Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync #166

Merged
merged 16 commits into from
Nov 18, 2020
Merged

Sync #166

merged 16 commits into from
Nov 18, 2020

Conversation

RCasatta
Copy link
Member

This PR contains:

  • change sync logic: biggest change in the sync algorithm is to drop the "clever but problematic" handling of the descendant transaction, see old fn check_tx_and_descendant. The rationale of the current algorithm is that internal address contains funds only in our transactions so you don't initially ask about them to the electrum server but check those in transactions retrieved from external addresses and only if present asks for spending tx. Doing so is problematic in case we have a deep change chain because these transactions must be asked serially. The algorithm in this PR simply asks for internal address as it's done for external so we can batch requests more, improving speed.
  • add timestamps to transactions: at the moment timestamp field of transactions is not populated, this PR download transaction relative block headers to populate the timestamp
  • remove listunspent calls: closes Change electrum client to use get_history instead of listunspent  #140, spent information is in our history, so we can avoid to ask this endpoint.
  • repl example can use esplora so that testing this blockchain source is easier

Results:

wallet_name max_addresses branch first sync following sync
change_deep 200 master 19.0s 9.0s
change_deep 200 sync 2.3s 1.2s
single_many_utxos 200 master 16.0s 2.0s
single_many_utxos 200 sync 2.0s 1.0s

Future improvements:

  • Error handling on connection reset. At the moment rate limiting on some electrum server (like esplora) could prevent big wallet (2000+ txs) to sync, we should handle connection failure and implement some exponential backoff time to keep servers happy
  • At the moment Esplora API miss raw blockheader endpoint in production, now it has been updated Blockstream/electrs@91076cf but not yet deployed (current code build the blockheader from a block header json)
  • Get rid of max_addresses in electrum-like syncing mechanism: change cache addresses mechanism to calculate the addresses every team previous stop_gap batch contains txs (max_addresses could still be useful for parallelism in blockchain source like compact filters)
  • improve esplora concurrency so that wasm wallets could benefit
  • use rust-miniscript with secp context to speed up address derivations
  • handle fee capabilities, avoid fetching previous txs if requested fee capability is not accurate

change_deep wallet:

RUST_LOG=info cargo run --release --example repl --features cli-utils,esplora,electrum -- --wallet change_deep --descriptor "wpkh(tpubD6NzVbkrYhZ4WrmVQXmpVcPoh1LUtuBaVLJ3PDPNEaFpw5iiBW9GwkMmtkpT3ucpN5G64Tfj9EAGVGkXrYFrpdBdvp8RMDaKhtoEp5cBDWY/0/*)" --change_descriptor "wpkh(tpubD6NzVbkrYhZ4WrmVQXmpVcPoh1LUtuBaVLJ3PDPNEaFpw5iiBW9GwkMmtkpT3ucpN5G64Tfj9EAGVGkXrYFrpdBdvp8RMDaKhtoEp5cBDWY/1/*)" sync --max_addresses 200

single_many_utxos wallet:

RUST_LOG=info cargo run --release --example repl --features cli-utils,esplora,electrum -- --wallet single_many_utxos --descriptor "wpkh(tpubD6NzVbkrYhZ4X2yy78HWrr1M9NT8dKeWfzNiQqDdMqqa9UmmGztGGz6TaLFGsLfdft5iu32gxq1T4eMNxExNNWzVCpf9Y6JZi5TnqoC9wJq/*)" sync --max_addresses 200

Make every request in batch, to save round trip times
Fetch timestamp of blockheader to populate timestamp field in transaction
Remove listunspent requests because we can compute it from our history
@RCasatta
Copy link
Member Author

PS. Today I am seeing a big swing in esplora electrum server performance, previous results are just one run from yesterday (without considering address derivation time)

src/blockchain/utils.rs Outdated Show resolved Hide resolved
examples/repl.rs Show resolved Hide resolved
examples/repl.rs Show resolved Hide resolved
src/cli.rs Outdated Show resolved Hide resolved
src/blockchain/esplora.rs Outdated Show resolved Hide resolved
src/blockchain/utils.rs Outdated Show resolved Hide resolved
src/blockchain/utils.rs Outdated Show resolved Hide resolved
src/blockchain/utils.rs Outdated Show resolved Hide resolved
src/blockchain/utils.rs Outdated Show resolved Hide resolved
src/blockchain/utils.rs Outdated Show resolved Hide resolved
@RCasatta
Copy link
Member Author

I did a commit for almost any feedback for easier review, let me know if you prefer them squashed

src/blockchain/esplora.rs Outdated Show resolved Hide resolved
@afilini
Copy link
Member

afilini commented Nov 17, 2020

I did a commit for almost any feedback for easier review, let me know if you prefer them squashed

That's fine, I think we can keep them like this

for el in flattened {
// el.height = -1 means unconfirmed with unconfirmed parents
// el.height = 0 means unconfirmed with confirmed parents
// but we threat those tx the same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "treat"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in ae16c8b


let mut wallet_chains = vec![ScriptType::Internal, ScriptType::External];
// shuffling improve privacy, the server doesn't know my first request is from my internal or external addresses
wallet_chains.shuffle(&mut thread_rng());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to get all the script pubkeys first and shuffle those? Seems like that would mix internal and external better, but it might also mix different "chunks" of the script pubkeys together making the wallet easier to identify as a single cluster?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also had the same idea, it would be nice but it makes it very hard to stop when a gap large enough has been reached, because everything is mixed up. You would probably end up making many more requests than necessary

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think it would require more requests, especially in case where there are much more internal addresses than external (or viceversa)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #176 to keep track of this idea, we could potentially implement it with a flag that switches between a "fast" and a "private" sync algorithm. For now I'll merge this and the improvements can come with future PRs.

@afilini afilini merged commit a601337 into bitcoindevkit:master Nov 18, 2020
@shesek
Copy link

shesek commented Nov 30, 2020

At the moment Esplora API miss raw blockheader endpoint in production

The new API endpoint is now deployed.

https://github.com/Blockstream/esplora/blob/master/API.md#get-blockhashheader

https://blockstream.info/api/block/000000000000000000061c57a2d52944c7388007358f2ed88bbc586f60190822/header

nickfarrow pushed a commit to nickfarrow/bdk that referenced this pull request Feb 21, 2022
Update ToPublicKey API to take in context param
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change electrum client to use get_history instead of listunspent
4 participants