Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize GraphQL "coins to spend" query #2391

Closed
rafal-ch opened this issue Oct 24, 2024 · 4 comments · Fixed by #2463
Closed

Optimize GraphQL "coins to spend" query #2391

rafal-ch opened this issue Oct 24, 2024 · 4 comments · Fixed by #2463
Assignees

Comments

@rafal-ch
Copy link
Contributor

rafal-ch commented Oct 24, 2024

The source of this issue is #623.

Initially, the idea was to remove all complex queries from fuel-core and make it part of the indexer's responsibility. But the SDK is deeply integrated with these basic queries, that we need to support them on fuel-core. For that, we need to optimize the following query by aggregating some information in the off-chain worker or adding new indexes to the database:

  • The coins_to_spend query is very complex and requires n^2 operations where n is the number of coins and messages. The algorithm itself can use additional indexes from RocksDB to solve the issue with dust coins and works fast.

Extracted from the #1965, which initially covered optimizations for 3 different areas: balances, coins to spend, dry run.

@rafal-ch
Copy link
Contributor Author

rafal-ch commented Nov 12, 2024

This issue will be delivered in at least 2 PRs:

  1. Add ability to store coins to spend in an sorted way
  2. Leverage the fact that coins are sorted in the selection algorithm

How to achieve pt. 1:

  1. We'll use two DBs/columns:
    1. main_db - the current one
    2. index_db - the new, indexation DB
  2. Use the separate DB/column to build an "reverse lookup index" which will use the RocksDB key sorting capabilities
    1. RocksDB uses lexicographical sort
    2. To maintain the ordering, the integer representing the amount will be converted to big-endian bytes
    3. Each entry will have a key USER|ASSET[^3]|AMOUNT, and value is going to be a key from the main_db
      • this will allow querying by USER|ASSET prefix, getting all coins in sorted order
      • TODO [needed, useful?]: Add ability to retrieve only coins "not greater" and/or "not less" than X
  3. To solve the conflicts (for example: user has two coins with same value for the same asset ID), each coin entry will have an unique suffix (utxo_id). This suffix will also be used to quickly exclude coins.

Example DB structure:
Coins:

key: utxo_id amount block_created tx_created_idx asset_id owner
X1 1 3 12 BTC Alice
X2 100 3 10 BTC Alice
X3 10 3 11 BTC Alice
Y1 54321 4 101 LCK Bob
Y2 12345 4 100 LCK Bob
X4 20 3 14 ETH Alice

Messages:

key: nonce amount sender recipient data da_height
M1 54321 Alice Alice [] 12
M2 54321 Bob Bob [1, 2] 13

Corresponding index_db:

key1 value
Alice-BASE-0-000000000000d431-M1 CoinTypeId(Message)2
Alice-BTC-0-0000000000000001-X1 CoinTypeId(Coin)
Alice-BTC-0-000000000000000a-X3 CoinTypeId(Coin)
Alice-BTC-0-0000000000000064-X2 CoinTypeId(Coin)
Alice-ETH-0-0000000000000014-X4 CoinTypeId(Coin)
Bob-BASE-1-000000000000d431-M2 CoinTypeId(Message)
Bob-LCK-0-0000000000003039-Y2 CoinTypeId(Coin)
Bob-LCK-0-000000000000d431-Y1 CoinTypeId(Coin)

How to achieve pt. 2:

  1. We try to fulfill the request using the biggest coins available
  2. If there is not enough value, return empty set
  3. If there is enough value, put the coins into the set
  4. If there are still free slots available (ie. user provided higher limit that the number of coins we found in the point above), fill the random amount of remaining slots with dust coins (smallest ones)
  5. Observe if the added dust coins can "replace" the big coins selected in step 1 - if so, remove the "big coin".
    • this will promote spending the dust coins if applicable, but if a user wants to spend "big" coins, he can always send query with lower max

Remarks:

  1. Both main coins and dust coins should respect the "excluded" filter
  2. Coins selected as main coins should not be added as dust later in the process

Questions:

  1. Can we specify the same asset multiple times in coins_to_spend query? - Nope - see here.
    • ⚠ The duplicate check is not included in the PoC

The PoC implementation of the above is available here: https://github.com/rafal-ch/coins_to_spend_poc

Footnotes

  1. key is a concatenation of 1) owner, 2) asset_id, 3) single byte which is 1 for coins and non-retryable messages and 0 for retryable messages, 4) big-endian encoded amount, 5) utxo_id/nonce (for conflict avoidance and coin exclusion)

  2. CoinTypeId is and repr u8 enum with the following variants: {Coin, Message}. This will be used to disambiguate between "coins with base asset id" and "messages".

@rafal-ch
Copy link
Contributor Author

rafal-ch commented Nov 15, 2024

Current flow:

  1. coinsToSpend() arrives at the GraphQL API
  2. Coins are collected
    1. off_chain DB is queried for "owned coin ids"
      • these are taken from OwnedCoins storage
    2. coin exclusion filter is applied
    3. coins are read from on_chain DB (coins_iter())
      • these are taken from Coins storage
  3. if querying for base_asset_id, the above is repeated for "messages":
    1. off_chain DB is queried for "owned message ids"
      • these are taken from OwnedMessageIds storage
    2. messages are read from on_chain DB (messages_iter())
      • these are taken from Messages storage
  4. random_improve() algo is used to select coins
    • if it cannot satisfy the request, fallback to largest_first() algorithm

@rafal-ch
Copy link
Contributor Author

rafal-ch commented Nov 24, 2024

For coins the key bytes represent:
| owner | asset_id | retryable flag | amount bytes | utxo_id |
where retryable flag is always set to false

For messages the key bytes represent:
| owner | base_asset_id | retryable flag | amount bytes | nonce |

@rafal-ch
Copy link
Contributor Author

Also, to disambiguate between "coins with base asset id" and "message" we'll use the "value" in the DB.

rafal-ch added a commit that referenced this issue Jan 13, 2025
Closes #2391

This PR includes all changes from the [Part 1
PR](#2455), making it
deprecated.

## Description
Changes in this PR:

#### The new `CoinsToSpend` index
* This is the database that stores all coins to spend sorted by the
amounts (i.e. largest-by-value coins first)
* The key consists of several parts
* _Retryable flag_ - to distinguish between retryable messages and other
coins
  * _Address_ (owner)
  * _AssetID_
* _Amount_ - as "big-endian" bytes to leverage the RocksDB key sorting
capabilities
* _Foreign Key_ - this are bytes of the key from either the `Messages`
or `Coins` on-chain databases
    * for messages this is a 32-bytes `Nonce`
    * for coins this is a 34-bytes `UtxoId`
* The value is an instance of `IndexedCoinType` enum, so we know which
on-chain database to query when returning the actual coins
* This index is updated when executor events are processed
* When querying for "coins to spend" the following algorithm is applied:
* First, we get as many "big" coins as required to satisfy _double the
amount_ from the query (respecting `max` and `excluded` params)
* If we have enough coins, but there are still some "slots" in the query
left (because we selected less coins than `max`) we fill the remaining
slots with a **random** number of "dust" coins
* If it happens that the value of selected "dust coins" is able to cover
the value of some of the already selected "big coins", we remove the
latter from the response
* If at any step we encounter a problem (reading from database, integer
conversions, etc.) we bail with an appropriate error

#### Changes to `CoinsQueryError` type
* The `MaxCoinsReached` variant has been removed because in the new
algorithm we never query for more coins than the specified `max`, hence,
without additional effort, we are not able to tell whether the query
could be satisfied if user provided bigger `max`
* The `InsufficientCoins` has been renamed to
`InsufficientCoinsForTheMax` and it now contains the additional `max`
field

#### Off-chain database metadata
* The metadata for off-chain database now contain the additional
`IndexationKind` - `CoinsToSpend`

#### Refactoring
* The `indexation.rs` module was split into separate files, each per
indexation type + errors + some utils.

#### Other
* Integration tests have to be updated to not expect the exact number of
coins to spend in the response (currently, due to randomness, we're not
deterministic in this regard)
* The number of excluded ids in the `coinsToSpend` GraphQL query is now
limited to the maximum number of inputs allowed in transaction.

### Before requesting review
- [X] I have reviewed the code myself
- [X] I have created follow-up issues caused by this PR and linked them
here

### Follow-up issues
* #2498
* #2448
* #2428
* #2499
* #2496

---------

Co-authored-by: Green Baneling <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant