Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpc: sync state api #51

Merged
merged 1 commit into from
Oct 31, 2023
Merged

rpc: sync state api #51

merged 1 commit into from
Oct 31, 2023

Conversation

hackaugusto
Copy link
Contributor

closes: #43

@hackaugusto hackaugusto force-pushed the hacka-sync-state-api branch 4 times, most recently from 339f16d to 400d0e6 Compare October 27, 2023 14:47
@@ -4,24 +4,24 @@ package block_header;
import "digest.proto";

message BlockHeader {
/// the hash of the previous blocks header.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protobuf's commets are just two forward slashes. This was translated by prost as:

/// / the hash of the previous blocks header.

Because the forward slash was considered part of the comment

@@ -0,0 +1,29 @@
syntax = "proto3";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the requests/responses to a separate module. The reason is that for the time being the rpc and store are using the same definitions. Having them defined once means they are the same type to the Rust compiler, and there is no need to perform additional casting.

For the long run this will probably change, for example, if we add distributed tracing the RPC will need to forward a token to the Store, and probably will be more reasons to change the messages. But for now it is probably best to keep this simple.

@@ -27,6 +27,7 @@ miden-node-utils = { path = "../utils" }
miden_objects = { workspace = true }
once_cell = { version = "1.18.0" }
prost = { version = "0.12" }
rusqlite = { version = "0.29", features = ["array", "buildtime_bindgen"] }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array is needed to support passing multiple values to the cursor. This is used to allow querying for tag/nullifier prefixies and account ids.

The rusqlite library supports multiple versions of the sqlite. It can use the system's library, it can statically link sqlite to the final binary, and it can use a patched version of sqlite with additional encryption features. To speed up compilation, the library is distributed with ffi generated for a conservative version of sqlite that can be used with any of the aforementioned distributions. Unfortunately it doesn't support the array extension. So we have to generate the ffi.

store/src/db.rs Outdated
Comment on lines 38 to 54
conn.interact(|conn| array::load_module(conn))
.await
.map_err(|_| anyhow!("Loading carray module failed"))??;
Copy link
Contributor Author

@hackaugusto hackaugusto Oct 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is loading the array extension, needed to perform the queries using vector of values form the rust land.

An alternative implementation would use the post_hook. I have to look at that later on, but for now this seems to be working.

(
block_num INTEGER NOT NULL,
block_header BLOB NOT NULL,

PRIMARY KEY (block_num),
CONSTRAINT block_header_block_num_positive CHECK (block_num >= 0)
CONSTRAINT block_header_block_num_is_u32 CHECK (block_num >= 0 AND block_num < 4294967296)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous store PR requested for the usage of u32 as the block number. This PR does some tidying up, to add that to the DB constraints and type system.

Comment on lines 166 to 174
mmr_delta: todo!(),
block_path: todo!(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mmr only supports constructing a delta against its latest version. So this can't yet be done, I'm looking into a fix for this to add in the crypto repo.

Copy link
Contributor Author

@hackaugusto hackaugusto Oct 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opened: 0xPolygonMiden/crypto#205 0xPolygonMiden/crypto#206 0xPolygonMiden/crypto#207

with these 3 PRs merged this is a trivial change

@hackaugusto hackaugusto force-pushed the hacka-sync-state-api branch 3 times, most recently from 19db790 to affe3ef Compare October 27, 2023 19:39
@hackaugusto hackaugusto requested a review from bobbinth October 27, 2023 19:39
@hackaugusto hackaugusto marked this pull request as ready for review October 27, 2023 19:39
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left a few comments inline - they all are fairly small. I do have one broader comment/question:

If I understood the code correctly, when the store fulfills the sync_state request it makes multiple sequential requests to the database (i.e., first notes_since_block_by_tag, then get_block_header, get_account_hash_by_block_range etc.). All these requests are independent of each other - i.e., executed in different transactions (and may be executed against different versions of the database).

While I don't think this results in consistency issues for this specific endpoint, I wonder if a better approach would be to execute all these individual requests in a single transaction. One way to do this would be to have a single method on the Db struct - something like get_state_sync_info - and then to make individual queries to the DB inside this method.

proto/proto/note.proto Show resolved Hide resolved
rpc/src/server/api.rs Show resolved Hide resolved
store/src/server/api.rs Show resolved Hide resolved
store/src/server/api.rs Outdated Show resolved Hide resolved
store/src/db.rs Outdated
Comment on lines 49 to 67
/// Inserts a new nullifier to the DB.
///
/// This method may be called multiple times with the same nullifier.
pub async fn add_nullifier(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method (and other similar methods) used primarily for testing purposes? If so, we should probably indicate this in the comments and maybe group them together in a "testing" section or something similar.

Eventually, there will be only a single method for updating the data in the store - apply_block. This method will lock the database for update and perform all required updates atomically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR it was added primarily for testing purposes, I assume this would be used in the future for other endpoints too (e.g. apply_block can call these methods).

  • Put the methods behind a cfg(test)

@hackaugusto
Copy link
Contributor Author

hackaugusto commented Oct 29, 2023

There should be no consistency issue as long as we don't have reorgs. I wrote the endpoint to first define the start/end blocks, and all the other loads are for the same range. The data in there would only change with a reorg.

@bobbinth
Copy link
Contributor

There should be no consistency issue as long as we don't have reorgs. I wrote the endpoint to first define the start/end blocks, and all the other loads are for the same range. The data in there would only change with a reorg.

Yes - as I mentioned in my comment I don't think there is a consistency issue here. But it is still probably better to retrieve all required data in a single transaction. A few reasons:

  • We push enforcement of consistency to the DB rather than leaving it in our code. This will mean less mental overhead in figuring out if the endpoint works as expected. This could prevent weird edge cases arising if we change how the endpoint works in the future.
  • We will have other endpoints which will need to get data from multiple tables in a single transaction, and it is better to use the same approach across all endpoints.
  • This will give us more flexibility in the future if we decided to switch to a different database. For example, in databases supporting stored procedures, the whole request could be fulfilled by the DB itself.

Are there any reasons to prefer multiple independent requests?

@hackaugusto
Copy link
Contributor Author

hackaugusto commented Oct 30, 2023

For example, in databases supporting stored procedures, the whole request could be fulfilled by the DB itself.

I'm not sure I follow. You mean a single request would return the complete dataset? Each request has a different return type, I'm not sure how to do that in SQL DB, perhaps you're planning on switching to a NoSQL in the future (something document based?)

Are there any reasons to prefer multiple independent requests?

Assuming we are using a SQL DB, we would need multiple SELECTs inside a single transaction (one for each data type, which is to say one per table, e.g. nullifiers / accounts). To have the consistency level you're talking about with multiple queries, we would need to set the DB isolation level into SERIALIZABLE, which is the worst case for performance.

To say the above with a different perspective, the result format is different for each table, so different SELECTs are necessary, the default for PG and MySQL is READ COMITTED 1 / REPEATABLE READ 2. Which means each SELECT statement is internally consistent, but it does not mean that two consecutive SELECTs are consistent among themselves in PG. From the PG docs:

[..] two successive SELECT commands can see different data, even though they are within a single transaction, if other transactions commit changes after the first SELECT starts and before the second SELECT starts

There are some other issues with transactions, not only they are a hit on performance, they are also a huge bottleneck for migrations. IMO we should not rely on them as you're suggesting.

Edit: Well, I guess that for MySQL and PostgreSQL the isolation level REPEATABLE READ provides more than the SQL standard requires, i.e. they have SNAPSHOT guarantees, which would be sufficient in this case but not portable. The point I'm trying to make is that transactions alone are not sufficient to ensure consistency, the behavior of the queries together with the server code and the server's/connection's configurations, like isolation level, must be looked at too, trying to reduce complexity by wrapping everything in a transaction will probably cause subtle bugs

Footnotes

  1. https://www.postgresql.org/docs/16/transaction-iso.html

  2. https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html

@hackaugusto
Copy link
Contributor Author

moved the logic to load the state sync into a method in the DB

@hackaugusto
Copy link
Contributor Author

this it the behavior of SQLite:

Except in the case of shared cache database connections with PRAGMA read_uncommitted turned on, all transactions in SQLite show "serializable" isolation. SQLite implements serializable transactions by actually serializing the writes. There can only be a single writer at a time to an SQLite database. There can be multiple database connections open at the same time, and all of those database connections can write to the database file, but they have to take turns. SQLite uses locks to serialize the writes automatically; this is not something that the applications using SQLite need to worry about.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Looks good! I left a few comments inline - all except for one are very minor. The main comment is to update how we select a list of relevant notes - a part of the criteria there should be based on the sender column.

A couple of other things, for which I think we should create separate issues:

  1. Regarding transaction isolation level: if I understood how SQLite works, the behavior I was going for may be provided out of the box in WAL mode (i.e., before a write transaction is committed, all reads - whether in a transaction or not - will see the database in the state prior to write transaction). So, we may not need to do anything extra except for enabling WAL mode - but I think we should discuss this in an issue.
  2. We should figure out how we want to do indexing. For example, do we need to add an index on the first 16 bits of note tag and on note sender? Or should we consider some other approach to quickly finding the "anchor block" for the sync_state endpoint.

For the second point, what worries me the most is the efficiency of doing this:

SELECT
    block_num
FROM
    notes
WHERE
    ((tag >> 48) IN rarray(?1) OR sender IN rarray(?2)) AND
    block_num > ?3
ORDER BY
    block_num ASC
LIMIT
    1

store/src/server/api.rs Outdated Show resolved Hide resolved
store/src/db.rs Outdated Show resolved Hide resolved
store/src/db.rs Outdated Show resolved Hide resolved
store/src/db.rs Outdated Show resolved Hide resolved
store/src/db.rs Outdated Show resolved Hide resolved
store/src/db.rs Outdated Show resolved Hide resolved
@hackaugusto
Copy link
Contributor Author

hackaugusto commented Oct 31, 2023

a part of the criteria there should be based on the sender column.

Oh, so the client is expected to send its account_ids also as part of the request's tags? And the idea of searching the note by using the sender, is for the client to get confirmation of when its notes are created?

if I understood how SQLite works, the behavior I was going for may be provided out of the box in WAL mode

It is provided by both WAL and rollback modes. The difference is that with WAL a transaction can fail and needs to be retried, with rollback there is an exclusive lock for the writer, so it never fails at the cost of increased latency.

We should figure out how we want to do indexing

I was hoping that we would not have to do that right now. We don't have all the tables defined, neither the queries, and the ones we have may even change in the future (say, if we decide to change the number of bits in the tag, or something like that). I'm not sure if optimizing that right now would be the best approach.

With that said, SQLite has very comprehensive documentation on indexes and the query planner
1 2 3. For the query you mentioned, we have a few things to consider:

  • Which columns would filter the result the most.
    • The > block_number filter is hard to predict, for the time being users will use this endpoint starting from genesis, and later on to track the chain tip. So this filter can potentially scan the table or just read one block's notes. To me this means this won't be a good index candidate
    • The OR by tag and sender are much better candidates.
      • We can very confidently assume a user will send a small fraction of the total number of notes, so filtering by sender is a very good strategy.
      • The tag is more complicated, as it is, we are doing a bitshift operation >>48, SQLite supports indexes over expressions 4, which we can take advantage of, but they can be error prone since indexes are chosen based on the syntactic form of the expression.
      • The two fields above require one index each, and the DB has to filter by each and then perform a union of the results. An alternative approach would be to have a separate table, say CREATE TABLE note_idx (tag INTEGER, block_num INTEGER), with just the tag data. The idea is that each note would produce two rows in the index table, one row with the value of sender and another with the value of tag>>48 and that we would do a single IN query, that would need a single index and eliminates the needs for joining of results.

So here are two proposals:

-- covering indexes
CREATE INDEX
  idx_notes_tag_high_16bits
ON
  notes
( tag >> 48, block_num );

CREATE INDEX
  idx_notes_sender
ON
  notes
( sender, block_num )

Or:

CREATE TABLE
  notes_idx
(
  tag INTEGER NOT NULL,
  block_num INTEGER NOT NULL,
  
  PRIMARY KEY (tag, block_num),
  CONSTRAINT notes_idx_tag_is_felt CHECK (tag >= 0 AND tag <= 18446744069414584321),
  CONSTRAINT notes_idx_block_num_is_u32 CHECK (block_num >= 0 AND block_num < 4294967296)
) STRICT, WITHOUT ROWID;

The table above forces the pair (tag, block_num) to be unique, which is not true (e.g. a user produces two notes in the same block, there are two notes with the same sender). The thing is that we don't care about that, we only want to learn the block height at which the user produce something, so that is okay. The benefit is that the table's clustered index is what we want to search on, so no extra index is needed.

The table could be populated via a trigger, and we could also have a foreign key to the block_num and clear the entries when the blocks are deleted (once we implement pruning in the future)

Footnotes

  1. https://www.sqlite.org/queryplanner.html

  2. https://www.sqlite.org/queryplanner-ng.html

  3. https://www.sqlite.org/optoverview.html

  4. https://www.sqlite.org/expridx.html

@bobbinth
Copy link
Contributor

so the client is expected to send its account_ids also as part of the request's tags?

Account ID's are already a part of the request - we just use them in two places (to look up account states and to look up notes by sender). Account IDs are not part of the tags.

We should figure out how we want to do indexing

I was hoping that we would not have to do that right now.

Yes, as I mentioned in my comment, we don't need to do this right now (or even in near future) - let's just create an issue to discuss this.

@hackaugusto hackaugusto dismissed bobbinth’s stale review October 31, 2023 16:01

applied requested changes

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good! Thank you! I left one small nit inline - after this, we can merge.

Also, let's create the two issues: one for transaction consistency mode and another one for indexes in the notes table.

Comment on lines 291 to 303
/// Searchs a block after `block_num` which contains note(s) matching [tag]s, and returns all
/// matching notes.
///
/// # Returns
///
/// An empty vector if the blocks after `block_num` don't contains notes matching `tags`.
/// Otherwise the matching notes from the next block are returned. Note that this method returns
/// notes from a single block.
pub async fn get_notes_since_block_by_tag_and_sender(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd probably mention sender in the comments here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@hackaugusto hackaugusto merged commit 0a484fe into main Oct 31, 2023
@hackaugusto hackaugusto deleted the hacka-sync-state-api branch October 31, 2023 21:17
block_number INTEGER NOT NULL,

PRIMARY KEY (nullifier),
CONSTRAINT nullifiers_nullifier_valid_digest CHECK (length(nullifier) = 32),
CONSTRAINT nullifiers_block_number_positive CHECK (block_number >= 0),
CONSTRAINT nullifiers_nullifier_is_digest CHECK (length(nullifier) = 32),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that I forgot to mention. The desire to avoid encoding/decoding data when reading/writing to the DB is leaking. This table can perform a constraint check ont he nullifier size because it is used the fixed encoding defined via winterfell's traits.

The tables above on the other hand are using the protobuf's serialization format, I skipped the constraint checks there because my first byte count was wrong, and in some situations the encoding is variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: does this mean that some BLOB fields in accounts, notes, and block_headers tables use protobuf serialization format?

Looking through these tables, it wasn't immediately clear to me which fields would these be (maybe block_headers.block_header and notes.merkle_path?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically everything except the nullifier. Maybe I should change that too, it is only using our binary format to create the nullifier tree, but that is done once during initialization, and the other uses actually need the protobuf format

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically everything except the nullifier.

Does this include non-blob fields too?

I would think integers would be recorded as integers - but maybe that's not the case?

Regarding blob fields: things like nullifiers, hashes etc. are just 32 bytes which cannot be compressed - so, the format should be the same. But maybe I'm missing something?

Copy link
Contributor Author

@hackaugusto hackaugusto Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think integers would be recorded as integers - but maybe that's not the case?

Integers should be fine

Regarding blob fields: things like nullifiers, hashes etc. are just 32 bytes which cannot be compressed - so, the format should be the same. But maybe I'm missing something?

The encoding format is documented here. There are some additional metadata in the protobuf's wire format that we don't use. I haven't spent a lot of time trying to fully digest the format, so I have an idea of how many bytes it should be, but I didn't want to add guess work to the constraints.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's create an issue to discuss this. On the one hand, it would be nice to reduce the number of times we encode/decode things. But on the other, storing protobuf formats in the DB doesn't feel right. Maybe there are options which can balance these somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Endpoint to perform client state sync
2 participants