Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking RON-based chronofold implementation #3

Open
gritzko opened this issue Apr 14, 2020 · 7 comments
Open

Benchmarking RON-based chronofold implementation #3

gritzko opened this issue Apr 14, 2020 · 7 comments

Comments

@gritzko
Copy link

gritzko commented Apr 14, 2020

Hi!

Martin @ept pointed me at this benchmark. Thanks, Martin!

Currently I have a rather unpolished RON+Chronofold C++ implementation which does text versioning/synchronization. It works in a slightly different model than Y.js or Automerge (see the draft at https://arxiv.org/abs/2002.09511). In particular, appending a char to a text would take nanoseconds, there is no point in comparing that case.

I thought, this microbenchmark should be close to the worst case for the C++ code:
[B1.4] Insert N characters at random positions (time)

That is because inserting-at-the-index is an expensive operation for RON+CF. It does not keep the text as a string, it has to recalculate such things. Also, inserting single characters at random positions breaks all chronofold optimizations. To make things comparable, I used the LineBasedText wrapper that uses {line,col} addressing.
Long story short, the first run of the bench said it spends ~3800ns per iteration (I use google/benchmark). That is a disappointing number... It is roughly ~6000*4usec for N=6000 iterations or 24ms in the "[B1.4]...(time)" line of the table.

Although, it is not directly comparable as it does not rebuild the entire text on each edit; only the affected line is read back. (That is the key idea of the LBT wrapper, it is made for text editors.)

I think I can measure the size of the resulting frame too...

@dmonad
Copy link
Owner

dmonad commented Apr 14, 2020

Hi @gritzko,

thanks for sharing your benchmark results!

I will insert a link to this thread into the readme, so please continue sharing your results. I'd be pretty interested in the document size of the RON-encoded format and the time to rebuild the document from the RON-encoded format.

Do you think it makes sense to add Swarm to this benchmark, as it is based on RON?

I think it is a fair assumption for text types that lines contain only few characters. We could add another benchmark for this more realistic case. E.g. "Insert N characters at random positions in ꜖N/100˩ existing lines".

@gritzko
Copy link
Author

gritzko commented Apr 15, 2020

...the size of the resulting frame in RONt is about 100KB. That space is mostly taken by ids. An empty replica named "test" produces ~25byte/op updates in "[B1.4] Insert N characters at random positions (avgUpdateSize)". Overall, ~100KB in "[B1.4] Insert N characters at random positions (docSize)".
Again, this is close to the worst-case as random insertions break optimizations in the format.

@gritzko
Copy link
Author

gritzko commented Apr 15, 2020

Actually, the absolute worst case in RONt is 60bytes per a single-character insertion op.
Like, if we max out every field: numbers, replica ids, non-BMP Unicode chars.

@gritzko
Copy link
Author

gritzko commented Apr 15, 2020

Do you think it makes sense to add Swarm to this benchmark, as it is based on RON?

I don't think so.
gritzko/swarm is no longer supported

@dmonad
Copy link
Owner

dmonad commented Jul 16, 2020

Hey @gritzko,

would you mind sharing the Chronofold results for applying the real-world dataset that @ept shared? I'd like to create a separate section that compares different implementations based on a real-world dataset as a baseline.

You can find the dataset here https://github.com/dmonad/crdt-benchmarks/blob/master/benchmarks/b4-editing-trace.js (it is copied with permission from the original repository).

I'm especially interested in the size of the final document docSize, and the time to parse the encoded document parseTime.

If possible, please share some insights on how much memory your implementation uses.

@gritzko
Copy link
Author

gritzko commented Jul 18, 2020 via email

@dmonad
Copy link
Owner

dmonad commented Jul 18, 2020

Sorry about that. Here is the correct one: https://github.com/dmonad/crdt-benchmarks/blob/main/benchmarks/b4-editing-trace.js

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants