Benchmarking RON-based chronofold implementation #3

gritzko · 2020-04-14T19:44:14Z

Hi!

Martin @ept pointed me at this benchmark. Thanks, Martin!

Currently I have a rather unpolished RON+Chronofold C++ implementation which does text versioning/synchronization. It works in a slightly different model than Y.js or Automerge (see the draft at https://arxiv.org/abs/2002.09511). In particular, appending a char to a text would take nanoseconds, there is no point in comparing that case.

I thought, this microbenchmark should be close to the worst case for the C++ code:
[B1.4] Insert N characters at random positions (time)

That is because inserting-at-the-index is an expensive operation for RON+CF. It does not keep the text as a string, it has to recalculate such things. Also, inserting single characters at random positions breaks all chronofold optimizations. To make things comparable, I used the LineBasedText wrapper that uses {line,col} addressing.
Long story short, the first run of the bench said it spends ~3800ns per iteration (I use google/benchmark). That is a disappointing number... It is roughly ~6000*4usec for N=6000 iterations or 24ms in the "[B1.4]...(time)" line of the table.

Although, it is not directly comparable as it does not rebuild the entire text on each edit; only the affected line is read back. (That is the key idea of the LBT wrapper, it is made for text editors.)

I think I can measure the size of the resulting frame too...

dmonad · 2020-04-14T22:27:49Z

Hi @gritzko,

thanks for sharing your benchmark results!

I will insert a link to this thread into the readme, so please continue sharing your results. I'd be pretty interested in the document size of the RON-encoded format and the time to rebuild the document from the RON-encoded format.

Do you think it makes sense to add Swarm to this benchmark, as it is based on RON?

I think it is a fair assumption for text types that lines contain only few characters. We could add another benchmark for this more realistic case. E.g. "Insert N characters at random positions in ꜖N/100˩ existing lines".

gritzko · 2020-04-15T06:10:03Z

...the size of the resulting frame in RONt is about 100KB. That space is mostly taken by ids. An empty replica named "test" produces ~25byte/op updates in "[B1.4] Insert N characters at random positions (avgUpdateSize)". Overall, ~100KB in "[B1.4] Insert N characters at random positions (docSize)".
Again, this is close to the worst-case as random insertions break optimizations in the format.

gritzko · 2020-04-15T06:14:12Z

Actually, the absolute worst case in RONt is 60bytes per a single-character insertion op.
Like, if we max out every field: numbers, replica ids, non-BMP Unicode chars.

gritzko · 2020-04-15T06:18:40Z

Do you think it makes sense to add Swarm to this benchmark, as it is based on RON?

I don't think so.
gritzko/swarm is no longer supported

dmonad · 2020-07-16T13:52:23Z

Hey @gritzko,

would you mind sharing the Chronofold results for applying the real-world dataset that @ept shared? I'd like to create a separate section that compares different implementations based on a real-world dataset as a baseline.

You can find the dataset here https://github.com/dmonad/crdt-benchmarks/blob/master/benchmarks/b4-editing-trace.js (it is copied with permission from the original repository).

I'm especially interested in the size of the final document docSize, and the time to parse the encoded document parseTime.

If possible, please share some insights on how much memory your implementation uses.

gritzko · 2020-07-18T15:22:54Z

Hi Kevin! The link is 404 Sent with [ProtonMail](https://protonmail.com) Secure Email. ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

…

On Thursday, July 16, 2020 4:52 PM, Kevin Jahns ***@***.***> wrote: Hey ***@***.***(https://github.com/gritzko), would you mind sharing the Chronofold results for applying the real-world dataset that ***@***.***(https://github.com/ept) shared? I'd like to create a separate section that compares different implementations based on a real-world dataset as a baseline. You can find the dataset here https://github.com/dmonad/crdt-benchmarks/blob/master/benchmarks/b4-editing-trace (it is copied with permission from the [original repository](https://github.com/automerge/automerge-perf/tree/master/edit-by-index)). I'm especially interested in the size of the final document docSize, and the time to parse the encoded document parseTime. If possible, please share some insights on how much memory your implementation uses. — You are receiving this because you were mentioned. Reply to this email directly, [view it on GitHub](#3 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AAAT5LI24LOIJWBTTYEZB63R34A2TANCNFSM4MIACVGQ).

dmonad · 2020-07-18T16:23:54Z

Sorry about that. Here is the correct one: https://github.com/dmonad/crdt-benchmarks/blob/main/benchmarks/b4-editing-trace.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarking RON-based chronofold implementation #3

Benchmarking RON-based chronofold implementation #3

gritzko commented Apr 14, 2020 •

edited

Loading

dmonad commented Apr 14, 2020

gritzko commented Apr 15, 2020

gritzko commented Apr 15, 2020 •

edited

Loading

gritzko commented Apr 15, 2020

dmonad commented Jul 16, 2020 •

edited

Loading

gritzko commented Jul 18, 2020 via email

dmonad commented Jul 18, 2020

Benchmarking RON-based chronofold implementation #3

Benchmarking RON-based chronofold implementation #3

Comments

gritzko commented Apr 14, 2020 • edited Loading

dmonad commented Apr 14, 2020

gritzko commented Apr 15, 2020

gritzko commented Apr 15, 2020 • edited Loading

gritzko commented Apr 15, 2020

dmonad commented Jul 16, 2020 • edited Loading

gritzko commented Jul 18, 2020 via email

dmonad commented Jul 18, 2020

gritzko commented Apr 14, 2020 •

edited

Loading

gritzko commented Apr 15, 2020 •

edited

Loading

dmonad commented Jul 16, 2020 •

edited

Loading