Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue: 109: Checkpointing for HomeObject #115

Merged
merged 27 commits into from
Jan 17, 2024
Merged

Issue: 109: Checkpointing for HomeObject #115

merged 27 commits into from
Jan 17, 2024

Conversation

yamingk
Copy link
Contributor

@yamingk yamingk commented Nov 18, 2023

Changes.

  1. Allow HomeObject to participant checkpointing.
  2. Created dirty list for PG SB, which currently is empty. Any dirty candidate that want to join CP can be added to the same place or any place that makes sense for the author.
  3. Some refactor change to test files
  4. convert existing blob_id maintenance to CP (replace periodic timer, but not a must)

Testing:

  1. Added a basic CP test for create a PG/Shard and put some random data in the dirty PG
  2. trigger cp
  3. restart homestore and assert the PG recovered should have the same dirtied value set before restart.
  4. Test passed.

@codecov-commenter
Copy link

codecov-commenter commented Nov 29, 2023

Codecov Report

Attention: 13 lines in your changes are missing coverage. Please review.

Comparison is base (68d9b2d) 78.06% compared to head (82968cd) 77.88%.

Files Patch % Lines
src/lib/homestore_backend/hs_blob_manager.cpp 60.00% 5 Missing and 1 partial ⚠️
src/lib/homestore_backend/hs_hmobj_cp.cpp 77.77% 4 Missing and 2 partials ⚠️
src/lib/homestore_backend/hs_homeobject.hpp 93.33% 0 Missing and 1 partial ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #115      +/-   ##
==========================================
- Coverage   78.06%   77.88%   -0.18%     
==========================================
  Files          29       30       +1     
  Lines        1217     1266      +49     
  Branches      127      130       +3     
==========================================
+ Hits          950      986      +36     
- Misses        199      208       +9     
- Partials       68       72       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@JacksonYao287 JacksonYao287 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yamingk
1 we already have a timer for flushing pg super block periodically, if we have CP for homeobject, we could remove that timer.

2 the dirty list can not guarantee that it will be flushed to disk(crash before trigger_cp_flush), so we need a separate log to make sure all the updates will be recovered when restart. if we use cp for homeobject, we should implement that log, right?

src/lib/homestore_backend/hs_pg_manager.cpp Outdated Show resolved Hide resolved
src/lib/homestore_backend/hs_hmobj_cp.hpp Outdated Show resolved Hide resolved
@yamingk
Copy link
Contributor Author

yamingk commented Dec 2, 2023

@JacksonYao287
| >>>>1. we already have a timer for flushing pg super block periodically, if we have CP for homeobject, we could remove that timer.
As was discussed in HS meeting, @raakella1 is going to combine the timer with CP.

| >>>>> 2 the dirty list can not guarantee that it will be flushed to disk(crash before trigger_cp_flush), so we need a separate log to make sure all the updates will be recovered when restart. if we use cp for homeobject, we should implement that log, right?
yes. Everything writing to CP has to be protected by log unless the consumer are aware of it and want to garunteen the idompotency itself.

Copy link
Contributor

@raakella1 raakella1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment, just for knowledge purpose, LG otherwise

src/lib/homestore_backend/hs_hmobj_cp.hpp Show resolved Hide resolved
replica_set_uuid = rhs.replica_set_uuid;
index_table_uuid = rhs.index_table_uuid;
blob_sequence_num = rhs.blob_sequence_num;
memcpy(members, rhs.members, sizeof(pg_members) * num_members);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this safe to copy sizeof(pg_members) * num_members bytes to members, which can hold only one pg_members?

Copy link
Contributor Author

@yamingk yamingk Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind the original author is that members is always come from the heap, and its pointed size is always the size of num_members.

src/lib/homestore_backend/hs_pg_manager.cpp Show resolved Hide resolved
Comment on lines +184 to +185
cache_pg_sb_ = (pg_info_superblk*)malloc(pg_sb_->size());
cache_pg_sb_->copy(*(pg_sb_.get()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i prefer to remove copy , and overwrite operator new for pg_info_superblk. i think it will be more readable

src/lib/homestore_backend/hs_hmobj_cp.hpp Show resolved Hide resolved
Copy link
Contributor

@raakella1 raakella1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG!

@yamingk yamingk merged commit b1ea8f9 into eBay:main Jan 17, 2024
24 checks passed
@yamingk yamingk deleted the yk_cp branch January 17, 2024 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Checkpointing for HomeObject and merge blob_sequence_num into checkpointing
5 participants