Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: state sync #1026

Open
wants to merge 45 commits into
base: v1.5-dev
Choose a base branch
from
Open

feat!: state sync #1026

wants to merge 45 commits into from

Conversation

lklimek
Copy link
Collaborator

@lklimek lklimek commented Jan 15, 2025

Issue being fixed or feature implemented

What was done?

  • added new call, FinalizeSnapshot, that is sent to ABCI App once state sync is finished
  • fixed bug in QuorumVerify call
  • updated e2e tests to test state sync
  • removed unused config settings

How Has This Been Tested?

Updated e2e tests, updated unit test.

Breaking Changes

  • removed unused configuration settings from [statesync] section: trust-height, trust-hash, trust-period
  • added new ABCI method FinalizeSnapshot
  • bumped ABCI protocol version to 1.3.0

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation

For repository code-owners and collaborators only

  • I have assigned this pull request to a milestone

Summary by CodeRabbit

  • New Features

    • Added snapshot finalization functionality to the ABCI interface.
    • Enhanced state synchronization process with light block validation.
    • Introduced new methods for finalizing snapshots across multiple components.
    • Added new methods for handling light blocks in the state provider.
    • Introduced new request and response types for snapshot finalization in the ABCI protocol.
  • Improvements

    • Simplified state sync configuration by removing trust parameters.
    • Improved error handling during state synchronization.
    • Enhanced integer overflow handling in the math package.
  • Version Update

    • ABCI library version upgraded from 1.2.0 to 1.3.0.
  • Dependency Updates

    • Updated several Go module dependencies to newer versions.

Copy link
Contributor

coderabbitai bot commented Jan 15, 2025

Walkthrough

This pull request introduces a comprehensive enhancement to the state synchronization process in the Tenderdash blockchain framework. The changes primarily focus on adding a new FinalizeSnapshot method across multiple components, removing trust-related configuration parameters, and updating the ABCI (Application Blockchain Interface) protocol. The modifications aim to streamline snapshot synchronization, provide more robust validation, and simplify the configuration process for state sync.

Changes

File Change Summary
abci/client/* Added FinalizeSnapshot method to various client implementations (gRPC, routed, socket, proxy)
abci/types/application.go Added FinalizeSnapshot to StateSyncer interface and BaseApplication
abci/types/messages.go Added ToRequestFinalizeSnapshot function for converting requests
abci/types/mocks/application.go Added mock implementation for FinalizeSnapshot
config/config.go Removed trust-related fields (TrustHeight, TrustHash, TrustPeriod)
config/toml.go Removed trust-related configuration parameters
internal/statesync/* Significant refactoring of state sync process, added LightBlock method, modified syncer logic
proto/tendermint/abci/types.proto Added RequestFinalizeSnapshot and ResponseFinalizeSnapshot messages
spec/abci++/api.md Documented new request and response types
version/version.go Updated ABCI semantic version from 1.2.0 to 1.3.0
go.mod Updated various dependencies
test/e2e/* Updated tests to reflect new state synchronization changes

Sequence Diagram

sequenceDiagram
    participant App as ABCI Application
    participant Syncer as State Syncer
    participant Provider as State Provider

    Syncer->>Provider: Retrieve LightBlock
    Provider-->>Syncer: Return LightBlock
    Syncer->>App: FinalizeSnapshot(snapshot_block, genesis_block)
    App->>App: Validate Snapshot Integrity
    alt Validation Successful
        App-->>Syncer: ResponseFinalizeSnapshot (Success)
    else Validation Failed
        App-->>Syncer: ResponseFinalizeSnapshot (Error)
        Syncer->>Syncer: Restart Process
    end
Loading

Poem

🐰 Snapshot Sync, a Rabbit's Tale
Through light blocks and genesis' trail,
Trust parameters fade away,
A new sync method comes to play!
Finalize with grace, my friend! 🌟

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@lklimek
Copy link
Collaborator Author

lklimek commented Jan 20, 2025

@coderabbitai review

Copy link
Contributor

coderabbitai bot commented Jan 20, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
abci/types/application.go (1)

102-104: Consider adding documentation for the base implementation.

While the empty implementation is appropriate for a base class, adding a comment explaining that this is a default no-op implementation would improve clarity.

+// FinalizeSnapshot provides a no-op implementation for the StateSyncer interface
 func (BaseApplication) FinalizeSnapshot(_ context.Context, _req *RequestFinalizeSnapshot) (*ResponseFinalizeSnapshot, error) {
   return &ResponseFinalizeSnapshot{}, nil
 }
proto/tendermint/abci/types.proto (1)

642-644: Consider adding status field to ResponseFinalizeSnapshot.

The empty response provides no feedback mechanism for the application to indicate success/failure of the finalization process.

Consider adding a status field:

message ResponseFinalizeSnapshot {
+  bool success = 1;  // Indicates if snapshot finalization was successful
+  string error = 2;  // Error message if finalization failed
}
abci/example/kvstore/kvstore.go (1)

583-598: Implementation looks good, but error messages could be more descriptive.

The FinalizeSnapshot implementation correctly verifies both height and app hash, with proper thread safety using mutex. However, the error messages could be more informative.

Consider enhancing the error messages with more details:

-    return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch")
+    return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch: expected %d, got %d", 
+        app.LastCommittedState.GetHeight(), req.LightBlock.SignedHeader.Header.Height)

-    return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch")
+    return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch: expected %x, got %x",
+        app.LastCommittedState.GetAppHash(), req.LightBlock.SignedHeader.Header.AppHash)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a6890a7 and e84e0be.

⛔ Files ignored due to path filters (2)
  • abci/types/types.pb.go is excluded by !**/*.pb.go
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (30)
  • abci/client/grpc_client.go (1 hunks)
  • abci/client/mocks/client.go (1 hunks)
  • abci/client/routed_client.go (1 hunks)
  • abci/client/socket_client.go (1 hunks)
  • abci/example/kvstore/kvstore.go (1 hunks)
  • abci/types/application.go (2 hunks)
  • abci/types/messages.go (1 hunks)
  • abci/types/mocks/application.go (1 hunks)
  • config/config.go (0 hunks)
  • config/toml.go (0 hunks)
  • go.mod (3 hunks)
  • internal/proxy/client.go (1 hunks)
  • internal/statesync/chunks.go (2 hunks)
  • internal/statesync/mocks/stateprovider.go (1 hunks)
  • internal/statesync/reactor.go (2 hunks)
  • internal/statesync/reactor_test.go (1 hunks)
  • internal/statesync/stateprovider.go (5 hunks)
  • internal/statesync/syncer.go (4 hunks)
  • internal/statesync/syncer_test.go (4 hunks)
  • light/client.go (1 hunks)
  • light/client_test.go (1 hunks)
  • proto/tendermint/abci/types.proto (5 hunks)
  • spec/abci++/api.md (7 hunks)
  • test/e2e/networks/rotate.toml (2 hunks)
  • test/e2e/pkg/mockcoreserver/core_server.go (1 hunks)
  • test/e2e/pkg/mockcoreserver/methods.go (1 hunks)
  • test/e2e/runner/setup.go (0 hunks)
  • test/e2e/runner/start.go (1 hunks)
  • types/quorum_sign_data.go (1 hunks)
  • version/version.go (1 hunks)
💤 Files with no reviewable changes (3)
  • test/e2e/runner/setup.go
  • config/toml.go
  • config/config.go
✅ Files skipped from review due to trivial changes (2)
  • light/client_test.go
  • light/client.go
🧰 Additional context used
🪛 Markdownlint (0.37.0)
spec/abci++/api.md

431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


439-439: null
Link fragments should be valid

(MD051, link-fragments)


945-945: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: tests (02)
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: e2e-test (dashcore)
🔇 Additional comments (31)
version/version.go (1)

14-14: ABCI version bump aligns with new functionality.

The version update from 1.2.0 to 1.3.0 correctly follows semantic versioning for introducing the new FinalizeSnapshot method.

Let's verify this version is consistently used across the codebase:

✅ Verification successful

ABCI version update is consistently implemented.

The version bump from 1.2.0 to 1.3.0 is properly defined and consistently used across the codebase through the centralized version.ABCIVersion constant. No conflicting version strings were found.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for any hardcoded ABCI version strings
rg -g '!version/version.go' '"1\.[0-9]\.[0-9]"' 

Length of output: 48492


Script:

#!/bin/bash
# Search for ABCI version references more specifically
rg -i "abci.*version|version.*abci"

# Also check for any proto definitions that might reference ABCI versions
fd -e proto | xargs rg -i "version|abci"

Length of output: 37694

test/e2e/runner/start.go (1)

91-91: Simplified error handling looks good.

The removal of intermediate block variables and direct error checking from waitForHeight aligns with the removal of trust parameters from state sync configuration.

test/e2e/pkg/mockcoreserver/methods.go (1)

68-68: Correctly includes QuorumHash in unmarshal parameters.

The addition of &cmd.QuorumHash ensures proper unmarshalling of the quorum verification request.

abci/types/messages.go (1)

90-94: New ToRequestFinalizeSnapshot function follows established patterns.

The implementation correctly converts RequestFinalizeSnapshot to the generic Request type, maintaining consistency with other message conversion functions in the file.

abci/types/application.go (1)

19-20: LGTM! Well-documented interface addition.

The new FinalizeSnapshot method in the StateSyncer interface is clearly documented and aligns with the state sync enhancement objectives.

test/e2e/pkg/mockcoreserver/core_server.go (1)

149-150: LGTM! Good simplification of return value.

The change simplifies the code by directly returning the boolean value instead of wrapping it in a struct, while maintaining the same functionality.

internal/statesync/mocks/stateprovider.go (1)

148-205: LGTM! Correctly generated mock implementation.

The new LightBlock mock method follows the established patterns for error handling and return values, properly supporting the state sync enhancement.

types/quorum_sign_data.go (1)

164-164: Verify the impact of changing the hash size constant.

The validation now uses crypto.QuorumHashSize instead of crypto.DefaultHashSize. Please ensure:

  1. The new constant exists and has the correct value
  2. All existing quorum hashes comply with the new size requirement
✅ Verification successful

QuorumHashSize constant is correctly defined and used consistently

The validation now uses the semantically correct crypto.QuorumHashSize constant which equals crypto.DefaultHashSize (32 bytes). The codebase consistently validates quorum hashes against this size through multiple validation checks, and test files demonstrate correct usage of the constant.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the existence and usage of QuorumHashSize

# Check if QuorumHashSize is defined
rg "QuorumHashSize.*=.*\d+" -A 2

# Find all quorum hash declarations to verify size compatibility
ast-grep --pattern 'QuorumHash []byte'

Length of output: 75


Script:

#!/bin/bash
# Broader search for hash size constants and usages

# Find QuorumHashSize constant definition and usage
rg "QuorumHashSize" -B 2 -A 2

# Find DefaultHashSize for comparison
rg "DefaultHashSize" -B 2 -A 2

# Find quorum hash related code with more context
rg "QuorumHash" -B 2 -A 2

# Look specifically in crypto package
fd -t f . "crypto" --exec cat {}

Length of output: 80387

internal/proxy/client.go (1)

215-218: LGTM! Implementation follows established patterns.

The FinalizeSnapshot method is well-implemented, following the same patterns as other methods in the file:

  • Consistent error handling
  • Proper timing metrics
  • Correct delegation to the underlying client
abci/client/grpc_client.go (1)

258-260: LGTM! Implementation follows established patterns.

The FinalizeSnapshot method is well-implemented, following the same patterns as other methods in the file:

  • Proper request conversion using ToRequestFinalizeSnapshot
  • Consistent use of grpc.WaitForReady(true) option
  • Correct error handling
internal/statesync/chunks.go (2)

136-144: LGTM! Improved handling of empty chunks.

The changes improve the robustness of chunk handling:

  • Maintains essential nil check for the chunk itself
  • Safely handles empty chunk content by replacing nil with empty byte slice
  • Prevents potential nil pointer dereference while allowing valid empty chunks

164-164: LGTM! Consistent use of modified data.

Correctly uses the potentially modified data variable for writing, ensuring consistency with the empty chunk handling changes.

abci/client/routed_client.go (1)

354-357: LGTM! Implementation follows established patterns.

The FinalizeSnapshot method is well-implemented, following the same patterns as other methods in the file:

  • Consistent use of delegate pattern
  • Proper type assertion for response
  • Correct error handling
internal/statesync/stateprovider.go (3)

43-44: LGTM: Well-structured interface addition

The new LightBlock method follows the existing interface pattern and is properly documented.


126-131: LGTM: Thread-safe implementation of LightBlock

The implementation correctly reuses the existing verifyLightBlockAtHeight method and maintains thread safety through proper mutex usage.


271-276: LGTM: Consistent implementation across providers

The P2P implementation maintains consistency with the RPC implementation, ensuring proper thread safety and error handling.

internal/statesync/syncer.go (2)

353-354: LGTM: Enhanced sync process with light blocks

The change improves the sync process by retrieving complete light blocks instead of just commits, providing more comprehensive block data for verification.


581-599: LGTM: Robust snapshot finalization implementation

The implementation properly handles snapshot finalization by sending light blocks to the ABCI app and includes appropriate logging.

abci/types/mocks/application.go (1)

Line range hint 261-422: LGTM! The new FinalizeSnapshot mock implementation follows established patterns.

The implementation correctly handles:

  • Error cases with appropriate panic messages
  • Type assertions and conversions
  • Method signatures matching the interface
  • Supporting mock types for testing
internal/statesync/reactor_test.go (1)

239-240: LGTM! Mock expectation for FinalizeSnapshot is correctly configured.

The mock is set up to be called exactly once and return an empty response, which is appropriate for this test scenario.

internal/statesync/reactor.go (1)

239-240: LGTM! Trust-related configuration has been cleanly removed.

The changes consistently remove trust-related parameters from both P2P and RPC state provider initialization while maintaining proper logging.

Also applies to: 252-253, 261-261

abci/client/mocks/client.go (1)

365-422: LGTM! The new FinalizeSnapshot mock implementation follows established patterns.

The implementation correctly handles:

  • Error cases with appropriate panic messages
  • Type assertions and conversions
  • Method signatures matching the interface
  • Supporting mock types for testing
test/e2e/networks/rotate.toml (1)

155-155: LGTM! State sync configuration changes look good.

The changes consistently configure p2p-based state sync for both validator06 and full01 nodes, aligning with the removal of trust-based parameters.

Also applies to: 195-195

abci/client/socket_client.go (1)

360-366: LGTM! Clean implementation of FinalizeSnapshot method.

The implementation follows the established pattern for ABCI methods in the socket client, with proper error handling and request/response flow.

go.mod (1)

12-13: Verify compatibility of updated dependencies.

Multiple dependency updates have been made. Please ensure:

  1. The new version of dashd-go (v0.26.1-0.20250113154619-015ef93ad36e) is stable and compatible
  2. The btcec/v2 v2.2.0 update doesn't introduce breaking changes
  3. The upgrade to stable versions (btclog v1.0.0) is intentional

Also applies to: 83-83, 111-111

✅ Verification successful

Dependencies update appears safe to proceed

The verification shows these are indirect dependencies with no direct usage in the codebase. The version updates follow semantic versioning principles, suggesting backward compatibility. While breaking changes exist in dashd-go, they don't appear to affect this codebase as there's no direct usage of the updated components.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for any known issues or breaking changes in the updated dependencies

# Check dashd-go changelog/releases
gh api repos/dashpay/dashd-go/releases | jq -r '.[].body' | grep -i "breaking change\|security"

# Check btcec changelog
gh api repos/dashpay/dashd-go/contents/btcec/v2/CHANGELOG.md | jq -r '.content' | base64 -d | grep -A 5 "## v2.2.0"

Length of output: 356


Script:

#!/bin/bash
# Search for import patterns and potential breaking changes
echo "=== Checking import patterns ==="
rg -l "github.com/dashpay/dashd-go/btcec/v2" --type go

echo -e "\n=== Checking documentation ==="
fd -e md -e txt . | xargs rg -l "dashd-go|btcec" || true

echo -e "\n=== Checking recent changes ==="
git log -n 5 --oneline -- go.mod

echo -e "\n=== Checking current usages ==="
rg "btcec\..*" --type go -A 2

Length of output: 454


Script:

#!/bin/bash
# Check for indirect dependencies and upgrade guides
echo "=== Checking go.mod for indirect dependencies ==="
cat go.mod | grep -A 10 "require"

echo -e "\n=== Looking for upgrade documentation ==="
fd -e md . | xargs rg -l "upgrade|migration|breaking|changelog" || true

echo -e "\n=== Checking dashd-go main repository ==="
gh api repos/dashpay/dashd-go/contents/docs | jq -r '.[].name' || true
gh api repos/dashpay/dashd-go/contents/UPGRADING.md | jq -r '.content' | base64 -d || true

Length of output: 4572

proto/tendermint/abci/types.proto (2)

187-193: LGTM! Well-defined RequestFinalizeSnapshot message.

The message includes the essential light_block field needed for state reconstruction after snapshot sync.


908-908: LGTM! Service definition properly updated.

The FinalizeSnapshot RPC method is correctly added to the ABCIApplication service.

internal/statesync/syncer_test.go (4)

103-106: LGTM: Commit structure properly includes Height field.

The commit structure now correctly includes the Height field, which is essential for proper block validation.


108-115: LGTM: Well-structured LightBlock implementation.

The LightBlock structure properly encapsulates both the Commit and Header, following the standard blockchain block structure pattern.


131-132: LGTM: StateProvider mock updated to use LightBlock.

The mock has been correctly updated to return a LightBlock instead of just the commit, aligning with the new state sync protocol.


268-270: LGTM: FinalizeSnapshot mock properly configured.

The FinalizeSnapshot mock is correctly set up with proper expectations and return values.

spec/abci++/api.md Outdated Show resolved Hide resolved
@lklimek lklimek added this to the v1.5 milestone Jan 20, 2025
@lklimek lklimek marked this pull request as ready for review January 20, 2025 08:55
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
proto/tendermint/abci/types.proto (2)

186-196: Fix documentation formatting.

The documentation has some minor formatting issues:

  1. Missing period after "e.g"
  2. Missing blank line before heading
 // RequestFinalizeSnapshot is called by Tenderdash after successfully applying all snapshot chunks, eg.
+// RequestFinalizeSnapshot is called by Tenderdash after successfully applying all snapshot chunks, e.g.,
 // when the ABCI application returned `ResponseApplySnapshotChunk` with `Result = ACCEPT`.
 // It includes the light block committed at the synced height, which Tenderdash uses to reconstruct its own state.
 // The application should validate the light block against its restored state.
+
 // If the application fails to validate the light block, it should return error in the response.
 // This is considered fatal, non-recoverable consensus failure and will cause Tenderdash to restart.

Line range hint 947-956: Fix documentation formatting.

Add a blank line before the heading.

 <a name="tendermint-abci-ResponseFinalizeSnapshot"></a>
+
 ### ResponseFinalizeSnapshot
 The response to a `RequestPrepareSnapshot` message.
abci/example/kvstore/kvstore.go (2)

583-600: Consider enhancing the logging for better observability.

The debug logging could be more detailed to help with troubleshooting. Consider adding logging for the validation steps and error cases.

 func (app *Application) FinalizeSnapshot(_ctx context.Context, req *abci.RequestFinalizeSnapshot) (*abci.ResponseFinalizeSnapshot, error) {
     app.mu.Lock()
     defer app.mu.Unlock()
 
+    app.logger.Debug("Starting snapshot finalization",
+        "expected_height", app.LastCommittedState.GetHeight(),
+        "expected_app_hash", app.LastCommittedState.GetAppHash(),
+        "snapshot_height", req.LightBlock.SignedHeader.Header.Height,
+        "snapshot_app_hash", req.LightBlock.SignedHeader.Header.AppHash)
+
     // we verify snapshot height and app hash, as there is no additional logic to be called here
     if app.LastCommittedState.GetHeight() != req.LightBlock.SignedHeader.Header.Height {
+        app.logger.Error("Snapshot height mismatch",
+            "expected", app.LastCommittedState.GetHeight(),
+            "got", req.LightBlock.SignedHeader.Header.Height)
         return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch: expected %d, got %d",
             app.LastCommittedState.GetHeight(), req.LightBlock.SignedHeader.Header.Height)
     }
 
     if !app.LastCommittedState.GetAppHash().Equal(req.LightBlock.SignedHeader.Header.AppHash) {
+        app.logger.Error("Snapshot app hash mismatch",
+            "expected", app.LastCommittedState.GetAppHash(),
+            "got", req.LightBlock.SignedHeader.Header.AppHash)
         return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch: expected %x, got %x",
             app.LastCommittedState.GetAppHash(), req.LightBlock.SignedHeader.Header.AppHash)
     }
 
     app.logger.Debug("FinalizeSnapshot finished successfully", "req", req)
     return &abci.ResponseFinalizeSnapshot{}, nil
 }

583-600: Well-integrated state sync implementation.

The FinalizeSnapshot method is well-integrated into the state sync workflow, following the same patterns as other ABCI methods (mutex locking, error handling, logging). It complements the existing snapshot-related methods (OfferSnapshot, ApplySnapshotChunk, etc.) to provide a complete state synchronization solution.

The implementation demonstrates good practices:

  1. Consistent error handling across all snapshot-related methods
  2. Thread-safety through mutex usage
  3. Clear separation of concerns between different snapshot operations
  4. Proper validation of snapshot metadata before finalization
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e84e0be and af79697.

⛔ Files ignored due to path filters (1)
  • abci/types/types.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (4)
  • abci/example/kvstore/kvstore.go (1 hunks)
  • abci/types/application.go (2 hunks)
  • proto/tendermint/abci/types.proto (5 hunks)
  • spec/abci++/api.md (7 hunks)
🧰 Additional context used
🪛 LanguageTool
spec/abci++/api.md

[duplication] ~431-~431: Possible typo: you repeated a word.
Context: ...abci-RequestFinalizeSnapshot"> ### RequestFinalizeSnapshot RequestFinalizeSnapshot is called by Tenderdash after successfu...

(ENGLISH_WORD_REPEAT_RULE)


[uncategorized] ~432-~432: The abbreviation “e.g.” (= for example) requires two periods.
Context: ...cessfully applying all snapshot chunks, eg. when the ABCI application returned `Res...

(E_G)

🪛 Markdownlint (0.37.0)
spec/abci++/api.md

431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


443-443: null
Link fragments should be valid

(MD051, link-fragments)


949-949: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)

🔇 Additional comments (5)
abci/types/application.go (2)

19-20: LGTM! Interface addition is well-documented.

The FinalizeSnapshot method is properly added to the StateSyncer interface with clear documentation.


102-105: LGTM! Base implementation provides good default behavior.

The no-op implementation in BaseApplication follows the pattern used for other methods.

proto/tendermint/abci/types.proto (1)

Line range hint 252-252: LGTM! Protocol changes are well-structured.

The new finalize_snapshot field is properly added to Request/Response messages and the ABCIApplication service.

Also applies to: 836-836, 1402-1402

spec/abci++/api.md (1)

431-444: Documentation needs enhancement.

The FinalizeSnapshot documentation should be expanded to include:

  1. When exactly Tenderdash calls this method in the state sync process
  2. What the application should do with the light block
  3. Error handling expectations
  4. Impact on the node's state

Consider adding a detailed section similar to other ABCI methods:

### RequestFinalizeSnapshot

#### Usage
- Called by Tenderdash after successfully applying all snapshot chunks
- The light block allows the application to verify the final state
- Applications should validate the light block against their restored state
- Any errors during finalization should trigger a state sync retry

#### When does Tenderdash call it?
Detailed sequence of events...
🧰 Tools
🪛 LanguageTool

[duplication] ~431-~431: Possible typo: you repeated a word.
Context: ...abci-RequestFinalizeSnapshot"> ### RequestFinalizeSnapshot RequestFinalizeSnapshot is called by Tenderdash after successfu...

(ENGLISH_WORD_REPEAT_RULE)


[uncategorized] ~432-~432: The abbreviation “e.g.” (= for example) requires two periods.
Context: ...cessfully applying all snapshot chunks, eg. when the ABCI application returned `Res...

(E_G)

🪛 Markdownlint (0.37.0)

431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


443-443: null
Link fragments should be valid

(MD051, link-fragments)

abci/example/kvstore/kvstore.go (1)

583-600: LGTM! The implementation looks solid.

The FinalizeSnapshot method correctly validates both the height and AppHash of the snapshot against the last committed state, with proper error handling and thread safety.

A few notable strengths of this implementation:

  1. Thread-safe with mutex locking
  2. Clear, descriptive error messages
  3. Proper debug logging
  4. Consistent with ABCI protocol requirements

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
proto/tendermint/abci/types.proto (1)

186-196: Documentation needs enhancement.

The documentation should be expanded to include:

  1. When exactly Tenderdash calls this method in the state sync process
  2. What the application should do with the light block
  3. Error handling expectations
  4. Impact on the node's state

Consider adding a detailed section similar to other ABCI methods:

### RequestFinalizeSnapshot

#### Usage
- Called by Tenderdash after successfully applying all snapshot chunks
- The light block allows the application to verify the final state
- Applications should validate the light block against their restored state
- Any errors during finalization should trigger a state sync retry

#### When does Tenderdash call it?
Detailed sequence of events...
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af79697 and a9cf16f.

⛔ Files ignored due to path filters (1)
  • abci/types/types.pb.go is excluded by !**/*.pb.go
📒 Files selected for processing (2)
  • proto/tendermint/abci/types.proto (5 hunks)
  • spec/abci++/api.md (7 hunks)
🧰 Additional context used
🪛 Markdownlint (0.37.0)
spec/abci++/api.md

431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


443-443: null
Link fragments should be valid

(MD051, link-fragments)


949-949: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)

⏰ Context from checks skipped due to timeout of 90000ms (12)
  • GitHub Check: test_apps
  • GitHub Check: test_abci_cli
  • GitHub Check: tests (05)
  • GitHub Check: tests (04)
  • GitHub Check: tests (03)
  • GitHub Check: tests (02)
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: tests (01)
  • GitHub Check: Super linter
  • GitHub Check: tests (00)
  • GitHub Check: e2e-test (dashcore)
  • GitHub Check: golangci-lint
🔇 Additional comments (5)
proto/tendermint/abci/types.proto (3)

35-35: LGTM! Field addition follows protobuf best practices.

The finalize_snapshot field is properly added to the Request message with an appropriate field number.


529-529: LGTM! Field addition follows protobuf best practices.

The finalize_snapshot field is properly added to the Response message with an appropriate field number.


Line range hint 1402-1402: LGTM! Service definition is properly updated.

The FinalizeSnapshot RPC is correctly added to the ABCIApplication service.

spec/abci++/api.md (2)

429-449: Documentation needs enhancement.

The documentation feedback from the proto file review applies here as well.

🧰 Tools
🪛 Markdownlint (0.37.0)

431-431: Expected: 1; Actual: 0; Below
Headings should be surrounded by blank lines

(MD022, blanks-around-headings)


443-443: null
Link fragments should be valid

(MD051, link-fragments)


21-21: LGTM! Table of contents is properly updated.

The table of contents correctly reflects the new FinalizeSnapshot messages.

Also applies to: 39-39

proto/tendermint/abci/types.proto Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
abci/example/kvstore/kvstore.go (2)

583-606: LGTM! Consider additional hash verification.

The implementation correctly verifies height and app hash consistency between the restored state and provided blocks. However, consider adding verification of the genesis block's app hash as well.

 func (app *Application) FinalizeSnapshot(_ctx context.Context, req *abci.RequestFinalizeSnapshot) (*abci.ResponseFinalizeSnapshot, error) {
     app.mu.Lock()
     defer app.mu.Unlock()
 
     // we only run some verifications here
 
     if app.LastCommittedState.GetHeight() != req.SnapshotBlock.SignedHeader.Header.Height {
         return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch: expected %d, got %d",
             app.LastCommittedState.GetHeight(), req.SnapshotBlock.SignedHeader.Header.Height)
     }
 
     if !app.LastCommittedState.GetAppHash().Equal(req.SnapshotBlock.SignedHeader.Header.AppHash) {
         return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch: expected %x, got %x",
             app.LastCommittedState.GetAppHash(), req.SnapshotBlock.SignedHeader.Header.AppHash)
     }
 
     if app.initialHeight != req.GenesisBlock.SignedHeader.Header.Height {
         return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("genesis height mismatch: expected %d, got %d",
             app.initialHeight, req.GenesisBlock.SignedHeader.Header.Height)
     }
+
+    // Verify genesis block app hash
+    genesisAppHash := make([]byte, crypto.DefaultAppHashSize)
+    if !bytes.Equal(req.GenesisBlock.SignedHeader.Header.AppHash, genesisAppHash) {
+        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("genesis apphash mismatch: expected %x, got %x",
+            genesisAppHash, req.GenesisBlock.SignedHeader.Header.AppHash)
+    }
 
     app.logger.Debug("FinalizeSnapshot finished successfully", "req", req)
     return &abci.ResponseFinalizeSnapshot{}, nil
 }

589-602: Enhance error message format for better debugging.

The error messages could be more descriptive by including additional context.

-        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch: expected %d, got %d",
+        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch during state sync finalization: expected %d, got %d",
             app.LastCommittedState.GetHeight(), req.SnapshotBlock.SignedHeader.Header.Height)
     }
 
     if !app.LastCommittedState.GetAppHash().Equal(req.SnapshotBlock.SignedHeader.Header.AppHash) {
-        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch: expected %x, got %x",
+        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch during state sync finalization: expected %x, got %x",
             app.LastCommittedState.GetAppHash(), req.SnapshotBlock.SignedHeader.Header.AppHash)
     }
 
     if app.initialHeight != req.GenesisBlock.SignedHeader.Header.Height {
-        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("genesis height mismatch: expected %d, got %d",
+        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("genesis height mismatch during state sync finalization: expected %d, got %d",
             app.initialHeight, req.GenesisBlock.SignedHeader.Header.Height)
     }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9d3dae4 and 94c5f69.

📒 Files selected for processing (2)
  • abci/example/kvstore/kvstore.go (1 hunks)
  • internal/statesync/syncer_test.go (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: e2e-test (dashcore)
🔇 Additional comments (5)
internal/statesync/syncer_test.go (5)

86-86: LGTM: Proper initialization of InitialHeight.

The addition of InitialHeight: 1 ensures complete state initialization for testing.


103-115: LGTM: Well-structured block representation.

The changes properly implement the block structure with consistent height values across commit and header. The encapsulation of SignedHeader within LightBlock follows the correct blockchain data structure pattern.


131-132: LGTM: Proper mock setup for LightBlock.

The stateProvider mock is correctly configured to return the new LightBlock structure with matching heights.


268-270: LGTM: Proper mock setup for FinalizeSnapshot.

The new FinalizeSnapshot mock is correctly configured with appropriate expectations and return values, aligning with the state sync feature implementation.


633-634: LGTM: Proper chunk initialization and queueing.

The chunk handling code correctly initializes and enqueues the chunk ID for processing.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
internal/statesync/syncer.go (3)

354-355: Improve error handling for light block retrieval.

The error handling for light block retrieval could be more specific about the failure reason.

-s.logger.Info("failed to get and verify light block. Dropping snapshot and trying again",
+s.logger.Error("failed to get and verify light block. Dropping snapshot and trying again",
     "err", err, "height", snapshot.Height)
+    "witnesses_remaining", s.stateProvider.WitnessesLeft())

Also applies to: 363-363, 365-365


380-384: Consider caching the genesis block.

The genesis block is unlikely to change during sync. Consider caching it to avoid repeated retrievals.

+// Add to syncer struct
+type syncer struct {
+    // ... existing fields ...
+    genesisBlock *types.LightBlock
+}

+// Add getter method
+func (s *syncer) getGenesisBlock(ctx context.Context, height uint64) (*types.LightBlock, error) {
+    if s.genesisBlock != nil {
+        return s.genesisBlock, nil
+    }
+    block, err := s.getStateProvider().LightBlock(ctx, height)
+    if err != nil {
+        return nil, err
+    }
+    s.genesisBlock = block
+    return block, nil
+}

592-616: Add metrics for snapshot finalization.

The new finalizeSnapshot function should include metrics to track finalization success/failure rates.

 func (s *syncer) finalizeSnapshot(ctx context.Context, snapshot *snapshot, genesisBlock *types.LightBlock, snapshotBlock *types.LightBlock) error {
+    start := time.Now()
+    defer func() {
+        s.metrics.SnapshotFinalizationTime.Set(time.Since(start).Seconds())
+    }()
     // ... existing code ...
     _, err = s.conn.FinalizeSnapshot(ctx, &abci.RequestFinalizeSnapshot{
         SnapshotBlock: snapshotBlockProto,
         GenesisBlock:  genesisBlockProto,
     })
+    if err != nil {
+        s.metrics.SnapshotFinalizationFailures.Inc()
+        return err
+    }
+    s.metrics.SnapshotFinalizationSuccess.Inc()
     return err
 }
libs/math/safemath.go (1)

261-308: Extend Max and Min functions to support all Integer types

The Max and Min functions may panic with "unsupported type" when used with custom integer types that satisfy the Integer interface but are not explicitly handled in the switch cases.

Consider modifying the switch statements to handle all types that satisfy the Integer interface, possibly using type assertions or reflection to generalize the implementation. Alternatively, document this limitation to inform users about the supported types.

libs/math/safemath_test.go (1)

88-191: Enhance test coverage for custom integer types in TestSafeConvert

The TestSafeConvert function provides thorough test cases for standard integer types. However, it may be beneficial to include test cases for custom types that satisfy the Integer interface to ensure that SafeConvert handles them correctly.

Consider adding test cases with custom integer types to validate the function's compatibility with all types that implement the Integer interface.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 94c5f69 and 81367da.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • go.mod (3 hunks)
  • internal/statesync/syncer.go (6 hunks)
  • libs/math/safemath.go (3 hunks)
  • libs/math/safemath_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • go.mod
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: e2e-test (dashcore)
🔇 Additional comments (5)
internal/statesync/syncer.go (5)

19-19: LGTM: Import addition for math package.

The addition of the tmmath import is appropriate for the new mathematical operations used in the code.


376-378: LGTM: Initial height validation.

Good addition of initial height validation to prevent invalid genesis states.


386-389: LGTM: Snapshot finalization implementation.

Good addition of snapshot finalization with proper error handling and context propagation.


402-402: LGTM: Updated return value.

Correctly updated to return the commit from the light block.


219-219: Consider using type conversion function for metrics.

Using MustConvertInt64 for metrics could panic. Consider using a safer conversion method or handling potential errors.

-s.lastSyncedSnapshotHeight = tmmath.MustConvertInt64(snapshot.Height)
+height, err := tmmath.SafeConvertInt64(snapshot.Height)
+if err != nil {
+    return sm.State{}, nil, fmt.Errorf("invalid snapshot height: %w", err)
+}
+s.lastSyncedSnapshotHeight = height

libs/math/safemath.go Show resolved Hide resolved
libs/math/safemath.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
abci/example/kvstore/kvstore.go (1)

589-602: Consider standardizing error handling.

To improve error handling consistency, consider:

  1. Defining error constants for each validation case
  2. Using a consistent error format across all validations

Example implementation:

+const (
+    errSnapshotHeightMismatch   = "snapshot height mismatch during state sync finalization: expected %d, got %d"
+    errSnapshotAppHashMismatch  = "snapshot apphash mismatch during state sync finalization: expected %x, got %x"
+    errGenesisHeightMismatch    = "genesis height mismatch during state sync finalization: expected %d, got %d"
+)

 func (app *Application) FinalizeSnapshot(_ctx context.Context, req *abci.RequestFinalizeSnapshot) (*abci.ResponseFinalizeSnapshot, error) {
     app.mu.Lock()
     defer app.mu.Unlock()

     if app.LastCommittedState.GetHeight() != req.SnapshotBlock.SignedHeader.Header.Height {
-        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot height mismatch during state sync finalization: expected %d, got %d",
+        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf(errSnapshotHeightMismatch,
             app.LastCommittedState.GetHeight(), req.SnapshotBlock.SignedHeader.Header.Height)
     }

     if !app.LastCommittedState.GetAppHash().Equal(req.SnapshotBlock.SignedHeader.Header.AppHash) {
-        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("snapshot apphash mismatch during state sync finalization: expected %x, got %x",
+        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf(errSnapshotAppHashMismatch,
             app.LastCommittedState.GetAppHash(), req.SnapshotBlock.SignedHeader.Header.AppHash)
     }

     if app.initialHeight != req.GenesisBlock.SignedHeader.Header.Height {
-        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf("genesis height mismatch during state sync finalization: expected %d, got %d",
+        return &abci.ResponseFinalizeSnapshot{}, fmt.Errorf(errGenesisHeightMismatch,
             app.initialHeight, req.GenesisBlock.SignedHeader.Header.Height)
     }
libs/math/safemath.go (2)

99-118: Enhance function documentation.

The wrapper functions could benefit from more descriptive documentation. Consider adding:

  • Parameter descriptions
  • Return value descriptions
  • Example usage

Example improvement for SafeConvertUint64:

-// SafeConvertUint64 takes a int and checks if it overflows.
+// SafeConvertUint64 safely converts a value of type T to uint64.
+// Parameters:
+//   - a: The value to convert
+// Returns:
+//   - uint64: The converted value
+//   - error: ErrOverflow if the conversion would overflow
+// Example:
+//   val, err := SafeConvertUint64(int32(42))

262-309: Consider improving error handling in Max/Min functions.

While the implementation is thorough, the panic messages for unsupported types could be more informative.

Consider this improvement:

-		panic("unsupported type")
+		panic(fmt.Sprintf("unsupported integer type %T", max))
-		panic("unsupported type")
+		panic(fmt.Sprintf("unsupported integer type %T", T(0)))
internal/statesync/syncer.go (2)

219-219: Consider using type conversion function directly.

The MustConvertInt64 function will panic if the conversion fails. Consider using ConvertInt64 and handling the error explicitly for better error handling.

-s.lastSyncedSnapshotHeight = tmmath.MustConvertInt64(snapshot.Height)
+height, err := tmmath.ConvertInt64(snapshot.Height)
+if err != nil {
+    return sm.State{}, nil, fmt.Errorf("invalid snapshot height: %w", err)
+}
+s.lastSyncedSnapshotHeight = height

592-619: Consider adding timeout for finalization.

The finalizeSnapshot method should have a timeout to prevent hanging in case the ABCI app is unresponsive.

 func (s *syncer) finalizeSnapshot(ctx context.Context, snapshot *snapshot, genesisBlock *types.LightBlock, snapshotBlock *types.LightBlock) error {
+    // Add timeout for finalization
+    ctx, cancel := context.WithTimeout(ctx, 1*time.Minute)
+    defer cancel()
+
     s.logger.Info("Finalizing snapshot restoration",

Additionally, consider adding metrics for tracking finalization success/failure rates.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81367da and 37f8b5f.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • abci/example/kvstore/kvstore.go (1 hunks)
  • go.mod (3 hunks)
  • internal/statesync/syncer.go (6 hunks)
  • libs/math/safemath.go (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • go.mod
🧰 Additional context used
📓 Learnings (1)
libs/math/safemath.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: libs/math/safemath.go:123-161
Timestamp: 2025-01-20T13:38:10.707Z
Learning: The `SafeConvert` function in `libs/math/safemath.go` correctly handles negative values during integer type conversion through two mechanisms: explicit checks for int64 to uint64 conversion, and the Min[U]() check that catches negative values when converting to unsigned types.
🔇 Additional comments (11)
abci/example/kvstore/kvstore.go (2)

583-606: LGTM! The FinalizeSnapshot implementation looks solid.

The method correctly implements state sync finalization with proper validation checks, error handling, and thread safety:

  • Uses mutex for thread safety
  • Validates snapshot height and app hash consistency
  • Validates genesis block height
  • Provides clear error messages
  • Includes debug logging

583-606: Add unit tests for FinalizeSnapshot.

To ensure the reliability of state sync finalization, we should add unit tests that cover:

  • Successful finalization case
  • Height mismatch scenarios
  • App hash mismatch scenarios
  • Genesis block height mismatch scenarios

Would you like me to help generate comprehensive unit tests for the FinalizeSnapshot method?

libs/math/safemath.go (3)

11-15: LGTM! Error variables are well-defined.

The new error variables follow the established naming pattern and provide clear, specific error messages.


163-198: LGTM! Must-convert functions are well-implemented.

The implementation:

  • Uses a consistent pattern across all functions
  • Properly handles error messages using zero values for type information
  • Provides a comprehensive set of conversion functions for common integer types

123-161: Implementation is robust and handles edge cases well.

The SafeConvert function correctly handles:

  • Special cases for int64/uint64 inputs
  • Platform-specific integer sizes
  • Negative value conversions
  • Overflow checks using Max/Min bounds

Let's verify the platform-specific behavior:

✅ Verification successful

Platform-specific integer handling verified as correct

The implementation correctly handles platform-specific integer sizes through:

  • Proper detection of platform integer sizes using math.MaxInt/MaxUint comparisons
  • Special handling for 64-bit integers
  • Comprehensive bounds checking using Max[T]()/Min[T]() functions
  • Test coverage validating the behavior
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that the code handles different platform integer sizes correctly
# Test: Search for platform-specific integer size handling
rg -A 2 'const (uintIsSmall|intIsSmall) ='

Length of output: 354


Script:

#!/bin/bash
# Check for test cases and related constants
rg -g '*.go' -A 5 'TestSafeConvert|func.*SafeConvert'

# Look for Max/Min function implementations
ast-grep --pattern 'func Max[$_]() $_'

Length of output: 4689

internal/statesync/syncer.go (6)

19-22: LGTM: Required imports for the new functionality.

The added imports support the new light client and math functionality needed for state sync improvements.


354-355: LGTM: Improved error handling for light block retrieval.

The transition from using commit to block improves the code by fetching the complete light block. The error handling is comprehensive, including specific handling for the case when no witnesses are available.

Also applies to: 363-365


376-378: LGTM: Essential validation of initial height.

Good addition of validation to ensure the initial height is valid (>= 1).


386-389: LGTM: Proper finalization handling.

The finalization step is well-integrated into the sync process with appropriate error handling and context propagation.


600-602: LGTM: Proper nil checks for blocks.

Good defensive programming with explicit nil checks for both genesis and snapshot blocks.


354-355: Verify the impact of LightBlock usage.

Let's verify that all callers of this method are prepared to handle the new return type.

internal/statesync/syncer.go Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
abci/client/socket_client.go (1)

360-366: Ensure robust error handling for FinalizeSnapshot.

Overall, the new FinalizeSnapshot method follows the pattern established by other ABCI methods in this file, which is great for consistency. However, to safeguard against unforeseen scenarios, consider adding extra error handling or logging if res.GetFinalizeSnapshot() ever returns a nil reference. This will help diagnose issues in production more effectively.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6332f13 and d595b00.

📒 Files selected for processing (1)
  • abci/client/socket_client.go (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (6)
  • GitHub Check: test_abci_cli
  • GitHub Check: test_apps
  • GitHub Check: build
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: e2e-test (dashcore)
  • GitHub Check: Super linter
🔇 Additional comments (1)
abci/client/socket_client.go (1)

440-441: Good addition of FinalizeSnapshot case in resMatchesReq.

The inclusion of Request_FinalizeSnapshot in resMatchesReq is consistent with the standard request-response matching pattern. This ensures that responses will map correctly to their corresponding requests, thus maintaining the correct flow of ABCI calls.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
internal/statesync/syncer.go (2)

340-344: Consider improving error logging.

The debug log message uses incorrect formatting. The string literal contains a format specifier but the arguments are passed separately.

-s.logger.Debug("light block not found at height %d, retrying", "height", snapshot.Height)
+s.logger.Debug(fmt.Sprintf("light block not found at height %d, retrying", snapshot.Height))

591-618: Consider adding error context and improving nil checks.

The finalizeSnapshot implementation could be improved in a few ways:

  1. The nil check error message could be more specific
  2. Error context could be enhanced with more details
 func (s *syncer) finalizeSnapshot(ctx context.Context, snapshot *snapshot, genesisBlock *types.LightBlock, snapshotBlock *types.LightBlock) error {
     s.logger.Info("Finalizing snapshot restoration",
         "snapshot", snapshot.Hash.String(),
         "height", snapshot.Height,
         "version", snapshot.Version,
         "app_hash", snapshot.trustedAppHash,
     )
-    if genesisBlock == nil || snapshotBlock == nil {
-        return fmt.Errorf("nil block provided: genesis=%v snapshot=%v", genesisBlock == nil, snapshotBlock == nil)
+    if genesisBlock == nil {
+        return fmt.Errorf("genesis block cannot be nil for snapshot at height %d", snapshot.Height)
+    }
+    if snapshotBlock == nil {
+        return fmt.Errorf("snapshot block cannot be nil for snapshot at height %d", snapshot.Height)
     }

     snapshotBlockProto, err := snapshotBlock.ToProto()
     if err != nil {
-        return fmt.Errorf("failed to convert snapshot block %s to proto: %w", snapshotBlock.String(), err)
+        return fmt.Errorf("failed to convert snapshot block at height %d to proto: %w", snapshot.Height, err)
     }
     genesisBlockProto, err := genesisBlock.ToProto()
     if err != nil {
-        return fmt.Errorf("failed to convert genesis block %s to proto: %w", genesisBlock.String(), err)
+        return fmt.Errorf("failed to convert genesis block at height %d to proto: %w", genesisBlock.Height, err)
     }

     _, err = s.conn.FinalizeSnapshot(ctx, &abci.RequestFinalizeSnapshot{
         SnapshotBlock: snapshotBlockProto,
         GenesisBlock:  genesisBlockProto,
     })

-    return err
+    if err != nil {
+        return fmt.Errorf("failed to finalize snapshot at height %d: %w", snapshot.Height, err)
+    }
+    return nil
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d595b00 and 3867717.

📒 Files selected for processing (1)
  • internal/statesync/syncer.go (8 hunks)
🧰 Additional context used
📓 Learnings (1)
internal/statesync/syncer.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: internal/statesync/syncer.go:380-384
Timestamp: 2025-01-24T08:11:01.354Z
Learning: For genesis height conversion from int64 to uint64, using MustConvertUint64 is preferred over error handling since the height is already validated to be positive and cannot exceed uint64 range.
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: e2e-test (dashcore)
🔇 Additional comments (4)
internal/statesync/syncer.go (4)

183-186: LGTM: Snapshot re-request logic looks good.

The implementation correctly handles snapshot re-requests with proper error propagation.


375-377: LGTM: Initial height validation.

Good addition of validation to ensure the initial height is valid (>= 1).


379-383: LGTM: Genesis height conversion.

The use of MustConvertUint64 is appropriate here as per the retrieved learning which confirms that the height is already validated to be positive and cannot exceed uint64 range.


401-401: Verify the impact of returning light block commit.

The change from returning commit to block.Commit might affect downstream code that expects the commit in a specific format.

✅ Verification successful

The change from commit to block.Commit is type-safe and doesn't impact downstream code

The modification maintains type compatibility as both variables are of type *types.Commit, and the value is passed through without transformation. No downstream code changes are required.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for usages of the Sync method's return values
rg -A 5 "Sync\([^)]+\)" | grep -B 5 "commit"

Length of output: 1995


Script:

#!/bin/bash
# Find the type definition of block.Commit and its usage
ast-grep --pattern 'type Block struct {
  $$$
  Commit $$$
  $$$
}'

echo "=== Searching for commit usage patterns ==="
rg -A 5 "func.*Sync.*\(.*\).*\(.*commit.*\)" 

Length of output: 213


Script:

#!/bin/bash
# Search for Block type definition and Sync method signature
echo "=== Block type definition ==="
rg "type Block struct" -A 10

echo "=== Sync method signature ==="
rg "func.*Sync.*\(.*\).*\(.*\).*\{" -A 2

echo "=== Commit type definitions ==="
rg "type.*Commit.*struct" -A 5

Length of output: 22012

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
internal/statesync/syncer.go (3)

124-125: Consider using structured logging fields.

The debug log message could be more consistent with other logging patterns in the file by using structured key-value pairs.

-s.logger.Debug("snapshot not added", "height", snapshot.Height, "hash", snapshot.Hash)
+s.logger.Debug("Snapshot not added",
+    "height", snapshot.Height,
+    "version", snapshot.Version,
+    "hash", snapshot.Hash)

339-361: Improve error message consistency.

The error messages for light block retrieval could be more consistent with the state retrieval error messages above.

-s.logger.Error("failed to get and verify light block. Dropping snapshot and trying again",
+s.logger.Info("failed to get and verify light block. Dropping snapshot and trying again",
     "err", err, "height", snapshot.Height)

606-609: Consider error wrapping for FinalizeSnapshot call.

The error from FinalizeSnapshot should be wrapped with additional context.

-return err
+return fmt.Errorf("ABCI app rejected snapshot finalization: %w", err)
internal/statesync/reactor.go (1)

1064-1070: Consider adding explanatory comments for the snapshot filtering logic.

The condition for filtering snapshots based on finalized blocks would benefit from a more detailed explanation.

Add a comment block explaining the rationale:

+		// We only accept snapshots that are "finalized", meaning:
+		// 1. The block at snapshot height has been committed
+		// 2. The next block (height+1) containing the commit is also available
+		// 3. We are currently at height+2 or higher, ensuring the snapshot is immutable
 		if s.Height <= uint64(r.csState.GetRoundState().Height)-2 {
🧰 Tools
🪛 GitHub Check: golangci-lint

[failure] 1067-1067:
G115: integer overflow conversion int64 -> uint64 (gosec)

🪛 GitHub Actions: Golang Linter

[error] 1067-1067: G115: Potential integer overflow in conversion from int64 to uint64

🪛 GitHub Actions: Test

[error] 1067-1067: Nil pointer dereference in recentSnapshots method

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3867717 and 1973905.

📒 Files selected for processing (2)
  • internal/statesync/reactor.go (3 hunks)
  • internal/statesync/syncer.go (8 hunks)
🧰 Additional context used
📓 Learnings (1)
internal/statesync/syncer.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: internal/statesync/syncer.go:380-384
Timestamp: 2025-01-24T08:11:01.354Z
Learning: For genesis height conversion from int64 to uint64, using MustConvertUint64 is preferred over error handling since the height is already validated to be positive and cannot exceed uint64 range.
🪛 GitHub Check: golangci-lint
internal/statesync/reactor.go

[failure] 1067-1067:
G115: integer overflow conversion int64 -> uint64 (gosec)

🪛 GitHub Actions: Golang Linter
internal/statesync/reactor.go

[error] 1067-1067: G115: Potential integer overflow in conversion from int64 to uint64

🪛 GitHub Actions: Test
internal/statesync/reactor.go

[error] 903-903: Panic in message processing: runtime error: invalid memory address or nil pointer dereference


[error] 1067-1067: Nil pointer dereference in recentSnapshots method

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: test_apps
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: e2e-test (dashcore)
🔇 Additional comments (5)
internal/statesync/syncer.go (3)

182-185: LGTM! Error handling for snapshot requests is well implemented.

The error propagation and logging for snapshot requests is properly handled.


369-382: LGTM! Genesis validation and finalization is well implemented.

The code correctly:

  1. Validates initial height
  2. Converts genesis height using MustConvertUint64 (as per retrieved learning)
  3. Retrieves genesis block
  4. Finalizes the snapshot

585-612: Consider adding validation for snapshot hash.

The finalizeSnapshot method should validate that the snapshot block's hash matches the trusted app hash before sending it to the ABCI app.

internal/statesync/reactor.go (2)

Line range hint 239-265: LGTM: State provider initialization changes look good.

The removal of the TrustHeight parameter from both P2P and RPC state providers is implemented consistently, with appropriate error handling and logging.


Line range hint 903-903: Add nil checks to prevent panic in message processing.

The pipeline failure indicates a panic due to nil pointer dereference in message processing.

Let's verify the message handling paths and add necessary nil checks:

internal/statesync/reactor.go Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
config/config.go (2)

1006-1009: Introduce a rationale for the new Retries field and clarify its behavior.

While adding Retries is a helpful addition, consider expanding its docstring to clarify what happens when this is set to zero (e.g., does it disable retries or cause immediate fallback to block sync?). Such context can prevent misconfiguration.


1029-1029: Scrutinize default retry count.

Defaulting to 1 may be insufficient in highly unstable networks or for slow peer discovery. You might want to provide guidance to users on appropriate retry counts based on network conditions.

internal/statesync/syncer.go (1)

590-617: Testing coverage for finalizeSnapshot.

This method is central to the new synchronization flow. Make sure it’s exercised by unit tests or integration tests with both valid and invalid blocks to confirm correct error handling.

config/toml.go (1)

501-504: Document the new retries config parameter.

This addition aligns with StateSyncConfig.Retries. Briefly explain the effect of zero or negative values in the config comments to help users avoid misconfiguration.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 34aa00b and c22edb6.

📒 Files selected for processing (5)
  • config/config.go (3 hunks)
  • config/toml.go (1 hunks)
  • internal/statesync/reactor.go (4 hunks)
  • internal/statesync/syncer.go (10 hunks)
  • internal/statesync/syncer_test.go (12 hunks)
🧰 Additional context used
📓 Learnings (1)
internal/statesync/syncer.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: internal/statesync/syncer.go:380-384
Timestamp: 2025-01-24T08:11:01.354Z
Learning: For genesis height conversion from int64 to uint64, using MustConvertUint64 is preferred over error handling since the height is already validated to be positive and cannot exceed uint64 range.
🪛 GitHub Check: golangci-lint
internal/statesync/reactor.go

[failure] 1071-1071:
G115: integer overflow conversion uint64 -> int64 (gosec)

🪛 GitHub Actions: Golang Linter
internal/statesync/reactor.go

[error] 1071-1071: G115: Integer overflow risk detected in conversion from uint64 to int64

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: e2e-test (dashcore)
🔇 Additional comments (20)
internal/statesync/syncer_test.go (7)

86-86: LGTM: Initial height field addition

The addition of InitialHeight field with value 1 in the state struct is consistent with the state initialization requirements.


103-115: LGTM: Enhanced block structure with light block

Good improvement to the test structure:

  1. Added height to commit for proper block identification
  2. Introduced light block with signed header for better test coverage

131-132: LGTM: Updated state provider mock

The mock now correctly returns a light block instead of just a commit, which better reflects the actual state provider behavior.


268-270: LGTM: Added FinalizeSnapshot mock expectation

Properly set up mock expectation for the new FinalizeSnapshot ABCI call, which is triggered after state sync completion.


280-280: Update test method signatures for consistency

The SyncAny method calls have been updated with an additional parameter, but the purpose of this parameter is not clear from the test context.

Consider adding a comment explaining the purpose of the new parameter in the test method signature.

Also applies to: 297-297, 323-323, 373-373, 416-416, 470-470, 498-498


633-634: LGTM: Simplified chunk ID initialization

The chunk ID initialization is now more straightforward with direct byte array assignment.


280-280: Let's verify the purpose of the new parameter:

Also applies to: 297-297, 323-323, 373-373, 416-416, 470-470, 498-498

✅ Verification successful

Test correctly uses nil function as a no-op callback for SyncAny

The test is correctly using func() error { return nil } as a valid no-op implementation for the requestSnapshots parameter of SyncAny. This is a common pattern for cases where the callback functionality is not needed in the test scenario.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for SyncAny method implementations and related changes
echo "Searching for SyncAny method implementations..."
rg -p "func.*SyncAny.*\(" --type go

echo -e "\nSearching for recent changes to SyncAny..."
git log -p --grep="SyncAny"

echo -e "\nSearching for related documentation..."
rg -p "SyncAny.*" --type md

Length of output: 864


Script:

#!/bin/bash
# View the implementation of SyncAny
echo "SyncAny implementation:"
rg -A 20 "func \(s \*syncer\) SyncAny" internal/statesync/syncer.go

echo -e "\nSearching for related type definitions:"
rg "type syncer struct" -A 10 internal/statesync/syncer.go

Length of output: 1114

config/config.go (1)

1062-1063: Good defensive check for negative values.

The validation correctly disallows negative retries. For completeness, you might also consider an upper bound if there’s a risk of unbounded spam or denial-of-service scenarios.

internal/statesync/syncer.go (8)

19-19: Importing tmmath for conversions is fine.

This import provides utility conversion functions such as MustConvertInt64 and MustConvertUint64. It’s consistent with the codebase’s usage of safe conversion helpers.


124-125: No issues with added log messages.

Logging “Discovered new snapshot” vs. “snapshot not added” conveys useful debugging info. The statements seem consistent with typical node logs, so no changes are needed.


Line range hint 153-176: Ensure consistent behavior with the new retries parameter.

  1. The loop checks (iters > retries) if retries is non-zero. Confirm that a value of 0 indeed short-circuits the loop to use block sync immediately and doesn’t lead to indefinite attempts.
  2. If the fallback logic is ambiguous, consider clarifying it (e.g., by comments or docstrings).

187-190: Re-requesting snapshots is a logical approach.

This step is essential for scenarios where the node has no snapshots in the pool. The fallback to requesting snapshots again is straightforward and correct.


217-217: Watch out for potential overflow in MustConvertInt64.

snapshot.Height is uint64; converting it to int64 may theoretically overflow if Height exceeds 2^63-1. If such large heights are impossible in your environment, this is acceptable; otherwise, add a safety check.

Would you like me to generate a script to scan the codebase for potential snapshot.Height usage flows or add an additional check here?


374-376: Validate edge-case height.

For initial heights less than 1, returning an error is sensible. Ensure that core logic also handles edge cases such as an initial height of 1 (the typical genesis block) suitably.


378-383: Potential panic in MustConvertUint64.

MustConvertUint64 can panic if the input is negative or exceeds the uint64 range, but you’ve already guarded with state.InitialHeight < 1. As with the other conversions, confirm that your genesis height range cannot exceed 2^64-1.


384-388: Leverage finalizeSnapshot responsibly.

The explicit finalization step is well-organized; just ensure the caller verifies the ABCI response to handle any app-layer issues after finalization.

internal/statesync/reactor.go (4)

Line range hint 239-265: LGTM! Clean removal of trust-related parameters.

The changes to state provider initialization are well-structured and maintain proper error handling.


330-330: LGTM! Well-structured sync implementation.

The sync functionality properly utilizes configuration parameters for discovery time and retries.


1068-1074: LGTM! Well-designed snapshot filtering logic.

The logic ensures that only finalized snapshots are accepted by checking if we're voting for height + 2 or higher. This is crucial for maintaining consistency in the state sync process.

🧰 Tools
🪛 GitHub Check: golangci-lint

[failure] 1071-1071:
G115: integer overflow conversion uint64 -> int64 (gosec)

🪛 GitHub Actions: Golang Linter

[error] 1071-1071: G115: Integer overflow risk detected in conversion from uint64 to int64


1064-1074: ⚠️ Potential issue

Fix potential nil pointer dereference and integer overflow issues.

The snapshot filtering logic needs defensive programming:

  1. Add nil checks for r.csState and GetRoundState()
  2. Handle potential integer overflow when converting between uint64 and int64

Apply this diff to fix the issues:

-        if r.csState == nil {
-            continue
-        }
-        currentHeight := r.csState.GetRoundState().Height
+        if r.csState == nil || r.csState.GetRoundState() == nil {
+            continue
+        }
+        currentHeight := r.csState.GetRoundState().Height
+        if currentHeight <= 0 {
+            continue
+        }
-        if int64(s.Height) >= currentHeight-2 {
+        // Safe conversion by checking bounds
+        if s.Height >= uint64(math.MaxInt64) || int64(s.Height) >= currentHeight-2 {

Likely invalid or redundant comment.

🧰 Tools
🪛 GitHub Check: golangci-lint

[failure] 1071-1071:
G115: integer overflow conversion uint64 -> int64 (gosec)

🪛 GitHub Actions: Golang Linter

[error] 1071-1071: G115: Integer overflow risk detected in conversion from uint64 to int64

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 589769d and 3370545.

📒 Files selected for processing (1)
  • internal/statesync/reactor.go (5 hunks)
🧰 Additional context used
🪛 GitHub Check: golangci-lint
internal/statesync/reactor.go

[failure] 1084-1084:
G115: integer overflow conversion uint64 -> int64 (gosec)

🪛 GitHub Actions: Golang Linter
internal/statesync/reactor.go

[error] 1084-1084: G115: Integer overflow possible in conversion from uint64 to int64

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: test_apps
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: e2e-test (dashcore)
🔇 Additional comments (2)
internal/statesync/reactor.go (2)

239-239: LGTM! State provider initialization changes look good.

The removal of trust-related parameters and addition of logging improves the code without introducing any issues.

Also applies to: 252-253, 261-261


280-294: LGTM! Error handling improvements look good.

The changes add proper fallback handling when no snapshots are available, with appropriate error wrapping and logging.

Comment on lines +1077 to +1087
if r.csState == nil {
continue
}
currentHeight := r.csState.GetRoundState().Height
// we only accept snapshots where next block is already finalized, that is we are voting
// for `height + 2` or higher, because we need to be able to fetch light block containing
// commit for `height` from block store (which is stored in block `height+1`)
if int64(s.Height) >= currentHeight-2 {
r.logger.Debug("snapshot too new, skipping", "height", s.Height, "state_height", currentHeight)
continue
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix potential integer overflow and improve null checks.

The snapshot filtering logic has several issues:

  1. Potential integer overflow when converting s.Height (uint64) to int64
  2. The null checks could be more concise

Apply this diff to fix the issues:

-		if r.csState == nil {
-			continue
-		}
-		currentHeight := r.csState.GetRoundState().Height
+		if r.csState == nil || r.csState.GetRoundState() == nil {
+			continue
+		}
+		currentHeight := r.csState.GetRoundState().Height
+		if currentHeight <= 0 {
+			continue
+		}
+		// Ensure safe conversion by checking for overflow
+		if s.Height > uint64(math.MaxInt64) {
+			r.logger.Debug("snapshot height too large, skipping", "height", s.Height)
+			continue
+		}
-		if int64(s.Height) >= currentHeight-2 {
+		snapshotHeight := int64(s.Height)
+		if snapshotHeight >= currentHeight-2 {

Don't forget to add the import:

+       "math"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if r.csState == nil {
continue
}
currentHeight := r.csState.GetRoundState().Height
// we only accept snapshots where next block is already finalized, that is we are voting
// for `height + 2` or higher, because we need to be able to fetch light block containing
// commit for `height` from block store (which is stored in block `height+1`)
if int64(s.Height) >= currentHeight-2 {
r.logger.Debug("snapshot too new, skipping", "height", s.Height, "state_height", currentHeight)
continue
}
if r.csState == nil || r.csState.GetRoundState() == nil {
continue
}
currentHeight := r.csState.GetRoundState().Height
if currentHeight <= 0 {
continue
}
// Ensure safe conversion by checking for overflow
if s.Height > uint64(math.MaxInt64) {
r.logger.Debug("snapshot height too large, skipping", "height", s.Height)
continue
}
snapshotHeight := int64(s.Height)
if snapshotHeight >= currentHeight-2 {
r.logger.Debug("snapshot too new, skipping", "height", s.Height, "state_height", currentHeight)
continue
}
🧰 Tools
🪛 GitHub Check: golangci-lint

[failure] 1084-1084:
G115: integer overflow conversion uint64 -> int64 (gosec)

🪛 GitHub Actions: Golang Linter

[error] 1084-1084: G115: Integer overflow possible in conversion from uint64 to int64

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
internal/statesync/syncer.go (1)

Review comment is accurate - test coverage needs improvement

The codebase analysis confirms gaps in test coverage for critical state sync components. Specific test cases needed:

  • finalizeSnapshot:

    • Test nil block handling
    • Test proto conversion errors
    • Test ABCI response errors
  • SyncAny:

    • Test retry behavior with different discovery times
    • Test peer rejection scenarios
    • Test snapshot rejection flows
  • Sync:

    • Test light block verification failures
    • Test app hash verification
    • Test chunk processing timeouts
🔗 Analysis chain

Line range hint 1-618: Improve test coverage.

The pipeline shows test coverage at 73.8%, which is below the desired threshold. Consider adding tests for:

  1. Error cases in finalizeSnapshot
  2. Retry logic in SyncAny
  3. Light block handling in Sync

Would you like me to help generate additional test cases to improve coverage?

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Find untested code paths in the syncer implementation
ast-grep --pattern 'if err != nil {
  $$$
  return $_, $_, err
}'

Length of output: 8719


Script:

#!/bin/bash
# Find test files for statesync package
echo "=== Test files for statesync package ==="
fd -e go test internal/statesync/

# Look for test cases related to the functions mentioned
echo -e "\n=== Test cases for specific functions ==="
rg -A 5 "func Test.*Sync(Any|er.*Sync|er.*finalizeSnapshot)" internal/statesync/

# Check for error case testing patterns
echo -e "\n=== Error case test patterns ==="
rg "t.Error|assert.*Error|require.*Error" internal/statesync/ --type go

Length of output: 13909

🧰 Tools
🪛 GitHub Actions: Test

[error] Package test failed with coverage: 73.8% of statements

🧹 Nitpick comments (2)
internal/statesync/syncer.go (2)

Line range hint 153-190: Enhance retry logic with exponential backoff.

The retry mechanism is a good addition, but it could benefit from exponential backoff to prevent overwhelming the system during repeated failures.

 func (s *syncer) SyncAny(
 	ctx context.Context,
 	discoveryTime time.Duration,
 	retries int,
 	requestSnapshots func() error,
 ) (sm.State, *types.Commit, error) {
+	backoffDuration := discoveryTime
 	if discoveryTime != 0 && discoveryTime < minimumDiscoveryTime {
 		discoveryTime = minimumDiscoveryTime
+		backoffDuration = minimumDiscoveryTime
 	}
 
-	timer := time.NewTimer(discoveryTime)
+	timer := time.NewTimer(backoffDuration)
 	defer timer.Stop()
 
     // ... existing code ...
 
     if snapshot == nil {
         if discoveryTime == 0 {
             return sm.State{}, nil, errNoSnapshots
         }
         if err := requestSnapshots(); err != nil {
             return sm.State{}, nil, err
         }
         s.logger.Info("discovering snapshots",
             "iterations", iters,
-            "interval", discoveryTime)
+            "interval", backoffDuration)
-        timer.Reset(discoveryTime)
+        backoffDuration = time.Duration(float64(backoffDuration) * 1.5) // Exponential backoff
+        timer.Reset(backoffDuration)
🧰 Tools
🪛 GitHub Actions: Test

[error] Package test failed with coverage: 73.8% of statements


590-617: Add error wrapping and improve error messages.

The error handling in the finalizeSnapshot function could be more descriptive.

 func (s *syncer) finalizeSnapshot(ctx context.Context, snapshot *snapshot, genesisBlock *types.LightBlock, snapshotBlock *types.LightBlock) error {
     // ... existing logging ...
     if genesisBlock == nil || snapshotBlock == nil {
-        return fmt.Errorf("nil block provided: genesis=%v snapshot=%v", genesisBlock == nil, snapshotBlock == nil)
+        return fmt.Errorf("invalid block parameters: genesis_block_nil=%v snapshot_block_nil=%v", genesisBlock == nil, snapshotBlock == nil)
     }
 
     snapshotBlockProto, err := snapshotBlock.ToProto()
     if err != nil {
-        return fmt.Errorf("failed to convert snapshot block %s to proto: %w", snapshotBlock.String(), err)
+        return fmt.Errorf("failed to convert snapshot block to proto at height %d: %w", snapshotBlock.Height, err)
     }
     genesisBlockProto, err := genesisBlock.ToProto()
     if err != nil {
-        return fmt.Errorf("failed to convert genesis block %s to proto: %w", genesisBlock.String(), err)
+        return fmt.Errorf("failed to convert genesis block to proto at height %d: %w", genesisBlock.Height, err)
     }
 
     _, err = s.conn.FinalizeSnapshot(ctx, &abci.RequestFinalizeSnapshot{
         SnapshotBlock: snapshotBlockProto,
         GenesisBlock:  genesisBlockProto,
     })
 
-    return err
+    if err != nil {
+        return fmt.Errorf("failed to finalize snapshot at height %d: %w", snapshot.Height, err)
+    }
+    return nil
🧰 Tools
🪛 GitHub Actions: Test

[error] Package test failed with coverage: 73.8% of statements

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3370545 and 76f6c00.

📒 Files selected for processing (1)
  • internal/statesync/syncer.go (10 hunks)
🧰 Additional context used
📓 Learnings (1)
internal/statesync/syncer.go (1)
Learnt from: lklimek
PR: dashpay/tenderdash#1026
File: internal/statesync/syncer.go:380-384
Timestamp: 2025-01-24T08:11:01.354Z
Learning: For genesis height conversion from int64 to uint64, using MustConvertUint64 is preferred over error handling since the height is already validated to be positive and cannot exceed uint64 range.
🪛 GitHub Actions: Test
internal/statesync/syncer.go

[error] Package test failed with coverage: 73.8% of statements

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: test_apps
  • GitHub Check: e2e-test (rotate)
  • GitHub Check: e2e-test (dashcore)
🔇 Additional comments (4)
internal/statesync/syncer.go (4)

19-19: LGTM: Import addition for math utilities.

The addition of the math utilities import is appropriate for the new height conversion functionality.

🧰 Tools
🪛 GitHub Actions: Test

[error] Package test failed with coverage: 73.8% of statements


374-376: LGTM: Initial height validation.

Good addition of validation to ensure the initial height is valid (>= 1).

🧰 Tools
🪛 GitHub Actions: Test

[error] Package test failed with coverage: 73.8% of statements


378-382: LGTM: Genesis block retrieval.

Appropriate use of MustConvertUint64 for height conversion as per the retrieved learning, since the height is already validated to be positive.

🧰 Tools
🪛 GitHub Actions: Test

[error] Package test failed with coverage: 73.8% of statements


400-400: LGTM: Return value update.

Appropriate update to return block.Commit instead of just commit to align with the light block changes.

🧰 Tools
🪛 GitHub Actions: Test

[error] Package test failed with coverage: 73.8% of statements

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

1 participant