Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test/deploy Archive node migration tool #14338

Closed
deepthiskumar opened this issue Oct 12, 2023 · 3 comments
Closed

Test/deploy Archive node migration tool #14338

deepthiskumar opened this issue Oct 12, 2023 · 3 comments
Assignees

Comments

@deepthiskumar
Copy link
Member

deepthiskumar commented Oct 12, 2023

Test Plan

Link to rfc: #14288

Features to be tested

  1. Data migration from mainnet to berkeley
    a. balances -> accounts_accessed table migration and accounts_created
    b. block table (min_window_density, sub_window_densities)
  2. Incrementality (checkpoints )
    a. ability to pick up jobs from checkpoint
    b. cron job setup
  3. Performance degradation
    a. Detect if migration process slowed down significantly

Approach

Migrating entire mainnet database can take some time (up to several hours), so there is no point in testing such scenario on pull request. Another problem is that we need to verify cross version behavior (part of the test has setup in compatible branch another in berkeley). Knowing above proposal is to split tests into 3 category:

Fast Verifications on PRs

As a starting point we can use extensional blocks (which are in form of json) and perform flow like below:

  • prepare new postgres instance
  • test will define input data such as transactions, states etc.
  • test will format input test data as json
  • compatible version of archive_blocks will consume json from previous step and import it to new schema compatible_archive
  • we will run migration app from compatible_archive to berkeley_archive
  • we will run add-berkeley-account app to fix balances
  • we will check if data is the same as test definition

Pros:

  • We can prepare whatever scenario we want. Good place to test corner cases and expand suite if we find any regression as testing with above approach is cheap

Cons:

  • We are not using production data
  • Probably we cannot test directly in ocaml as we need to download apps from compatible

On this level we can also check if entire process of importing data has not degradated

Incrementaility on PR

Having all tooling available like below we can test checkpoints and various parameters which limit scope of migration. For example

  • Account A has 10 Mina and B 20 Mina in block 1
  • Account A transfered 1 Mina to B in block 2

running migration from block 0 to 1 should result in A=10, B=20 running migration from 1 to 2 should result in A=9 B=21 etc

Mainnet data on nightly

On this level we should more focus on production data using existing dumps from gcloud storage. We can pick up some mainnet dump of our preference (not to fresh to limit data migration or new one but use start and stop parameters during migration). Using compatible tooling we can import such dump to database and extract data to json to learn about initial and last state of balances or transactions. Then perform migration which should take more than 30 minutes. Then after initial checks (validating account balances vs expected) we will start archive deamon to see if there is no problem connecting to new data, additionally we can use rosetta to check if it gives us expected outcome.

On this level we can also check if entire process of importing data has not degradated

Full experiment on stable

Stable runs for me are weekly or even biweekly runs which are costly but are doing exact operation which archive node operators will do.

We will import entire mainnet data on small deployment:

  • demo node (can be lightnet? )
  • archive node
  • rosetta

(mina-local-network docker can be helpful in this scenario)

Using cron jobs (but maybe we could increase frequency of run) we will try to migrate data but in the same time we can send some transactions to demo node to ensure that those would also be migrated.

At the end we will migrate manually the remaining data which was not migrated by cron job. We will restart our small deployment and test if balances / rosetta and archive node on new data is stable.

@deepthiskumar deepthiskumar changed the title Berkeley Archive node migration tool Test/deploy Archive node migration tool Oct 12, 2023
@psteckler
Copy link
Member

we will run add-berkeley-account app to fix balances

Clarification: that app populates accounts_accessed and accounts_created

@psteckler
Copy link
Member

psteckler commented Oct 25, 2023

The plan here doesn't mention the migration cron job suggested in #14288. That cron job would migrate the data incrementally.

Is that different from the Full experiment on stable described above?

@dkijania
Copy link
Member

I did mentioned that in Full experiment test:

Using cron jobs (but maybe we could increase frequency of run) we will try to migrate data but in the same time we can send some transactions to demo node to ensure that those would also be migrated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants