Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise handling of duplicate (miner_id, piece_cid) pairs #59

Open
bajtos opened this issue Jan 30, 2025 · 2 comments
Open

Optimise handling of duplicate (miner_id, piece_cid) pairs #59

bajtos opened this issue Jan 30, 2025 · 2 comments

Comments

@bajtos
Copy link
Member

bajtos commented Jan 30, 2025

There may be multiple deals with the same combination of (miner_id, piece_cid) values, that's how the UNIQUE constraint proposed in #31 works.

From what I've seen in fil-deal-ingester, it's very very likely there will be many such duplicates.

Let's implement more efficient handling of such duplicates, so that we reduce the number of PieceIndexer/IPNI calls.

I see two situations to consider:

  • When we list all deals missing the payload CID, there will be duplicate (miner_id, piece_cid) pairs. I think this is unlikely to happen in practice after we complete the initial run.
  • When a new deal is created from a claim event, there is already an older deal with the same (miner_id, piece_cid) and with a resolved payload_cid.
@NikolasHaimerl
Copy link
Contributor

@bajtos is there any way to distinguish between the older and never deal when asking the piece indexer for the payload given a certain combination of (minder_id, piece_cid)?

@NikolasHaimerl
Copy link
Contributor

When a new deal is created from a claim event, there is already an older deal with the same (miner_id, piece_cid) and with a resolved payload_cid.

Do you mean to exclude new claim events that have an already existin (miner_id,piece_cid) combination in the deal-observer database?

When we list all deals missing the payload CID, there will be duplicate (miner_id, piece_cid) pairs. I think this is unlikely to happen in practice after we complete the initial run.

After the initial run, those deals would be flagged with payload_unretrievable. Is this not a sufficient terminal state for these states or do you propose a further action here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants