Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: check for missing payload #77

Merged
merged 19 commits into from
Feb 7, 2025

Conversation

NikolasHaimerl
Copy link
Contributor

@NikolasHaimerl NikolasHaimerl commented Feb 3, 2025

This PR adds the following changes:

  1. Add a cache for payloads that could not be retrieved
  2. Retrieve payloads up to 4 times with a time delay of 8 hours in between calls
  3. Mark deals as unretrievable if all retries fail

See also this issue.

@NikolasHaimerl NikolasHaimerl marked this pull request as ready for review February 3, 2025 15:09
backend/bin/deal-observer-backend.js Outdated Show resolved Hide resolved
backend/lib/piece-indexer.js Outdated Show resolved Hide resolved
@juliangruber

This comment was marked as outdated.

bajtos

This comment was marked as outdated.

@bajtos

This comment was marked as outdated.

@juliangruber

This comment was marked as outdated.

@NikolasHaimerl
Copy link
Contributor Author

  1. IIUC your design, on every loop iteration, you fetch all deals missing PayloadCID from the database. That will eventually mean fetching tens of millions of records.

I miscalculated the number of deals here. IIUC, this PR will mark deals as not advertised after ~1 day. fil-deal-ingester is adding 40-60k daily these days. Around 33% of deals are advertised to IPNI, according to what I see in the Spark Internal Dashboard. That means around 77% * 40k = ~30k deals to recheck after the service (re)starts.

I apologise for exaggerating the number in my initial comment.

I still prefer to keep our service as stateless as possible. It's a principle we have been following in other services, too. Please correct me if I am wrong, @juliangruber.

I created an implementation plan which makes the service stateless in exchange for an additional field in the database to keep track of when the last time was when we tried to fetch a payload. Please let me know what you think.

Copy link
Contributor

@pyropy pyropy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, I think we're almost there.

backend/lib/piece-indexer.js Outdated Show resolved Hide resolved
backend/lib/piece-indexer.js Outdated Show resolved Hide resolved
db/migrations/009.do.add-payload-retrievability-column.sql Outdated Show resolved Hide resolved
backend/lib/piece-indexer.js Outdated Show resolved Hide resolved
db/migrations/009.do.add-payload-retrievability-column.sql Outdated Show resolved Hide resolved
backend/lib/piece-indexer.js Outdated Show resolved Hide resolved
backend/lib/deal-observer.js Outdated Show resolved Hide resolved
backend/lib/piece-indexer.js Outdated Show resolved Hide resolved
backend/lib/piece-indexer.js Outdated Show resolved Hide resolved
backend/lib/piece-indexer.js Outdated Show resolved Hide resolved
backend/lib/piece-indexer.js Outdated Show resolved Hide resolved
db/migrations/009.do.add-payload-retrievability-column.sql Outdated Show resolved Hide resolved
db/migrations/009.do.add-payload-retrievability-column.sql Outdated Show resolved Hide resolved
Nikolas Haimerl added 2 commits February 6, 2025 16:05
@NikolasHaimerl NikolasHaimerl mentioned this pull request Feb 7, 2025
60 tasks
@NikolasHaimerl NikolasHaimerl requested a review from bajtos February 7, 2025 12:11
@@ -79,7 +79,9 @@ export async function storeActiveDeals (activeDeals, pgPool) {
term_start_epoch,
term_min,
term_max,
sector_id
sector_id,
payload_retrievability_state,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
payload_retrievability_state,
payload_retrievability,

(everything stored in Postgres is a state)

If you want to annotate the type, maybe instead this?

Suggested change
payload_retrievability_state,
payload_retrievability_enum,

I suggest we wait for @bajtos with this one

@@ -0,0 +1,4 @@
CREATE TYPE PAYLOAD_RETRIEVABILITY_STATE AS ENUM ('PAYLOAD_CID_NOT_QUERIED_YET', 'PAYLOAD_CID_UNRESOLVED', 'PAYLOAD_CID_RESOLVED','PAYLOAD_CID_TERMINALLY_UNRETRIEVABLE');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CREATE TYPE PAYLOAD_RETRIEVABILITY_STATE AS ENUM ('PAYLOAD_CID_NOT_QUERIED_YET', 'PAYLOAD_CID_UNRESOLVED', 'PAYLOAD_CID_RESOLVED','PAYLOAD_CID_TERMINALLY_UNRETRIEVABLE');
CREATE TYPE PAYLOAD_RETRIEVABILITY_STATE AS ENUM ('PAYLOAD_CID_RESOLUTION_NOT_ATTEMPTED_YET', 'PAYLOAD_CID_UNRESOLVED', 'PAYLOAD_CID_RESOLVED','PAYLOAD_CID_TERMINALLY_UNRETRIEVABLE');

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So as not to introduce another concept of "querying"

Comment on lines +3 to +10
const PayloadRetrievabilityState = {
NotQueried: 'PAYLOAD_CID_NOT_QUERIED_YET',
Unresolved: 'PAYLOAD_CID_UNRESOLVED',
Resolved: 'PAYLOAD_CID_RESOLVED',
TerminallyUnretrievable: 'PAYLOAD_CID_TERMINALLY_UNRETRIEVABLE'
}

const PayloadRetrievabilityStateType = Type.Enum(PayloadRetrievabilityState)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const PayloadRetrievabilityState = {
NotQueried: 'PAYLOAD_CID_NOT_QUERIED_YET',
Unresolved: 'PAYLOAD_CID_UNRESOLVED',
Resolved: 'PAYLOAD_CID_RESOLVED',
TerminallyUnretrievable: 'PAYLOAD_CID_TERMINALLY_UNRETRIEVABLE'
}
const PayloadRetrievabilityStateType = Type.Enum(PayloadRetrievabilityState)
const PayloadRetrievabilityStates = {
NotQueried: 'PAYLOAD_CID_NOT_QUERIED_YET',
Unresolved: 'PAYLOAD_CID_UNRESOLVED',
Resolved: 'PAYLOAD_CID_RESOLVED',
TerminallyUnretrievable: 'PAYLOAD_CID_TERMINALLY_UNRETRIEVABLE'
}
const PayloadRetrievabilityState = Type.Enum(PayloadRetrievabilityStates)

Does this make more sense? Payload Retrievability States lists all possible states. Payload Retrievability State is one of these states at a time.

We might want to wait for @bajtos with this one

if (deal.last_payload_retrieval_attempt) {
deal.payload_retrievability_state = PayloadRetrievabilityState.TerminallyUnretrievable
} else {
deal.payload_retrievability_state = PayloadRetrievabilityState.Unresolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so nice to read :)

await updatePayloadCidInActiveDeal(pgPool, deal)
deal.payload_cid = await getDealPayloadCid(minerPeerId, deal.piece_cid)
if (!deal.payload_cid) {
if (deal.last_payload_retrieval_attempt) {
Copy link
Member

@juliangruber juliangruber Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (deal.last_payload_retrieval_attempt) {
if (deal.payload_retrievability_state === PayloadRetrievabilityState.Unresolved) {

@juliangruber
Copy link
Member

@bajtos I'm merging now so that we can hopefully get more payloads assigned over the weekend, and thereby improve Spark DDO data

@juliangruber juliangruber merged commit 5b3f3fa into main Feb 7, 2025
8 checks passed
@juliangruber juliangruber deleted the nhaimerl-piece-indexer-missing-payloads branch February 7, 2025 17:18
Copy link

sentry-io bot commented Feb 7, 2025

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ Error: Connection terminated unexpectedly loadDeals(deal-observer) View Issue
  • ‼️ Error: Error updating payload of deal: { updatePayloadCidInActiveDeal(look-up-payload-cids) View Issue

Did you find this useful? React with a 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants