-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gateway tracking whether requested content is in Database #14
Comments
@dchoi27 let me know if you have other thoughts/ideas of things we should look into in this context. Probably a special case if we fail to request but content is in the DB? |
Yes for sure how old the content is (when it was requested vs. when it was first uploaded) Can we track the metrics around the response for each of the groups above? E.g. if it's pinQueued, does it take longer / less reliable to fetch? |
Yes, so this would be targeting the incomplete uploads.
Yes, that's a good idea |
Awesome, SGTM |
This sounds very similar to the needs and plans we have for niftysave (discussed as recently as today with @mikeal ). I'm pulling in @the-simian here. You two may sync up on roadmap to implement this to meet both needs. |
@dchoi27 how important are these stats to us? In order to make this work nftstorage/nft.storage#1386 adds logic to hit the nftstorage db for every single CID that is requested from the gateway. That seems like an amplification point where a spike in traffic to the gateway cause a spike in requests to the nftstorage db... two systems that are currently isolated from each other become co-dependent. in the worst case, a sustainable increase in gateway trafffic could be an unsustainable increase in nftstorage db reads... we can and will continue to optimse and grow that db, but I'd feel more comfortable if we ditched these metrics and kept the gateways sparate from the nft.storage api Also notable adding these stats makes the current gateway impl less reusable / in need of more customisation to be used as a web3.storage gateway. |
So I think the main goals of these stats would be to:
The former probably gets solved by IPFS Elastic Provider in the long-run, so if there are good reasons not to do a live lookup for every CID to understand its pin status at the time, it's probably not worth doing. But for the latter, it'd be great to at least be able to have periodic datasets with samples showing a CID and when it was requested vs. when it was uploaded if there's a way to do that asynchronously, and in a way that doesn't risk the performance of the entire database. |
The solution here is going through logs and get metrics from a different analyser, like a Digital Ocean App similar to checkup tool Alan built |
We want to know if gateway requested CIDs are root CIDs stored in Content table (and also if they are Pinned).
Requirements:
The text was updated successfully, but these errors were encountered: