Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dump of todos/notes #19

Open
musnit opened this issue May 18, 2022 · 0 comments
Open

dump of todos/notes #19

musnit opened this issue May 18, 2022 · 0 comments

Comments

@musnit
Copy link
Contributor

musnit commented May 18, 2022

  • do a general overall audit on documentation and a run through/full code review to identify improvements
  • add more linter rules to reduce code diffs as team scales up
  • better documentation on each processor type and individual processor
  • cli cleanup or deprecation even
  • more secure, production grade API: run through considerations for postgraphile on https://www.graphile.org/postgraphile/production/ and add a maximum depth or prevent circular lookups
  • use more db transactions to ensure atomic processor updates, eg:
    • db transactions for bulk update/upsert/delete
    • batch each processor phase into a single db transaction so that all updates and their cursor get updated together
  • review sql schema for indexes and ensure all indexes are setup nicely for performance optimization
  • add more clear and comprehensive guidelines on processor requirements (eg: idempotency?, progress-tracking? progress-requirements, being pedantic about types)
  • better generic contract support: see review of ingestion schema and more contract examples #18 for overview
  • better error handling/logging/notification: it works ok already, but some things still to improve
    • metrics and alarming on errors so that we can get notified to fix
    • add errorList to fully process all known error tracks
    • handle metadata 404/526 errors automatically (cull those records or retry or retry or?)
    • catch processing errors from centralized apis better (eg: if api goes down or has empty data) so that a track will not move to processed if the api changes unexpectedly or returns bad data. there are likely some protections against this already but not 100% robust
    • automated delay/retries on errors, especially for handling slow ipfs pinning (eg: sometimes catalog drops dont get propogated properly across ipfs for a few minutes after release), slow/empty api responses (eg: sound tracks during the preview period)
    • fix noizd changing ids premint->postmint
  • some of the processors that are platform specific could likely still be refactored to be more generic & abstract, based on more of a JSON config or DSL, for example processCatalogTracks/categorizeZora -> generic addPlatformAPIData/splitSharedContract
  • process audio ourselves from lossless to lossy to ensure we have both for every track
  • add ipfs pinning processor to pin all metadata and media files ourselves, ideally with some edge caching solution too (maybe pinata or fleek are sufficient?)
  • make proper issues for all these todos :)
musnit pushed a commit that referenced this issue Aug 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant