Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature resize images #27

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Feature resize images #27

wants to merge 15 commits into from

Conversation

billybonks
Copy link
Collaborator

@billybonks billybonks commented May 28, 2022

Summary

Created a table for artworks in order to have flexibility in the amount of sizes and not have to add 2 or 3 columns per size.

  1. for hash
  2. for error

trigger checks is there any track that does not have an entry into the artwork table. artwork processing is atomic so either both get created or not.

IPFS Pinning trigger + processor has been built but not tested.

Deployment

Create a pr with the migration and merge it first
Install ImageMagic on EC2
add the new ENV variables
Merge this pr

State post deployment of this pr.

Different 700x700 and 200x200 sizes will be generated for all existing tracks, and they should be pinned

Next steps

would be to return a url with graphql and have a toggle in the app to see the experience with the new assets?

GA Release requirements

If we are happy with the results then we can decide if

  1. do we want to store the original reference somewhere ?
  2. Do we want to use the existing api or do we want to make a new api fro the assets
  3. Pre render all urls for minor performance ?

@billybonks billybonks force-pushed the feature/resize-images branch 2 times, most recently from 82cc0b3 to 4890997 Compare May 28, 2022 13:44
@billybonks billybonks requested a review from musnit May 28, 2022 13:44
@billybonks billybonks force-pushed the feature/resize-images branch 3 times, most recently from 579bdb4 to e0cfe7d Compare May 28, 2022 13:55
@billybonks billybonks changed the title Feature/resize images Feature resize images May 28, 2022
@billybonks billybonks changed the base branch from main to lint/autofix May 28, 2022 13:56
package.json Outdated Show resolved Hide resolved
@@ -45,6 +46,9 @@
"ethers": "^5.6.2",
"graphql": "^16.3.0",
"graphql-request": "^4.2.0",
"imagemagick": "^0.1.3",
"ipfs-api": "^26.1.2",
"ipfs-http-client": "^56.0.3",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i decided to use this just because its was easier for me, happy to use non sugar client, i see you mostly interact with axios

}

async function uploadBuffer(buffer: Buffer) {
const url: any = '/ip4/127.0.0.1/tcp/5011'; // have to use any because create only accepts hash
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to change to env var, Do we have a write url in env variables?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sent to u directly

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

name: 'processTrackArtworks',
trigger: missingProcessedArtworks,
processorFunction: processorFunction,
};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we care about new lines at end of file, i can set a rule for it

Copy link
Contributor

@musnit musnit May 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whatevr, but yes am a fan of spam more linting rules

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, i am a fan as well. what editor are you using btw?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vscode

@@ -18,6 +18,12 @@ export type NFTTrackJoin = {
processedTrackId: string;
}

export type Artworks = {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can make a new file for this type if required

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np whichever

trackId and error seem missing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also should be singular, just Artwork?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably should be singular

@billybonks billybonks force-pushed the feature/resize-images branch from 680e009 to da6ebc5 Compare May 28, 2022 14:03
@@ -0,0 +1,19 @@
import { Knex } from 'knex';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see u created this migration properly with timestamp - need to rename the other migrations to have timestamp too, as they're currently just in numeric order so could clash with timestamp ones - will do before merging this PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha ye i saw, was going to ask if i need to make mine the custom numbers

import { ProcessedTrack } from '../../types/track';
import { rollPromises } from '../../utils/rollingPromises';

const name = 'addMetadataIPFSHash';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const name = 'addMetadataIPFSHash';

unused?

await callback(path);
})
}
// when i use ProcessedTrack it claims lossyArtworkURL does not exist :shrug:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh - its an optional field:

type ProcessedTrackArtwork = { lossyArtworkURL: string } | { lossyArtworkIPFSHash: string };

So the type claims that at least one of these fields will be there, but not necessarily both. E.g. There are some music nft platforms that only provide centralized artwork URL and no ipfs involvement, and there are other ones that only provide IPFS hash and no URL.

I think IPFS hash should take priority, so logic should be someth like:

  • If is IPFS hash, get image from ipfsClient.getHTTPURL(processedTrack.lossyArtworkIPFSHash);
  • If no IPFS hash, get image from processedTrack.lossyArtworkURL

Also, for the first case in the condition above to actually work, we'll require:

  • lossyArtworkIPFSHash has been successfully pinned
  • pinning has completed,

otherwise there is a race condition where this processTrackArtwork runs before the ipfsHash has completed pinning. anyway i think for now its fine, don't worry about this and if the race condition occurs it will just error and can clean errors+retry later

as the URL may not exist if for example its only on IPFS, in which case need to pull it from our IPFS gateway as in src/clients/ipfs.ts:getHTTPURL

  • hm this also means that pinning the

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lossyArtworkIPFSHash has been successfully pinned
pinning has completed,

so check the database first then if its pinned use the hash otherwise the url.

my assumption for using ipfsHash first is because the the probability that ipfs will always exist as compared to the url is higher? or its a perf thing with the pin.

Copy link
Contributor

@musnit musnit Jun 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ipfsHash first prio is coz the hash is stored on chain and so more robust. whether the actual image is there or not is another question, but yes i think also more likely that the image exists too.

but ye ideally code should b able to handle all combinations of existing hash/ existing url or not, and existing actual file or not

can improve the edge cases it handles in future PR tho

})
}
// when i use ProcessedTrack it claims lossyArtworkURL does not exist :shrug:
const processArtwork = async function (clients: Clients, nft: any): Promise<void> {
Copy link
Contributor

@musnit musnit May 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const processArtwork = async function (clients: Clients, nft: any): Promise<void> {
const processArtwork = async function (clients: Clients, processedTrack: ProcessedTrack): Promise<void> {

(or maybe just name the var track if simpler, but nft is 100% misleading)

See above comment for why this codebase is being quite pedantic about types and rarely using any -> i think in most cases in this codebase, if it feels like Typescript is getting in the way of things, actually it's likely revealing some edge case u haven't thought of. Also don't be shy to keep the Typescript docs on hand while working with this codebase coz there are some uses of more esoteric features that require more robust understanding of Typescript.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ye this is true its mega misleading a bit of a legacy copy paste, ill fix the usage of the type :)

let imageProcessingResults: any = null;
try {
const [largeImageBuffer, thumbnailImageBuffer]
= await Promise.all([resizeImage(originalPath, '700x700'), resizeImage(originalPath, '200x200')]);
Copy link
Contributor

@musnit musnit May 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe put the sizes in a constant config var at top of file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

{ error: e.toString(), size: 'thumbnail', trackId: nft.id }
]
} finally {
await clients.db.insert(Table.processedArtworks, imageProcessingResults);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm so this will insert every processedArtworks row in a new DB request, one request per artwork, with all happening in parallel? not batched/super efficient but i think it's probably fine nonetheless?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's super easy to batch i can do a return here and insert afterwards. My main goal was to get the insert into the db since i am using the non cursor approach.

const buffer = await imageBuffer(nft.lossyArtworkURL);
return temporaryWrite(buffer, async function (originalPath) {
console.log(`wrote image to ${originalPath}`)
let imageProcessingResults: any = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let imageProcessingResults: any = null;
let imageProcessingResults: Artworks[] = [];

?

@@ -18,6 +18,12 @@ export type NFTTrackJoin = {
processedTrackId: string;
}

export type Artworks = {
id: number;
cid: string;
Copy link
Contributor

@musnit musnit May 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cid is optional (eg: if there's an error), also error is optional. but at least one of the two is required.

so to be super pedantic, could have a Union type of cidOrError = {cid:string } | {error:string }

and smth like

Artwork = {
  trackId: string;
  size: string;
} & cidOrError

export const up = async (knex: Knex) => {
console.log('Running create contracts bootstrap');
await knex.schema.createTable(Table.processedArtworks, (table: Knex.CreateTableBuilder) => {
table.increments('id');
Copy link
Contributor

@musnit musnit May 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm not sure if should have an increment id? trackId should be unique could just be used as the primary key. we do just use this in other places - don't think there are any autoincrementing ids anywhere else.

one of the motivation & requirements for no autoincrementing ids is to ensure that if we open source this and someone else runs it with their own endpoint, their DB looks the same and has the same ids as ours. With autoincrementing ids, this can't be guaranteed, for example consider each of the below scenarios:

  • we run the db
  • we add sound, catalog
  • we process sound tracks + catalog tracks on tuesday
  • on tuesday process sound tracks+catalog tracks from tuesday and we add noizd and process noizd tracks from monday and tuesday
  • on weds we process tracks from all 3

2 weeks later, someone else runs the thing:

  • they process all sound+catalog+noizd tracks from tuesday and wednesday

the other person will end up with a different order of IDs, since our initial run had the monday noizd tracks mixed in.

the current design has a nice property across all tables where the order of all operations in the pipeline is basically irrelevant and 2 different people running it will eventually converge on to mostly the same state and same db contents (with some minor buggy exceptions probably)

so this also means that if we add a peer-to-peer network at some point in future where different nodes share and sync their data, it will be easy to do so if everyone is on the same page about ordering, as there will be no need for debate or consensus between nodes on ordering.

(this motivation/requirement may not be that important tbh, could be a bit overkill/premature optimization, but given we've got this property already and it's easy to preserve for now, i think worth preserving)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ye i figured this would be a problem based on the other code that i read through :). was being a bit lazy. i cant use trackId since there are multiple images per track but i could follow your format that makes ids deterministic

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ye i mean it may make sense to actually process images from the nft directly rather than from the track, and use the nft id too. would need to handle the case where multiple nfts all have the same image tho.

@billybonks billybonks force-pushed the feature/resize-images branch from da6ebc5 to 284c419 Compare June 5, 2022 05:45
Base automatically changed from lint/autofix to main June 5, 2022 05:51
musnit added a commit that referenced this pull request Aug 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants