Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ways to check on the quality of images from the wikidata harvester #87

Open
hyanwong opened this issue Jul 6, 2024 · 0 comments
Open

Comments

@hyanwong
Copy link
Member

hyanwong commented Jul 6, 2024

At some point soon we'll want to be able to test what the image processing is like, compared to our current images. This could determine whether e.g. 35000 is a good "image quality" rating to use as a default for auto-harvested wikidata images, get a feel for the cropping etc.

I can think of a few ways to do this, assuming that we can harvest a set of (small) clades over the tree. Of course, once we have harvested, we can delete or retain all the src=20 images from the DB

  • we could delete all existing images from a OneZoom instance, run picProcess, and see what it looks like. This might be useful anyway, to check that we are setting copyright names etc. correctly. We should probably pick clades which have wikidata images that cover a few licence types. This would be a reasonable way to see how well cropping works, and we could look at the resulting images and compare them to our existing (EoL) images of quality (say) 35000, versus 30000-34999 versus 35001-40000.
  • We could leave the existing EoL images in the images_by_ott table, then see how many get replaced with new wiki images once we harvest the small clade set. We can easily change the rating arbitrarily after the harvest, by editing the DB tables then running picProcess, so we can test the effect of various rating values.

For reference, it's quite useful to look at #85 (comment) which shows some auto-harvested examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant