-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional wiki functionality: getting a list of alternative images from commons #75
Comments
Note that P373 can have multiple categories, although in many cases it's just one. There is also P910 which could be useful: "topic's main category". Category images are probably not in the dump, or maybe it's in a different dump. But since we'd be dealing with one image at a time, we may as well just call an API. |
Yes, I think it would be an API call to wikimedia. |
For reference, I found the call to make. e.g. for Lion: https://commons.wikimedia.org/w/api.php?action=query&prop=images&titles=Category:Panthera_leo&imlimit=500&format=json&utf8. Though it only returns 18 images, while https://commons.wikimedia.org/wiki/Category:Panthera_leo has 361. So not sure what's going on here. It would be quite easy to have a command that lists all the category images for an ott. It could even return it formatted as HTML so you can open it locally, but I'm not sure that buys much, since you may as well go to the wikimedia category page and view them there. Now if you want UI that is served by web2py and can handle taking a user selection and making the update, it's going to be more work as it needs a backend component. |
For example, if we go to our classic example https://www.wikidata.org/wiki/Q140 (lion), we get https://www.wikidata.org/wiki/Property:P373 = https://commons.wikimedia.org/wiki/Category:Panthera%20leo, so we can look up: And return all the images from that API call. To convert those image names to thumbnails for viewing, we could,, I suppose, follow https://stackoverflow.com/questions/33689980/get-thumbnail-image-from-wikimedia-commons. |
There is also the "commons gallery" for that taxon (https://www.wikidata.org/wiki/Property:P935), which point in the lion case to https://commons.wikimedia.org/wiki/Panthera%20leo - I believe this is a set of hand-collected images of that taxon, and so is sometimes a nicer curated example of a subset of pictures. We could probably specify whether to get P373 or P935, or both. |
Sorry - I only just saw that we were trying exactly the same thing! I think that you want |
As the risk of feature creep, one very useful wiki API function would be to take the wikidata qID and find the wikimedia commons category from the wikidata API (this is P373), then get a list of thumbnail image URLs of all the images in that category on commons. This would allow us to make a page where you could pick alternative bespoke images to harvest.
I'm guessing that it would be useful to roll that functionality into the
get_wiki_images.py
file, although it isn't necessary for the CLI use of that file. I'm tending to think ofget_wiki_images.py
as more like a set of library routines, however.The text was updated successfully, but these errors were encountered: