-
-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test vidyut.prakriya programmatically against the Siddhanta Kaumudi. #157
Comments
Is this along the lines of what you have in mind?
|
Yes, this is almost exactly it! If you have time, these small tweaks would make the data basically perfect:
Once we have this data, we can turn it into a Google Sheet or similar and ask volunteers to go through these systematically and flag which ones are real errors. +cc @neeleshb . |
Something like this? https://docs.google.com/spreadsheets/d/1Sa_TI5-C37gRuepdn5Iwt0PTQTRrMYUp7rWiXIwQ2Sg/edit?usp=sharing FYI - the SK data that you linked to in the issue is indexed by |
Ah, how beautiful! Yes, this is basically it. I thought the data might be too large for sheets to handle in one sheet, but it's loading smoothly on my end. Misc notes:
Otherwise this is already immensely useful. I'm surprised words like |
|
(1) I suggest something like: if (2) of course, that's only natural when getting something up and running. 🙏 |
I found the issue with I'll continue going through this list. Would love your help rerunning this for a future iteration. Can you attach or share your script here? |
The idea was yours, so credit as well. Here is the script - https://gist.github.com/avinashvarna/d81f0304f3105206df4691f215da85c2 I can try to rerun it if I have time, but if not, feel free to use the script linked above. |
The spreadsheet has been updated. I've moved the previous sheet to |
Here's an important project I'd love help with: find words that vidyut.kosha does not understand so that we can add them to our test cases and improve our output. This would help immensely. It's a very high impact project that only needs a bit of time and some basic knowledge of Python and vyAkaraNam.
Here's the basic idea:
Once we have this CSV, we can turn it into a spreadsheet and share it with volunteers to mark which are real errors and which are just noise.
Tips:
Please share your early results either here or on our Discord server on the #vidyut channel.
The text was updated successfully, but these errors were encountered: