-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proper Names in SLP1 ({r}Ama for Rāma) #24
Comments
Is what you are looking for a list of MW headwords which are proper names? |
Indeed and not only that. I'm thinking how to extract all of them from all dictionaries. I'm ready to do even additional markup, but first I would want to listen to your bright ideas. |
For MW, many will be caught by searching for
So, I propose generating a list of such headwords. What kind of output are you looking for? |
The N. search will match some 'L-number' records that are not PROPER names, such as
|
#12 continued. Sure there are many false positives as the
How about adding the data to https://github.com/funderburkjim/MWlexnorm/ How about comparing the list with Mahabharata Index and Puranic Encyclopedia? |
Re '47027 ... seems closer to the truth' This sounds right. I did not realize that there were so many 'naked' Since you can generate the list, what help are you looking to me to provide? |
How should I spread the ab tags? How to widen the usage? Looking through my 30 000 additions manually does not sound to be a good idea. |
I need to see a sample of the file(s) you are using. |
Now only 700 words have it. I propose 30 000 should have them. There are no sample files. I need to understand how I can contribute in this markup expansion project, if you agree. I'm looking for names of living creatures. |
@Andhrabharati do you understand the issue? |
I do; but I've no intention to work for SLP1 stuff!! And guess, you should first need to consult Peter Scharf before playing around with SLP1 thus |
@gasyoun Here's something that might be relevant for getting some proper names.
Similarly, under The search A slight variant would give some more headwords that can be used as proper nouns:
Another variation
Is this approach in the direction you are going, or maybe you've already exhausted such an approach ? |
An entirely different approach might be to use the headwords in INM (Index to names in the Mahabharata). Presumably nearly every headword there is a proper name. Similarly, there are many proper names among the headwords of ACC. And back to the previous comment, another search that would lead to many proper-name headwords is (in MW):
And still another variant with many matches:
Seems like there are lots of searches that will get high density of matches to headwords that may appear as |
As the issue is 'continued' in another repo, this issue could be closed now. |
At drdhaval2785/SanskritSorting#27 Jim said
It would be possible to adapt transcoder files
to work with{}
for proper names. As I want to have the proper names in my Reverse dictionary I humbly ask to extract the proper names data from at least MW. Could you include in a plan, Jim, please?The text was updated successfully, but these errors were encountered: