-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vcp-skd1 comparison, part 2 #10
Comments
Why equivalence classes?In the previous vcp-skd work, discusses in #9, we ran into a limitation. The equivalence class notion aims to handle this and other similar, but not yet identified, cases; and thereby paint a truer picture of the correspondence between verbs in VCP and verbs in SKD. basic ideaThe notion of 'equivalence classes' is a useful concept in many areas of mathematics. For example, integers can be constructed as equivalence classes of pairs of natural numbers (see). In our situation, we start with the set of entries from the Cologne digitization of a particular dictionary. Each entry has a specific Cologne id, which distinguishes that entry from all other entries in the particular dictionary. And each entry also has a particular headword spelling. Next, in our study of verbs, we use some method to determine which of the entries of the dictionary corresponds to a verb. This leads to a list of verb entries.
entry equivalence by headwordWe have silently assumed that two verb entries from a particular dictionary are equivalent if they In applying this to VCP, we get the equivalence classes of vcp_ecs.txt (and vcp_ecs_deva.txt). Consider the vcp equivalence classes (vcp_ecs.txt). An entry is a pair (headword,cologne id). |
equivalence classes with different headwordsWe are now in a position to modify the equivalence classes for a given dictionary. We do this by having a particular file. In the case of SKD, this file is skd_ecs_manual.txt.
From skd_verb_filter.txt, we know that there is just one entry Similarly, I determined that we should think of the entries for 'staBa' and 'stamBa' as equivalent. There is also a file for VCP: vcp_ecs_manual.txt. Currently this file is empty, which implies that currently we consider all the distinct verb spellings in VCP to be non-equivalent. As @Shalu411 continues studying the non-matching VCP and SKD verbs, I anticipate that we will add |
Matching VCP and SKD equivalences manuallyThe main focus of this research is to match VCP and SKD verbs.
The mapping between vcp equivalence classes and skd equivalence classes is presented in two short form of matching reportThis report form is vcp_skd_ec_map.txt (or vcp_skd_ec_map_deva.txt). Each line of this report shows a vcp equivalence class and an skd equivalence class, When there is no match for a vcp equivalence class, the skd equivalence class shows as '?'. When there is no match for an skd equivalence class, the vcp equivalence class shows as '?'. One further annotation is |
long form of matching report.This is report vcp_skd_ec_verb2.html and the Devanagari version The long form report contains all the information of the short report. additional information of long reportThe long report has some detail from the underlying dictionary entries. Note the |
Mapping principleThere are currently two methods of matching -- a 'manual' method and a 'general' method. The 'manual' method uses a file of headword spelling correspondences: vcp_skd_map.txt. For example These correspondences were developed mostly by me; I think @Shalu411 found some of them. The 'general' method uses the rule: |
Next stepsI think the next step is to continue the comparison of non-matches, using the two 'ec' mapping reports. This will likely turn up more examples like 'ujJa'. For example, I think 'drA'/'drE' is such an example. There are also some other kinds of cases which probably should match. It might be
@Shalu411 : the ball is now in your court! Hope I've given you enough material to proceed. |
Hariom Jim
|
So do I, our Bangalore Sanskrit scholar. |
@Shalu411 The main report is referred to as the 'long form of matching report', mentioned in The report file is named 'vcp_skd_ec_verb2_deva.html'. To get this file:
Your main task is to find how to resolve the '=?' cases of this report. For example, with
|
Off-line issue of Typos- |
Namaste Others are not found. I am not noting them separately when match is not found. |
Right, that's the way to do. |
Re अटाट्या : Do you suggest that SKD has a print error at अट्ट क तौच्छ्ये -- the error being that this Also, how to translate क तौच्छ्ये ? |
Disagree. SKD अन्धं 1295 is a nominal while VCP अन्ध 2362 is a verb. VCP अन्ध 2363 and 2388 are nominals. Comparing texts, I think VCP अन्ध 2363 corresponds to SKD अन्धं 1295 |
How is the list above derived ? What is your method? what things are you looking at (what files and or displays)? What are the criteria for putting something in the list? Are all the items supposed to be verbs? Why is VCP ऋश 10520 = SKD ऋश 5410 in the list (even though the spelling is same in VCP and SKD)? |
Namaste
|
Hope @funderburkjim is happy with the answers. |
This continues the root-matching exercise discussed in #9.
The programs and reports are in the vcp_skd1 directory. To see html files as html in your browser,
you will need to download the raw files and open the downloaded files in your browser.
The text was updated successfully, but these errors were encountered: