Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'rXX' notes #1

Open
drdhaval2785 opened this issue Nov 22, 2015 · 9 comments
Open

'rXX' notes #1

drdhaval2785 opened this issue Nov 22, 2015 · 9 comments

Comments

@drdhaval2785
Copy link
Contributor

Earlier I thought that there would be uniform duplication or non-duplication dictionarywise.

A cursory reading of https://github.com/sanskrit-lexicon/hwnorm1/blob/master/proberrors/21violation.txt and https://github.com/sanskrit-lexicon/hwnorm1/blob/master/proberrors/rxx.txt showed me that it is not the case.

See the image.
akarkaSaH
but akarttA

S is not duplicated, t is.
capture

@drdhaval2785
Copy link
Contributor Author

I am trying to document some peculiarities if any.

@gasyoun
Copy link
Member

gasyoun commented Nov 22, 2015

21violation.txt is what kind of violation?

@drdhaval2785
Copy link
Contributor Author

21violation.txt

It stores cases where there are NO duplication after 'r' even if the dictionary is in SKD,VCP,SHS,WIL,YAT,PD

Our hypothesis states that there is duplication after 'r' in these dictionaries.
Whenever our hypothesis doesn't hold true - it is stored in 21violation.txt.

If it fits in our hypothesis - it is stored in rxx.txt

For cursory statistics
rxstats.txt - Cases where particular pattern occurs in a given dictionary (no duplication - even though duplication was supposed to happen)

rxxstats.txt - Cases where particular pattern occurs in a given dictionary (with duplication).

@drdhaval2785
Copy link
Contributor Author

Analysis of statistics

Observation 1 - PD dictionary doesn't do duplication as a rule. It does duplication as abberration.
See representative statistics

rkk pattern in PD dictionary is 2 / 12712
rKK pattern in PD dictionary is 0 / 12712
rgg pattern in PD dictionary is 0 / 12712
rGG pattern in PD dictionary is 0 / 12712
rcc pattern in PD dictionary is 0 / 12712
rjj pattern in PD dictionary is 0 / 12712
rtt pattern in PD dictionary is 20 / 12712
rdd pattern in PD dictionary is 0 / 12712
rpp pattern in PD dictionary is 0 / 12712
rmm pattern in PD dictionary is 2 / 12712
ryy pattern in PD dictionary is 0 / 12712
rvv pattern in PD dictionary is 2 / 12712

Therefore, PD deserves to be removed from list of duplicated dictionaries.

@drdhaval2785
Copy link
Contributor Author

Observation 2 -
None of the dictionaries do duplication in case of 'K','G','N','C','J','Y','w','W','q','Q','T','D','n','P','B','r','l','S','z',s','h'

Exception - there is a sole entry 'rww' in YAT. dArwwura:YAT
Exception - there is a sole entry durnnirIkzya:WIL
Exception - there are two entries durllaBa:WIL and sudurllaBaH:SKD

Grammar note - the nonduplication in 'K','G','C','J','W','Q','T','D','P','B' are due to the fact that the duplicated entries converted to 'rkK','rgG','rcC','rjJ','rwW','rqQ','rtT','rdD','rpP','rbB' because of the rule झलां जश् झशि॥ ८।४।५२. This means that I would have to capture these patterns also.

@drdhaval2785
Copy link
Contributor Author

Observation 3 -
Duplication in case of 'R' is predominantly seen in VCP
See

rRR pattern in SKD dictionary is 0 / 12712
rRR pattern in VCP dictionary is 212 / 12712
rRR pattern in SHS dictionary is 0 / 12712
rRR pattern in WIL dictionary is 3 / 12712
rRR pattern in YAT dictionary is 2 / 12712
rRR pattern in PD dictionary is 0 / 12712

drdhaval2785 added a commit that referenced this issue Nov 22, 2015
@gasyoun
Copy link
Member

gasyoun commented Nov 22, 2015

The absence of uniform duplication rules makes me sad as well.
But I guess we do not need all the possible details, seems there
are too many. The general pattern stats gives the picture that is
enough, to filter out repetitive words if needed.

@funderburkjim
Copy link
Contributor

addendum to dArwwura:YAT

Based on the sense,
WIL spells this dArddura.

MW spells this as dArdura.

dArwwura:YAT should be considered a print error, and be changed to dArddura:

  • Since YAT closely follows WIL and
  • dArwwura is out of alphabetical order in YAT: dArGasatra, dArQya, dArwwura, dArvvawa
    But dArddura would be restore alphabetical order.

Agree with correcting dArwwura to dArddura ?

@drdhaval2785
Copy link
Contributor Author

Agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants