-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pUrbb vs. pUrvv #20
Comments
TE lines beginning with anusvaraThere are 852 lines of vac2 that begin with 'M', the slp1 version of anusvAra. Suggest we remove those initial 'M' in vac2. |
This will get us in trouble with @drdhaval2785
So dirt takes as visarga, now that is a huge number. Maybe forget about the idea of correcting it? |
|
Here's the whole list of 852: filter01.txt |
The point about that 100,000 is that maybe there is a way to 'automate' some significant The lakzmI/lakzmIH example may be idiosyncratic, but perhaps some of the differences |
candrabinduIn the slp1 coding of vac.txt, candrabindu is represented by the character '~' ; I believe this However, ~ is not used in vac2 (Tirupati); instead, the candrabindu is represented by 'z'; this Thus we should correct vac2 in such cases (changing such 'z' to '~'). There are 842 matches for ~ in vcp.txt. |
A bird's eye view shows that VAC is correct in majority of places. cInA-MSuka is correct. cInA-Suka is wrong ib VCP. So, we can not mechanically change VAC. On the contrary, VCP would require addition of those missing anusvAras. |
I agree. There is no possibility of ष being confused with candrabindu by any typist. So, we can mechanically convert z to ~ where vcp.txt has ~. |
I love the way Jim keeps on identifying low hanging fruits, to reduce labour. |
NYRnm v/s MI am not sure about the conventions used by VAC and VCP. But I saw some entries in meld, which were differing in these letters only. E.g. saMKyA and saNKyA. We can derive some stats to check the tendency of the dictionary, and correct the remaining entries to match them. |
duplicated / deduplicatedCheck for stats in VCP of The relevant portion from paper normalization.pdf is as below.
Therefore, we should remove all duplications after r in VCP and VAC. |
I guess one could call it lexicographical hell otherwise.
Right, the nasals.
What other issues of |
षार्वत्यां -> पार्वत्यां was noticed by a user correction (sanskrit-lexicon/csl-orig#495). There are several other zArvat and zArvvat possible errors in VCP to be investigated. |
I would propose that the issue is even wider: |
Before we make a blanket change of 'pUrbba' to 'pUrvva' , I would like to know that the scanned images actually have 'bb' -- Can anyone find 5 instances where 'pUrbba' is clearly 'b' ? |
My remark elsewhere is not just limited to this पूर्ब्ब, but to
as well. The Eastern school (of India) of grammars (and usage) are having those throughout the literature in (& from) that region. One may look at the |
My opinion is that any kind of normalisation should be done in another layer (for searching and displaying etc.), but not in the actual "content" of the printed matter. |
Interesting thought.
We have some data in tags added, that's all for now, I guess. |
I have some additional information now, and thought I should share the same here. (a) The consonant doubling is not prescribed by Mugdhabodha (Vopadeva), but has been identified by Pāṇini himself. So the replacement of the double consonant after r and h with a single consonant can be taken as grammatically alright. (b) Now coming to the perpetual ba/va issue. With this information, we can safely replace the conjuncted b with v [for handling the doubling cases, refer the above point] in all the Bengal based works (WIL, YAT, SKD, VCP etc.), when such v forms are seen in other regional texts (like AP or the European ones). |
The issue is still lingering in my mind. Probably we can do the va/ba replacement (and the reverse case, ba/va as well) in non-conjuct places too; say like klIva to klIba, if such are the forms used in other region texts. On the whole, it appears to be not a deliberate different form in Bengali works but just a limitation in their orthographs. And then the outsiders took the letters as is without understanding/knowing the Bengali limitation. This thus treats the va-ba issue in toto once for all, I believe. what do you say, @drdhaval2785? |
This was discussed elsewhere.
So there is nothing to be lost, but everything to gain by this b/v change. I am now convinced that we should make changes, and am making such changes in my VAC VCP comparision work. |
Good, and now I also have to take back what I said few months back that I cannot be a part in the team's exercise with the change suggested by Jim or you (in one of the issues in Meld usage). After you finish your comparision work, I would be glad to proofread the VCP text, for the benefit of everyone. |
As I am looking into SKD front pages now, found this piece of info under the section ग्रन्थपरिपाटी (Methodology adopted)- वर्णमालायां च वर्ग्य-जकारान्तःस्थयकारौ मूर्द्धन्य-णकार-दन्त्यनकारौ वर्ग्यवकारान्तःस्थवकारौ तालव्यशकार-मूर्द्धन्यषकारदन्त्यसकाराः सन्ति । एतदखिल-वर्णादि-शब्दानां धातूनाञ्च प्रभेदं कृत्वा सूचीपूर्व्वकं यथास्थानं संस्थापनं कृतवान् । वङ्गदेशे उक्तवर्णानामुच्चारण-भेदाभावः । विशेषतो वकारद्वयस्याकारोच्चारणयोर्भेदो नास्ति पश्चिमादिदेशे वर्त्तते । किन्तु मुग्धबोधटीकायां दुर्गादासविद्यावागीशधृता वकार-भेदिकैकप्राचीनकारिकास्ति । सा यथा, -- Here, we are cautioned not to change every va/ba-kAra blindly (एतत्कारिकया सकलवकार-प्रभेदो न भवति). BTW, contextually the वर्ग्यवकारान्तःस्थवकारौ in the above text should not be changed to वर्ग्यबकारान्तःस्थवकारौ (as this has been referred a few lines later as वकारद्वय-भेदं), though there is no "vargya-va" in the rest of India. |
Here SKD is giving the prevalent practice in Bengal that (j,y), (N,n), (b,v) and (S,z,s) groups [वर्ग्य-जकारान्तःस्थयकारौ मूर्द्धन्य-णकार-दन्त्यनकारौ वर्ग्यवकारान्तःस्थवकारौ तालव्यशकार-मूर्द्धन्यषकारदन्त्यसकाराः] to be without a difference in pronunciation. Wilson in his dictionary (1st ed., 1819) preface quotes thus- रलयोर्डलयोस्तद्वज्जययोर्बवयोरपि । “The letters R and L, D and L, J and Y, B and V, Ś and S, M and N, a final visarga or its omission, and a final nasal mark or its omission, are always optional, there being no difference between them.” Thus Wilson has covered a larger regional variations in India, than SKD. Thought @funderburkjim might catch a piece or two (with his interest in "knowing" Skt.) through my posts, which could be of some help in cleaning the CSL texts. |
With the above information before us, what is your opinion about changing कोष to कोश? This has been a long pending issue in my mind. |
As both are valid words, I would not change कोष or कोश |
May this issue be closed? |
This is one of those 'b/v' problems that @drdhaval2785 loves.
In vcp2, there are 1303 matches for `pUrvv' and 1474 matches for 'pUrbb'.
In vac2, there are 2051 matches for 'pUrvv' and 236 matches for 'pUrbb'.
I suggest we change all the 'pUrbb' to 'pUrvv' in both vcp and vac.
This would remove a nice chunk of needless differences.
What do others think?
The text was updated successfully, but these errors were encountered: