-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MD subheadwords #12
Comments
md_1b_subhw.txt is a proposed intermediate form for sub-headwords. @Andhrabharati please take a look! |
readme_md_1b_subhw.txt provides some explanation of the conventions in md_1b_subhw.txt. Note - I am aware that this is incomplete in various ways. The current objective, as I see it, is to 'correct' this file manually. |
@Andhrabharati with MD sort of MW subheadwords as separate entries and up to 8000 headwords, 10% are verified by Jim manually and eventually it would take 4 more weeks for him. So I hope that in a few days it can be understood what is missing to take the job over from him, thanks. |
I have spent some time looking into your file today. I must express my feeling openly that you had spent much of your time (over two weeks, as mentioned by @gasyoun) for a wrong purpose; the reason being, you had erred at many places and also at times missed 'capturing' the intent of the author. Speaking of the author's intent, I have noticed that MD has clearly indicated the purpose of using italics, which surely applies to Boethlingk as well, MD having been closely followed BR's lexicons (in theme and style). [You may recall that we were pondering about the significance of italics in pwk sometime back, which is remaining unanswered so far.] |
Without going into much details, I thought I should atleast give an example entry to compare Jim's work (derived by AB) with what it should've been-- And if a basic text in the above (AB) manner is made, then the rest of the work could easily be done by Jim (programmatically). |
Is 'svara' simple-vowel-sandhi, e.g. 'a+a' -> 'A'? |
No, I am talking about the places where accents (as at pra-śaṃsā́ + ālāpa) are involved. |
scharfsandhi doesn't handle accents. Also not ‿. There may also be instances where the parts to combine are not handled as expected by scharfsandhi. My conclusion is that for the task of joining the parts (to get k1 from k2), I should write a separate module (perhaps making use of |
comparison of subhw form of AB and JimL-12291-jim-corr.txt. My correction of my version for L=12291. compare_H.txt comparison of AB version of L=12291 to Jim's correction. From this comparison, there are the same number of subheadwords. There is only 1 difference in H AB's form has no place for the identification of what is or is not a headword represented in MW, PW, etc. Similarly, the 'pfx + sfx' part of Jim's form is absent in AB's form. The 'sfx' is given from the {@-sfx@}, and AB version uses one line (with a tab) for each subhw. While Jim's version uses two lines for each subhw. Conclusion: AB's form and Jim's form are functionally equivalent. |
@Andhrabharati Will you undertake the task of completing the subhw markup according to your form? If so, you may find that you can start with my md_1b_subhw.txt, but discard the
If you decide to start with your md_AB_V2.txt, you may need to take into account change_notes_0b.txt. Let me know how you plan to proceed. |
This was the intention, when I had asked about the MD task earlier!! Now, I am just contemplating whether to delegate the task to @AnnaRybakovaT (not at all doubting her capacity to understand and do things; she has indeed been doing good jobs) or do it myself (looking at the complexity and the time-factor involved; I presume, I am unbeatable in quicker working).
Yes, the H4 entries would also be marked, wherever seen.
I see no practical value in marking the presence or absence of entries in a work wrt to some other work; hence having no interest in this part.
I haven't given my full file idea, which would be slightly beyond what Jim has proposed. Pl. have a look at my revised L-12291 file, L-12291 AB (revised).txt
Would you be finally making the 'new' entries with the iast text before the broken-bar or after? and any plan to 'pad' the devanagari strings as well to these entries? I have put the iast string before the bar (for now), as "iast header¦ body", taking that devanagari text would not be there, If devanagari also is going to be 'padded', then the notation "deva header¦ body" would be appropriate (as in the rest of the text file).
Would be doing many more corrections as well(!!), see for example, wrt your
[as in md_1b_subhw]
[as in print, with missed matter in typing]
Isn't MW having the I did not fully understand the X (non-substantive, non-verbal) type; probably it could (or might have to) be further divided into some 'meaningful' types.
I see that the dhAtus are clearly shown with all-CAPs (iast) in the print; so there is no need for explicit markup further. Probably, we can think of adding the √ to those strings (as done in my recent works, and your acceptance thereof). And yes, doing the work in two separate sessions is a good idea.
I would like to
|
Just by a cursory browsing, noticed that MD also requires too many markup corrections, as in AP90 that I had mentioned long back. |
Gone through the change_notes_0b.txt, and seen that 3 corrections made there are not required to be done,
|
This recalls me saying earlier that our mind-wavelengths match; we have similar thoughts on what to do, but different thoughts on how to do!! [I'd choose the simpler (necessary and sufficient) way, and you'd choose the rigorous way.] |
one interesting observation in the MD text: the ळ (slp1 L; cdsl iast ł) is rendered as iast ḷ [like ऌ (slp1 x; cdsl iast ḷ)] at many places (a big confusion)! |
In that case let's hand it over to @Andhrabharati, our biggest Indian contributor ("I am unbeatable in quicker working" - and unmatched). @AnnaRybakovaT will remain busy and will not each MD in 2024.
This @Andhrabharati remains crucial for @funderburkjim
no need for explicit markup further - disagree. CAPS are easy to catch for the eye, but ivisible for the computer code. So additional markup is badly needed. We need to unite humans and robots )
@Andhrabharati if you would only know how important it remains for me.
A pitty, but yes. SLP1 would be unbeatable for restoring the real picture. |
In fact, MD himself has used the √ symbol to denote the roots, when inside the body matter. While at the entry level the roots are shown in all-CAPs, in the body matter (after the √ symbol) they are shown in all-small letters [except at 2 (out of 1794) places-- √ CHṚD under L-7736 and √ CHĀ under L-7757; probably these two should be considered as print-changes to small letters in the name of consistency! Agree, @funderburkjim ?]. |
Also out of 329 places of
|
@gasyoun, is there someone outside India, that has done (or doing) such a voluminous (comprehensive) work like me? |
well, immediately the name of Donald Trump comes to mind
…On Mon, Jan 8, 2024, 14:26 Andhrabharati ***@***.***> wrote:
@Andhrabharati <https://github.com/Andhrabharati>, our biggest Indian
contributor
@gasyoun <https://github.com/gasyoun>, is there someone outside India,
that has done (or doing) *such a voluminous (comprehensive) work* like me?
—
Reply to this email directly, view it on GitHub
<#12 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADY4EMIT6SQNRR5KYBREC3TYNONRPAVCNFSM6AAAAABBNGQHCSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBQGQ4TIOJZGI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
comment on L-12291.AB.revised.txtA couple of suggestions, with examples.
discussed in the suggestions.
Good!
Yes. and I anticipate same usage in revised md.txt. The
Good idea.
I'd like to see your proposed form for kiMnara example
Regarding alternate hws:
I thought the final word was
I don't think the avagraha should be part of k1. Not sure about k2.
Good catch -- Should use ł as iast for slp1 L , as in MW. |
Please note that previous comment has been expanded from its original form. |
italics : MD vs. PWNote the quote in previous comment. MD uses italics for 'comments', non-italic for translations PW(K) is the opposite: italics for translation, non-italic for comment. PWG same principle as PWK: @maltenth --- Have I got that straight? |
I gave full credits to you and Jim both, whose contributions are the cornerstones of CDSL; and I was just referring to further works like corrections, refining etc. on these texts. It appears that you had taken me wrongly (in your sarcastic post). Anyway, would you pl. shed a light on the abbr. "M. or N." in MD text (p. 35), like you had helped identifying the "N. N." earlier in Boethlingk's lexicons? I think this denotes two person names (starting with M and N), but unable to go further. |
my reaction to your remark does not refer to the factual substance of what you or @gasyoun said but to the implication that you were not praised highly enough. |
@Andhrabharati Is work on md (subhw) progressing? Anything needed from me? |
It has gone to a much advanced stage than discussed above; but stalled now @funderburkjim. I do not need anything from you. |
@Andhrabharati no, you have become the third whale, the third pillar. Outside and inside India. Hope @funderburkjim agrees. No need to stop the work @Andhrabharati as no one comes even close to he level of depth of corrections or speed. |
Only kudos for @Andhrabharati contributions to cdsl ! I hope these contributions continue. There are numerous instances in these issues where my lack of carrying forward his suggestions are mentioned by AB. These are almost always due to my inability to keep up with him -- he can make improvement suggestions faster than I can process them! My aim is to eventually take into account ALL AB's suggestions. Let him be patient with my limitations. |
I would like to offer my unreserved apology to @Andhrabharati for my
remarks, and request him to speedily resume his unmatched contributions.
…On Sun, Jan 14, 2024, 02:23 Mārcis Gasūns ***@***.***> wrote:
@Andhrabharati <https://github.com/Andhrabharati> no, you have become the
third whale, the third pillar. Outside and inside India. Hope
@funderburkjim <https://github.com/funderburkjim> agrees. No need to stop
the work @Andhrabharati <https://github.com/Andhrabharati> as no one
comes even close to he level of depth of corrections or speed.
—
Reply to this email directly, view it on GitHub
<#12 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADY4EMPSFLEZ5PJAYA2I4FDYOLNLTAVCNFSM6AAAAABBNGQHCSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQG42DMMBWGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
My (in fact, even Jim and others') contributions to CDSL are voluntary, and no one has asked us for doing them. It is purely out of personal interest, that we are all doing these works. And my intention in asking @gasyoun (in my original post that has led to this 'storm in a teacup' of misunderstinding) is not for getting appraisals from anyone, but just to get the list of 'team' contributing to the 'corrections process' made [whose names are lying beneath the covers (or in Jim's mind) till now]. It is not at all a competitive work, but a collaborative effort that needs to be in such projects as CDSL. @maltenth , I did not anticipate an apology from you; I am really sorry, if my words have prompted you for this. I might resume the working (in my way) after a small gap; but shall be attending to the smaller stuff like what Dhaval is assigning me these days (after he took over the baton of doing the csl-corrections, from Jim.) |
@Andhrabharati as I'm preparing a paper on the Future of Cologne, such a |
Just want to mention I'm working on the md subheadword project. I am only editing the lines under the lines below the '* +' lines, but not the lines beginning with '1'. |
May I ask you not to spend your time in this MD_subhw issue, but to focus on other issues? Once I am back to work (probably within a week or so), I shall post my MD file; and most likely you would not hesitate to "take" the same. Thus your time and effort [the result of which may not be 'used finally'] on this issue might go wasted. |
@funderburkjim
@gasyoun / @drdhaval2785 |
I'm glad we agree in 99% of cases. If you use |
It is good that those, whose knowledge of Sanskrit is much greater than mine, are closely examining the cdsl versions of the dictionaries. |
Achsel = armpit (google translate). How do we know that aṃsa-kūṭa--pṛṣṭha is not a word unique to MD. How do we know that it is not proper? |
aṃsa is the shoulder (that is 'visible' on top side); no doubt about this, right? aṃsa-kūṭa is the kūṭa projection (hump) between the shoulders of the oxen. pṛṣṭha is rear or back; thus, the rear of the 'top-side' (aṃsa) shoulder is the 'underneath' (aṃsa-pṛṣṭha) armpit. |
Objective: make an 'mw-style' version of md.
The text was updated successfully, but these errors were encountered: