-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AB3 alternate form for mws #178
Comments
JIC if you are convinced to make new metalines for the 500+ cases with
i.e., at some of these lexical-change entries, the grouping structure also has to be provided. |
formal reviewWork is here. My local name for your file is temp_rev1_ab_orig.txt. change_notes_orig_iast.txt are my notes on these changes. The formal changes are mostly to enforce two internal consistencies:
@Andhrabharati I suggest your further changes should be based on this temp_rev1_ab_iast.txt file. I have not yet thought about the 'body' changes (such as '[]') and removal of the Note that the [NEW] lines are ignored in the conversion. Similarly ignored are any lines between OK so far? |
rev1aWork is in rev1a directory. About 25 changes, mostly L-ordering. L-nums must be not only unique, but also ordered |
comment on [][] vs. < div n="P"/>From AB's comment here:
Sample at first instance:
Clearly, the current [] display form is undesirable. |
@Andhrabharati The div markup seems preferable. And is similar in function to other I hope you will reconsider this point. |
random questionAre these special unicode characters a temporary experiment? Otherwise need for explanation of special purpose characters like these.
|
Yes, they serve a spl. purpose in my revision version; it was my mistake to copy those lines from my file, instead of taking them from the earlier cdsl file. For now, these are to be taken as
|
I agree for these, and it was my error having violated these "rules" in my posted file. [Rather, I was taking that Jim would "handle" these as appropriate, based on his post, namely
As such, I did not pay not much attention to these.] These are definitely to be changed/corrected in my file. |
Looked at the differences between the files "temp_rev1_ab_orig.txt" (my file, as renamed by Jim) and "temp_rev1_ab_iast.txt" (Jim's revised file), and except at two lines I have corrected all other lines mentioned in Jim's revision. And these two are given below, with my comments-- (22691) |
Changed all as suggested, but I have an issue at one entry, as given below--
AB Remark: I have changed the numbering here to suit the first meaning alone, as suggested; but this grouping should be extended till L-95075, encompassing all the lex-change (mfn. -> m. -> f. -> n. -> ind.) and meaning sense-change entries (to be with |
Now, coming to the two other posts 1 and 2 by Jim, I can only say that he has grossly mistaken/misunderstood/misinterpreted my point and considered the [] as a new line-break, which is not at all what I meant. What I was saying in my earlier posts at the other issues and the above one in this issue is that these [][] lines are to be made as entry-terminator I think, I need not elaborate this point further. |
It is for you, dear Jim, to re-look at my earlier (and the above) posts, and make the mw data in "uniform form". |
further defense of divPlease refer:
I think @Andhrabharati would agree with me that either display is a useful representation of the scan. The only minor flaw I see is that there is a missing semicolon. When comparing the two displays, the only difference is that the pre-div form has separate IDs (L-numbers) for the two senses. I view this difference as immaterial. I view the cdsl dictionaries as a kind of search engine. As I understand search engines (such as Elasticsearch, based on the Java Lucene project), there are two components to a search engine.
Why are there 'ABCE' in the <e> field of metaline? e.g. 3A MW has the '3' -- that refers to his 4-lines. Conceptually it could be part of the document. But the 'A' is purely an artifice introduced by me at some early stage of the development. 'A' means that the headword and lexical category is same as for the first headword for the document. In other words, the 'document' was split up into 'sub-documents'. In retrospect, I think this was a wrong choice. Why did I 'merge' the bodies of L=450 and L=451 of the 'and/or' form i Why did I put the <div n="P"/> markup before the second sense? Simply to recognize that this was a second sense, as indicated by the semicolon in the scan. I similarly merged 'B'-sub-entries and others that were encountered among and/or group work. The work done in issue 175 was focused only on the and-or groups.
These two could be merged as
So the same formalism can readily accommodate feminine forms. I present the above comments as further 'defense' of the <div n="P"/> markup introduced in issue175. I hope it furthers the dialogue with AB. |
(I think you meant lex="ind"). The 'document' encompasses both the mfn forms and the ind. forms. |
Refer Peter Scharf's website: https://sanskritlibrary.org/transcodeText.html Re L=6522,6523 ँ anunAsika U+0901 Devanagari Sign Candrabindu Re L=16252 ᳲ arDavisarga U+1CF2 Vedic Sign Ardhavisarga I can adjust cdsl transcoding files accordingly. |
For Jim's answer to 'how to mark?', compare
|
I have no issues with this!
With exactly the same view, I had completely got rid of these A-form metalines (that have
I have chosen to make new entries for all such (by appropriately padding the terminations given in braces in print), thus getting more HWs that could be 'directly' searched for; in the method adopted by Jim, those would not be 'searchable'! I have also considered grouping-inheritance (as appropriate; not all those lexical-siblings are with group-inheritance!) to these lexical-siblings (as I coined this term, and marked them with Ⓛ); these grouped-siblings are also 'out-of-searchability' in Jim's version.
My only point above was to make the full file in uniform and consistent style; but Jim has opted to limit the process just to these grouped entries, which is a miniscule portion of the whole. So, instead of asking him to continue the same to the whole rest of the text, thought it was convenient and easier for him to revert those 500+ cases to metaline form. Anyways, I have no interest debating further on this; but I would continue with my marking (which I think is in a better form). Jim can simply delete those [] lines and look at the differences in the rest of my file data and take action in the cdsl file (in a manner that he feels appropriate). |
And this issue can be closed now, as no probable discussion/action is envisaged further. |
Sounds like a good idea. When you conclude your marking of the annexure placements, do you plan to upload it ? |
YES! |
Need your stand on this, Jim! Would you like to go with the marking of rev-entries in the GRA style (with original and revised strings together side-by-side), or to go in a simpler manner that you did some of the MW revi-entries [just marking as |
example_rev_sup.txt has some
|
karvarI has been added as an alternate headword, - check MW display on cologne server |
Love you both, @Andhrabharati and @funderburkjim |
This issue continues the discussion begun in #176.
More specifically, the first objective is to examine @Andhrabharati 's version at this comment.
The text was updated successfully, but these errors were encountered: