Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abbreviation tooltips #11

Open
funderburkjim opened this issue Aug 10, 2023 · 45 comments
Open

Abbreviation tooltips #11

funderburkjim opened this issue Aug 10, 2023 · 45 comments
Labels
documentation Improvements or additions to documentation

Comments

@funderburkjim
Copy link
Contributor

This issue devoted to initial creation of a file from the 'LIST OF ABBREVIATIONS' provided in the front matter.
This was discussed here.

image

funderburkjim added a commit that referenced this issue Aug 10, 2023
@funderburkjim
Copy link
Contributor Author

@AnnaRybakovaT

Hi!

Instructions for first step prepared for you here,

Let me know when you see this note.

@Andhrabharati
Copy link

@funderburkjim

You guys need to keep this also in mind, to add at the end of the abbr. markup.

@Andhrabharati
Copy link

And then if required, I could finally step-in to identify the unlisted abbr.s present in the text, as I did an all other works.

@funderburkjim
Copy link
Contributor Author

The particular task of this issue is to prepare an abbreviation tooltip file.

A next step would be to add markup <ab>X</ab> in md.txt.

Within that next step, (or possibly as a second next step), we can consider the 'asterisk'-related markup.

@Andhrabharati
Copy link

Andhrabharati commented Sep 5, 2023

@funderburkjim

Here are the listed abbr.s in the MD print--
md-abbr.txt

As you might be somewhat free from pwk abbr. work now (till Thomas comes back to you), I am posting this file for your perusal and further action.
[I would just like to state here that I got quite many (~60) other abbr.s from the MD text.]

@Andhrabharati
Copy link

Andhrabharati commented Sep 5, 2023

It may be interesting to note that MD has employed both regular cap. N. and the small cap. ɴ. as abbr.s (the ɴ. being used 400+ times in the text, while the N. is present 4200+ times).

@funderburkjim
Copy link
Contributor Author

@Andhrabharati acknowledged. You are right - I'll be holding off changes to pwk until Thomas finishes.
I guess Anna is not available now. Agree that abbreviation markup for MD is needed.

@Andhrabharati
Copy link

If you're willing to use it, I can post my MD file, with many corrections (I just don't want to list them) incorporated.

It is something like the GRA file (from me) that you've used recently.

funderburkjim added a commit that referenced this issue Sep 18, 2023
funderburkjim added a commit to sanskrit-lexicon/csl-pywork that referenced this issue Sep 18, 2023
funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Sep 18, 2023
@funderburkjim
Copy link
Contributor Author

abbrev1

<ab> markup applied, based on the list of abbreviations shown in first comment of this issue.
Working directory: https://github.com/sanskrit-lexicon/MD/tree/master/mdissues/issue11

  • temp_md_1.zip the version of md with markup. Displays based on this version are installed at Cologne.
  • abbrev0 directory - preliminary work with the digitization provided by @Andhrabharati .
  • abbrev1 directory: application of markup
    • ab_count_1.txt shows counts of the global abbreviations, along with tooltip text.
  • ab_count_local_1.txt shows the (few) local abbreviations.

@Andhrabharati - What do you think should be done next? Are there differences between temp_md_1 and your version that you think the cdsl version should implement?

@Andhrabharati
Copy link

Andhrabharati commented Sep 19, 2023

Are there differences between temp_md_1 and your version that you think the cdsl version should implement?

Yes @funderburkjim, there are hundreds of types of changes (corrections), ranging from Sanskrit spellings (and/or accents) [sometimes even in headwords], English spellings, Greek spellings, wrong tags (italic, bold and sanskrit), hyphens, brace matching, … … …

And most important of them all is "decoding" the numeral marking ¤X¤ into various types, that I had mentioned earlier in a response to your query.

Next comes the 'relocation' of the homonym numbers that you had inserted recently, to their 'proper' position as per MD print and intention!!

You may look at various addl. tags that I had used--
<ab></ab>
<bot></bot>
<cl></cl>
<fr></fr>
<gk></gk>
<hom></hom>
<lang></lang>
<lat></lat>
<lex></lex>
<ls></ls>
<pe></pe>
<zoo></zoo>

Even if you would like to limit to abbr. markings, there are quite many yet to do, see for e.g. my extracted lists--
MD ab_local.txt
MD ab_global.txt
[And of course, there are many count differences as well between your version and AB version.]

@funderburkjim
Copy link
Contributor Author

@Andhrabharati Would you upload your version?

@Andhrabharati
Copy link

Andhrabharati commented Sep 19, 2023

Here is the file for your study/reference, @funderburkjim--
md_AB_v1.zip

  • I had removed the italic markers around the ab-tags etc. for better readability, with a thought that they could be rendered in italics while displaying them.
  • You may note that the '🞄' could be replaced by a line-break, to get closer (but not equal) to the cdsl version in terms of line count.

---------------------------------------
And here is the file, with a kind of semantic line-breaks for the sub-HWs inside the entries [they are not always the composite words formed from the main HW, but most of the times 'siblings' containing the first portion of the main entry!!]--
md_AB_v2.zip
[This is just done as a trial, not as a full (complete) work.]

@gasyoun gasyoun added the documentation Improvements or additions to documentation label Sep 22, 2023
@gasyoun
Copy link
Member

gasyoun commented Sep 22, 2023

Sanskrit spellings (and/or accents) [sometimes even in headwords]

headwords is what I value the most @Andhrabharati

@AnnaRybakovaT
Copy link
Contributor

Instructions for first step prepared for you here,

Let me know when you see this note

Dear Jim and dear all,
Glad to see you after some months break.

During summer time you have mentioned me in some topics (BHS Issue 4 and PWK Issue 95). Please let me know from what I should start now?

Regardind this current issue I can't clone the directory ( I a bit confussed - is it :
github.com/sanskrit-lexicon/MD/mdissues/issue11 or
github.com/sanskrit-lexicon/MD/tree/master/mdissues/issue11

in any case I had messages like:

fatal: repository 'https://github.com/sanskrit-lexicon/MD/tree/master/mdissues/issue11/' not found
---
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

@drdhaval2785
Copy link

I think you will have to do

git clone https://github.com/sanskrit-lexicon/MD.git

@Andhrabharati
Copy link

@AnnaRybakovaT,

Jim has already done what he wanted you to do (as a first step) reg. the MD abbr.s; so you need not bother about the same now.

However, @funderburkjim is yet to take up further changes based on my posted file.

And the pwk-95 has nothing more to do; it has gone into many major changes in the later days.

As such, I think, you may now see if BHS-4 issue interests you, as Jim has suggested.
But probably Jim might wish to put you on some other task; so, let's wait for his response.

@AnnaRybakovaT
Copy link
Contributor

As such, I think, you may now see if BHS-4 issue interests you, as Jim has suggested.
But probably Jim might wish to put you on some other task; so, let's wait for his response.

Great! Of course I would like to work with BHS. So I am waiting for the decision.

@funderburkjim
Copy link
Contributor Author

@Andhrabharati -- thank you for taking an interest here. I would prefer for you to work with @AnnaRybakovaT on either this MD topic or the BHS topic (or both)
(sanskrit-lexicon/BHS#4) according to what your mutual interest suggests.

My part would then be limited to helping integrate your work into the active displays for MD, BHS.

If these tasks are completed while Anna is available, then we might consider involving Anna in an AP90 task -- developing an MW-style display so that the AP90 nominal compounds would be accessible directly.

@Andhrabharati
Copy link

Andhrabharati commented Dec 18, 2023

@funderburkjim

I think, there is nothing more that Anna or I could do in this MD issue; it is ONLY you that can take-up further work based on my file(s) posted above.

@AnnaRybakovaT

Do you think you can proceed based on what Jim had suggested at BHS-4 (sanskrit-lexicon/BHS#4), or need anything more?

@Andhrabharati
Copy link

Andhrabharati commented Dec 18, 2023

On a 2nd thought [re-looking at Jim's posting], apart from AP90, I think my md_AB_v2.zip file can be worked upon, to make this MD another work having sub-HWs 'accessible' to online search-queries, after MW.

And, my BEN file (already posted long ago) also can be a similar candidate.

@AnnaRybakovaT
do you think you could take-up this piece of work, to make the 'full' HWs from the 'partial' sub-HWs, by appropriately filling up the (presumed) beginnings?
[Just browsing through my above file (look for <div/> tags) might give you some ideas!!]

@AnnaRybakovaT
Copy link
Contributor

AnnaRybakovaT commented Dec 18, 2023

Do you think you can proceed based on what Jim had suggested at BHS-4 (sanskrit-lexicon/BHS#4), or need anything more?

As I see - I have to work only with the file tagcount_ls.txt
In general I understood the task. Shall I make a copy of this file to make my changes in this copy?

@AnnaRybakovaT
Copy link
Contributor

AnnaRybakovaT commented Dec 18, 2023

'full' HWs from the 'partial' sub-HWs, by appropriately filling up the (presumed) beginnings?

I am so sorry but I need more explanations regarding this task. First of all could you kindly show me exsamples in the file - what is:

  • 'full' HWs
  • partial' sub-HWs

@Andhrabharati
Copy link

As I see - I have to work only with the file tagcount_ls.txt
In general I understood the task. Shall I make a copy of this file to make my changes in this copy?

That's correct; pl. go ahead.

@Andhrabharati
Copy link

'full' HWs from the 'partial' sub-HWs, by appropriately filling up the (presumed) beginnings?

I am so sorry but I need more explanations regarding this task. First of all could you kindly show me exsamples in the file - what is:

* 'full' HWs

* partial' sub-HWs

@AnnaRybakovaT
We shall come back to this MD task, after the above BHS work is done.

@Andhrabharati
Copy link

@funderburkjim

Is it OK if we take-up the MD sub-HWs work before the AP90 (that you suggested)?

funderburkjim added a commit to sanskrit-lexicon/csl-websanlexicon that referenced this issue Dec 23, 2023
funderburkjim added a commit to sanskrit-lexicon/csl-pywork that referenced this issue Dec 23, 2023
funderburkjim added a commit to sanskrit-lexicon/csl-pywork that referenced this issue Dec 23, 2023
funderburkjim added a commit to sanskrit-lexicon/csl-apidev that referenced this issue Dec 23, 2023
funderburkjim added a commit that referenced this issue Dec 23, 2023
@funderburkjim
Copy link
Contributor Author

AB.v1 (with a few minor changes) now installed as cdsl version.
temp_md_ab_1pe.zip has the changes, which I think you should incorporate in further versions.

Work done in abv1 directory.

Many new items added to the tooltips. See this commit [(0469541).
There are still 3 <disp>??</disp> with completely unresolved tooltips.
And several others with a ? in the tooltip where I was uncertain.

@Andhrabharati
Copy link

@funderburkjim

<cl> tag -- class of verb. <cl>X</cl> X is a roman-numeral

<cl>V.</cl> -> <cl>ᴠ.</cl> class 5 (33 - to avoid conflict with <ab>V.</ab> Vedict 2470 instances
<cl>V.</ab> is class 5 root
Change to use Unicode U+1d20 Latin Letter Small Capital V
<cl>ᴠ.</ab>

I would suggest to go for unicode Roman numerals (U+216x), throughout for the dhAtu class-numbers; as only 1-10 such cases [Ⅰ, Ⅱ, Ⅲ, Ⅳ, Ⅴ, Ⅵ, Ⅶ, Ⅷ, Ⅸ, Ⅹ] are required, we'd have no issues.
[Using a small capital letter ᴠ (U+1d20) [as above] is not a proper choice, as all other class numbers are in normal-size capital letters.]

Here is the updated file (hoping that you'd have no issue in agreeing to my proposal)--
md_AB_v1.zip

@Andhrabharati
Copy link

Andhrabharati commented Dec 24, 2023

There are still 3 <disp>??</disp> with completely unresolved tooltips.

Are these the ones having the superscript numbers ¹ and ² ?

They stand for 'rare' (or singular) occurrences in the whole 'text' being referred.

See what MD says in his Preface (p. ⅸ),--

image

Also this snippet reminds you of the pending work that I had mentioned above, to which you had also responded that it would be taken up next.

As I had already pointed this out (as above) to you, I deliberately did not mark it thus in my later working.

funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Dec 26, 2023
funderburkjim added a commit to sanskrit-lexicon/csl-pywork that referenced this issue Dec 26, 2023
funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Dec 26, 2023
funderburkjim added a commit to sanskrit-lexicon/csl-pywork that referenced this issue Dec 26, 2023
@funderburkjim
Copy link
Contributor Author

roman numeral revision.

@Andhrabharati accepted your revision re roman numerals. It is now installed. One correction:

At<L>19659<pc>358-2<k1>sfj
<ab>A.</ab> -> <lex>Ā.</lex>

Related revisions to mdab_input.txt in csl-pywork, md-meta2 in csl-orig.
For details, see the commits above or the mdissus/issue11/abv1 readme.txt.

Please note the ¹ and ² tooltips in mdab_input.txt. I translated these as 'one instance' or 'two instances' in the tooltips.

The three unknown abbrevs show as <disp>??</disp> in mdab_input.txt.

@funderburkjim
Copy link
Contributor Author

Also this snippet reminds you of the pending work that I had mentioned #11 (comment), to which you had also #11 (comment) that it would be taken up next.
As I had already pointed this out (as above) to you, I deliberately did not mark it thus in my later working.

@Andhrabharati I prefer to omit further work on this (e.g. marking '*' as abbreviation, with associated tooltip). Feel free to add such markup in a future version (and DOCUMENT what you do).

@Andhrabharati
Copy link

Andhrabharati commented Dec 26, 2023

The three unknown abbrevs show as <disp>??</disp> in mdab_input.txt.

Filled these, and also corrected a few other abbr.s--
mdab_input_AB.txt

Here is my updated file--
md_AB_v1.zip

And, just noticed that I did not change the 𝑃. (Purāṇa) occurrences inside the text file [it is rendered as a normal letter in the print, being within the italic string(s)!], though the abbr. list is having it thus. This shall be done while on v.2 (sub-HWs) work.

@funderburkjim
Copy link
Contributor Author

@Andhrabharati -- In your file [md_AB_v1.zip] https://github.com/sanskrit-lexicon/MD/files/13768757/md_AB_v1.zip from prior comment.

I see only 1 difference (under #upa at line 18596). Is this what you intended?

@Andhrabharati
Copy link

Andhrabharati commented Dec 26, 2023

Yes, while looking for nl. I've noticed the vb. px. here and thought it should've been vbl. px. (the upasarga)!

funderburkjim added a commit to sanskrit-lexicon/csl-pywork that referenced this issue Dec 26, 2023
funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Dec 26, 2023
funderburkjim added a commit to sanskrit-lexicon/csl-corrections that referenced this issue Dec 26, 2023
@funderburkjim
Copy link
Contributor Author

revisions installed.

Yes, I agree with the 'vbl.' change under 'upa', added a print change note in csl-corrections
Thanks for changes to mdab_input. I made a couple of changes to your changes,
as noted in abv1/readme.txt at
12-26-2023 AB rev to mdab_input.txt.
Or you can see via the commits above. I'm fairly sure that, in mdab_input.txt, 'sts.' is 'sometimes' (even though 'st.' is 'stem'!). I also used both 'absolute' (according to md print abbreviations) and 'absolutive' for 'abs.'

@Andhrabharati
Copy link

I'm fairly sure that, in mdab_input.txt, 'sts.' is 'sometimes' (even though 'st.' is 'stem'!).

Yes-- you're correct, @funderburkjim ; I did not pay proper attention to the file content.

----------------------------------

  • In particular, is there any 'alternate headword' markup that remains to be done in v1 ?

And, there are no alt. HWs, as I recall in MD.

I was wrong here as well!!
Seen now, that the text has quite many alt. HW candidates; but these all need to be 'marked' yet [like in the GRA and pwk].
Do you think this part could be done now, or along with the sub-HWs task sometime later?

[I also noticed that I had missed quite many nuances in the text earlier. Too bad of me, that I did not put my mind properly in the MD work.]

@gasyoun
Copy link
Member

gasyoun commented Dec 29, 2023

quite many alt. HW candidates

Interesting to know if any unique ones, as compared to other dictinaries, @Andhrabharati

@Andhrabharati
Copy link

I recall MD being helpful in resolving an issue or two while on MW two years back; no other work had those words!!

And yes @gasyoun, I think we might get few interesting entries from MD, if fully worked upon.

And I have noticed many entries in VCP, which are not anywhere else!!

@gasyoun
Copy link
Member

gasyoun commented Jan 3, 2024

noticed many entries in VCP, which are not anywhere else

of utmost interest such your remarks. Would love to call you tomorrow and talk about what is your vision on future on Sanskrit dictionaries, what is done, what is important and what will remain to be done for generations ahead. @drdhaval2785 @funderburkjim @AnnaRybakovaT how about a call on 4th of January 20:00 Moscow time?

@drdhaval2785
Copy link

I will be able to spend time from 20:00 to 22:00 Indian Standard Time. 20:00 Moscow time will be too late in night for me.

@gasyoun
Copy link
Member

gasyoun commented Jan 4, 2024

I will be able to spend time from 20:00 to 22:00 Indian Standard Time. 20:00 Moscow time will be too late in night for me.

I'm ready for a call 20 Indian Standard Time as well. Wrote in our sanskrit-lexicon Skype group.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

5 participants