Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different markup for 'titular abbreviations' #172

Open
funderburkjim opened this issue Jul 4, 2024 · 17 comments
Open

Different markup for 'titular abbreviations' #172

funderburkjim opened this issue Jul 4, 2024 · 17 comments

Comments

@funderburkjim
Copy link
Contributor

funderburkjim commented Jul 4, 2024

Refer.

This issue started since the referring issue has been closed, and this idea needs attention sometime.

The discussion started with @aumsanskrit comment

The form "Up." and "Upapur." should remain in "the same coding" as the entire definition itself.

Currently these are marked in mw.txt as <ls>Up.</ls> and are displayed in the special html styling used for
the 'ls' tags.

@Andhrabharati suggested an alternate code such as <s1 n="Upaniṣad">Up.</s1>; then the display could

  • provide a tooltip for 'Up.' (from the n-attribute).
    • this tooltip could be initially derived from the 'mwauth' file, but could also be modifed from mwauth.
  • The display would be in the same text style as normal text, as <ab n="tip">ABBREV</ab> coding is displayed.

This is my understanding of the desideratum.


To what words should a new coding apply ? ProposaL; all EMPTY ls tags. In other words, to ls tags which are 'titular' -- i.e. there is no possible link target.  For example, <ls>RV.</ls> would get the new coding,
but <ls>RV. v, 86, 5</ls> would NOT get the new coding.

@Andhrabharati @aumsanskrit -- Your thoughts?

@Andhrabharati
Copy link
Contributor

Andhrabharati commented Jul 5, 2024

@funderburkjim

This is my understanding of the desideratum.
… … …
@Andhrabharati @aumsanskrit -- Your thoughts?

You've perfectly understood and paraphrased what I had suggested, and what Scott meant, for the desideratum.

To what words should a new coding apply ? ProposaL; all EMPTY ls tags.

  1. This is a too generic and simplistic (global replacement) rule termed as 'ativyāpti' in Sanskrit!
  2. There are some (very few) subtleties involved, which had prompted me to say thus--

[Only hitch is to "identify" the places that need this marking (appropriately leaving the citation places); a slightly involved work, but not too difficult!!]

and I guess you'd also realize the same, when actually "making" the changes in the file data for this.

@Andhrabharati
Copy link
Contributor

A clue lies in the term 'the citation places' [that are not 'linkable', i. e. without a 'link target']!!

@aumsanskrit
Copy link

I was going to agree with what Jim wrote, " In other words, to ls tags which are 'titular' -- i.e. there is no possible link target."

But Andhrabharati seems to be aware of some finer points of consideration to prevent "ativyāpti".

I am on the sidelines at this point, letting those who know better decide.

funderburkjim added a commit that referenced this issue Jul 5, 2024
@funderburkjim
Copy link
Contributor Author

AB's "clue" is too subtle for me. So I went ahead and

These use my above proposed definition of "titular" as abbreviation with no parameters.

Note there is no change to mw.txt (a good thing IMHO), but just changes to the displays. It could be readily adapted to displays of all dictionaries.

Let the other two participants compare and contrast.

@aumsanskrit
Copy link

Great Work Funderburkjim!

I randomly checked several entries and I am totally satisfied.

Now, we await Andhrabharati who may have more to say....

@funderburkjim
Copy link
Contributor Author

@Andhrabharati Your thoughts?

@Andhrabharati
Copy link
Contributor

You have to wait for two more days, as I am in the mid of a very imp. and interesting work.

So, I thought I should not divert my mind elsewhere.

@funderburkjim
Copy link
Contributor Author

Understood.

@Andhrabharati
Copy link
Contributor

Andhrabharati commented Jul 9, 2024

As Jim and Scott seem eager to see my comments, I have changed my mind a bit now, and here it goes.

  • My immediate impression is that the CDSL display has "lost" its charm with this correction, which otherwise has been visually "appealing" by employing various tags and colors.

and I guess you'd also realize the same, when actually "making" the changes in the file data for this.

  • I am glad that Jim did not "touch" the file text, and applied the correction only in the display code.

I did a quick parsing of the file data to "identify" the citation places & the "titular" abbr.s, by taking the "preceding word" of the ls-entity into consideration.

The summary is --

  1. The "titular" abbr.s are only at places "[^;,¦&>] <ls>[^ ]*</ls>", among the ones preceded by: an, as, by, in, of, on, or & to.
  2. And some probable candidates with preceding comma could be among the ones preceded by: at, by, in & to (being typo errors).

As such, only these are the candidates to be looked at, for "applying" the intended/suggested "correction".

On the whole, these could be just about 2-3k lines in the mw file data. [On a lighter-note, Jim's correction should go with a new term 'atyativyāpti', that crosses even the 'ativyāpti'.]
[And I can post these lines, once I am done with my on-going task, if Jim wants].

===============================
All the rest (mostly) indicate the citation places, whether the exact "position" in the work is mentioned by MW or not (though his parent source PWG/pwk is having the full citation "position" mentioned).

See for e.g. aṃśudhāna in MW (which has only <ls>R.</ls>) vs. PWG (which has "full" <ls>R. 2,71,9</ls>); and antaraprekṣin in MW (which has only <ls>MBh.</ls>; <ls>R.</ls>) vs. pwk (which has "full" <ls>MBH. 1,128,30. 7,117,5</ls>. <ls>R. 3,52,13. 5,9,46</ls>).

It is one of the (and the foremost!) primary points of PWG/pwk to mention the literary source "place(s)" as a SOLID proof for the word-usage [which at times even became a "train" of citations, thus taking much print space], which got 'reduced' in MW [whose primary goal is to "fit" the dictionary in a single volume] as giving the literary source "name" alone [for saving print space] in majority cases.

@Andhrabharati
Copy link
Contributor

Andhrabharati commented Jul 9, 2024

Any one, who has looked into the MW print pages, would easily notice the "undocumented point" that the citation places are preceded by a comma [to separate from the meaning part] or a semi-colon (or &) [to separate from the other citation places].

It is very unfortunate that CDSL mw.txt file has LOST the comma and semi-colon at innumerable places due to some mishap at some stage in its "evolution".

I have employed various patterns to identify such places, and inserted them in my revision file [Feb–Apr of last year itself].
And the remaining ones would be got while proofing the text [which for some reasons, did not take-off yet].

@Andhrabharati
Copy link
Contributor

Andhrabharati commented Jul 9, 2024

Look at L-20327,
<s>áṣṭaka</s> ¦ <lex>n.</lex> a whole consisting of eight parts (as each of the eight <s1 slp1="a/zwaka">Aṣṭaka</s1>s of the <ls>RV.</ls>, or as, <ls>TS. i</ls>, or as <s1 slp1="pARini">Pāṇini</s1>'s grammar &c.)

Production display--
image

Test display--
image

wherein the RV. and TS. i should go with <s1 n="" tag (instead of ls-tag), thus matching the surrounding tags very much.

This was my original (and still the same) proposal to change the ls-tag at all such places.

Incidentally, we can see that the above example has a comma after as (erroneously inserted at some stage; it is not in the mw earlier files) and the non-empty TS. ls-tag (<ls>TS. i</ls>).

So we may conclude that the correction should not be limited to the empty ls-tags.

@Andhrabharati
Copy link
Contributor

I randomly checked several entries and I am totally satisfied.

Now, we await Andhrabharati who may have more to say....

Scott, (I think) I do not need to say anything more than the above.

@funderburkjim
Copy link
Contributor Author

mwtestls1

url: https://sanskrit-lexicon.uni-koeln.de/work/mwtestls1/web/

Relative to this discussion, these displays are only slightly different.
Still, all the '(strictly) empty' ls references are marked differently,
but now with a pale blue color (per comment visually "appealing" by employing various tags and colors.) --
still ok with @aumsanskrit ?

I think @Andhrabharati has identifed a 3-fold classification of ls references in mw:

  • linkable
  • titular
  • empty but not titular
    • this will be the biggest group. How to present (e.g. with different color, smaller font-size)

[And I can post these lines, once I am done with my on-going task, if Jim wants].

'titular' class requires an independent datapoint, so YES your posting would be wanted.

I will probably use a different markup than the 's1' you suggest; maybe

  • <ls titular="yes">X</ls> or
  • <ls n="titular">X</ls>

Possible (remote) future research:
The 'empty but not titular' class of mw could be further divided into

  • those with full references in pw (and/or) pwg
  • those that are 'new' to MW. I wonder how many of these there are.

@Andhrabharati
Copy link
Contributor

Glad that a major chunk of my postings are being taken into consideration, for CDSL works.

@Andhrabharati
Copy link
Contributor

Andhrabharati commented Jul 10, 2024

I will probably use a different markup than the 's1' you suggest; maybe
* <ls titular="yes">X</ls> or
* <ls n="titular">X</ls>

I would just suggest having a look at the entries áṣṭaka (L-20327) and taittirīya-saṃhitā (L-87012) to "clearly see" what I was telling ("matching the running-text style")--

Jim's proposal to mark as ls n="titular" (in pale blue color)

image
(having RV. and TS. as the candidates)

contrast with my suggestion

I had already changed several <ab n="" tags with '°' as <s1 n="">X°</s1> in my revision
... ... ...
and several others with '.' as <s1 n="">X.</s1>

AB's proposal to mark as <s1 n="" (in auto, i.e. black color)

image
(having , Y., V. and YV. as the candidates)

@aumsanskrit
Copy link

"still ok with @aumsanskrit ?" Yes, very OK!

funderburkjim added a commit to sanskrit-lexicon/csl-websanlexicon that referenced this issue Jul 11, 2024
@Andhrabharati
Copy link
Contributor

Andhrabharati commented Aug 22, 2024

As Jim did not respond about my earlier post, I thought of giving another example (of Mahābhārata) for my choosing <s1 n="XXX"> tag for titular abbr. of ls-entities, as against the proposition by Jim [<ls titular="yes">X</ls> or <ls n="titular">X</ls>] now--

the "MBh.", with ls-tag (26 instances)

(128): title of sections 64-67 of the first book of the <ls>MBh.</ls>
(22450): <ab>N.</ab> of the chapters 70-79 in the second book of the <ls>MBh.</ls>
(79874): (as a chapter of the <ls>MBh.</ls>)
(99284): <ab>N.</ab> of a <ab>wk.</ab> containing 32 legends from the <ls>MBh.</ls>
(157109): according to the <ls>MBh.</ls>,
(186061): a <ab>Comm.</ab> on the <ls>MBh.</ls>
(186454): compiler of the <ls>MBh.</ls> and of the <s1>Purāṇa</s1>s;
(209702): he is said to have written down the <ls>MBh.</ls> as dictated by <s1>Vyāsa</s1>,
(225788): <ab>N.</ab> of a commentator on the <ls>MBh.</ls>
(239592): the subject of the <ls>MBh.</ls>
(255541): <ab>N.</ab> of a celebrated king to whom <s1>Vaiśampāyana</s1> recited the <ls>MBh.</ls>
(307088): <ab>N.</ab> of a section of the <ls>MBh.</ls>
(331238): public reader of the <ls>MBh.</ls>
(371936): where <s1>Sauti</s1> related the <ls>MBh.</ls>,
(377021): the 5 gems or most admired episodes of the <ls>MBh.</ls>;
(486784): <ab>N.</ab> of an episode of the <ls>MBh.</ls>
(488341): interpolated in the <ls>MBh.</ls>
(498933): <ab>N.</ab> of the 6th book of the <ls>MBh.</ls>
(498951): <ab>N.</ab> of a <s1>Stotra</s1> from the <ls>MBh.</ls> and from the <ls>BhP.</ls>
(508258): containing episodes from the <ls>MBh.</ls>
(612200): in the <ls>MBh.</ls>
(618458): in the <ls>MBh.</ls>
(660358): in the <ls>MBh.</ls> & <ls>Hariv.</ls> he is a son of the <s1>Vasu</s1> <s1>Prabhāsa</s1> and <s1>Yoga-siddhā</s1>;
(740000): <ab>N.</ab> of the section of the <ls>MBh.</ls>
(775020): <ab>N.</ab> of <ab>ch.</ab> of the first book of the <ls>MBh.</ls>
(810969): <ab>N.</ab> of a section in the <ls>MBh.</ls>

the "Mahābhārata", with s1-tag (17 instances)

(21550): the fourteenth book of the <s1>Mahābhārata</s1>
(82539): the first book of the <s1>Mahābhārata</s1>.
(90029): the third book of the <s1>Mahābhārata</s1>
(107479): the sixth book of the <s1>Mahābhārata</s1>.
(115230): the fifth book of the <s1>Mahābhārata</s1>
(139003): the tenth book of the <s1>Mahābhārata</s1>)
(153052): the eighth book of the <s1>Mahābhārata</s1>.
(495070): ‘short sketch of the <s1>Mahābhārata</s1>’,
(553702): the 12th book of the <s1>Mahābhārata</s1>
(553705): the 12th book of the <s1>Mahābhārata</s1>
(583925): does not like the <s1>Mahābhārata</s1>, represent the production of different epochs and minds),
(652807): the fourth book of the <s1>Mahābhārata</s1>
(693874): <the <ab>Comm.</ab> on the <s1>Mahābhārata</s1>
(829830): the tenth book of the <s1>Mahābhārata</s1>
(832917): the 11th book of the <s1>Mahābhārata</s1>
(840031): [<ab>e.g.</ab> the <s1>Mahābhārata</s1> and <s1>Rāmāyaṇa</s1>],
(856493): a central scene of action in the <s1>Mahābhārata</s1>

the "Mahā-bhārata", with s1-tag (41 instances)

(90011): the third book of the <s1>Mahā-bhārata</s1>.
(90050): (the <s>āraṇyakam parva</s> of the <s1>Mahā-bhārata</s1> is either the whole third book or only the first section of it)
(95239): the fifteenth book of the <s1>Mahā-bhārata</s1>.
(95251): the fifteenth book of the <s1>Mahā-bhārata</s1>.
(95665): the fourteenth book of the <s1>Mahā-bhārata</s1>.
(96963): the first book of the <s1>Mahā-bhārata</s1>.
(100305): the third book of the <s1>Mahā-bhārata</s1>
(107287): the warriors of the <s1>Mahā-bhārata</s1>.
(166262): composed by him &c. (<ab>e.g.</ab> <s>kārṣṇaveda</s> <ab>i.e.</ab> the <s1>Mahā-bhārata</s1>,
(477893): <ab>opp.</ab> to the <s1>Mahā-bhārata</s1>
(495013): sometimes identified with the <s1>Mahā-bhārata</s1>,
(527446): the 17th book of the <s1>Mahā-bhārata</s1>.
(527915): one who knows the <s1>Mahā-bhārata</s1>,
(538089): explains some of the incidents of the <s1>Mahā-bhārata</s1>; 
(555283): the 16th book of the <s1>Mahā-bhārata</s1>
(634282): a section of the <s1>Mahā-bhārata</s1>
(640992): one of the wisest characters in the <s1>Mahā-bhārata</s1>,
(663300): in the <s1>Mahā-bhārata</s1> and <s1>Rāmāyaṇa</s1>
(663779): part of the <s1>Mahā-bhārata</s1>.
(664040): <ab>N.</ab> of a portion ... of the <s1>Mahā-bhārata</s1>
(664043): <ab>N.</ab> of a portion ... of the <s1>Mahā-bhārata</s1>
(664046): <ab>N.</ab> of a portion ... of the <s1>Mahā-bhārata</s1>
(673353): the 2nd and 8th books of the <s1>Mahā-bhārata</s1>,
(680579): the narrator of the <s1>Mahā-bhārata</s1> to <s1>Janam-ejaya</s1>
(685731): the supposed compiler of the <s1>Mahā-bhārata</s1>,
(694814): <ab>N.</ab> of a <ab>ch.</ab> of the <s1>Mahā-bhārata</s1>
(701154): the ninth book of the <s1>Mahā-bhārata</s1>,
(704885): the 12th book of the <s1>Mahā-bhārata</s1>
(711332): the 17th chapter of the <s1>Anuśāsana-parvan</s1> of the <s1>Mahā-bhārata</s1>, ... the <s1>Sabhā-parvan</s1> of the <s1>Mahā-bhārata</s1>;
(713194): <ab>N.</ab> of a <ab>ch.</ab> of the <s1>Mahā-bhārata</s1>,
(715488): <ab>N.</ab> of a <ab>ch.</ab> of the <s1>Mahā-bhārata</s1>.
(715692): an episode of the <s1>Śānti-parvan</s1> of the <s1>Mahā-bhārata</s1>,
(715713): a section of the <s1>Śānti-parvan</s1> of the <s1>Mahā-bhārata</s1>.
(726125): an episode in the <s1>Mahā-bhārata</s1>.
(761662): the second book of the <s1>Mahā-bhārata</s1>
(761695): the <s1>Sabhā</s1> and <s1>Araṇya-parvan</s1> (of the <s1>Mahā-bhārata</s1>)
(796452): a fine episode of the <s1>Mahā-bhārata</s1>;
(826551): <ab>N.</ab> of a <ab>ch.</ab> of the <s1>Mahā-bhārata</s1>
(828180): the 10th book of the <s1>Mahā-bhārata</s1>
(852494): <a celebrated poem supplementary to the <s1>Mahā-bhārata</s1>
(857773): an episode of the <s1>Mahā-bhārata</s1>,

With these examples, doesn't it seem more appropriate to choose the s1-tag for such titular abbr.s?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants