Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrections in lsextract_mw.txt #135

Closed
Andhrabharati opened this issue Jul 1, 2022 · 83 comments
Closed

Corrections in lsextract_mw.txt #135

Andhrabharati opened this issue Jul 1, 2022 · 83 comments
Assignees
Labels

Comments

@Andhrabharati
Copy link
Contributor

Andhrabharati commented Jul 1, 2022

Just had a cursory look at the file, and felt it needs a good revision/updating.

Here are some quick finds-

00001 Beames ? (Author)

; John Beams (for more details, look at Wiki)

00001 Hunter ? (Author)

; William Wilson Hunter (for more details, look at Wiki)

00000 Bn. ? (Author)

; Look at this #121 (comment), which had tackled the entry. Hence, to be deleted.

00000 Wh. Whitney, W. D. (1872). (Author)

; to be deleted, as "Wh." has been merged into "00001 Wh. and Ro. Atharva Veda Sanhita, Volume 1," now.

00000 Uttamac.2 2 Uttamacaritra in about 700 verses (Tit

; occurs under <L>68385, but differently marked.

00000 Bādar.,Sch.. Śaṃkara's Śārīraka-mīmāṃsā on Bādarāyaṇa

; There are as many as 46 "<ls>Bādar.</ls> <ab>Sch.</ab>" in the text.

00000 Nid,Sch.. Nidāna,Sch.i.e.Vācaspati's Comm. (Title)

; There are 4 "<ls>Nid.</ls> <ab>Sch.</ab>" in the text.

00000 Uṇ,Sch.. Uṇādi-sūtra,Sch.i.e.Ujjvaladatta
; There are as many as 52 "<ls>Uṇ.</ls> <ab>Sch.</ab>" in the text.

This clubbing together of a work with its commentary as THE "ls item marking" is what I had suggested in one of the issues at PWG recently.

Similarly many other entries should also be checked once and updated.
-----------
And there are quite many <ls> entries having a comma or space after the <ls> word, and also without any punctuation mark (, or .) at all.
This needs correcting the resp. places in the mw.txt itself, apart from updating the lsextract counts.

See for e.g.

00970 ŚāṅkhŚr. Śāṅkhāyana-śrauta-sūtra (Title)

while there are 977 of "ŚāṅkhŚr" in the text,
1 "ŚāṅkhŚr," (<ls>ŚāṅkhŚr, xiii</ls>), 1 "ŚāṅkhŚr " (<ls>ŚāṅkhŚr vii</ls>), and 5 "ŚāṅkhŚr" (<ls>ŚāṅkhŚr</ls>) are missed in the count.

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 1, 2022

<ls>Beta bengalensis</ls> under <L>60183, <L>69053 and <L>74487 to be tagged as <bot> entity.

@Andhrabharati
Copy link
Contributor Author

And finally the <ls> entries extracted and filtered now from the latest mw.txt gave as many as 848 entries, in comparison to 737 in the lsextract listed by Jim, giving nearly over a 100 additional entries!

lsextract (AB).txt

Even if some of these extra items are resolved as typos, some additional entries would surely remain to be added into Jim's listing.
[Seen that some entities are with both a caret marking and macron marking, thus got separately identified now.]

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 1, 2022

And there are 17 items marked NONE in the description, that seem to have a need of identifying a proper tooltip--

Śūdradh. | NONE (Title)
Śaṃkaracetov. | NONE (Title)
Śrāddhac. | NONE (Title)
DrāhyŚr. | NONE (Title)
Mit. | NONE (Title)
Ratnak. | NONE (Title)
SaṃnyUp. | NONE (Title)
TśUp. | NONE (Title)
TAnukr. | NONE (Title)
TaṇḍināmUp. | NONE (Title)
Udbh. | NONE (Author)
Vākyap. | NONE (Title)
Vaidyajīv. | NONE (Title)
Vaidyaj. | NONE (Title)
Vaidyakaparibh. | NONE (Title)
Vallabh. | NONE (Author)
Vidvanm. | NONE (Title)

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 22, 2022

Most of the tooltip "Unknown references" are resolved in this file--

abbrevlist_unknown_resolved.txt

Guess @funderburkjim could workout with this data.

Another 7 remaining to be resolved--

90.67 Par.
90.76 PS.
90.83 RLM.
90.87 RāmRās.
91.07 Viśv.
91.16 Śāntaś.
92.15 Sūtrakṛt.

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 22, 2022

The unknown references are identified by padding the <ls> name to the "orphan numbers" etc. in the text, and it appears that @funderburkjim had missed few such places.

Also seen that some Capital_words missed by Jim in his attempt to identify the <ls> candidates sometime earlier.

I will be posting those lists shortly.

@Andhrabharati
Copy link
Contributor Author

@funderburkjim

Should the 17 entries listed at #135 (comment) be handled?
These seem not appearing in the tooltip file; which is the final file that to be used for this purpose?

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 22, 2022

(847375):
= 1. <s>bandin</s>1) to be changed to
= 1. <s>bandin</s>)

Under <L>29692
<ls>MBh. xii,i 1220</ls> to be changed to
<ls>MBh. xii, 11220</ls>

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 22, 2022

The unknown references are identified by padding the name to the numbers etc. in the text, and it appears that @funderburkjim had missed few such places.

Identified 96 <ls> orphan candidates (with multiple numbers)--
ls orphans-1.txt

Similarly, orphans involving single numbers to be identified.

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 22, 2022

There are
(a) 939 cases of <ab>Vārtt.</ab> [0-9]+, to be linked appropriately to the resp. P. (Pāṇini) number-seq.
(b) 20 cases of <ls>RTL.</ls> p.[0-9]+

And some individual cases are in
ls orphans-2.txt
ls orphans-3.txt

The exercise to find the <ls> orphans with single number is to be continued still.

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 22, 2022

Another 7 remaining to be resolved--

4 resolved--

90.76 PS. Unknown reference [Cologne Addition] Title > Paiṭhīnasi-Sūtra (in Paippalāda Saṃhitā) ?
90.87 RāmRās. Unknown reference [Cologne Addition] Title > RāmaRāsa(līlā) (in Bṛhatkośalakhaṇḍa) ?
91.07 Viśv. Unknown reference [Cologne Addition] Title > Viśvāmitra gotra; Kāty. herein refers to Kātyāyana gotra
92.15 Sūtrakṛt. Unknown reference [Cologne Addition] Title > Sūtrakṛtāṅgavṛtti (of Śīlācārya)

Still 3 more to be deciphered.

@funderburkjim
Copy link
Contributor

Format of abbrevlist_unknown_resolved.txt is fine. Plan to begin my part of work on this on Monday (3 days from now).

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 22, 2022

Got one more entry--

91.16 Śāntaś. Unknown reference [Cologne Addition] Title > Śāntiśataka [print correction as Śāntiś.] [Śāntiś. 3.8]

image

Another 2 to be got.

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 22, 2022

And there are 17 items marked NONE in the description, that seem to have a need of identifying a proper tooltip--

Śūdradh. | NONE (Title)
Śaṃkaracetov. | NONE (Title)
Śrāddhac. | NONE (Title)
DrāhyŚr. | NONE (Title)
Mit. | NONE (Title)
Ratnak. | NONE (Title)
SaṃnyUp. | NONE (Title)
TśUp. | NONE (Title)
TAnukr. | NONE (Title)
TaṇḍināmUp. | NONE (Title)
Udbh. | NONE (Author)
Vākyap. | NONE (Title)
Vaidyajīv. | NONE (Title)
Vaidyaj. | NONE (Title)
Vaidyakaparibh. | NONE (Title)
Vallabh. | NONE (Author)
Vidvanm. | NONE (Title)

14 resolved--

Śūdradh. | NONE (Title) > Śūdradharma
Śaṃkaracetov. | NONE (Title) > Śaṅkaracetovilāsacampū
Śrāddhac. | NONE (Title) > Śrāddhacintāmaṇi
DrāhyŚr. | NONE (Title) > Drāhyāyana Śrautasūtra
Mit. | NONE (Title) > Mitākṣarā
Ratnak. | NONE (Title) > Ratnakoṣa
SaṃnyUp. | NONE (Title) > Saṃnyāsopaniṣad
TAnukr. | NONE (Title) > TaittirīyaAnukramaṇikā
Udbh. | NONE (Author) > Udbhaṭa (Kāvyālaṅkārasārasaṅgraha)
Vākyap. | NONE (Title) > Vākyapadīya, a grammatical treatise (by Bhartṛhari)
Vaidyajīv. | NONE (Title) > Vaidyajīvana (of Lolimbarāja)
Vaidyaj. | NONE (Title) > Vaidyajīvana (of Lolimbarāja)
Vaidyakaparibh. | NONE (Title) > Vaidyakaparibhāṣā
Vidvanm. | NONE (Title) > Vidvanmodataraṅgiṇī

3 more to be got still--

TśUp. | NONE (Title)
TaṇḍināmUp. | NONE (Title)
Vallabh. | NONE (Author)

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 25, 2022

Also seen that some Capital_words missed by Jim in his attempt to identify the <ls> candidates sometime earlier.

I will be posting those lists shortly.

Cap. letter words are broadly treated in two categories--

<ls> related [76 instances]
Cap. letter (ls related).pdf

and miscellaneous [160 instances]
Cap. letter (miscellaneous).pdf

Also there are 78 instances where " Jain" (i.e. preceding with a space) is to be tagged as <ns>Jain</ns>

And there are 3 other miscellaneous corrections, that I wanted to list separately--
(164391): 'V' (u+0056), mentioned as a symbol, could be changed to a true symbol like character '⋎' (u+22CE), or at least as '⋁' (u+22C1).

(41998): (a Lyrae) to be changed as (<lang n="greek">α</lang> Lyræ)
;[There are many more places requiring ae > æ change; and some places that need oe > œ change.]

(706145): <ls>Ked.</ls>N. to be changed as <ls>Ked.</ls>; <ab>N.</ab>
;[should this portion be split into a separate line from here??]

Hope this gets Jim's attention, in his present working on MW.

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 25, 2022

@funderburkjim is requested to generate the IAST version again, after correcting (a) the <ls> and other issues herein & (b) the #131 <hom> related issues that he is "considering" next.

Probably he could consider the #137 (Miscellaneous corrections) as well in the current session itself.

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 25, 2022

Got another one!

90.67 Par. Unknown reference [Cologne Addition] Title > Pur. [print correction; there are 230 instances having the sequence <ls>Kāv.</ls>; <ls>Pur.</ls>, except this one]

One more remaining now--

90.83 RLM.

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 25, 2022

The items posted in 3 files under #112 are checked now; the items yet to be corrected are being posted here for @funderburkjim to have a look at.

These contain 5 <ls> related and 39 misc. corrections--
small letters (ls and misc.).pdf

And then, there are 60 untagged or new abbr.s--
untagged or new abbr(s).txt.txt

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Jul 26, 2022

MW has kept a space after the number(s) in <ls> citations invariably (though he chose not to have the space in pure numbers!).

Seen 13 instances at the latest file data having no space in the <ls> citations, out of which
(a) 1 RV. citation & 4 AV. citations are working for links
(b) 1 RV. citation yet to be "made" to work
(c) 3 R. citations not working
(d) 4 works not yet having link-targets

(39338): <ls>AV. vi, 117,1</ls> ;[working]
(51733): <ls>R. vii, 7,3</ls> ;[not working]
(69064): <ls>R. vi, 8,10.</ls> ;[not working]
(73752): <ls>R. i, 67,15.</ls> ;[not working]
(78042): <ls>AV. xiii, 2,3</ls> ;[working]
(203832): <ls>BhP. iv, 16,7.</ls> ;[yet to link a target]
(257301): <ls>Hit. iii, 8,1/2</ls> ;[yet to link a target]
(264933): <ls>BhP. iii, 18,1</ls> ;[yet to link a target]
(316762): <ls>AV. viii, 6,3.</ls> ;[working]
(450250): <ls n="RV.">iv, 5,14</ls> ;[working]
(512900): <ls>RV. (<ab>esp.</ab> vii, 18, 6; viii, 3,9 &c.)</ls> ;[not working]
(774326): <ls>AV. viii, 9,18</ls> ;[working]
(816930): <ls>ChUp. v, 12,1</ls> ;[yet to link a target]

In any case, it is suggested to insert a space in these 13 instances also-- to make all the citations uniformly marked.

funderburkjim added a commit to sanskrit-lexicon/csl-pywork that referenced this issue Jul 27, 2022
funderburkjim added a commit to sanskrit-lexicon/csl-apidev that referenced this issue Jul 27, 2022
funderburkjim added a commit to sanskrit-lexicon/csl-websanlexicon that referenced this issue Jul 27, 2022
funderburkjim added a commit to sanskrit-lexicon/csl-corrections that referenced this issue Jul 27, 2022
@funderburkjim
Copy link
Contributor

change_1

The changes based on file abbrevlist_unknown_resolved.txt have been installed. Changes based on the numerous additional comments will be handled in a next phase.
The work is done in the issue135 directory.

  • readme.txt describes the steps taken.
  • change_1.txt contains the small number of changes to mw.txt.
    • print_change.txt indicates those changes (about 30) which are print changes to ls abbreviations.
    • these are also documented in the csl-corrections repository mw_printchange.txt file.
  • tooltip.txt in pywork repository at commit 8c29bcc is the current revised list of ls abbreviations and tooltips.
  • tooltip_2_unused.txt lists abbreviation tooltips removed from tooltip.txt because they have no usage instances in mw.txt.

@Andhrabharati
Copy link
Contributor Author

  1. DharmaP.
    old tip: = tooltip 02:45 [Cologne Addition]
    new tip: Dāyabhāga [Cologne Addition]

My file has this--

90.26 DharmaP. Unknown reference [Cologne Addition] Title > DharmaPurāṇa

funderburkjim added a commit to sanskrit-lexicon/csl-pywork that referenced this issue Aug 5, 2022
funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Aug 5, 2022
funderburkjim added a commit that referenced this issue Aug 5, 2022
@funderburkjim
Copy link
Contributor

Correction re BhP. (B.)

I think I understand it finally. See note at Further revision re BhP. (B.) in readme for issue135 directory.

Closing this issue.

@Andhrabharati
Copy link
Contributor Author

90.83 RLM.

@Andhrabharati please provide a reference to the text mentioned in the above comment, and
a link to a pdf if such is available.

@funderburkjim

It is from the front pages of ACC.

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Aug 5, 2022

There are 10 places with a ? in the tooltips as of now.

00001 90.16 :: Bṛ. :: Śatapatha-brāhmaṇa ? [Cologne Addition] :: Title

This occurs under <L>117598<pc>595,1<k1>pariṇī as
to lead forward to, put or place anywhere (<s>agram</s>, at the head), <ls>Bṛ.</ls>;

PWG in the entry nI (under the prefix pari) has thus-

  1. herumführen, - geleiten, - tragen; herbeibringen Ṛv. 1, 95, 2. 1, 162, 4. सो अ॑ध्व॒राय॒ परि॑ णीयते क॒विः 3, 2, 7. ज्या॑वाजं॒ परि॑ णयन्त्या॒जौ 3, 53, 24. स सद्म परि॑णीयते 4, 9, 3. 4, 15, 1. जी॒वां मृ॒तेभ्यः॑ परिणी॒यमा॑नाम् Av. 18, 3, 3. परी॒मे गाम॑नेषत Ṛv. 10, 155, 5. 10, 165, 5. तेनै॒वैन॒मग्रं॑ दे॒वता॑नां॒ पर्य॑णयत् brachte an die Spitze Ts. 2, 3, 4, 3. Śat. Br. 5, 3, 3, 6. 7, 3, 2, 18. Śāṅkh. Br. 28, 2. Kauś. 46. 64. 80. 81.

As seen here, there are three Brāhmaṇas cited- Śatapatha, Śāṅkhāyana and Kauṣītaki; so the tooltip might have to be Brāhmaṇa literature instead of just the Śatapatha-brāhmaṇa ?

Just noticed the in Bṛ!

So, make it a print correction as Bṛ > Br. and the tooltip 02:16 gets applied to it; no need of this 90.16

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Aug 5, 2022

90.24 :: Cār. :: (possibly) Caraka [The context indicates that a work related to some herb, so some medical treatise is intended.] [Cologne Addition] :: Title

I forgot to mention a print change Cār. > Car. while saying this--
90.24 Cār. Unknown reference [Cologne Addition] Title > (possibly) Caraka [The context indicates that a work related to some herb, so some medical treatise is intended.]

The matter within [...] was used as an explanation for my interpretation, and not to be added in the tooltip.
As Car. is already there at 02:32 (02:32 :: Car. :: Caraka :: Author), this 90.24 entry may be deleted, after making the print correction.
And, Car. refers to Carakasaṃhitā (a Title), not to Caraka (the Author).

BTW, just noticed that 9x.xx items are with a dot between, while all the earlier ones 0x:xx, 1x:xx and 2x:xx are with a colon between; is there some special significance for this different notation?

Finally,

00003 9.2 :: UNKNOWN :: unknown :: ls is unknown

what are these three instances?

@Andhrabharati
Copy link
Contributor Author

There are 13 places where a space is missing between the title and the citation place--

Kathās.liv, 18.
Kathās.lxxi
Kathās.lxxvii, 22.
Pañcat.iv, 2, 0/1
R.i, 3, 11.
R.i, 5
VarBṛS.li, 2
VarBṛS.li, 24
VarBṛS.liii, 48
VarBṛS.lx, 5
VarBṛS.lxviii, 97.
VarBṛS.lxxix
KātyŚr.,xxv

@Andhrabharati
Copy link
Contributor Author

Now, I move away from this ls issue.

Note. As mentioned above at #135 (comment), there still can be some ls-orphans (with a single digit) lying, as my hunt was not continued.

@Andhrabharati
Copy link
Contributor Author

[Hope @funderburkjim would come back to this closed issue to do these corrections.]

@funderburkjim
Copy link
Contributor

Thanks for reference to ACC front matter.

9x.xx items are with a dot between, while all the earlier ones 0x:xx, 1x:xx and 2x:xx are with a colon between; is there some special significance for this different notation?

No significance. These 'codes' are present only to make it easier to refer to a specific abbreviation. Each line should have a
different code,

@funderburkjim
Copy link
Contributor

one of the 3 'unknown' items in lsextract_all.txt is

; <L>187338<pc>924,3<k1>varRatAla
626433 old <s>va/rRa—tAla</s> ¦ <lex>m.</lex> <ab>N.</ab> of a king, 
<ls>Vār., <ab>Introd.</ab></ls><info lex="m"/>

Vār. is the unknown. No obvious answer from varRatAla in pwg or pw.

The other two were typos <ls><ab>ib.</ab> [Pi.']</ls> are being changed to <ls>ib. [Pi.']</ls> .

@funderburkjim funderburkjim reopened this Aug 5, 2022
funderburkjim added a commit to sanskrit-lexicon/csl-corrections that referenced this issue Aug 5, 2022
funderburkjim added a commit to sanskrit-lexicon/csl-pywork that referenced this issue Aug 5, 2022
funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Aug 5, 2022
@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Aug 5, 2022

Vār. is the unknown. No obvious answer from varRatAla in pwg or pw.

Very easy, this is!

Just change Vār. to Vās. and it becomes another count increase in <ls>Vās., <ab>Introd.</ab>

PWG clearly mentions this--
वर्णताल m. N. pr. eines Fürsten Hall in der Einl. zu Vāsavad. S. 53. [ID=88431] [p= 6-0742]
(Einl. zu Vāsavad. is Introd. to Vāsavad.)

@Andhrabharati
Copy link
Contributor Author

With this "final" correction done, this issue can be closed with a bang!

funderburkjim added a commit to sanskrit-lexicon/csl-corrections that referenced this issue Aug 5, 2022
funderburkjim added a commit to sanskrit-lexicon/csl-orig that referenced this issue Aug 5, 2022
funderburkjim added a commit that referenced this issue Aug 5, 2022
@funderburkjim
Copy link
Contributor

I saw Vāsavad. but didn't notice that 'Einl. zu' (einleitung) corresponds to 'Introd.'.

All changes now made and installed, AFAIK.

The changes of this issue provide a substantial improvement to the literary source markup and tooltips in the Cologne digitization of mw1899.
Many thanks to @Andhrabharati for guiding this effort!

Will now close this issue and on to 'cleanup' of other areas of mw.txt.

@Andhrabharati
Copy link
Contributor Author

Andhrabharati commented Aug 8, 2022

@funderburkjim

90.83 RLM. Unknown reference [Cologne Addition] Title

In the light of this info, I would suggest RLM. to be interpreted as RājendraLālaMitra's Notices of Sanskrit MSS. (probably with a '?' at the end to be on safe side).

I have found the RLM. for sure now, quite accidentally.

It is the RājendraLālaMitra's ed. of Lalitavistara (The Lalita-vistara: Memoirs of the Early Life of Sakya Sinha, 1877), who also translated it into English in two parts subsequently (1882, 1886).
https://archive.org/details/in.ernet.dli.2015.292668/page/n3/mode/1up

Now, the ? mark is not needed at RLM.

@Andhrabharati
Copy link
Contributor Author

And you can get many source scans (for PWG, pwk and MW) from this list--

http://www.sanskritebooks.org/2015/12/bibliotheca-indica-series/

funderburkjim added a commit to sanskrit-lexicon/csl-pywork that referenced this issue Sep 2, 2022
@funderburkjim
Copy link
Contributor

Changed tooltip for RLM per comment above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants