Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion about changes in XML for Sanskrit koshas #408

Open
drdhaval2785 opened this issue Mar 16, 2023 · 4 comments
Open

Discussion about changes in XML for Sanskrit koshas #408

drdhaval2785 opened this issue Mar 16, 2023 · 4 comments

Comments

@drdhaval2785
Copy link
Contributor

With the new markup syntax being discussed for Sanskrit koshas in #405 and #406, the next logical step would be generation of XMLs, so that the data can be used for sqlite and HTML purposes. As XMLs are generated from the xxx.txt file programmatically, duplication in XML is not a very deadly thing. But, ideally, we should design an XML structure which avoids duplication.

Currently, the Cologne dictionaries also have this problem, which is circumvented by duplication.
See SKD for example.

<H1><h><key1>kuberaH</key1><key2>kube(ve)raH</key2></h><body><s>kube(ve)raH , puM, (kumbatIti . kuba i ki AcCAdane</s> <lb/>“<s>kumbernalopaSca</s>”<s> . uRAM 1 . 60 . iti erak .</s> <lb/><s>nalopaSca . yadvA kutsitaM veraM SarIraM yasya . piNgala</s> <lb/><s>netratvAttaTAtvam .) yakzarAjaH . iti sidDAnta-</s> <lb/><s>kOmudyAmuRAdivfttiH .. (sa ca viSravasa fze</s> <lb/><s>rilavilAyAM jAtaH . sa tu tripAt azwadantaH</s> <lb/><s>kekarAkzaSca . yaTA, vAyupurARe .</s> <lb/>“<s>kutsAyAM kvitiSabdo'yaM SarIraM veramucyate .</s> <lb/><s>kuveraH kuSarIratvAt nAmnA tenEva so'NkitaH</s>”<s> ..</s> <lb/><s>taTA kASIKaRqe devIdattaSApoktO ca .</s> <lb/>“<s>kuvero Bava nAmnA tvaM mama rUperzyayA suta !</s>”<s> ..)</s></body><tail><L>8094</L><pc>2-144-a</pc></tail></H1>
<H1><h><key1>kuveraH</key1><key2>kuveraH</key2></h><body><alt><s>kuveraH</s> is an alternate of <s>kuberaH</s>.</alt> <s>kube(ve)raH , puM, (kumbatIti . kuba i ki AcCAdane</s> <lb/>“<s>kumbernalopaSca</s>”<s> . uRAM 1 . 60 . iti erak .</s> <lb/><s>nalopaSca . yadvA kutsitaM veraM SarIraM yasya . piNgala</s> <lb/><s>netratvAttaTAtvam .) yakzarAjaH . iti sidDAnta-</s> <lb/><s>kOmudyAmuRAdivfttiH .. (sa ca viSravasa fze</s> <lb/><s>rilavilAyAM jAtaH . sa tu tripAt azwadantaH</s> <lb/><s>kekarAkzaSca . yaTA, vAyupurARe .</s> <lb/>“<s>kutsAyAM kvitiSabdo'yaM SarIraM veramucyate .</s> <lb/><s>kuveraH kuSarIratvAt nAmnA tenEva so'NkitaH</s>”<s> ..</s> <lb/><s>taTA kASIKaRqe devIdattaSApoktO ca .</s> <lb/>“<s>kuvero Bava nAmnA tvaM mama rUperzyayA suta !</s>”<s> ..)</s></body><tail><L>8094.01</L><pc>2-144-a</pc><hwtype n="alt" ref="8094"/></tail></H1>

In this case, the second headword has a sentence that "X is an alternate of Y" followed by the whole body of the previous entry.

Proposed markup

<hwdetails>
<h><key1>kuberaH</key1><key2>kube(ve)raH</key2><L>8094</L></h>
<h><key1>kuveraH</key1><key2>kuveraH</key2><L>8094</L></h>
</hwdetails>

<entrydetails>
<body><L>8094</L><pc>2-144-a</pc><entry><s>kube(ve)raH , puM, (kumbatIti . kuba i ki AcCAdane</s> <lb/>“<s>kumbernalopaSca</s>”<s> . uRAM 1 . 60 . iti erak .</s> <lb/><s>nalopaSca . yadvA kutsitaM veraM SarIraM yasya . piNgala</s> <lb/><s>netratvAttaTAtvam .) yakzarAjaH . iti sidDAnta-</s> <lb/><s>kOmudyAmuRAdivfttiH .. (sa ca viSravasa fze</s> <lb/><s>rilavilAyAM jAtaH . sa tu tripAt azwadantaH</s> <lb/><s>kekarAkzaSca . yaTA, vAyupurARe .</s> <lb/>“<s>kutsAyAM kvitiSabdo'yaM SarIraM veramucyate .</s> <lb/><s>kuveraH kuSarIratvAt nAmnA tenEva so'NkitaH</s>”<s> ..</s> <lb/><s>taTA kASIKaRqe devIdattaSApoktO ca .</s> <lb/>“<s>kuvero Bava nAmnA tvaM mama rUperzyayA suta !</s>”<s> ..)</s></entry></body>
</entrydetails>
@drdhaval2785
Copy link
Contributor Author

@funderburkjim,
I need you to contribute to this thread and finalize the XML format, so that I can write make_xml.py to generate XMLs from sample TXT files.

@funderburkjim
Copy link
Contributor

I think it is better for me to work on make_xml.py as well as the other components (such as displays) for these 'thesaurus-like' dictionaries. Please be patient.

@gasyoun
Copy link
Member

gasyoun commented Mar 17, 2023

Please be patient.

We sure are, are not we @drdhaval2785 ?

@drdhaval2785
Copy link
Contributor Author

Sure. Just want to voice my opinion that this is one of the high priority items. The highest for me. Once we are able to close this, I can start preparing the koshas and add them to Cologne and Stardict world.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants