Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capitalization in Proper Names (earlier was mw:179760) #1537

Open
drdhaval2785 opened this issue Jan 12, 2024 · 10 comments
Open

Capitalization in Proper Names (earlier was mw:179760) #1537

drdhaval2785 opened this issue Jan 12, 2024 · 10 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@drdhaval2785
Copy link
Collaborator

date: 12/22/2023 00:33:17
dict: mw
Lnum: 179760
hw: rocana
old: (ruci-ruce r°)
new: (Ruci-ruce r°)
comm: Typo

@drdhaval2785
Copy link
Collaborator Author

Requires examination and comparision with other such quotes.

@drdhaval2785 drdhaval2785 added the help wanted Extra attention is needed label Jan 14, 2024
@Andhrabharati
Copy link

This belongs to the category of initial-CAP letter of IAST [denoting proper nouns, as per English grammar] getting converted to small letter in slp1; there are hundreds (if not thousands) of such places across many cdsl texts.

It calls for a strategic decision, if something like what @gasyoun had proposed long ago to be resorted to [i.e., to use the slp1 notation {X}, for such initial-CAP letters in IAST; thus helping the 'invertibility' property that Jim and Dhaval mention time and again], to match with the printed text; or to ignore those initial-CAPs as is being done till now.

@funderburkjim
Copy link
Contributor

current coding in179760 rocana
<s>ruci-ruce r°</s>    In cdsl displays, this is rendered according to the user's output preference.

Compare 179772 rocanA
<s1 slp1="SAlmali">Śālmali</s1>   Always rendered as IAST Śālmali in current cdsl displays.

The cdsl transcoding routines do not implement the {} feature of slp1.

<s1 slp1="ruci-ruce r°">ruci-ruce r°</s1>
(It so happens in ruci-ruce  that this is same in both slp1 and iast.)

And in print (mw-iast), there is no capitalization.  

If we decide that a print change should be made, the coding could be
<s1 slp1="ruci-ruce r°">Ruci-ruce r°</s1>

@gasyoun
Copy link
Member

gasyoun commented Jan 14, 2024

@Andhrabharati eagle-eyed you remain. Was not aware that there are hundreds (if not thousands) of such places across many cdsl texts. so many of them.

@Andhrabharati
Copy link

ruci-ruce r°
(It so happens in ruci-ruce that this is same in both slp1 and iast.)

And in print (mw-iast), there is no capitalization.

Wonder how this initial-CAP letter skipped Jim's eye--

image

It is not a print-change here!!

And I am referring to the cases like
image
image
where proper nouns (names) are the 'entries'.

Compare 179772 rocanA
Śālmali Always rendered as IAST Śālmali in current cdsl displays.

image

I am aware of the <s1 slp1= notation of CDSL for the words 'inside' the body portion, but that is altogether a different matter.

Finally, it may be recalled that MW print has all the <H2> entry words rendered with initial-CAPs, but they need not be considered as initial-CAPs, unless they are denoting proper nouns.

@Andhrabharati
Copy link

@gasyoun

You had started the topic when you were 10 years younger.

If you make a step forward (by showing the result of what you said "you're ready to do"), probably Jim might not mind 'adapting' the transcoder files as he mentioned those days.

@funderburkjim
Copy link
Contributor

Wonder how this initial-CAP letter skipped Jim's eye-

Jim must not have looked at the scans!

@Andhrabharati
Copy link

Wonder how this initial-CAP letter skipped Jim's eye-

Jim must not have looked at the scans!

Without looking at the print matter, how could you say thus--

And in print (mw-iast), there is no capitalization.

@funderburkjim
Copy link
Contributor

'adapting' the transcoder files

Someone (maybe @artanat ?) needs to fill the role of transcoding expert at cdsl.

Peter's site https://sanskritlibrary.org/transcodeText.html provides an implementation of the {x} feature. See screen-shot of next comment. I don't know where or if Peter has documented the {x} feature specification.

Peter's site is based on Ralph Bunker's Java code, and on xml transcoding file formats devised by Malcolm, Peter, and Ralph.

The cdsl PHP and Python implementations (including the format of the input transcoding xml files) were made by me based on Bunker's early work. Ralph and Peter later revised their system, so the {x} feature (and probably other features) are not present in the cdsl system.

@funderburkjim
Copy link
Contributor

Example of {x} feature at sanskritlibrary

image

@drdhaval2785 drdhaval2785 changed the title mw:179760 Capitalization in Proper Names (earlier was mw:179760) Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants