SKD digitization (Devanagari version) #11

funderburkjim · 2021-01-09T03:47:43Z

@Shalu411 Hi!

I've made a version of the digitization of skd that you requested.
[Did you think I had forgotten?]

Currently, the version is a sample of the first 10,000 lines, and it is
skd_deva_sample.txt

Take a look, and see if this sample is what you were requesting.
Or, tell me of any problems.

When you give the go-ahead, I'll generate the whole dictionary in a similar way.

Shalu411 · 2021-01-10T02:59:28Z

Namaste Jim

[Did you think I had forgotten?]

You and forget!!? Not even in dream! I am glad we have it at the right time. Thanks you so much.
I have seen the sample. It should do!

Now how to note down the error?
For Eg- I see one right here (starred)
21-001अअ
अ¦, व्य, अभाबः । अल्पः ।
There should be अभावः
What is the format to make it? Please tell with one example.
Thanks
--Shalu

drdhaval2785 · 2021-01-10T05:25:43Z

One friendly advice @Shalu411 .
Don't try to change b / v errors. Otherwise you will end up writing SKD and VCP afresh.

gasyoun · 2021-01-10T07:33:52Z

Don't try to change b / v errors.

Ignoring them is not a good idea as well. But there must be thousands of them.

end up writing SKD and VCP afresh.

For our digital purpose it might be not that bad idea at all - at least at an alternate headword level. Maybe generate just all words with v as b and vice versa, @drdhaval2785 ?

Shalu411 · 2021-01-10T09:52:20Z

Hariom.
Ok. .Assuming it is a mistake I have to note down-
Can I note the errors this way?
Method 1) Give the whole technical detail of the word-
<L>2<pc>1-001<k1>अ<k2>
अभाबः >> अभावः

OR this - Method 2) just with the LCode?
L=2 अभाबः >> अभावः
Please guide me..

How is Sampada doing it?

@drdhaval2785 Can you please provide me the list of suspicious head-words / words in SKD?
Thanks

funderburkjim · 2021-01-10T18:28:02Z

Please tell with one example.

An error file: skd_error.txt

In preparation, make an 'skd_error.txt' file where the changes are detailed.
Within skd_error.txt, make a line for each change. The format of such a line woud be, by example,
2:अ:अभाबः अभावः

There are 4 fields separated by colons, almost like your example. The 4 fields are:

L-code (the cologne record number)
k1 value the headword
old : The word that needs to be corrected
new : The correction

If you want to make a comment in skd_error.txt file, insert one or more lines after the above 4-field
correction line, and start each of the comment lines with a semicolon.
You can add extra blank lines if you want.

These formatting details are consistent with the xxx_error1.txt files that Sampada and Anna have
been using.

change the digitization

You should also change the digitization directly (currently, for this preliminary trial, this digitization file
is named skd_deva_sample.txt).

So incorporate the changes directly.

funderburkjim · 2021-01-10T18:43:19Z

अभाबः -> अभावः

I am very much in favor of this change. I think @Shalu411 is experienced enough in Sanskrit to make a
reliable judgment in such cases.

There are 3 sitations that might have led to 'aBAbaH' in the skd digitization:

The scanned image clearly shows 'b' and the typist who did the digitization accurately entered 'b'
The scanned image clearly shows 'v' and the typist erroneously entered 'b'
The scanned image is unclear, and the typist entered 'b'.

In the present case, I would say case 2 applies:

@Shalu411 If you go to the trouble of examining the scanned image, and happen to notice a
case of type '1' (i.e. a case where your change definitely disagrees with the scan), then you should
make a comment in skd_error.txt of the form '; scan error'.

However, I am not saying that you should examine the scanned image in every change,
as this extra scanned image examination may be more time-consuming than it is worth.

funderburkjim · 2021-01-10T18:45:15Z

@Shalu411

Did you clone the SKD repository? Are you using git or Github desktop?

gasyoun · 2021-01-10T20:04:53Z

2:अ:अभाबः अभावः

Oh, these visargas that look like :.

Did you clone the SKD repository?

Not yet, she will need my help.

Github desktop

She will. Let the whole converted file come?

In the present case, I would say case 2 applies

Agree.

funderburkjim · 2021-01-10T20:09:27Z

Oh, these visargas that look like :

Good point. Maybe use '#' instead?

Let the whole converted file come?

Let's work a while with the sample file that is there. Once the procedural steps are ironed out,
we can go to a full skd_deva.txt.

she will need my help.

Thanks!

gasyoun · 2021-01-11T03:08:35Z

Good point. Maybe use '#' instead?

Let us give '#' a try?

Once the procedural steps are ironed out, we can go to a full skd_deva.txt.

Sure, so be it.

gasyoun · 2021-01-13T12:20:49Z

Usha pulled an update from Github Desktop. Is it as it should be @drdhaval2785 @funderburkjim ?

e8686f3

Shalu411 · 2021-01-13T12:45:48Z

Hariom
Hearty Thanks Mark, for the support and guidance.
Jim, once you confirm, I am ready for carrying on with the corrections.

funderburkjim · 2021-01-14T03:34:10Z

Usha: I confirm that you pushed properly; I can see the 1 change you made.

BUT please wait for making further changes.

I am having problems with inverting the Devanagari back to slp1, and need to get that problem
ironed out . Will aim for solving this problem tomorrow. The problem relates to the candrabindu
when it is after an 'o' but is not the Om character. ॐ In slp1, o~ is supposed to represent ॐ.

But there are several instances in the skd digitizations like under ठोँट under headword aDaraH that
also have 'o~' in slp1. These are what are causing the problems at the moment.

Shalu411 · 2021-01-14T04:57:09Z

Namaste
BUT please wait for making further changes.
Sure Jim!
@drdhaval2785 Can you help with the o~ issue?

gasyoun · 2021-01-14T05:44:56Z

But there are several instances in the skd digitizations like under ठोँट under headword aDaraH that
also have 'o~' in slp1. These are what are causing the problems at the moment.

As rare as it can get. Jim, you are our fortress.

funderburkjim · 2021-01-14T20:43:42Z

As rare as it can get

The was discovered by applying a principle of invertibility. Here,

our base digitization is in SLP1 spelling : skd.txt
A conversion was made to use Devanagari spelling: skd_deva.txt
To incorporate the changes Shalu makes to skd_deva.txt, we need to convert skd_deva.txt
back to skd_slp1.txt.
And it should be that if NO changes were made to skd_deva.txt, then the round trip
skd.txt -> skd_deva.txt -> skd_slp1.txt should result in skd.txt identical to skd_slp1.txt.
The problem was noticed while investigating WHY, in earlier version of transcoding,
skd.txt was NOT same as skd_slp1.txt.

This problem now has a satisfactory solution.

You can see that skd_deva_sample was changed in two lines from what Usha had,
by looking at this commit difference.

funderburkjim · 2021-01-14T20:52:37Z

@Shalu411

Ready for you to pull this repository and continue with changes to skd_deva_sample.txt.

ALSO, I made a file 'skd_error.txt' where you should document simple changes, such as the
first 'aBAbaH' one you made.

By 'simple change', I mean spelling errors like 'aBAbaH'.

More complex errors (like missing a headword which you mentioned elsewhere) will need to have
special handling -- meaning that probably I need to do the actual change to skd_deva_sample.txt
rather than you for the complex errors.
You can describe such complex cases as comments in the skd_error.txt file.

gasyoun · 2021-01-14T22:31:32Z

Ready for you to pull this repository and continue with changes to skd_deva_sample.txt.

Good news for India.

gasyoun added the help wanted label Jan 9, 2021

gasyoun changed the title ~~Devanagari version skd digitization~~ SKD digitization (Devanagari version) Jan 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SKD digitization (Devanagari version) #11

SKD digitization (Devanagari version) #11

funderburkjim commented Jan 9, 2021

Shalu411 commented Jan 10, 2021 •

edited

Loading

drdhaval2785 commented Jan 10, 2021

gasyoun commented Jan 10, 2021

Shalu411 commented Jan 10, 2021 •

edited

Loading

funderburkjim commented Jan 10, 2021

funderburkjim commented Jan 10, 2021

funderburkjim commented Jan 10, 2021

gasyoun commented Jan 10, 2021

funderburkjim commented Jan 10, 2021

gasyoun commented Jan 11, 2021

gasyoun commented Jan 13, 2021

Shalu411 commented Jan 13, 2021

funderburkjim commented Jan 14, 2021 •

edited

Loading

Shalu411 commented Jan 14, 2021

gasyoun commented Jan 14, 2021

funderburkjim commented Jan 14, 2021

funderburkjim commented Jan 14, 2021

gasyoun commented Jan 14, 2021

SKD digitization (Devanagari version) #11

SKD digitization (Devanagari version) #11

Comments

funderburkjim commented Jan 9, 2021

Shalu411 commented Jan 10, 2021 • edited Loading

drdhaval2785 commented Jan 10, 2021

gasyoun commented Jan 10, 2021

Shalu411 commented Jan 10, 2021 • edited Loading

funderburkjim commented Jan 10, 2021

An error file: skd_error.txt

change the digitization

funderburkjim commented Jan 10, 2021

funderburkjim commented Jan 10, 2021

gasyoun commented Jan 10, 2021

funderburkjim commented Jan 10, 2021

gasyoun commented Jan 11, 2021

gasyoun commented Jan 13, 2021

Shalu411 commented Jan 13, 2021

funderburkjim commented Jan 14, 2021 • edited Loading

Shalu411 commented Jan 14, 2021

gasyoun commented Jan 14, 2021

funderburkjim commented Jan 14, 2021

funderburkjim commented Jan 14, 2021

gasyoun commented Jan 14, 2021

Shalu411 commented Jan 10, 2021 •

edited

Loading

Shalu411 commented Jan 10, 2021 •

edited

Loading

funderburkjim commented Jan 14, 2021 •

edited

Loading