You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
in the hwnorm1c.txt file, there is one line for each normalized spelling.
For instance, the first line and its explanation:
a:a:AP,AP90,BEN,BHS,BOP,BUR,CAE,CCS,GRA,GST,MCI,MD,MW,MW72,PD,PE,PW,PWG,SCH,SHS,SKD,STC,VCP,WIL,YAT;aM:PD,SKD;aH:PD,SKD
Two parts:
first part:
a (normalized spelling)
second part a:AP,AP90,BEN,BHS,BOP,BUR,CAE,CCS,GRA,GST,MCI,MD,MW,MW72,PD,PE,PW,PWG,SCH,SHS,SKD,STC,VCP,WIL,YAT;aM:PD,SKD;aH:PD,SKD
Second part contains a sequence of parts, separated by semicolon. There are three such parts in
this example.
1.a:AP,AP90,BEN,BHS,BOP,BUR,CAE,CCS,GRA,GST,MCI,MD,MW,MW72,PD,PE,PW,PWG,SCH,SHS,SKD,STC,VCP,WIL,YAT
2. aM:PD,SKD
3. aH:PD,SKD
Each of these three parts itself has two parts (colon-separated)
non-normalized spelling,
and a comma-separated list of dictionaries with a headword with this
spelling
Here's another example:
uBayavat:uBayavat:MW;uBayavant:PW,PWG
The distrib.py program provides code to parse the lines of hwnorm1c.txt into a series of HWnormc objects.
distrib.txt counts how many normalized spellings occur in exactly one dictionary, exactly two dictionaries, etc.
The text was updated successfully, but these errors were encountered:
The normalize_key function in hwnorm1c.py is used to compute a normalized spelling for any given headword. All spellings are in SLP1 transliteration.
Here is an explanation of the current normalization rules, as copied from here:
hwnorm1 normalization rules
These rules are independent of the dictionary.
Use homorganic nasal rather than anusvara
normalize so that 'rxx' is 'rx' (similarly, fxx is fx)
ending 'aM' is 'a'
ending 'aH' is 'a'
ending 'uH' is 'u'
ending 'iH' is 'i'
'ttr' is 'tr' (pattra v. patra)
ending 'ant' is 'at'
'cC' is 'C' (Jan 27, 2015)
The ejf/hwnorm1c directory initally contains the normalization program used in the Cologne hwnorm1 display.
in the hwnorm1c.txt file, there is one line for each normalized spelling.
For instance, the first line and its explanation:
Here's another example:
The distrib.py program provides code to parse the lines of hwnorm1c.txt into a series of
HWnormc
objects.distrib.txt counts how many normalized spellings occur in exactly one dictionary, exactly two dictionaries, etc.
The text was updated successfully, but these errors were encountered: