Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

correction aggregation #243

Closed
funderburkjim opened this issue Jan 21, 2016 · 3 comments
Closed

correction aggregation #243

funderburkjim opened this issue Jan 21, 2016 · 3 comments

Comments

@funderburkjim
Copy link
Contributor

For convenience in installation, I am aggregating the corrections from several issues:

#180 , #186, #229, #230, #232, #233, #239, #240, #241

This issue exists just for the purpose of identifying the aggregation.

I'll post the aggregated list, in standard form, when the installation is finished.

@gasyoun
Copy link
Member

gasyoun commented Jan 21, 2016

List of lists of errors - how far have we come?!
Just 2 years ago there was no PWK for download.
Now we do not solve one error per week. One batch per month 👍
This is what I call Sanskrit NLP. The rest is just sand castles.

@funderburkjim
Copy link
Contributor Author

@gasyoun Nice comment reminding us of wider perspective.

@funderburkjim
Copy link
Contributor Author

corrections installed.

Here is the aggregated change file, in case it might be needed sometime for reference.

; Ref: https://github.com/sanskrit-lexicon/CORRECTIONS/issues/243
; multiple correction issues
; Ref https://github.com/sanskrit-lexicon/CORRECTIONS/issues/180
;   See issue-189 for programs
; headword changes
; Dhaval
; format: mw72:Olapoi:Olapi:t: 
sch:aQaMQolayat:aQaMQolayat:n: A questionable form
pw:anugrahakat:anugrahakft:t:
stc:udGat:udGaw:t:Unclear print
cae:ohabrahmat:ohabrahman:p:
ccs:caturviSat:caturviMSat:t:Bad print
;
; Ref https://github.com/sanskrit-lexicon/CORRECTIONS/issues/186
;
pd:anvA\hartavE\:anvAhartavE:t:remove accents in key1
pd:aDaHSAKAcaturTA~Sa:aDaHSAKAcaturTAMSa:t:
pd:aDarmA~Sa:aDarmAMSa:t:
pd:aDarmA~SodBava:aDarmAMSodBava:t:
;
; Ref https://github.com/sanskrit-lexicon/CORRECTIONS/issues/229
;
pui:agnijiHva,76:agnijihva:p:Hv
stc:atiduhSrava,427:atiduHSrava:t:hS
sch:adfztAgraha,1579:adfzwAgraha:t:zt
ap:aprAJa,3202:aprAjYa:t:AJ
mw72:araksha,5699:arakzas:t:sh
ccs:asaQfSa,2301:asadfSa:t:Qf
bur:asahtya,1889:asahya:p:The meaning mandates that it has to be asahya. There is no word like asahtya.
ccs:asAQfSya,2347:asAdfSya:t:Qf
ccs:asmAQfSa,2416:asmAdfSa:t:Qf
mw:iLas,28690:iLas:n:iL
cae:ILitf,5096:ILitf:n:Li
cae:ILeRya,5098:ILeRya:n:Le
mw:ILenya,29730:ILenya:n:Le
mw:evAsha,40240.2:evAza:t:sh
pui:kanakAHvaya,2432:kanakAhvaya:p:Hv-wrong dot below h.
acc:kandALayArya,39549:kandALayArya:n:AL
pui:kAUraya,2854:kAUraya:n:AU-requires investigation from source.
mw:kAciLindika,47335.1:kAciLindika:n:iL,Li
yat:kfSAnusAtBUta,10905:kfSAnusAtBUta:n:tB-compound
pui:kOztikI,3734:kOzwikI:p:zt is not possible in Sanskrit grammar.
ben:krILi,3949:krILi:n:Li
bur:klfp,5045:kxp:t:lf
bop:kzti,2580:kziti:t:zt
ieg:ganDashastimAqa,7653:ganDahastimAqa:p:sh
pui:garutmaTfdayA,4086:garutmathfdayA:t:Tf
mw:gahsnazWa,64492.1:gahanezWa:t:hs
sch:goLikA,12492:goLikA:n:oL,Li
bur:caKO,6337:caKO:n:KO
pui:catuhSiras,4590:catuHSiras:t:hS-missing dot below h.
pui:catuhsana,4591:catuHsana:t:hs-missing dot below h.
ieg:cash,1180:cash:n:sh-Anglisized word
acc:CAjurAU,7605:CAjurAU:n:AU-Proper name
bur:jihILire,7280:jihILire:n:Li
ccs:taLat,8066:taLit:t:
cae:taLit,11764:taLit:n:aL,Li
;
; Ref https://github.com/sanskrit-lexicon/CORRECTIONS/issues/230
;
ccs:tAQfkza,8419:tAdfkza:t:Qf
ccs:tAQfgguRa,8420:tAdfgguRa:t:Qf
ccs:tAQfgrUpa,8421:tAdfgrUpa:t:Qf
ccs:tAQfgviDa,8422:tAdfgviDa:t:Qf
ccs:tAQfS,8423:tAdfS:t:Qf
;acc:tirunaLavAqImAhAtmya,9064:tirunaLavAqImAhAtmya:n:aL
ieg:diestruck,7611:diestruck:n:ck-English word
mw:pacCasCaHSasya,115173.1:pacCaHSasya:t:sC
ieg:pAdishAh,3931:pAdiSAh:p:sh
mw:puroLAS,126061.1:puroLAS:n:oL,LA
bur:preLe,12279:preLe:n:Le
ben:preshyatA,10112:prezyatA:t:sh
mw:baLA,142406.1:baLA:n:aL,LA
mw:bA|a,144241.1:bA|a:n:A|
inm:mahAdaMztra,6742:mahAdaMzwra:t:zt
yat:mAnAUQa,29734:mAnArUQa:t:AU
mw72:yamushadeva,38415:yamuzadeva:t:sh
acc:yoginIAzAQakfzRA,45919:yoginIAzAQakfzRA:n:IA-multiheadword
acc:rAIAka,46058:rAIAka:n:IA-Proper name
ieg:rUpIa,4947:rUpIa:n:Ia-Indian currency
mw:liNgapratizWApadDahti,182535:liNgapratizWApadDati:t:ht
stc:lfN,17685:lfN:n:lf-grammar tense
ccs:vijOavarman,22284:vijayavarman:t:Oa
mw72:vo|f,45199:vo|f:n:|f
ccs:SapaTf,24118:SapaTya:t:Tf
acc:SivamahimnHstotra,37808:SivamahimnaHstotra:t:nH
mw:SyAmasAhSaMkara,222096:SyAmasAhasaMkara:n:hS- print smudge and a typo ??
mw:zaL,225299:zaL:n:aL
mw:zoLaSa,225557:zoLaSa:n:oL
mw:zoLaSAkzara,225407:zoLaSAkzara:n:oL
;66
mw72:samaBikFt,49673:samaBikFt:n:Ft,kF
ccs:sahsrad,27095:sahsrada:t:hs
ccs:seAgra,28345:senAgra:t:eA
mw:sTf,255844:sTf:n:Tf
ieg:shAh,5442:SAh:p:sh-Persian
mw:hiL,263406.1:hIL:t:iL
;
; REF https://github.com/sanskrit-lexicon/CORRECTIONS/issues/232
;
bur:aDyAruQa,472:aDyArUQa:t:uQ is not possible grammatically.
cae:aro|a,2806:arI|a:t:the explanation reads arIQa / arI|a.
ap:AjiJAsenya,6667:AjijYAsenya:t:iJ
pe:kuraNNu,3876:kuraNNu:n:Nu-Non-Sanskrit headword
yat:QuQ,16271:QuQ:n:uQ-verb
acc:tiNantaSezasaMgraha,9001:tiNantaSezasaMgraha:n:Na-grammar word
mw:niguQaka,108089:nigUQaka:t:uQ not possible grammatically
mw:vo|ave,208142:vo|ave:n:o|
ccs:sarvAnavadyANNa,26888:sarvAnavadyANga:t:Na
;
; Ref https://github.com/sanskrit-lexicon/CORRECTIONS/issues/233
;
ieg:awWAimahotsava,645:awWAimahotsava:n:Ai,wW-Prakrit
bhs:aqQatiya,272:aqQatiya:n:qQ-Pali
acc:aparAjitapfCA,860:aparAjitapfCA:n:fC
sch:apravlava,3841:apravlava:n:vl-Seems OK by alphabetic order.
vcp:abBraNkaza,3195:abBraNkaza:n:bB
mw:abliNgA,11083:abliNgA:n:bl-Compound
sch:abliRga,3958:abliNga:t:R->N
bhs:aBiBvAyatana,1586:aBiBvAyatana:n:Bv
acc:aBirAmaviAlaMkAra,982:aBirAmavidyAlaMkAra:t:iA
vcp:aByutTeTa,3695:aByutTeya:t:eT
ap90:aBvavaskaMd,3605:aByavaskaMd:p:Bv
vei:amitratapanaSuzmiRaSEbya,114:amitratapanaSuzmiRaSEbya:n:Eb
pw:asTiSOTilya,12935:asTiSETilya:t:OT
bhs:AkaqQana,2503:AkaqQana:n:qQ
mw72:AcAika,8154:AcArika:t:Ai
ieg:AdeSanEbanDika,81:AdeSanEbanDika:n:Eb
bur:AnarPa,2211:AnarPa:n:rP
ieg:ArraNkarEttevE,7186:ArraNkarEttevE:n:rr-Tamil
ieg:ArrukkAlamaYji,7188:ArrukkAlamaYji:n:rr-Tamil
ieg:ArrukkulE,7187:ArrukkulE:n:rr-Tamil
ieg:ukkuwWI,6143:ukkuwWI:n:wW-Prakrit
bhs:uccagGana,3250:uccagGana:n:gG
mw:unnateCa,33796.1:unnatecCa:t:eC
ccs:uparfpaka,3601:uparUpaka:t:rf
ieg:ulavukAwci,7488:ulavukAwci:n:wc-Tamil
ccs:ekarfpa,3961:ekarUpa:t:rf
acc:ekAkzarIbEw,2992:ekAkzarIbEw:n:bE
ccs:evarfpa,4059:evaMrUpa:t:rf
ieg:Oreus,664:Oreus:n:eu
ieg:kaqQaka,2415:kaqQaka:n:qQ
ccs:kaTaMrfpa,4276:kaTaMrUpa:t:rf
acc:kapPinAByudaya,3251:kapPinAByudaya:n:pP
ieg:karuvUlavari,7275:karuvUlavari:n:vU-Tamil
ieg:kAwci,7278:kAwci:n:wc-Tamil
ieg:kAwciyeradukkASu,7279:kAwciyeradukkASu:n:wc-Tamil
ccs:kAmarfpa,4704:kAmarUpa:t:rf
acc:kAlaniRaryacaMdrikAlaGvI,3956:kAlaniRaryacaMdrikAlaGvI:n:Gv
ieg:kIrruvari,7290:kIrruvari:n:rr-Tamil
ieg:kudirEmArru,7305:kudirEmArru:n:rr-Tamil
ieg:kurrunel,7313:kurrunel:n:rr-Tamil
ieg:kUrrariSi,7312:kUrrariSi:n:rr-Tamil
acc:kfCracAndrAyaRalakzaRa,4522:kfCracAndrAyaRalakzaRa:n:fC
acc:kfzRaBawwaArqe,4657:kfzRaBawwaArqe:n:rq
acc:kOTumi,5156:kOTumi:n:OT
mw:kOTumI,57150:kOTumI:n:OT
mw:krILu,58213:krILu:n:Lu,IL
sch:kzEbya,11741:kzEbya:n:Eb
ieg:KaRqAit,2713:KaRqAit:n:Ai
krm:Kerf,363:Korf:t:e->o
ieg:ganjwar,1918:ganjwar:n:jw-Persian
;
; Ref https://github.com/sanskrit-lexicon/CORRECTIONS/issues/239
;
acc:gAyatryAfziCandodevatAnukramaRikA,39805:gAyatryAfziCandodevatAnukramaRikA:n:Af
yat:gOT,13613:graT:t:OT
mw:caturfdDipAdacaraRatalasupratizWita,71196:caturfdDipAdacaraRatalasupratizWita:n:rf
acc:catuScaraRakalaSAhUAnapadDati,7036:catuScaraRakalaSAhvAnapadDati:t:UA and hvA are orthographically similar
pwg:jajAiRa,74108:jajAiRa:n:Ai
pwg:jajjAiRa,74111:jajjAiRa:n:Ai
acc:jayapfCADikAra,33054:jayapfCADikAra:n:fC
acc:jayasiMhasavAI,7919:jayasiMhasavAI:n:AI
bur:juhvUrzAmi,7364:juhvUrzAmi:n:vU
pwg:jYOdanIy,120050:jYOdanIy:n:YO
pw:jYOdAnIy,43446:jYOdAnIy:n:YO
pw:JaRaJJaRiti,43695:JaRaJJaRiti:n:JJ
vcp:JUli,21685:JUli:n:JU
ap:JUziRI,15901:JUziRI:n:JU
sch:JolikA,14105:JolikA:n:Jo
pwg:tAIkadeSa,74938:tAIkadeSa:n:AI
ccs:trivarfTa,9070:trivaruTa:t:rf
acc:trizKalaSAnti,9422:trizKalaSAnti:n:zK
acc:ToAka,9484:ToAka:n:oA
acc:ToAnanda,9485:ToAnanda:n:oA
vei:dakzakAtyAyaniAtreya,1239:dakzakAtyAyaniAtreya:n:m
pwg:devaheuna,34729:devaheqana:t:It is an alternate to devaleLana. So, it must be 'qa'. 'qa' and 'u' are orthographically similar in PWG fonts.
acc:dyuBvAdipAdasyadurvAdijayahAlAhalI,10449:dyuBvAdipAdasyadurvAdijayahAlAhalI:n:Bv
acc:dvAdaSAheCAvAkaprayoga,33783:dvAdaSAheCAvAkaprayoga:n:eC
pw:DUHzawkavaha,55697:DUHzawkavaha:n:UH
pwg:DoI,37049:DoI:n:oI
bur:nirfRAmi,9936:nirfRAmi:n:rf
vcp:neulI,29699:neulI:n:eu
acc:nErftalakzaRavicAra,44497:nErftalakzaRavicAra:n:rf
vcp:nErfteyA,29830:nErfteyA:n:rf
mw72:paryuqBfta,28442:paryudBfta:t:qB
acc:pAiyalacCInAmamAlA,13252:pAiyalacCInAmamAlA:n:Ai
vcp:pEWinasi,33276:pEWinasi:n:EW
ccs:pOruhvUta,15060:pOruhUta:t:vU
pw:prAptiSOTilya,74435:prAptiSETilya:t:OT
mw:prAvfq,140303:prAvfq:n:fq
vcp:prAvfqatTaya,34782:prAvfqatyaya:t:fq
pui:bEjaBft,9209:bEjaBft:n:bE
mw:bEjavApayana,146455:bEjavApAyana:t:
mw:bEqAlikarRikanTa,146478:bEqAlikarRikanTa:n:bE
pwg:bEtAlin,95709:bEtAlin:n:bE
yat:blekza,37040:vlekza:t:It is found in 'v' headwords
inm:BarataSrezWO,1349:BarataSrezWO:n:WO
acc:BAIBawwa,35209:BAIBawwa:n:AI
acc:maDuparkakOTumaSAKIya,45443:maDuparkakOTumaSAKIya:n:OT
bur:marImfqye,13083:marImfqye:n:fq
bur:mimarqizAmi,13365:mimarqizAmi:n:rq
mw:mI|a,164586.1:mI|a:n:I|,|a
mw:mI|u,164588:mI|u:n:|u,I|
acc:mUtrakfCracikitsAdi,35832:mUtrakfCracikitsAdi:n:fC
;
; Ref https://github.com/sanskrit-lexicon/CORRECTIONS/issues/240
;
sch:mfqIcI,21934:mfqIcI:n:fq
acc:rAmakfzRadIkzitanAhnABAI,19864:rAmakfzRadIkzitanAhnABAI:n:AI
mw:reL,179371.1:reL:n:eL
pw:vaBluka,98268:vaBluka:n:Bl
; ap:varvU,28644:varvUraH:t:varvU(vu)raH wrongly converted to varvU only. raH is trimmed as it is separated by a parenthesis. Handled in hw1.py
vei:vAfzRivfdDa,2906:vArzRivfdDa:t:Af
bur:vAgGIna,14962:vAgGIna:n:gG
bhs:vikaqQate,13599:vikaqQate:n:qQ
acc:viwWalanAmastotra,41108:viwWalanAmastotra:n:wW
acc:viwWaleSAzwaka,46618:viwWaleSAzwaka:n:wW
acc:viwWaleSvaracintanaprakAra,46620:viwWaleSvaracintanaprakAra:n:wW
acc:viwWaleSvarAzwottaraSata,46621:viwWaleSvarAzwottaraSata:n:wW
ap90:viWaMka,25862:viWaMka:n:iW
bhs:viWapayati,13771:viWapayati:n:iW
yat:viqbarAha,34609:viqbarAha:n:b-v issue of Bengal kind words.
pe:viBvA,8238:viBvA:n:Bv
bur:vuvUrzAmi,15992:vuvUrzAmi:n:vU
bhs:vUdagra,14394:vUdagra:n:vU
skd:vfhaqQakkA,33931:vfhaqQakkA:n:qQ
pw:vyAsaviwWalAcArya,108816:vyAsaviwWalAcArya:n:wW
mw:vyU|a,210304:vyU|a:n:U|,|a
mw:vyU|acCandas,210304.25:vyU|acCandas:n:U|,|a
bur:vlepayAmi,16478:vlepayAmi:n:vl
pwg:SaMvUka,97000:SaMvUka:n:vU
pe:SAivAla,5949:SAivAla:n:Ai
pui:SEbjA,14745:SEbjA:n:Eb
inm:SEbyasugrIvavAhana,2221:SEbyasugrIvavAhana:n:Eb
inm:SEbyAtmaja,2222:SEbyAtmaja:n:Eb
mw:zawcakraBedavivftiwIkA,224975.40:zawcakraBedavivftiwIkA:n:wc
acc:zaqAmnAyazaqdarSanasaMkzepavAda,26412:zaqAmnAyazaqdarSanasaMkzepavAda:n:qd
mw:zaqAmnAyazaqdaSanasaMkzepavAda,225129:zaqAmnAyazaqdaSanasaMkzepavAda:n:qd
bhs:saMSfKalA,15446:saMSfKalA:n:fK
mw72:saqBAva,48948:sadBAva:t:qB
mw:samyagGuta,237361:samyagGuta:n:gG
mw:sAhityakOTUhala,243677:sAhityakOtUhala:t:OT
acc:sidDArTapfCA,28217:sidDArTapfCA:n:fC
pui:sudUGamuKI,16119:sudUGamuKI:n:UG
pw:susTeTa,128459:susTeya:p:PWG has susTeya with the same meaning.
pui:sEbalkA,16751:sEbalkA:n:Eb
inm:skandoPyAna,9967:skandopAKyAna:t:Py - print smudge
vcp:sPyASlizwejyADikaraRa,47422:sPyASlizwejyADikaraRa:n:Py
acc:svayaMBvagnimAhAtmya,39026:svayaMBvagnimAhAtmya:n:Bv
pwg:hareu,116370:hareu:n:eu
mw:hiteCA,262925:hiteCA:n:eC
mw:heL,263977.1:heL:n:eL
;
; Ref https://github.com/sanskrit-lexicon/CORRECTIONS/issues/241
;
acc:aRREyaAcArya,255:aRREyaAcArya:n:RE
acc:aRREyapaRqita,256:aRREyapaRqita:n:RE
pw:atridagja,2351:atridagja:n:gj
pw:anuzwupkArmIRa,4904:anuzwupkArmIRa:n:pk
ccs:arTayvavahAra,1827:arTavyavahAra:t:yv
pw:arDANI,9739:arDANgI:t:NI
acc:azwESvaryaPala,39397:azwESvaryaPala:n:wE
pe:uccESSravas,7650:uccESSravas:n:SS
bop:upaspfrSA,1466:upasparSa:t:fr
; also
bop:upasAna,1464:upasTAna:t: paper wrinkle in scan
bop:upasita,1465:upasTita:t: paper wrinkle in scan
inm:etAvarRO,3903:etAvarRO:n:RO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants