Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

verbs01 #1

Open
funderburkjim opened this issue Apr 9, 2020 · 11 comments
Open

verbs01 #1

funderburkjim opened this issue Apr 9, 2020 · 11 comments
Labels
documentation Improvements or additions to documentation

Comments

@funderburkjim
Copy link
Contributor

The verbs01 directory aims

  • to identify the entries in the Burnouf dictionary which are verbs, and
  • to provide a correspondence between the headwords of these entries and verb entries of the Monier-Williams dictionary.

The comments here will focus on the bur_preverb1 report.
bur_preverb1_deva is a Devanagari version of the report.

Currently, 5815 of the 19774 entries of Burnouf are identifed as verbs.
The bur_preverb1 report organizes these verbal entries according to their relation to MW verbs.

There are 1736 different groups, based on the correpsondence to MW (unprefixed) verbs.

@funderburkjim funderburkjim added the documentation Improvements or additions to documentation label Apr 9, 2020
@funderburkjim
Copy link
Contributor Author

Description of group in bur_preverb1 report

Here is the group related to MW root 'Bram':

; Verb 1017: Bram (1 uninflected, 4 inflected, 3 prefix entries)
  L=12860 k1=Bram       code=*R  mw=Bram,verb
  L=12416 k1=baBrAmi    code=R   mw=Bram,verb
  L=12423 k1=bamBramye  code=Aug mw=Bram,verb
  L=12535 k1=biBramizAmi code=Des mw=Bram,verb
  L=12876 k1=BrAmayAmi  code=R   mw=Bram,verb
  L=3002  k1=udBramAmi  code=D   mw=udBram,preverb,ud+Bram
  L=10652 k1=pariBramAmi code=D   mw=pariBram,preverb,pari+Bram
  L=15614 k1=viBramAmi  code=D   mw=viBram,preverb,vi+Bram
  • First appears the entry in Burnouf for the bare root 'Bram', as
    indicated by the '*' (code=*R); this is the '1 uninflected' entry.
  • Next appear 4 entries whose headword is believed to be an inflected
    form of the non-prefixed verb:
    • baBrAmi, bamBramye, biBramizAmi, BrAmayAmi.
  • Finally appear 3 entries believed to be prefixed verb forms based on 'Bram':
    • udBramaAmi, pariBramAmi, viBramAmi.

@funderburkjim
Copy link
Contributor Author

uninflected v. inflected verb entries

For simplicity, let's say there are about 6000 Burnouf entries identified as 'verbs' .
Based on our analysis, these may be thought of in three subsets, each with about
2000 entries

  • uninflected roots. These are identified in the print by having an asterisk preceding the headword:
    image
    • In the digitization, this is identified by '*' in the 'key2' field:
      • <L>12860<pc>482,1<k1>Bram<k2>*Bram
    • There are only a handful of '*' headwords which are not verbs. I count 9 of these. See
      bur_verb_exclude
  • prefixed verb entries. The entry headword is invariably an inflected form; and almost always
    this is the first person singular present tense form.
  • inflected non-prefixed verb entries. The entry headword is almost always a first person singular.

@funderburkjim
Copy link
Contributor Author

funderburkjim commented Apr 9, 2020

varieties of inflected headwords

About 2/3 of the verb entries have headwords which are inflected forms, usually a first person singular.
The bur_preverb1 report identifies the form of some of these:

  • 403 Aug (Intensive)
  • 423 Des (Desiderative)
  • 451 C (Causal)
  • 57 F (Simple Future)

Other forms which the bur_preverb1 report does NOT currently identify but which are identified
by Burnouf include:

  • p. Perfect
  • f1. Periphrastic future
  • ps. Passive
  • a1 First aorist
  • a2 Second aorist

@funderburkjim
Copy link
Contributor Author

missed verbs

The entries corresponding to inflected verb forms are hard to find, programmatically.
The strategy used was to examine headwords with various endings. The endings examined thus
far are:

  • Ami most common. 1st person singular present tense active voice of conjugation classes 1, 4, 6, and 10; and of derived forms (like Desiderative, Intensive)
  • omi less common. Like 'karomi'
  • mi (not preceded by 'o')
  • 'e' 1st person singular present tense middle voice
  • 'am' imperfect tense 1st singular

It is known that some verb entries are missing because the inflected verb form headword has
some other ending not examined yet. For examples I just noticed
akarot' , akarizi. anOt, proT`. (sigh!).

Maybe someone else can fill in the gaps, or maybe I will do so at some future time.

@funderburkjim
Copy link
Contributor Author

Uses of inflected headwords.

The inflected headwords of Burnouf could prove useful in extending the coverage and checking the accuracy of the inflected verb forms computed in https://github.com/sanskrit-lexicon/csl-inflect/.

@gasyoun
Copy link
Member

gasyoun commented Apr 10, 2020

5815 of the 19774 entries of Burnouf are identifed as verbs.

So not dhatus, but dhatus + verbs?

three subsets, each with about 2000 entries

Thanks for the clarification.

bur_preverb1 report does NOT currently identify but which are identified by Burnouf include:

Means no tag is there?

Maybe someone else can fill in the gaps, or maybe I will do so at some future time.

So we have no list of suspicios words? Or we can have the list, hovewer big?

The inflected headwords of Burnouf could prove useful in extending the coverage and checking the accuracy of the inflected verb forms computed in https://github.com/sanskrit-lexicon/csl-inflect/.

I would not spend time on that. But I have an idea. Here is a list of all the possible verb forms as per Huet, can we use it? https://yadi.sk/i/lvmYk0Y7Te4MrQ

@funderburkjim
Copy link
Contributor Author

bur_preverb1 report does NOT currently identify ...

This comment suggests a limitation of the 'code' values (R, Des,...) that could be enhanced.
For instance, 'baBrAmi' above has code=R. The entry in Burnouf is
babhrāmi (red. de bhram) aller; errer;. Comparing with MW 'Bram' entry, baBrAmi is
probably the 1s perfect tense form. So a better coding for 'baBrAmi` would be 'perfect'.

Similarly, 'BrAmayAmi' shows code=R, but a better coding would be code=C (Causal).

we have no list of suspicious words?

I was pointing out that there remain some headwords in Burnouf that

  1. Should be identiifed as verbs (like akarot) but which
  2. do not appear in the report.

In other words, there are some verb entries in BUR that the current analysis has missed.
I don't currently have a systematic way to find these. akarot is just one that I happened
to notice by chance.

Huet verb forms.

Yes, that's a good idea to make better use of Huet's work. I think the best entry point
would be to include his forms in the csl-inflect repository.

@gasyoun
Copy link
Member

gasyoun commented Apr 13, 2020

I think the best entry point would be to include his forms in the csl-inflect repository.

And the quickest find to find and add tags to verb forms. You know where to look for the XML files with all the forms on Git, do you?

@funderburkjim
Copy link
Contributor Author

I think so. Am planning to try to include Huet's aorist forms as a starter.

@gasyoun
Copy link
Member

gasyoun commented Apr 20, 2020

Am planning to try to include Huet's aorist forms as a starter.

Hope it does not take much time. Because I think it does not make much sense.

@drdhaval2785
Copy link

sanskrit-lexicon/COLOGNE#228 was an old entry regarding verb identification in BUR. Referencing it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants