Skip to content
Ben Chauvette edited this page Aug 4, 2015 · 3 revisions

Table of Contents


Marking Up Examples

Leipzig.js is flexible when it comes to what underlying tags you use to mark up your glosses. For semantic reasons, I like to use <p> tags in a <div>, e.g.

<div data-gloss>
  <p>ein Beispiel</p>
  <p>DET.NOM.N.SG example</p>
  <p>‘an example’</p>
</div>

You can also mark it up as a list, and Leipzig.js will add the aligned words as an <li> item:

<ul data-gloss>
  <li>ein Beispiel</li>
  <li>DET.NOM.N.SG example</li>
  <li>‘an example’</li>
</ul>

To make the parser treat multiple words as a single unit, surround the words with curly braces, e.g.:

<div data-gloss>
  <p>El perrito está comiendo.</p>
  <p>the {little dog} is eating.</p>
</div>

Leipzig()

Leipzig([selector : String|NodeList|Element], [config : Object] ) -> Function

Leipzig.js takes two optional arguments during construction:

  1. selector, which tells Leipzig.js which elements to gloss
  2. config, a plain JavaScript object for configuration

Neither argument is required when creating a new Leipzig.js object, and if no arguments are provided, then Leipzig.js will use the default configuration, listed below.

Defaults

Leipzig.js defaults to a three-line glossing pattern, where the first two lines are word-aligned, and the last line is a non-aligned free translation.

The default configuration is the following:

var leipzig = Leipzig({
  selector: '[data-gloss]',
  lastLineFree: true,
  firstLineOrig: false,
  spacing: true,
  autoTag: true,
  async: false,
  lexers: [
    '{(.*?)}',
    '([^\\s]+)'
  ],
  events: {
    beforeGloss: 'gloss:beforeGloss',
    afterGloss: 'gloss:afterGloss',
    beforeLex: 'gloss:beforeLex',
    afterLex: 'gloss:afterLex',
    beforeAlign: 'gloss:beforeAlign',
    afterAlign: 'gloss:afterAlign',
    beforeFormat: 'gloss:beforeFormat',
    afterFormat: 'gloss:afterFormat',
    start: 'gloss:start',
    complete: 'gloss:complete'
  },
  classes: {
    glossed: 'gloss--glossed',
    noSpace: 'gloss--no-space',
    words: 'gloss__words',
    word: 'gloss__word',
    spacer: 'gloss__word--spacer',
    abbr: 'gloss__abbr',
    line: 'gloss__line',
    lineNum: 'gloss__line--',
    original: 'gloss__line--original',
    freeTranslation: 'gloss__line--free',
    noAlign: 'gloss__line--no-align',
    hidden: 'gloss__line--hidden'
  },
  abbreviations: {...} // See Leipzig.abbreviations section
});

When configuring Leipzig.js, you only need to specify the options that you want to change. All other options will retain their default values.

Back to Top ↑


config.selector

   Type : String | NodeList | Element
Default : '[data-gloss]'

This option configures which elements that the Leipzig.js glosser will operate on. You can set this option by either passing it as the first argument when initializing Leipzig.js, or by setting the selector argument in the configuration object:

// Two ways of saying the same thing
var leipzig = Leipzig('[data-gloss]');
var leipzig = Leipzig({ selector: '[data-gloss]' });

The elements option can be a String, a NodeList or Element.

If the selector argument is a String, Leipzig.js will internally run document.querySelectorAll() using the specified string, and the glosser will operate on the list of DOM elements it returns.

Likewise, if selector is an Element or a NodeList, the glosser will operate on the provided DOM element(s).

Back to Top ↑


config.lastLineFree

   Type : Boolean
Default : true

Leipzig.js can automatically mark the last line in a gloss as a non-aligned free translation. Doing so will add a special class to the last line (.gloss__line--free by default), and cause the line to be excluded from word alignment.

This behavior is controlled by the lastLineFree configuration option, and is enabled by default.

To disable automatically marking the last line as a free translation, set lastLineFree to false when initializing Leipzig.js:

var leipzig = Leipzig({ lastLineFree: false });

If you turn this option off, you can still mark a line as a free translation by adding the free translation CSS class (gloss__line--free by default) to the underlying HTML:

<p class="gloss__line--free">‘The little dog is eating.’</p>

Back to Top ↑


config.firstLineOrig

   Type : Boolean
Default : false

Leipzig.js can also automatically mark the first line in a gloss as a non-aligned free translation. Doing so will add a special class to the last line (.gloss__line--original by default), and cause the line to be excluded from word alignment.

This behavior is useful in cases where the line being glossed is long, or if the original language is not usually written with spaces, e.g. Japanese:

This behavior is controlled by the firstLineOrig configuration option, and is disabled by default.

To enable automatically parsing the first line as original text, set firstLineOrig to true when initializing Leipzig.js:

var leipzig = Leipzig({ firstLineOrig: true });

If firstLineOrig is disabled, you can still mark a line as a original text by adding the original text CSS class (gloss__line--original by default) to the underlying HTML:

<p class="gloss__line--original">太陽が昇る。</p>

Back to Top ↑


config.spacing

   Type : Boolean
Default : true

The default Leipzig.js styling includes small horizontal spacing at glossed word boundaries. For highly agglutinative languages, this behavior may not be ideal, because glossed phrases are likely to contain many morphemes in one word:

To remove this automatic spacing, you can set the spacing option to false when initializing Leipzig.js:

var leipzig = Leipzig({ spacing: false });

This will add an additional class to the gloss container (.gloss--no-space by default), which removes the horizontal space.

If spacing is set to false, you can still indicate spaces between words by adding an empty group ({}) where the space should be on each aligned line, e.g.:

<div id="ainu">
  <p>Usaopuspe aeyaykotuymasiramsuypa.</p>
  <p>usa- opuspe {} a- e- yay- ko- tuyma- si- ram- suy -pa</p>
  <p>various- rumors {} 1SG- APPL- REFL- APPL- far- REFL- heart- sway -ITER
  <p>‘I wonder about various rumors.’</p>
</div>

Back to Top ↑


config.autoTag

   Type : Boolean
Default : true

By default, Leipzig.js will try to wrap morphemic glosses in <abbr> tags. Beginning with the second line of the aligned lines, the parser looks for the following types of morphemes to tag:

  1. Numbers 1 through 4, corresponding to possible person morphemes;
  2. Sequences of ≥1 uppercase letter(s), e.g. N, SG, or PST.

The parser attempts to assign a title attribute to any matches by looking for a matching key in the Leipzig.abbreviations object. This object contains key-value pairs based on the standard abbreviations of the Leipzig Glossing Rules.

You can customize the definitions by adding or modifiying the keys and values on the Leipzig.abbreviations object. For example, the following code changes the definition of COMP from complementizer to comparative:

var leipzig = Leipzig();
leipzig.abbreviations.COMP = 'comparative';

Back to Top ↑


config.async

   Type : Boolean
Default : false

Leipzig.js runs synchronously by default, and normally this is fine. However, when running synchronously, the browser will wait for Leipzig.js to finish before attending to the needs of other scripts and browser events. This means that if you have a large number of glosses on a page, the user experience might start to suffer.

To remedy this, you can set the async option to true, which will cause Leipzig.js to run (somewhat) asynchronously.

You can use the optional callback to Leipzig#gloss() to perform actions when the glossing has been completed.

Back to Top ↑


config.lexers

   Type : Array<String> | String | RegExp
Default : ['{(.*?)}', '([^\\s]+)']

This option controls how Leipzig breaks lines into aligned words.

If passed a String or an Array of Strings, Leipzig will convert them into a RegExp object used for lexing the lines. The following configurations produce the same lexer:

// Array<String>
var leipzig = Leipzig({ lexers: ['{(.*?)}', '([^\\s]+)'] });

// String
var leipzig = Leipzig({ lexers: '{(.*?)}|([^\\s]+)' });

// RegExp
var leipzig = Leipzig({ lexers: /{(.*?)}|([^\s]+)/g });

Back to Top ↑


config.events

   Type : Object
Default : { beforeGloss:  'gloss:beforeGloss',
            afterGloss:   'gloss:afterGloss',
            beforeLex:    'gloss:beforeLex',
            afterLex:     'gloss:afterLex',
            beforeAlign:  'gloss:beforeAlign',
            afterAlign:   'gloss:afterAlign',
            beforeFormat: 'gloss:beforeFormat',
            afterFormat:  'gloss:afterFormat',
            start:        'gloss:start',
            complete:     'gloss:complete' }

Leipzig.js triggers certain events during the glossing process. You can act on these events by creating an event listener before calling the glosser:

var leipzig = Leipzig();
document.addEventListener('gloss:complete', function(event) {
  console.log('Glossing complete!');
});
leipzig.gloss();

// -> Glossing complete!

Following the DOM Custom Event API, some of the events have detail objects, which contain additional information about the event. For events without detail objects, you should be able to access all relevant information from various methods on event.target.

You can customize the event names by passing a plain JavaScript object to the events key on your config object, e.g.:

var leipzig = Leipzig({
  events: { complete: 'newComplete' }
});

config.events.start

Event Name : 'gloss:start'
  Triggers : Before glossing the first Element
    Detail : {
               glosses: NodeList // Elements to be glossed
             }

config.events.complete

Event Name : 'gloss:complete'
  Triggers : After glossing every Element
    Detail : {
               glosses: NodeList // Elements that were glossed
             }

config.events.beforeGloss

Event Name : 'gloss:beforeGloss'
  Triggers : Before each Element is glossed
    Detail : --

config.events.afterGloss

Event Name : 'gloss:afterGloss'
  Triggers : After each Element is glossed
    Detail : --

config.events.beforeLex

Event Name : 'gloss:beforeLex'
  Triggers : Before lexing each line
    Detail : {
               lineNum: Number       // Index of line being lexed
             }

config.events.afterLex

Event Name : 'gloss:afterLex'
  Triggers : After lexing each line
    Detail : {
               lineNum: Number,      // Index of line that was lexed
               tokens: Array<String> // Resulting tokens
             }

config.events.beforeAlign

Event Name : 'gloss:beforeAlign'
  Triggers : Before aligning lexed lines
    Detail : {
               firstLineNum: Number, // Index of first line being aligned
               lastLineNum: Number,  // Index of last line being aligned
               lines: Array<Array<String>> // Lines of tokens to be aligned
             }

config.events.afterAlign

Event Name : 'gloss:afterAlign'
  Triggers : After aligning lexed line
    Detail : {
               firstLineNum: Number, // Index of first line that was aligned
               lastLineNum: Number,  // Index of last line that was aligned
               lines: Array<Array<String>> // Lines of aligned tokens
             }

config.events.beforeFormat

Event Name : 'gloss:beforeFormat'
  Triggers : Before formatting aligned lines
    Detail : {
               firstLineNum: Number, // Index of first line being formatted
               lastLineNum: Number,  // Index of last line being formatted
               lines: Array<Array<String>> // Lines of aligned tokens
             }

config.events.afterFormat

Event Name : 'gloss:afterFormat'
  Triggers : After formatting aligned lines
    Detail : {
               firstLineNum: Number, // Index of first line that was formatted
               lastLineNum: Number   // Index of last line that was formatted
             }

Back to Top ↑


config.classes

   Type : Object
Default : // See Table Below

Leipzig.js adds a number of CSS classes to the final gloss, which you can use to style your glosses. The names of these classes can be configured by changing the the settings on the class object within the options configuration object.

The names, meaning, and default values of the classes are as follows:

Option Default Description
glossed gloss--glossed Added to each element in selector after the glosser has finished
noSpace gloss--no-space Added to each element in selector when the spacing option is set to false
words gloss__words Added to the group of words that are aligned
word gloss__word Added to each word in the group of aligned words
spacer gloss__word--spacer Added to empty words when spacing is false
abbr gloss__abbr Added to morpheme abbreviations by the auto-tagger
line gloss__line Added to each visible line in the gloss
lineNum gloss__line-- Added to each visible line in the gloss. The zero-indexed line number is automatically appended to the end of this class.
freeTranslation gloss__line--free The free translation line
original gloss__line--original The original language line
noAlign gloss__line--no-align Can be manually added to tell the parser to skip a line when aligning words

The following example shows the class structure of what a gloss looks like after being fully parsed and formatted:

<div id="gloss--style" class="example gloss--no-space gloss--glossed">
  <p class="gloss__line gloss__line--0 gloss__line--original">Nikukonda</p>
  <div class="gloss__words">
    <div class="gloss__word">
      <p class="gloss__line gloss__line--1">Ni-</p>
      <p class="gloss__line gloss__line--2">1SG.SBJ-</p>
    </div>
    <div class="gloss__word">
      <p class="gloss__line gloss__line--1">ku-</p>
      <p class="gloss__line gloss__line--2">2SG.OBJ-</p>
    </div>
    <div class="gloss__word">
      <p class="gloss__line gloss__line--1">kond</p>
      <p class="gloss__line gloss__line--2">love</p>
    </div>
    <div class="gloss__word">
      <p class="gloss__line gloss__line--1">-a</p>
      <p class="gloss__line gloss__line--2">-IND</p>
    </div>
  </div>
  <p class="gloss__line--hidden">Ni- ku- kond -a</p>
  <p class="gloss__line--hidden">1SG.SBJ- 2SG.OBJ- love -IND</p>
  <p class="gloss__line gloss__line--3 gloss__line--free">‘I love you’</p>
  <p class="gloss__line gloss__line--4 gloss__line--no-align">Town Nyanja (Lusaka, Zambia)</p>
</div>

If the class names of the last three options – classes.freeTranslation, classes.original, and classes.noAlign – are manually added to the html markup, they will be skipped by the Leipzig.js glosser during parsing, and will not be word-aligned with the other text.

NB: If a line is manually skipped by adding the classes.noAlign class, it might interfere with the automated Free Translation and Original Language line detection. If this happens, you will have to manually add the relevant classes to the underlying markup.


config.abbreviations

   Type : Object
Default : // see Leipzig.abbreviations section below

If you pass in a plain JavaScript object, it will override the default auto-tagging definitions.

Back to Top ↑


Leipzig#gloss()

Leipzig.gloss([callback : Function(err, elements)]) -> Void

This method runs the glosser over the elements that were specified when initializing the Leipzig object.

It accepts an optional, error-first style callback function that will be called once all of the glosses have been completed (or the glosser encounters an error):

var leipzig = Leipzig({ async: true });
console.log('Starting gloss...');
leipzig.gloss(function(err, elements) {
  if (err) {
    console.log(err);
  }
  console.log('Glossing complete!' + elements);
});
console.log('Glosser is running...');

// -> Starting gloss...
// -> Glosser is running...
// -> Glossing complete! [object NodeList]

This callback is especially useful if you're using the asynchronous glosser, but you can also use it with the synchronous API.

Back to Top ↑


Leipzig#config()

Leipzig.config(options : Object) -> Void

This option allows you to configure Leipzig.js after initializing it. The following code snippets have the same effect:

// Setting config during initialization
var leipzig = Leipzig({ async: true });

// Setting config via Leipzig#config()
var leipzig = Leipzig();
leipzig.config({ async: true });

NB: The config method will only set the options passed in via the configuration object. All other settings will return to their default values.

Back to Top ↑


Leipzig#addAbbreviations()

Leipzig.addAbbreviations(abbreviations : Object) -> Void

You can add multiple new abbreviations by calling Leipzig.addAbbreviations with a plain JavaScript object containing the new definitions.

The following code will (1) replace the existing definition of COMP, and (2) add a new definition for DIM:

var leipzig = Leipzig();
var newAbbreviations = {
  COMP: 'comparative',
  DIM: 'diminutive'
};

leipzig.addAbbreviations(newAbbreviations);

Back to Top ↑


Leipzig#setAbbreviations()

Leipzig.setAbbreviations(abbreviations : Object) -> Void

You can completely replace the abbreviations by calling Leipzig.setAbbreviations with a plain JavaScript object containing the new definitions.

For example, the following code will replace all existing definitions, leaving only ones for COMP and DIM:

var leipzig = Leipzig();
var newAbbreviations = {
  COMP: 'comparative',
  DIM: 'diminutive'
};

leipzig.setAbbreviations(newAbbreviations);

Back to Top ↑


Leipzig.abbreviations

Leipzig.js comes with a dictionary of the Standard Leipzig Glossing Rule abbreviations baked in. The auto-tagging engine will use these definitions by default when attempting to assign title attributes to morpheme glosses.

Modifying the abbreviations

You can replace this dictionary completely by setting the abbreviations config option when initializing Leipzig:

var newAbbreviations = { ABBREVIATION: 'definition' };
var leipzig = Leipzig({ abbreviations: newAbbreviations });

Or by calling Leipzig#setAbbreviations() after Leipzig has been initialized:

var leipzig = Leipzig();
var newAbbreviations = { ABBREVIATION: 'definition' };
leipzig.setAbbreviations(newAbbreviations);

You can also add or modify specific entries in the dictionary by setting the relevant key in the abbreviations dictionary to a different value. For example, to change COMP from complementizer to comparative, you could use the following:

var leipzig = Leipzig();
leipzig.abbreviations.COMP = 'comparative';

Likewise, to add the a new entry -- DIM for diminutive -- you could do:

var leipzig = Leipzig();
leipzig.abbreviations.DIM = 'diminutive';

If you would like to modify many definitions at once, you can use the Leipzig#addAbbreviations() method. The following code redefines COMP and adds a new definition for DIM:

var leipzig = Leipzig();
var newAbbreviations = {
  COMP: 'comparative',
  DIM: 'diminutive'
};

leipzig.addAbbreviations(newAbbreviations);

Default Definitions

The standard list is as follows:

Abbreviation Definition
1 first person
2 second person
3 third person
A agent-like argument of canonical transitive verb
ABL ablative
ABS absolutive
ACC accusative
ADJ adjective
ADV adverb(ial)
AGR agreement
ALL allative
ANTIP antipassive
APPL applicative
ART article
AUX auxiliary
BEN benefactive
CAUS causative
CLF classifier
COM comitative
COMP complementizer
COMPL completive
COND conditional
COP copula
CVB converb
DAT dative
DECL declarative
DEF definite
DEM demonstrative
DET determiner
DIST distal
DISTR distributive
DU dual
DUR durative
ERG ergative
EXCL exclusive
F feminine
FOC focus
FUT future
GEN genitive
IMP imperative
INCL inclusive
IND indicative
INDF indefinite
INF infinitive
INS instrumental
INTR intransitive
IPFV imperfective
IRR irrealis
LOC locative
M masculine
N neuter
NEG negation / negative
NMLZ nominalizer / nominalization
NOM nominative
OBJ object
OBL oblique
P patient-like argument of canonical transitive verb
PASS passive
PFV perfective
PL plural
POSS possessive
PRED predicative
PRF perfect
PRS present
PROG progressive
PROH prohibitive
PROX proximal / proximate
PST past
PTCP participle
PURP purposive
Q question particle / marker
QUOT quotative
RECP reciprocal
REFL reflexive
REL relative
RES resultative
S single argument of canonical intransitive verb
SBJ subject
SBJV subjunctive
SG singular
TOP topic
TR transitive
VOC vocative

Back to Top ↑