Skip to content

Latest commit

 

History

History
159 lines (130 loc) · 6.56 KB

TODO.md

File metadata and controls

159 lines (130 loc) · 6.56 KB

Features

  • (Admin) Clone Annotations

    • Required schema changes (boolean is_clone and integer cloned_from_id to keep track of source)
      • Change models in model_sqla.py
      • Add .sql in misc/ that contains SQL migration commands
    • Database function to clone annotations from one annotator to another
    • Admin GUI to perform transfer
  • (Admin) Show Annoation Progress

    • Database utility function to fetch annotation progress (utils.database.get_annotation_progress)
      • Utilize is_cloned column to ensure cloned annotation stats are not credited to the user
    • (tempoarary) Basic jsonify() frontend
    • Admin GUI to neatly display annotation progress
  • Accept plaintext input

    • Simple plaintext processor (regex split)
    • Stanza or some similar processors
    • Input/Output Transliteration option at the time of chapter file upload
  • Use localStorage for QoL Improvement Changes

    • (corpus) split.js percentages

    • (corpus) custom style for unconfirmed annotations from each task

      • (word-order) store unsubmitted word-order annotations (saved entities: state, unsubmitted-order, heuristic-order)
      • (word-order) add badges to describe fixed annotation areas (sentence tokens, unused area, custom tokens)
      • (word-order) display state badge to show the phase of annotation of each boundary element
    • (admin) remember last open tab

      • data management accordion
      • ontology tabs
  • Export Data

    • Every task in 2 formats
      • suitable for annotators
      • annotator visualization
      • (admin) suitable for programmers
      • decide a standard format
  • (Admin) Download (Export) Data

  • (Admin) Task Reorder/Enable/Disable

  • (Admin) View as another user

  • Improvements to "Add Token" Interface

    • Front-end with multiple fields for Analysis and Features
    • Search Functionality for "Add Token"
    • Edit Token using same interface?
  • Keep showing the TokenGraph as triplets are added

    • Core functionality - after press of a button
    • Change graph to show permanently instead of on click
  • Unify/Modularize task backend handling items? (Task-3 onwards, since most of it is repetitive code).

  • Remove task_id hardcoding

    • Task table
    • Remove task_id hard-coding in server_sqla.py api action handling
    • Next task etc using Task.order
    • Task related elements etc in JS
    • export.py hard-coding for Sentence Boundary and Token Order task

Minor

  • Custom key-events

    • Next/Previous Verse
    • Next/Previous Page
    • Submit
  • "Submit and go to next verse" ?


Implemented

  • Update Token Order task pipeline
    • Present it as a two-step task
    • Step 1: Decide which tokens to keep or get rid of
    • Step 2: Decide token order (current full task)
  • Allow display of additional context when required
  • Task Category System (allow multiple tasks from same category)
    • Global TASK_CATEGORY Constants
    • Update models
      • Task model to have category (Enum() of all valid task categories)
      • All task models to have task_id
    • Render templates, perform JS actions based on Task.category
    • Admin interface to add/edit tasks, category to be chosen
      • Customizable Help Messages
  • Log all Submits instead of just latest per task per annotator?
  • SentenceGraph show graph
  • Add triplets for token graph
  • Show sentence with Token Graph
  • Do task setup on the event of tab change (so we can avoid calling it from every submit and it'll be more consistent)
  • Track progress and provide a progressbar in front-end
    • Front-end (formatter function, on-hover info)
    • Back-end (update after every task-submit)
  • Skip vs Submit
    • Remove Skip button?
    • Allow empty submits
  • Draggable Left-Right Column
    • At least adjust width (50-50 or so) (no longer needed)

Bugs

  • Multiple chapters selection in export/ interface doesn't work, either force a single chapter, or figure out why the error is occurring

  • If there are deleted items, it triggers a "Successfully updated" message even if there are no changes. Refer to server_sqla.py for further details.

  • TokenConnection/SentenceGraph connections/relations show incomplete rows when one of the tokens is out of context.

  • Export - boundary not shown in some cases

    • (details: bug was when boundary token does not have text (or equals _))
  • If sentence boundary is marked for a verse in the next chapter, all the nodes in between get counted as sentences

  • Task 4 not recording

  • Task 4 display only displays single relation out of existing ones

  • After marking sentence boundary, transition to word-order task doesn't take proper sentence as word-order, need to call setup_word_order() again. Probably async issue.

  • When new boundaries are marked, it may affect next sentence as well, need to do something about that. (e.g. If token 12 was boundary, and token 24 was another, and token 12 gets deleted, now, if token 24 had word_order, that needs to be re-done)

Core

  • Front-end
    • Sentence Boundary Interface
    • Canonical Token Order Interface
      • Reordering Front-end (sortable)
    • Token Text Annotation Interface
    • Token Classification Interface (e.g. Named Entity)
    • Token Graph Interface
      • Triplet based addition
      • Show Live Graph (on button click)
    • Co-reference Resolution Interface
      • Button click (for token selection) based interface
    • Sentence Classification Interface
    • Sentence Graph
  • Back-end
    • Sentence Boundary
      • Deleting necessary boundaries if required
      • Delete related objects
    • Canonical Token Order
    • Token Text Annotation
    • Token Classification
    • Token Graph
    • Token Connection
    • Sentence Classification
    • Sentence Graph

Future

  • Connect to Sangrahaka

    • Import entities (do it manually in database)
  • Allow selecting if DCS etc in case that conllu allowed

  • Run SSCS for splitting (for general Sanskrit corpus when not in CoNLLU) (Think!)

  • Edit in table??

  • Allow multiple selection in classification tasks -- handle it on JS side, generating multiple entries from a single selector -- Issue: when it is actually single select, multi-select adds one extra click for changing.

  • User-wise task allocation?

  • Conflict Resolution Strategy

    • Task-specific