Skip to content

Meeting Note #2 07.03.2019

ugurcanarikan edited this page Mar 15, 2019 · 2 revisions

Location: Bogazici University Computer Engineering Building

Date/Time: 07.03.2019 / 12:00

Attendees:

  • Suzan Üsküdarlı
  • Onur Güngör
  • Uğurcan Arıkan

1. Preparation Before Meeting

  • 1.1. Search about NLTK
  • 1.2. Search about data visualization tools in python

2. Agenda

  • 2.1. Pretraining BERT after the corpus has been tokenized

3. Discussion

  • 3.1. Missing parts of the wiki page has been discussed
  • 3.2. Pretraining BERT has been discussed

4. Outcomes

  • 4.1. Wiki page of the repository

    • 4.1.1. Link to the corpus will be added
  • 4.2. BERT

    • 4.2.1. Vocabulary file for BERT will be created using Google's sentencepiece
    • 4.2.2. Pretraining data will be created

5. TO-DO list

Deadline: 14.03 12:00 Assignee: Uğurcan Arıkan

  • 5.1. Add link of the corpus to the wiki

Deadline: 14.03 12:00 Assignee: Uğurcan Arıkan

  • 5.2. Create vocabulary for BERT

Deadline: 14.03 12:00 Assignee: Uğurcan Arıkan

  • 5.3. Create pretraining data for BERT