Update index.html

OSU-NLP-Group · Nov 22, 2023 · b370aa6 · b370aa6
1 parent 6c3985d
commit b370aa6
Showing 1 changed file with 2 additions and 2 deletions.
diff --git a/index.html b/index.html
@@ -625,7 +625,7 @@ <h2 class="title is-3">In-domain Evaluation</h2>
               </div>
               <div>
               <br>
-              Specifically, we observe the following takeaways:
+              Specifically, we observed the following takeaways:
                 <ol>
                   <li>By simply fine-tuning a large language model on TableInstruct, TableLlama can achieve comparable or even better performance on almost all the tasks <b>without any table pretraining or special table model architecture design</b>;</li>
                   <li><b>TableLlama displays advantages in table QA tasks</b>: <b>TableLlama</b> can surpass the SOTA by <b>5.61 points</b> for highlighted cell based table QA task (i.e., FeTaQA) and <b>17.71 points</b> for hierarchical table QA (i.e., HiTab), which is full of numerical reasoning on tables. As LLMs have been shown superior in interacting with humans and answering questions, this indicates that <b>the existing underlying strong language understanding ability of LLMs may be beneficial for such table QA tasks, despite with semi-structured tables</b>;</li>
@@ -718,7 +718,7 @@ <h2 class="title is-3">Out-of-domain Evaluation</h2>
               </div>
               <div>
                 <br>
-                Specifically, we observed these following takeaways:
+                Specifically, we observed the following takeaways:
                 <ol>
                   <li><b>By learning from the table-based training tasks, the model has acquired essential underlying table understanding ability, which can be transferred to other table-based tasks/datasets and facilitate their performance;</b></li>
                   <li>FEVEROUS exhibits the largest gain over the other 5 datasets. This is likely because the fact verification task is an in-domain training task, although the dataset is unseen during training. <b>Compared with cross-task generalization, it may be easier to generalize to different datasets belonging to the same tasks</b>;</li>