Docvqa data and its format #10

omeryasar · 2023-05-12T08:56:49Z

Hi, Thanks for sharing your valuable work.
I try to finetune dessurt model to my own dataset. Contains receipt like documents. What is the exact supervision that ı need to provide. And Do ı have to give bounding boxes of answers etc... I would be so happy if you share the dataset you used or format of the dataset you used for docqva task.

Thanks in advance

herobd · 2023-05-12T16:08:32Z

If you just want textual answers/supervision, the MyDataset class should meet your needs: https://github.com/herobd/dessurt#mydataset
It just requires a very simple json annotation format. The only tricky part is picking the task token, which depends on what your doing. "natural_q~" is the one for answering questions like DocVQA

omeryasar · 2023-05-16T08:18:04Z

If you just want textual answers/supervision, the MyDataset class should meet your needs: https://github.com/herobd/dessurt#mydataset It just requires a very simple json annotation format. The only tricky part is picking the task token, which depends on what your doing. "natural_q~" is the one for answering questions like DocVQA

Thanks for the answer, so I dont have to locate my questions or answers with bounding boxes in related document ?

herobd · 2023-05-18T00:05:43Z

Correct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docvqa data and its format #10

Docvqa data and its format #10

omeryasar commented May 12, 2023

herobd commented May 12, 2023

omeryasar commented May 16, 2023

herobd commented May 18, 2023

Docvqa data and its format #10

Docvqa data and its format #10

Comments

omeryasar commented May 12, 2023

herobd commented May 12, 2023

omeryasar commented May 16, 2023

herobd commented May 18, 2023