Handwritten mathematical expression dataset for value evaluation and notation type classification.
The dataset consists of 50,000 images of dimensions 384 x 128 (w x h), with each image having a expression in either the prefix (eg: +12), postfix (eg: 12+) or infix (eg: 1+2) notations. The characters in the image are evenly spaced, such that every one-third of the image has a single character in it.
Some sample images from the dataset can be seen below:
The annotations of the dataset contain the label (prefic, postfix or infix) and the value obtained after evaluation of the expression
Image | Label | Value |
---|---|---|
100.jpg | prefix | 0 |
311.jpg | postfix | 3 |
9991.jpg | prefix | 0 |
34788.jpg | infix | 7 |
34651.jpg | prefix | 16 |
34611.jpg | infix | 28 |
The dataset can be downloaded from google drive. On extracting, it has a directory 'data' with all images and a 'annotations.csv' file with the labels and values of each image.