Image dataset analysis

In the field of Computer Vision, image datasets play one of the important roles. Everyone, who begins his/her research in the aforementioned sphere, initially, need to choose model and set of data, appropriate for the given CV task. Usually, it is useful to make analysis (ex. analyze proportion of classes with given number of images or get classes with n images) in order to take a closer look at data. In this repository, you might find useful tools to perform image dataset analysis.

Let's analyze the image dataset

Let's choose the LFW (Labelled Faces in the Wild) face database as an example dataset.

First of all, it is necessary to create an instance of ImageDataset class and pass the path of dataset's directory as an argument.

Following structure of dataset directory is required to analyze it appropriately:
- image_dataset:
  - class 1:
    - image 1;
    - image 2;
    - ....
  - class 2
  - class 3
  - .....
```
from image_dataset_analysis import ImageDataset
lfw = ImageDataset("/LFW")   
```

Now, we can get some useful information about dataset by calling the following method:

lfw.analyze()

Output:

   Number of images in image dataset: 13214
   Number of classes in image dataset: 5734
   Mean number of images per class: 2
   Minimum number of images per class: 1
   Maximum number of images per class: 530
   Formats of images in dataset:
      JPEG: 13213 (100.0%)
   Modes of images in dataset:
      RGB: 13213 (100.0%)
   Sizes of images in dataset:
      (128, 128): 13213 (100.0%)
   Number of classes with only 1 images : 4057
   Number of classes with only 2 images : 777
   Number of classes with only 3 images : 290
   Remaining number of classes : 610

Do you want to know more?

You might get useful information about other valuable tools along with their implementations in the example.pdf file. In addition, you may use example.ipynb to make analysis of your image datasets. By the way, don't forget to change the path of dataset's directory to your own one!

Any questions ❓

Feel free to ask or express your ideas in issues section.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Image dataset analysis

Let's analyze the image dataset

Do you want to know more?

Any questions ❓

Files

README.md

Latest commit

History

README.md

File metadata and controls

Image dataset analysis

Let's analyze the image dataset

Do you want to know more?

Any questions ❓