Scene Recognition using Bag of Visual Words with Spatial Pyramid Matching

This software implements a bag of visual words model to classify images belonging to a subset of the SUN dataset into 8 classes.

Methodology

First a vocabulary of visual words is constructed by densely sampling SIFT features, and clustering them into visual words via K-means.
The training images are then represented as histograms of these visual words, which are TF-IDF re-weighted to enhance the importance of discriminative features.
One-vs-all linear SVM classifiers are then trained on these histograms.
At query-time, the test image is sampled densely for features, which are mapped to words using K-means to construct the histogram. This histogram is identically re-weighted and then passed through the SVM classifiers.

Spatial Pyramid Matching:
Bag of visual words does not account for the spatial locations of occurence of the visual words. To account for this, spatial pyramid matching divides the image into $4^l$ regions at each level $l=0, 1, 2, ...$, and concatenates a weighted version of the histograms from each level, paying more attention to deeper levels, which are more spatially focussed [1].

Experimentation:
The effect of variations and parameter tuning and an analysis of this effect is also presented.

To run

All commands to be run from the repository root.

Unzip the dataset.
```
unzip dataset/SUN_data.zip -d dataset/
```

Install python packages from the Python Package Index (preferably in a virtual environment).

python3 -m venv bovw-env && source bovw-env/bin/activate # optional
pip install -r requirements.txt

Run the notebook on a jupyter server.
```
jupyter notebook
```
Open src/bovw.ipynb in the web browser window.

[1] Lazebnik, Svetlana & Schmid, Cordelia & Ponce, J.. (2006). Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. In CVPR. 2. 2169 - 2178. 10.1109/CVPR.2006.68.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github		.github
dataset		dataset
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scene Recognition using Bag of Visual Words with Spatial Pyramid Matching

Methodology

To run

About

Releases

Packages

Contributors 2

Languages

License

jiviteshjain/bag-of-visual-words

Folders and files

Latest commit

History

Repository files navigation

Scene Recognition using Bag of Visual Words with Spatial Pyramid Matching

Methodology

To run

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages