CVND---Image-Captioning-Project

Instructions

Clone this repo: https://github.com/cocodataset/cocoapi

git clone https://github.com/cocodataset/cocoapi.git

Setup the coco API (also described in the readme here)

cd cocoapi/PythonAPI  
make  
cd ..

Download some specific data from here: http://cocodataset.org/#download (described below)

Under Annotations, download:
- 2014 Train/Val annotations [241MB] (extract captions_train2014.json and captions_val2014.json, and place at locations cocoapi/annotations/captions_train2014.json and cocoapi/annotations/captions_val2014.json, respectively)
- 2014 Testing Image info [1MB] (extract image_info_test2014.json and place at location cocoapi/annotations/image_info_test2014.json)
Under Images, download:
- 2014 Train images [83K/13GB] (extract the train2014 folder and place at location cocoapi/images/train2014/)
- 2014 Val images [41K/6GB] (extract the val2014 folder and place at location cocoapi/images/val2014/)
- 2014 Test images [41K/6GB] (extract the test2014 folder and place at location cocoapi/images/test2014/)

The project is structured as a series of Jupyter notebooks that are designed to be completed in sequential order (0_Dataset.ipynb, 1_Preliminaries.ipynb, 2_Training.ipynb, 3_Inference.ipynb).

Image Captioning Project

In this project, I design and train a CNN-RNN (Convolutional Neural Network - Recurrent Neural Network) model for automatically generating image captions. The network is trained on the Microsoft Common Objects in COntext (MS COCO) dataset. The image captioning model is displayed below.

Image source

Generating Image Captions

Here are some predictions from my model.

Good results

Not so good results

More samples

There are 500 predictions in the samples folder.

To setup your environment for Computer Vision Exercises and Projects, please see: CVND-Exercises-Solved

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
images		images
samples		samples
0_Dataset.ipynb		0_Dataset.ipynb
1_Preliminaries.ipynb		1_Preliminaries.ipynb
2_Training.ipynb		2_Training.ipynb
3_Inference.ipynb		3_Inference.ipynb
README.md		README.md
data_loader.py		data_loader.py
model.py		model.py
requirements.txt		requirements.txt
vocab.pkl		vocab.pkl
vocabulary.py		vocabulary.py
workspace_utils.py		workspace_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CVND---Image-Captioning-Project

Instructions

Image Captioning Project