From 2168e0a6a9808d9cc116e3e68e4319d3716f7fe3 Mon Sep 17 00:00:00 2001 From: jiviteshjain Date: Tue, 6 Apr 2021 23:58:18 +0530 Subject: [PATCH] Question 2: Documentation. --- src/Assignment4.ipynb | 24 +++++++++++------------- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/src/Assignment4.ipynb b/src/Assignment4.ipynb index fb975f3..26f8c3d 100644 --- a/src/Assignment4.ipynb +++ b/src/Assignment4.ipynb @@ -2,7 +2,7 @@ "cells": [ { "cell_type": "markdown", - "id": "korean-responsibility", + "id": "focused-rebel", "metadata": {}, "source": [ "# Scene Recognition using Bag of Visual Words\n", @@ -51,7 +51,7 @@ }, { "cell_type": "markdown", - "id": "floral-pastor", + "id": "pleased-graphics", "metadata": {}, "source": [ "Here, we use generators as data loaders. This ensures that more than one image is never in memory while processing. Because SIFT feature calculation is memory intensive, this helps in keeping the overall memory consumption of the process low. \n", @@ -124,7 +124,7 @@ }, { "cell_type": "markdown", - "id": "exclusive-wholesale", + "id": "transparent-restoration", "metadata": {}, "source": [ "The following functions implement major parts of the algorithm:\n", @@ -186,7 +186,6 @@ " \n", " histograms = np.zeros((len(descriptors), k), dtype=np.float64)\n", " idx = 0\n", - "# print(len(descriptors))\n", " for i, desc in enumerate(descriptors):\n", " for _ in range(desc.shape[0]):\n", " histograms[i, labels[idx]] += 1\n", @@ -407,7 +406,7 @@ }, { "cell_type": "markdown", - "id": "structural-break", + "id": "violent-bathroom", "metadata": {}, "source": [ "The confusion matrix on the test set is largely diagonal.\n", @@ -492,7 +491,7 @@ }, { "cell_type": "markdown", - "id": "voluntary-theta", + "id": "paperback-judgment", "metadata": {}, "source": [ "We can't help but notice that several of the mis-classified images actually do resemble the predicted classes. The highway on the bottom-right has several green areas to the side, and so do the two waterfalls to its left - confusing the classifier into thinking of them as parks. Laundromats and Kitchens can easily be mixed up, because of the similar colours and arrangements of washing machines and chairs and tables." @@ -558,7 +557,6 @@ " hist = w * bin_histograms(np.concatenate(sel_lab), sel_desc, k)\n", " histograms = np.hstack((histograms, hist))\n", " \n", - "# print(histograms.shape)\n", " return histograms \n", "\n", "def cluster_pyramid(keypoints, descriptors, image_shape=SIZE, k=VOCAB_SIZE, level_weights=LEVEL_WEIGHTS, kmeans=None):\n", @@ -578,7 +576,7 @@ }, { "cell_type": "markdown", - "id": "static-metadata", + "id": "affecting-spray", "metadata": {}, "source": [ "For purposes of efficiency, the code here reuses as much of the calculated entities as possible. The SIFT descriptors are calculated only once, across grids and levels, and those too are re-used from the run above. Because the clustering is not hierarchical, it is also reused from the run earlier." @@ -700,7 +698,7 @@ }, { "cell_type": "markdown", - "id": "homeless-opening", + "id": "trying-burton", "metadata": {}, "source": [ "We obtain a significantly higher test accuracy of 70.625%, demonstrating the utility and robustness of this approach." @@ -708,7 +706,7 @@ }, { "cell_type": "markdown", - "id": "subtle-baghdad", + "id": "recreational-princeton", "metadata": {}, "source": [ "### Without TF/IDF\n", @@ -719,7 +717,7 @@ { "cell_type": "code", "execution_count": 18, - "id": "provincial-czech", + "id": "supported-fetish", "metadata": {}, "outputs": [ { @@ -801,7 +799,7 @@ }, { "cell_type": "markdown", - "id": "unlikely-conflict", + "id": "trying-machine", "metadata": {}, "source": [ "## Results and Experimentation\n", @@ -814,7 +812,7 @@ "\n", "We notice that the accuracy increases on increasing the vocabulary size, because of finer grained clustering. We report test accuracy scores of approximately $48.7\\%, 52.5\\%$ and $54.7\\%$ for vocabulary sizes of $100, 200$ and $500$, respectively. We however, keep the vocabulary size at $200$, for purposes of efficiency. \n", "\n", - "The regularization parameter, $\\lambda$, also has an effect. Setting $C$ to $10$, instead of the default $1$ (higher $C$ corresponds to a lower $\\lambda$) improves the training and test accuracy slightly (test accuracy changes from $51.1%$ to the current $52.5$), although the perfect training accuracy hints at overfitting. \n", + "The regularization parameter, $\\lambda$, also has an effect. Setting $C$ to $10$, instead of the default $1$ (higher $C$ corresponds to a lower $\\lambda$) improves the training and test accuracy slightly (test accuracy changes from $51.1\\%$ to the current $52.5\\%$), although the perfect training accuracy hints at overfitting. \n", "\n", "TF-IDF re-weighting also improves the accuracy, although only slightly." ]