Skip to content

Commit

Permalink
Question 2: Documentation.
Browse files Browse the repository at this point in the history
  • Loading branch information
jiviteshjain committed Apr 6, 2021
1 parent bb1e83d commit 2168e0a
Showing 1 changed file with 11 additions and 13 deletions.
24 changes: 11 additions & 13 deletions src/Assignment4.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "markdown",
"id": "korean-responsibility",
"id": "focused-rebel",
"metadata": {},
"source": [
"# Scene Recognition using Bag of Visual Words\n",
Expand Down Expand Up @@ -51,7 +51,7 @@
},
{
"cell_type": "markdown",
"id": "floral-pastor",
"id": "pleased-graphics",
"metadata": {},
"source": [
"Here, we use generators as data loaders. This ensures that more than one image is never in memory while processing. Because SIFT feature calculation is memory intensive, this helps in keeping the overall memory consumption of the process low. \n",
Expand Down Expand Up @@ -124,7 +124,7 @@
},
{
"cell_type": "markdown",
"id": "exclusive-wholesale",
"id": "transparent-restoration",
"metadata": {},
"source": [
"The following functions implement major parts of the algorithm:\n",
Expand Down Expand Up @@ -186,7 +186,6 @@
" \n",
" histograms = np.zeros((len(descriptors), k), dtype=np.float64)\n",
" idx = 0\n",
"# print(len(descriptors))\n",
" for i, desc in enumerate(descriptors):\n",
" for _ in range(desc.shape[0]):\n",
" histograms[i, labels[idx]] += 1\n",
Expand Down Expand Up @@ -407,7 +406,7 @@
},
{
"cell_type": "markdown",
"id": "structural-break",
"id": "violent-bathroom",
"metadata": {},
"source": [
"The confusion matrix on the test set is largely diagonal.\n",
Expand Down Expand Up @@ -492,7 +491,7 @@
},
{
"cell_type": "markdown",
"id": "voluntary-theta",
"id": "paperback-judgment",
"metadata": {},
"source": [
"We can't help but notice that several of the mis-classified images actually do resemble the predicted classes. The highway on the bottom-right has several green areas to the side, and so do the two waterfalls to its left - confusing the classifier into thinking of them as parks. Laundromats and Kitchens can easily be mixed up, because of the similar colours and arrangements of washing machines and chairs and tables."
Expand Down Expand Up @@ -558,7 +557,6 @@
" hist = w * bin_histograms(np.concatenate(sel_lab), sel_desc, k)\n",
" histograms = np.hstack((histograms, hist))\n",
" \n",
"# print(histograms.shape)\n",
" return histograms \n",
"\n",
"def cluster_pyramid(keypoints, descriptors, image_shape=SIZE, k=VOCAB_SIZE, level_weights=LEVEL_WEIGHTS, kmeans=None):\n",
Expand All @@ -578,7 +576,7 @@
},
{
"cell_type": "markdown",
"id": "static-metadata",
"id": "affecting-spray",
"metadata": {},
"source": [
"For purposes of efficiency, the code here reuses as much of the calculated entities as possible. The SIFT descriptors are calculated only once, across grids and levels, and those too are re-used from the run above. Because the clustering is not hierarchical, it is also reused from the run earlier."
Expand Down Expand Up @@ -700,15 +698,15 @@
},
{
"cell_type": "markdown",
"id": "homeless-opening",
"id": "trying-burton",
"metadata": {},
"source": [
"We obtain a significantly higher test accuracy of 70.625%, demonstrating the utility and robustness of this approach."
]
},
{
"cell_type": "markdown",
"id": "subtle-baghdad",
"id": "recreational-princeton",
"metadata": {},
"source": [
"### Without TF/IDF\n",
Expand All @@ -719,7 +717,7 @@
{
"cell_type": "code",
"execution_count": 18,
"id": "provincial-czech",
"id": "supported-fetish",
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -801,7 +799,7 @@
},
{
"cell_type": "markdown",
"id": "unlikely-conflict",
"id": "trying-machine",
"metadata": {},
"source": [
"## Results and Experimentation\n",
Expand All @@ -814,7 +812,7 @@
"\n",
"We notice that the accuracy increases on increasing the vocabulary size, because of finer grained clustering. We report test accuracy scores of approximately $48.7\\%, 52.5\\%$ and $54.7\\%$ for vocabulary sizes of $100, 200$ and $500$, respectively. We however, keep the vocabulary size at $200$, for purposes of efficiency. \n",
"\n",
"The regularization parameter, $\\lambda$, also has an effect. Setting $C$ to $10$, instead of the default $1$ (higher $C$ corresponds to a lower $\\lambda$) improves the training and test accuracy slightly (test accuracy changes from $51.1%$ to the current $52.5$), although the perfect training accuracy hints at overfitting. \n",
"The regularization parameter, $\\lambda$, also has an effect. Setting $C$ to $10$, instead of the default $1$ (higher $C$ corresponds to a lower $\\lambda$) improves the training and test accuracy slightly (test accuracy changes from $51.1\\%$ to the current $52.5\\%$), although the perfect training accuracy hints at overfitting. \n",
"\n",
"TF-IDF re-weighting also improves the accuracy, although only slightly."
]
Expand Down

0 comments on commit 2168e0a

Please sign in to comment.