Skip to content

Commit

Permalink
migrate repo from gitlab to github
Browse files Browse the repository at this point in the history
  • Loading branch information
chulwoopack committed Feb 10, 2020
1 parent 6f03a90 commit 62f87db
Show file tree
Hide file tree
Showing 290 changed files with 202,562 additions and 0 deletions.
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.params filter=lfs diff=lfs merge=lfs -text
127 changes: 127 additions & 0 deletions Exploration - Digitization Type Differentiation/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# Introduction

Exploration - Digitization Type Differentiation recognizes whether an image was digitized from Scanned or Microfilm material.

## Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

### Prerequisites

The required software and libraries are:
* Python 3.7
* MXNet 1.5
* CUDA 10.0 [if training on GPU]
* Matplotlib 3.1.1
* opencv-python 4.1
* numpy 1.17

### Installing

A step-by-step instruction on how to install required software systems and libraries.

1. Download Python 3.7 from <https://www.python.org/downloads/>
2. Download CUDA 10.0 from <https://developer.nvidia.com/cuda-toolkit-archive>
3. Install downloaded installation file
4. Open Terminal (for macOS), Command-Line (for Windows)
5. Install MXNet
```
pip install 'mxnet-cu100==1.5.1'
```
6. Install Matplotlib
```
python -m pip install -U 'matplotlib==3.1.1'
```
7. Install opencv-python
```
pip install 'opencv-python==4.1'
```
8. Install numpy
```
pip install 'numpy==1.17'
```

## Data Acquisition
For the first and second iteration:
The dataset was downloaded from the Library of Congress. To download the collectio, please refer to [the Civil War collection on By The People](https://crowd.loc.gov/topics/civil-war/).

### Ground-truth
We sampled 1,200 images from the collection. Please refer text files for ground-truth:
1. micro-test.txt
2. micro-train.txt
3. scan-test.txt
4. scan-train.txt
Note: If more labels were needed, please refer to our [ground-truth construncting tool](https://git.unl.edu/unl_loc_summer_collab/codebase/tree/master/utils/GroudtruthBuilder).

## Running the training process

1. Configure the training
1.1 set the path to save the training log
2. Download dataset from
<https://git.unl.edu/unl_loc_summer_collab/labeled_data/tree/master/micrpfilm_scanning>
3. Copy the downloaded folder to the downloaded 'project5' folder
4. Run the training script
```
python train.py
```

### Formats of the training log

There are four files generated by the training process.
Training performance for each batch in the training set.
```
batch_stat_micro_affine_bak.txt
```
Testing performance for each batch in the testing set.
```
test_batch_stat_micro_affine_bak.txt
```
Each line of batch-specific log consists of ten parts split by "|".
They are:
1. the number ID of the batch;
2. the confusion table for the batch;
3. the number of the true positive for the batch;
4. the number of the true negative for the batch;
5. the number of the false positive for the batch;
6. the number of the false negative for the batch;
7. the average accuracy for the batch;
8. the average precision for the batch;
9. the average recall for the batch;
10. the average F1 score for the batch;

Training performance for each training step.
```
epoch_stat_micro_affine_bak.txt
```
Testing performance for each training step.
```
epoch_stat_test_micro_affine_bak.txt
```
Each line of training-step-specfic log consists of eleven parts split by "|".
They are:
1. the number ID of the training step;
2. the time elapsed for the training step;
3. the confusion table for the training step;
4. the number of the true positive for the training step;
5. the number of the true negative for the training step;
6. the number of the false positive for the training step;
7. the number of the false negative for the training step;
8. the average accuracy for the training step;
9. the average precision for the training step;
10. the average recall for the training step;
11. the average F1 score for the training step;

## Built With

* [Python](https://www.python.org/) - The programming language
* [CUDA Toolkit](https://developer.nvidia.com/cuda-toolkit) - Enable GPU for model training
* [MXNet](https://mxnet.apache.org/) - Deep learning framework

## Contributing

Digitization type differentiation using deep learning is promising
Enrich metadata tagging by recognizing digitization type automatically

## Authors

* **Yi Liu** - University of Nebraska-Lincoln - *email* - [email protected]
163 changes: 163 additions & 0 deletions Exploration - Digitization Type Differentiation/augmentation.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
#!/usr/bin/env python
# coding: utf-8

# In[ ]:


import os
from math import log, floor, sqrt

import mxnet as mx

import numpy as np
import numpy.random as random
import cv2


# In[ ]:


class ToNDArray():
def __call__(self, img, lbl=None):

# print(img.shape)
if len(img.shape) == 2:
h,w = img.shape
img = img.reshape(h,w,1)
img = mx.nd.array(np.moveaxis(img,-1,0))

if lbl is not None:
if len(lbl.shape) == 2:
h,w = lbl.shape
lbl = lbl.reshape(h,w,1)
lbl = mx.nd.array(np.moveaxis(lbl,-1,0)) #, dtype=np.int32)

return img, lbl

class Normalize:
def __init__(self, mean, std):
self.mean = mx.nd.array(mean)
self.std = mx.nd.array(std)

def __call__(self, img, lbl=None):
img = mx.nd.transpose(img, (1, 2, 0))
img = mx.image.color_normalize(img, self.mean, self.std)
img = mx.nd.transpose(img, (2, 0, 1))

return img, lbl

class AdaptNormalize:
def __call__(self, img, lbl=None):
avg = mx.nd.mean(img)
std = np.std(img.asnumpy())
img = mx.nd.transpose(img, (1, 2, 0))
img = mx.image.color_normalize(img, avg, std)
img = mx.nd.transpose(img, (2, 0, 1))

return img, lbl

class Compose:
def __init__(self, trans):
self.trans = trans

def __call__(self, img, lbl=None):
for t in self.trans:
img, lbl = t(img, lbl)
return img, lbl

class AdaptResize:
def __init__(self, resolution):
self.resolution = resolution

def __call__(self, img, lbl=None):

if len(img.shape) == 3:
nb_px = np.prod(img[:,:,0].shape)
else:
nb_px = np.prod(img.shape)
factor = sqrt(nb_px / self.resolution)
prev_h = img.shape[0]
prev_w = img.shape[1]
w = floor(prev_w // factor)
h = floor(prev_h // factor)
# print(w,h)
img = cv2.resize(img, (w, h), 0, 0, cv2.INTER_LINEAR)

if lbl is not None:
lbl = cv2.resize(lbl, (w, h), 0, 0, cv2.INTER_NEAREST)

return img, lbl

class Resize:
def __init__(self, w, h):
self.w = w
self.h = h

def __call__(self, img, lbl = None):
img = cv2.resize(img, (self.w, self.h), 0, 0, cv2.INTER_LINEAR)
if lbl is not None:
lbl = cv2.resize(lbl, (self.w, self.h), 0, 0, cv2.INTER_NEAREST)

return img, lbl

class RandomCrop:
def __init__(self, crop_size=None, scale=None):
# assert min_scale <= max_scale
self.crop_size = crop_size
self.scale = scale
# self.min_scale = min_scale
# self.max_scale = max_scale

def __call__(self, img, lbl=None):
if self.crop_size:
crop = self.crop_size
else:
crop = min(img.shape[0], img.shape[1])

if crop > min(img.shape[0], img.shape[1]):
crop = min(img.shape[0], img.shape[1])
print(crop, img.shape[0], img.shape[1])
if self.scale:
factor = random.uniform(self.scale, 1.0)
crop = int(round(crop * factor))

x = random.randint(0, img.shape[1] - crop)
y = random.randint(0, img.shape[0] - crop)

img = img[y:y+crop, x:x+crop,:]
if lbl is not None:
lbl = lbl[y:y+crop, x:x+crop,:]
return img, lbl

class RandomAffine:
def __init__(self):
pass

def __call__(self, img, lbl=None):
#scale = random.uniform(1, 1)
theta = random.uniform(-np.pi, np.pi)
flipx = random.choice([-1,1])
flipy = random.choice([-1,1])
imgh = img.shape[0]
imgw = img.shape[1]
T0 = np.array([[1,0,-imgw/2.],[0,1,-imgh/2.],[0,0,1]])
S = np.array([[flipx,0,0],[0, flipy,0],[0,0,1]])
R = np.array([[np.cos(theta), np.sin(theta), 0], [-np.sin(theta), np.cos(theta), 0],[0,0,1]])
T1 = np.array([[1,0,imgw/2.],[0,1,imgh/2.],[0,0,1]])
M = np.dot(S, T0)
M = np.dot(R, M)
M = np.dot(T1, M)
M = M[0:2,:]

img = cv2.warpAffine(img, M, dsize=(imgw, imgh), flags=cv2.INTER_LINEAR)
if lbl is not None:
lbl = cv2.warpAffine(lbl, M, dsize=(imgw, imgh), flags=cv2.INTER_NEAREST, borderMode=cv2.BORDER_CONSTANT, borderValue=0)

return img, lbl


# In[ ]:




Loading

0 comments on commit 62f87db

Please sign in to comment.