-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
6f03a90
commit 62f87db
Showing
290 changed files
with
202,562 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
*.params filter=lfs diff=lfs merge=lfs -text |
127 changes: 127 additions & 0 deletions
127
Exploration - Digitization Type Differentiation/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
# Introduction | ||
|
||
Exploration - Digitization Type Differentiation recognizes whether an image was digitized from Scanned or Microfilm material. | ||
|
||
## Getting Started | ||
|
||
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system. | ||
|
||
### Prerequisites | ||
|
||
The required software and libraries are: | ||
* Python 3.7 | ||
* MXNet 1.5 | ||
* CUDA 10.0 [if training on GPU] | ||
* Matplotlib 3.1.1 | ||
* opencv-python 4.1 | ||
* numpy 1.17 | ||
|
||
### Installing | ||
|
||
A step-by-step instruction on how to install required software systems and libraries. | ||
|
||
1. Download Python 3.7 from <https://www.python.org/downloads/> | ||
2. Download CUDA 10.0 from <https://developer.nvidia.com/cuda-toolkit-archive> | ||
3. Install downloaded installation file | ||
4. Open Terminal (for macOS), Command-Line (for Windows) | ||
5. Install MXNet | ||
``` | ||
pip install 'mxnet-cu100==1.5.1' | ||
``` | ||
6. Install Matplotlib | ||
``` | ||
python -m pip install -U 'matplotlib==3.1.1' | ||
``` | ||
7. Install opencv-python | ||
``` | ||
pip install 'opencv-python==4.1' | ||
``` | ||
8. Install numpy | ||
``` | ||
pip install 'numpy==1.17' | ||
``` | ||
|
||
## Data Acquisition | ||
For the first and second iteration: | ||
The dataset was downloaded from the Library of Congress. To download the collectio, please refer to [the Civil War collection on By The People](https://crowd.loc.gov/topics/civil-war/). | ||
|
||
### Ground-truth | ||
We sampled 1,200 images from the collection. Please refer text files for ground-truth: | ||
1. micro-test.txt | ||
2. micro-train.txt | ||
3. scan-test.txt | ||
4. scan-train.txt | ||
Note: If more labels were needed, please refer to our [ground-truth construncting tool](https://git.unl.edu/unl_loc_summer_collab/codebase/tree/master/utils/GroudtruthBuilder). | ||
|
||
## Running the training process | ||
|
||
1. Configure the training | ||
1.1 set the path to save the training log | ||
2. Download dataset from | ||
<https://git.unl.edu/unl_loc_summer_collab/labeled_data/tree/master/micrpfilm_scanning> | ||
3. Copy the downloaded folder to the downloaded 'project5' folder | ||
4. Run the training script | ||
``` | ||
python train.py | ||
``` | ||
|
||
### Formats of the training log | ||
|
||
There are four files generated by the training process. | ||
Training performance for each batch in the training set. | ||
``` | ||
batch_stat_micro_affine_bak.txt | ||
``` | ||
Testing performance for each batch in the testing set. | ||
``` | ||
test_batch_stat_micro_affine_bak.txt | ||
``` | ||
Each line of batch-specific log consists of ten parts split by "|". | ||
They are: | ||
1. the number ID of the batch; | ||
2. the confusion table for the batch; | ||
3. the number of the true positive for the batch; | ||
4. the number of the true negative for the batch; | ||
5. the number of the false positive for the batch; | ||
6. the number of the false negative for the batch; | ||
7. the average accuracy for the batch; | ||
8. the average precision for the batch; | ||
9. the average recall for the batch; | ||
10. the average F1 score for the batch; | ||
|
||
Training performance for each training step. | ||
``` | ||
epoch_stat_micro_affine_bak.txt | ||
``` | ||
Testing performance for each training step. | ||
``` | ||
epoch_stat_test_micro_affine_bak.txt | ||
``` | ||
Each line of training-step-specfic log consists of eleven parts split by "|". | ||
They are: | ||
1. the number ID of the training step; | ||
2. the time elapsed for the training step; | ||
3. the confusion table for the training step; | ||
4. the number of the true positive for the training step; | ||
5. the number of the true negative for the training step; | ||
6. the number of the false positive for the training step; | ||
7. the number of the false negative for the training step; | ||
8. the average accuracy for the training step; | ||
9. the average precision for the training step; | ||
10. the average recall for the training step; | ||
11. the average F1 score for the training step; | ||
|
||
## Built With | ||
|
||
* [Python](https://www.python.org/) - The programming language | ||
* [CUDA Toolkit](https://developer.nvidia.com/cuda-toolkit) - Enable GPU for model training | ||
* [MXNet](https://mxnet.apache.org/) - Deep learning framework | ||
|
||
## Contributing | ||
|
||
Digitization type differentiation using deep learning is promising | ||
Enrich metadata tagging by recognizing digitization type automatically | ||
|
||
## Authors | ||
|
||
* **Yi Liu** - University of Nebraska-Lincoln - *email* - [email protected] |
163 changes: 163 additions & 0 deletions
163
Exploration - Digitization Type Differentiation/augmentation.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,163 @@ | ||
#!/usr/bin/env python | ||
# coding: utf-8 | ||
|
||
# In[ ]: | ||
|
||
|
||
import os | ||
from math import log, floor, sqrt | ||
|
||
import mxnet as mx | ||
|
||
import numpy as np | ||
import numpy.random as random | ||
import cv2 | ||
|
||
|
||
# In[ ]: | ||
|
||
|
||
class ToNDArray(): | ||
def __call__(self, img, lbl=None): | ||
|
||
# print(img.shape) | ||
if len(img.shape) == 2: | ||
h,w = img.shape | ||
img = img.reshape(h,w,1) | ||
img = mx.nd.array(np.moveaxis(img,-1,0)) | ||
|
||
if lbl is not None: | ||
if len(lbl.shape) == 2: | ||
h,w = lbl.shape | ||
lbl = lbl.reshape(h,w,1) | ||
lbl = mx.nd.array(np.moveaxis(lbl,-1,0)) #, dtype=np.int32) | ||
|
||
return img, lbl | ||
|
||
class Normalize: | ||
def __init__(self, mean, std): | ||
self.mean = mx.nd.array(mean) | ||
self.std = mx.nd.array(std) | ||
|
||
def __call__(self, img, lbl=None): | ||
img = mx.nd.transpose(img, (1, 2, 0)) | ||
img = mx.image.color_normalize(img, self.mean, self.std) | ||
img = mx.nd.transpose(img, (2, 0, 1)) | ||
|
||
return img, lbl | ||
|
||
class AdaptNormalize: | ||
def __call__(self, img, lbl=None): | ||
avg = mx.nd.mean(img) | ||
std = np.std(img.asnumpy()) | ||
img = mx.nd.transpose(img, (1, 2, 0)) | ||
img = mx.image.color_normalize(img, avg, std) | ||
img = mx.nd.transpose(img, (2, 0, 1)) | ||
|
||
return img, lbl | ||
|
||
class Compose: | ||
def __init__(self, trans): | ||
self.trans = trans | ||
|
||
def __call__(self, img, lbl=None): | ||
for t in self.trans: | ||
img, lbl = t(img, lbl) | ||
return img, lbl | ||
|
||
class AdaptResize: | ||
def __init__(self, resolution): | ||
self.resolution = resolution | ||
|
||
def __call__(self, img, lbl=None): | ||
|
||
if len(img.shape) == 3: | ||
nb_px = np.prod(img[:,:,0].shape) | ||
else: | ||
nb_px = np.prod(img.shape) | ||
factor = sqrt(nb_px / self.resolution) | ||
prev_h = img.shape[0] | ||
prev_w = img.shape[1] | ||
w = floor(prev_w // factor) | ||
h = floor(prev_h // factor) | ||
# print(w,h) | ||
img = cv2.resize(img, (w, h), 0, 0, cv2.INTER_LINEAR) | ||
|
||
if lbl is not None: | ||
lbl = cv2.resize(lbl, (w, h), 0, 0, cv2.INTER_NEAREST) | ||
|
||
return img, lbl | ||
|
||
class Resize: | ||
def __init__(self, w, h): | ||
self.w = w | ||
self.h = h | ||
|
||
def __call__(self, img, lbl = None): | ||
img = cv2.resize(img, (self.w, self.h), 0, 0, cv2.INTER_LINEAR) | ||
if lbl is not None: | ||
lbl = cv2.resize(lbl, (self.w, self.h), 0, 0, cv2.INTER_NEAREST) | ||
|
||
return img, lbl | ||
|
||
class RandomCrop: | ||
def __init__(self, crop_size=None, scale=None): | ||
# assert min_scale <= max_scale | ||
self.crop_size = crop_size | ||
self.scale = scale | ||
# self.min_scale = min_scale | ||
# self.max_scale = max_scale | ||
|
||
def __call__(self, img, lbl=None): | ||
if self.crop_size: | ||
crop = self.crop_size | ||
else: | ||
crop = min(img.shape[0], img.shape[1]) | ||
|
||
if crop > min(img.shape[0], img.shape[1]): | ||
crop = min(img.shape[0], img.shape[1]) | ||
print(crop, img.shape[0], img.shape[1]) | ||
if self.scale: | ||
factor = random.uniform(self.scale, 1.0) | ||
crop = int(round(crop * factor)) | ||
|
||
x = random.randint(0, img.shape[1] - crop) | ||
y = random.randint(0, img.shape[0] - crop) | ||
|
||
img = img[y:y+crop, x:x+crop,:] | ||
if lbl is not None: | ||
lbl = lbl[y:y+crop, x:x+crop,:] | ||
return img, lbl | ||
|
||
class RandomAffine: | ||
def __init__(self): | ||
pass | ||
|
||
def __call__(self, img, lbl=None): | ||
#scale = random.uniform(1, 1) | ||
theta = random.uniform(-np.pi, np.pi) | ||
flipx = random.choice([-1,1]) | ||
flipy = random.choice([-1,1]) | ||
imgh = img.shape[0] | ||
imgw = img.shape[1] | ||
T0 = np.array([[1,0,-imgw/2.],[0,1,-imgh/2.],[0,0,1]]) | ||
S = np.array([[flipx,0,0],[0, flipy,0],[0,0,1]]) | ||
R = np.array([[np.cos(theta), np.sin(theta), 0], [-np.sin(theta), np.cos(theta), 0],[0,0,1]]) | ||
T1 = np.array([[1,0,imgw/2.],[0,1,imgh/2.],[0,0,1]]) | ||
M = np.dot(S, T0) | ||
M = np.dot(R, M) | ||
M = np.dot(T1, M) | ||
M = M[0:2,:] | ||
|
||
img = cv2.warpAffine(img, M, dsize=(imgw, imgh), flags=cv2.INTER_LINEAR) | ||
if lbl is not None: | ||
lbl = cv2.warpAffine(lbl, M, dsize=(imgw, imgh), flags=cv2.INTER_NEAREST, borderMode=cv2.BORDER_CONSTANT, borderValue=0) | ||
|
||
return img, lbl | ||
|
||
|
||
# In[ ]: | ||
|
||
|
||
|
||
|
Oops, something went wrong.