Skip to content

Commit

Permalink
Finish to update Sankar et al
Browse files Browse the repository at this point in the history
  • Loading branch information
Guillaume Lemaitre committed Apr 18, 2016
1 parent 271f68a commit 23c2541
Show file tree
Hide file tree
Showing 28 changed files with 182 additions and 4,118 deletions.
113 changes: 112 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,115 @@
Classification of SD-OCT volumes for DME detection: an anomaly detection approach
=================================================================================

S. Sankar, D. Sidibé, C. Y. Cheung, T. Y. Wong, E. Lamoureux, D. Milea, F. Meriaudeau, “Classification of SD-OCT volumes for DME detection: an anomaly detection approach”, SPIE Medical Imaging 2016, San Diego, USA.
```
@proceeding{sankar2016classification,
author = {Sankar, S. and Sidib\'{e}, D. and Cheung, Y. and Wong, T. Y. and Lamoureux, E. and Milea, D. and Meriaudeau, F.},
title = {Classification of SD-OCT volumes for DME detection: an anomaly detection approach},
journal = {Proc. SPIE},
volume = {9785},
pages = {97852O-97852O-6},
year = {2016}
}
```

How to use the pipeline?
-------

### Pre-processing pipeline

The follwoing pre-processing routines were applied:

- Flattening,
- Cropping.

#### Data variables

In the file `pipeline/feature-preprocessing/pipeline_preprocessing.m`, you need to set the following variables:

- `data_directory`: this directory contains the orignal SD-OCT volume. The format used was `.img`.
- `store_directory`: this directory corresponds to the place where the resulting data will be stored. The format used was `.mat`.

#### Algorithm variables

The variables which are not indicated in the inital publication and that can be changed are:

- `x_size`, `y_size`, `z_size`: the original size of the SD-OCT volume. It is needed to open `.img` file.
- `kernelratio`, `windowratio`, `filterstrength`: the NLM parameters.
- `h_over_rpe`, `h_under_rpe`, `width_crop`: the different variables driving the cropping.
- `thres_method`, `thres_val`: method to threshold and its associated value to binarize the image.
- `gpu_enable`: method to enable GPU.
- `median_sz`: size of the kernel when applying the median filter.
- `se_op`, `se_cl`: size of the kernel when applying the closing and opening operations.

#### Run the pipeline

From the root directory, launch MATLAB and run:

```
>> run pipeline/feature-preprocessing/pipeline_preprocessing.m
```

### Extraction pipeline

For this pipeline, the following features were extracted:

- PCA on vectorized B-scans.

#### Data variables

In the file `pipeline/feature-extraction/pipeline_extraction.m`, you need to set the following variables:

- `data_directory`: this directory contains the pre-processed SD-OCT volume. The format used was `.mat`.
- `store_directory`: this directory corresponds to the place where the resulting data will be stored. The format used was `.mat`.
- `pca_compoments`: this the number of components to keep when reducing the dimension by PCA.

#### Run the pipeline

From the root directory, launch MATLAB and run:

```
>> run pipeline/feature-extraction/pipeline_extraction.m
```

### Classification pipeline

The method for classification used was:

- GMM modelling.

#### Data variables

In the file `pipeline/feature-preprocessing/pipeline_classifier.m`, you need to set the following variables:

- `data_directory`: this directory contains the feature extracted from the SD-OCT volumes. The format used was `.mat`.
- `store_directory`: this directory corresponds to the place where the resulting data will be stored. The format used was `.mat`.
- `gt_file`: this is the file containing the label for each volume. You will have to make your own strategy.
- `gmm_k`: this is the number of mixture components of the GMM.
- `pca_components`: this is the number of components of the PCA used in the extraction.
- `mahal_thresh`: the treshold to use to consider a B-scan as abnormal or not.
- `n_slices_thres`: the minimum number of abnormal slices to consider the volume as DME.

#### Run the pipeline

From the root directory, launch MATLAB and run:

```
>> run pipeline/feature-classification/pipeline_classifier.m
```

### Validation pipeline

#### Data variables

In the file `pipeline/feature-validation/pipeline_validation.m`, you need to set the following variables:

- `data_directory`: this directory contains the classification results. The format used was `.mat`.
- `gt_file`: this is the file containing the label for each volume. You will have to make your own strategy.

#### Run the pipeline

From the root directory, launch MATLAB and run:

```
>> run pipeline/feature-validation/pipeline_validation.m
```
43 changes: 0 additions & 43 deletions pipeline/check_outlier.m

This file was deleted.

40 changes: 0 additions & 40 deletions pipeline/crop_vols.m

This file was deleted.

67 changes: 0 additions & 67 deletions pipeline/do_pca.m

This file was deleted.

24 changes: 17 additions & 7 deletions pipeline/feature-classification/pipeline_classifier.m
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,17 @@
idx_class_pos = find( data_label == 1 );
idx_class_neg = find( data_label == -1 );

% Number of mixture components
gmm_k = 8;
% Parameter for the GMM
rng(1);
gmm_k = 15;
options = statset('MaxIter', 1000);

% Mahalanobis threshold
pca_components = 300;
pca_components = 500;
mahal_thresh = chi2inv(0.95, pca_components);

% Number of abnormal slices tolerated
n_slices_thres = 32;
n_slices_thres = 15;

% Number of slice per volume
x_size = 128;
Expand All @@ -50,13 +52,17 @@
load(strcat(data_directory, filename_cv));

% Apply a GMM learning on the training set
gmm_model = fitgmdist(training_data, gmm_k);
gmm_model = fitgmdist(training_data, gmm_k, ...
'Options', options, ...
'CovarianceType', 'diagonal', ...
'RegularizationValue', 0.001, ...
'Replicated', 10);

test_vol = 1;
% Test the gmm_model and count the number of outliers
for test_id = 1 : x_size : size(testing_data,1)
% Extract the data to use in the gmm model
t_data = testing_data(test_id : test_id + x_size - 1,:));
t_data = testing_data(test_id : test_id + x_size - 1,:);

% Compute the Mahalanobis distance for all the slices
mahal_dist = mahal(gmm_model, t_data);
Expand All @@ -65,7 +71,11 @@
mahal_dist_near = min(mahal_dist, [], 2);

% Check how many slices are abnormal
n_abnormal_slices = nnz(mahal_dist_near > mahal_thresh);
% Apply a median filter in order to reject the case that
% you do not have consecutive abnormal slices
n_abnormal_slices = nnz(medfilt1(single(mahal_dist_near > mahal_thresh)));

disp(['Number of estimated outliers: ', num2str(n_abnormal_slices)]);

% Affect the predicted label
if n_abnormal_slices > n_slices_thres
Expand Down
2 changes: 1 addition & 1 deletion pipeline/feature-extraction/pipeline_extraction.m
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
idx_class_neg = find( data_label == -1 );

% Number of components for the PCA
pca_components = 300;
pca_components = 500;

% poolobj = parpool('local', 48);

Expand Down
52 changes: 52 additions & 0 deletions pipeline/feature-validation/pipeline_validation.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
clear all;
close all;
clc;

% Execute the setup for protoclass matlab
run('../../../../third-party/protoclass_matlab/setup.m');

% Refer to the classification pipeline to know how the testing set
% was created
% Location of the ground-truth
gt_file = '/data/retinopathy/OCT/SERI/data.xls';

% Load the csv data
[~, ~, raw_data] = xlsread(gt_file);
% Extract the information from the raw data
% Store the filename inside a cell
filename = { raw_data{ 2:end, 1} };
% Store the label information into a vector
data_label = [ raw_data{ 2:end, 2 } ];
% Get the index of positive and negative class
idx_class_pos = find( data_label == 1 );
idx_class_neg = find( data_label == -1 );

gt_label = [];
% We gan create the GT labels
for idx_cv_lpo = 1:length(idx_class_pos)
% Concatenate the value as in the classification pipeline
gt_label = [ gt_label 1 -1 ];
end

% Load the results data
results_filename = ['/data/retinopathy/OCT/SERI/results/' ...
'sankar_2016/predicition.mat'];
load(results_filename);

% Linearize the vector loaded
pred_label = pred_label_cv';
pred_label = pred_label(:);

% Get the statistic
[ sens, spec, prec, npv, acc, f1s, mcc, gmean, cm ] = metric_confusion_matrix( ...
pred_label, gt_label );

% Display the information
disp( ['Sensitivity: ', num2str(sens)] );
disp( ['Specificity: ', num2str(spec)] );
disp( ['Precision: ', num2str(prec)] );
disp( ['Negative Predictive Value: ', num2str(npv)] );
disp( ['Accuracy: ', num2str(acc)] );
disp( ['F1-score: ', num2str(f1s)] );
disp( ['Matthew Correlation Coefficiant: ', num2str(mcc)] );
disp( ['Geometric Mean: ', num2str(gmean)] );
Loading

0 comments on commit 23c2541

Please sign in to comment.