ocular-disease-intelligent-recognition-deep-learning

ODIR-2019. Ocular Disease Intelligent Recognition Through Deep Learning Architectures

View the Project on GitHub JordiCorbilla/ocular-disease-intelligent-recognition-deep-learning

Ocular Disease Intelligent Recognition Through Deep Learning Architectures

Welcome to the repository for Jordi Corbilla’s MSc dissertation, titled “Ocular Disease Intelligent Recognition Through Deep Learning Architectures.” The dissertation was published by Universitat Oberta de Catalunya in 2020 and can be accessed through this link: [http://openaccess.uoc.edu/webapps/o2/handle/10609/113126].

The PDFs and sources for the dissertation are licensed under the Creative Commons Attribution license, which is detailed in the LICENSE file. We hope you find the dissertation and associated materials helpful in your own research and learning endeavors.

Abstract

Retinal pathologies are the most common cause of childhood blindness worldwide. Rapid and automatic detection of diseases is critical and urgent in reducing the ophthalmologist's workload. Ophthalmologists diagnose diseases based on pattern recognition through direct or indirect visualization of the eye and its surrounding structures. Dependence on the fundus of the eye and its analysis make the field of ophthalmology perfectly suited to benefit from deep learning algorithms. Each disease has different stages of severity that can be deduced by verifying the existence of specific lesions and each lesion is characterized by certain morphological features where several lesions of different pathologies have similar characteristics. We note that patients may be simultaneously affected by various pathologies, and consequently, the detection of eye diseases has a multi-label classification with a complex resolution principle.

Two deep learning solutions are being studied for the automatic detection of multiple eye diseases. The solutions chosen are due to their higher performance and final score in the ILSVRC challenge: GoogLeNet and VGGNet. First, we study the different characteristics of lesions and define the fundamental steps of data processing. We then identify the software and hardware needed to execute deep learning solutions. Finally, we investigate the principles of experimentation involved in evaluating the various methods, the public database used for the training and validation phases, and report the final detection accuracy with other important metrics.

Keywords

Image classification, Deep learning, Retinography, Convolutional neural networks, Eye diseases, Medical imaging analysis.

Pathologies

pathologies

According to 2010 World Health Organization data: The prevalence of preventable blindness is staggering, with over 39 million people affected globally, 80% of whom could have been prevented. In developing countries, cataracts remain the leading cause of blindness, accounting for 51% of cases worldwide.

Currently, the standard method for classifying diseases based on fundus photography involves manual estimation of injury locations and analysis of their severity, which is time-consuming and costly for ophthalmologists and healthcare systems. Therefore, there is a pressing need for automated methods to streamline this process.

Rapid and accurate disease detection is crucial for reducing the workload of ophthalmologists and ensuring timely treatment for patients.

Deep learning architecture

design

What is our methodology?

Our methodology consists of several steps that enable us to accurately classify medical images into different pathologies. First, we carefully analyze the dataset to ensure that each pathology is correctly represented. Next, we develop algorithms for processing the images, which are then fed into deep learning networks capable of handling them. To address imbalances in the data, we include a data augmentation module that adds variability to the images.

Using these data vectors, we train multiple deep learning models through a series of experiments, each aimed at improving classification accuracy. We adjust the size of the images and conversion blocks to optimize performance, while convolutional layers extract relevant image features and reduce dimensionality. The final decision-making step is handled by a Sigmoid layer.

In the end, we generate and compare results from each experiment to determine the best-performing model. To illustrate our methodology, we have created a flowchart that details each stage of the process.

Training Details

training

After conducting numerous experiments on the introduced models, we have identified the most effective configurations for each model.

For the Inception model, we utilized data augmentation and initialized the model with ImageNet weights for transfer learning. We enabled both feature extraction and sorting components, as this configuration produced satisfactory results. As previously mentioned, we added a dense layer with a Sigmoid activation to the last layer to compute the loss for each of the 8 classes in the output. We employed Stochastic Gradient Descent with a learning rate of 0.01 and utilized binary cross-entropy as the loss function for the multi-tag configuration. Additionally, we incorporated a patience feature that terminates training if the validation loss fails to decrease for 8 iterations. The model has 23 million trainable parameters.

In the VGG model, we found that transfer learning outperformed training the model from scratch. We loaded the ImageNet weights and modified the last layer to address the multi-label problem, enabling only the classifier component, resulting in 32 thousand trainable parameters. The configuration is similar to the Inception model, except for a reduced learning rate of 0.001.

Model Comparison

modelcomparison

The evaluation of the models reveals that the Inception model outperforms the VGG model, achieving an accuracy of 60% and a recall of 55%. The final score, taking into account the mean value of the Kappa coefficient of Cohen, F1-Score, and AUC, is 76%.

While the VGG model's performance is still impressive, with an accuracy of 57%, its recall is lower at 36%. This means that only 36% of the model's predictions are correct compared to the total number of images that are actually positive. For a more detailed analysis, we have provided a confusion matrix below.

Confusion matrix

ConfusionMatrix

The confusion matrices reveal interesting insights about the performance of the models. The Inception model outperforms the VGG model in correctly classifying images, as indicated by a higher number of values on the diagonal. However, there are still some classes, such as hypertension, where both models struggle to correctly classify images. The Inception model achieves an overall accuracy of 80%, with the exception of hypertension and other pathologies, where its performance drops to 22% and 32% respectively. This indicates that despite using data augmentation, there are still features that the model has not learned. On the other hand, the VGG model achieves an accuracy of 57%, with similar issues in correctly classifying hypertension images. Overall, both models show promise in recognizing ocular diseases, but further improvements are needed to address the misclassifications in certain classes.

Classification Output

classificationoutput

The final section of the report presents the output of each model and provides some conclusions. The code for training, validation, and inference is included in this repository, allowing for reproducibility and further exploration.

Upon examining the model output, we can see that both models produce the same classification for each image. However, upon closer inspection, we can see that the response of each model's output differs significantly.

Conclusions

In conclusion, this project explores the use of two deep learning models for the classification of ocular diseases, addressing the challenges posed by multi-label classification and data imbalance. The experiments have shown that the models can achieve an accuracy of 60% on the validation set after fine-tuning. The results suggest that these models could be applied in practice to assist ophthalmologists in classifying fundus images more efficiently.

Implementation Details

Dataset

The Dataset is part of the ODIR 2019 Grand Challenge. In order to use the data you need to register and download it from there: https://odir2019.grand-challenge.org/introduction/

Works on Python 3.6

tensorflow-2.0 - use branch master

The full list of packages used can be seen below:

- tensorboard-2.0.0
- tensorflow-2.0.0
- tensorflow-estimator-2.0.1
- tensorflow-gpu-2.0
- matplotlib-3.1.1
- keras-applications-1.0.8
- keras-preprocessing-1.0.5
- opencv-python-4.1.1.26
- django-2.2.6
- image-1.5.27
- pillow-6.2.0
- sqlparse-0.3.0
- IPython-7.8.0
- keras-2.3.1
- scikit-learn-0.21.3
- pydot-1.4.1
- graphviz-0.13.2
- pylint-2.4.4
- imbalanced-learn-0.5.0
- seaborn-0.9.0
- scikit-image-0.16.2
- pandas-0.25.1
- numpy 1.17.2
- scipy-1.3.1

All the training images must be in JPEG format and with 224x224px.

Usage

1) Image Treatment Process

Place all the files in the following folders (Training and Validation images):

c:\temp\ODIR-5K_Training_Dataset
c:\temp\ODIR-5K_Testing_Images

The training images Dataset should contain 7000 images and the testing Dataset 1000 images. Below is a screenshot of the images in the training dataset folder:

dataset

Then, create the following folders:

c:\temp\ODIR-5K_Testing_Images_cropped
c:\temp\ODIR-5K_Testing_Images_treated_128
c:\temp\ODIR-5K_Testing_Images_treated_224
c:\temp\ODIR-5K_Training_Dataset_cropped
c:\temp\ODIR-5K_Training_Dataset_treated_128
c:\temp\ODIR-5K_Training_Dataset_treated_224
c:\temp\ODIR-5K_Training_Dataset_augmented_128
c:\temp\ODIR-5K_Training_Dataset_augmented_224

run the following command to treat the training and validation images:

//These two remove the black pixels
python odir_image_crop_job.py
python odir_image_testing_crop_job.py

//These two resize the images to 224 pixels
python odir_training_image_treatment_job.py
python odir_testing_image_treatment_job.py

The odir_image_crop_job.py job will treat all the Training Dataset images and remove the black area of the images so the images end up like in the image below (same job for the odir_image_testing_crop_job.py which will act upon the training images):

Cropped

The second job will perform the resize and squaring functionality to 224 pixels x 224 pixels. The parameters image_width and keep_aspect_ratio variables can be edited in the python file to test different values/scenarios. This should give you images like the ones below:

squareimages

2) Data Augmentation (if you don’t want to use this step you can skip it)

run the following command to generate the additional images:

python.exe odir_data_augmentation_runner.py

This will generate the odir_augmented.csv file.

3) Image to tf.Data conversion and .npy storage

Now that we have all the images. We need to translate them into a td.Data component so we can load them into our model. Run the following command to generate the dataset for training and validation:

python.exe odir_patients_to_numpy.py

Note that any changes in the images will need a re-run of this script to rebuild the .npy files.

If you take a look at the arguments of the script you will see the following:

image_width = 224
training_path = r'C:\temp\ODIR-5K_Training_Dataset_treated' + '_' + str(image_width)
testing_path = r'C:\temp\ODIR-5K_Testing_Images_treated' + '_' + str(image_width)
augmented_path = r'C:\temp\ODIR-5K_Training_Dataset_augmented' + '_' + str(image_width)
csv_file = r'ground_truth\odir.csv'
csv_augmented_file = r'ground_truth\odir_augmented.csv'
training_file = r'ground_truth\testing_default_value.csv'

odir.csv file contains the generated ground truth per eye. To generate the ground truth, you can take a look at odir_runner.py which contains the different procedures to generate the ground truth based on the ODIR-5K_Training_Annotations(Updated)_V2.xlsx file which is part of the provided file by ODIR.

odir_augmented.csv contains the generated ground truth per eye and sample of the data augmentation process generator. This makes things easier when trying to feed this into the model and compare the results.

testing_default_value.csv contains the vectors of the testing images.

Deep Learning Models

4) Run Inception-v3

-- Basic Run of the model
python.exe odir_inception_v3_training_basic.py

Sample output can be seen here: readme.md

-- Enhanced Run of the model using Data Augmentation
python.exe odir_inception_training.py

Sample output can be seen here: readme.md

4.1) Inception Inference

python.exe odir_inception_testing_inference.py

5) Run VGG16

-- Basic Run of the model
python.exe odir_vgg16_training_basic.py
-- Enhanced Run of the model using Data Augmentation
python.exe odir_vgg16_training.py

5.1) VGG16 Inference

python.exe odir_vgg_testing_inference.py

Deep Learning Models (Additional - out of the scope of this dissertation)

6) Run VGG19

-- Basic Run of the model
python.exe odir_vgg19_training_basic.py
-- Enhanced Run of the model using Data Augmentation
python.exe odir_vgg19_training.py

6.1) VGG19 Inference

python.exe odir_vgg19_testing_inference.py

7) Run ResNet50

-- Basic Run of the model
python.exe odir_resnet50_training_basic.py
-- Enhanced Run of the model using Data Augmentation
python.exe odir_resnet50_training.py

7.1) ResNet50 Inference

python.exe odir_resnet50_testing_inference.py

8) Run InceptionResNetV2

-- Basic Run of the model
python.exe odir_inception_ResNetV2_training_basic.py
-- Enhanced Run of the model using Data Augmentation
python.exe odir_inception_ResNetV2_training.py

8.1) ResNet50 Inference

python.exe odir_inception_ResNetV2_testing_inference.py

References

License

Creative Commons Attribution-NonCommercial 4.0 International Public License.

Sponsors

No sponsors yet! Will you be the first?

PayPayl donate button