Digit classification on the display of a turbomolecular pump using a KERAS model trained on the SVHN dataset

#Digit-classification-on-the-display-of-a-turbomolecular-pump-using-a-KERAS-model-trained-on-the-SVHN-dataset

In this project, we use a KERAS model that was trained on the SVHN dataset to predict the digits on a turbomolecular pump readout, recorded with an IP camera.

All codes and files are available in the github repository: https://github.com/kromerh/displayReadout

Loading output library...

01. Display Readout - Table of contents

#01.-Display-Readout---Table-of-contents

02. Introduction and Motivation

#02.-Introduction-and-Motivation

In this part steps "1. Retrieve image from camera / load image" and "2. Digit classification" from the project pipeline will be covered. The retrieval from the camera is not covered here, see the previous notebook for the example code.

03. Model training for digit classification

#03.-Model-training-for-digit-classification

In this part steps "1. Retrieve image from camera / load image" and "2. Digit classification" from the project pipeline will be covered. The retrieval from the camera is not covered here, see the previous notebook for the example code.

04. Digit recognition and classification

#04.-Digit-recognition-and-classification

In this part step "4. Classify digit using trained model" from the project pipeline will be covered. The previous step is taken as the ground truth, i.e. the images of the digits to be classified are fed to the algorithm directly. This can be addressed in the future to improve the need for manual input of the images.

05. Summary

#05.-Summary

Pipeline

#Pipeline
Loading output library...

02. Introduction and Motivation

#02.-Introduction-and-Motivation

When electronic equipment has to be isolated from one another, reading out the signals becomes a challenge. An example of such a problem is the case of the pressure readout of a compact deuterium-deuterium fast neutron generator. The details of the device are not important, but there is an electrode within the neutron generator system that is biased to around -100 kV, or 100000 V. When there is a high voltage discharge from the high voltage to the ground, it can created unwanted disruptions in nearby electronic devices, which are very sensitive to this type of discharges. To protect equipment, the pressure levels displayed on a so called single gauge readout have to be read out with a camera. This makes manual work a requirement, that is very tidious. To automatise the recording of the values displayed on the pressure readout, an algorithm was developed that automatically reads the pressure level from the display. Ideally, the algorithm would detect the regions with digits in the image by itself. However, at the time of this writing, that step in the digit detection or localisation was not included in the pipeline.

One test image, that represents what the camera is viewing is shown below. The code loads the test image and displays it.

Loading output library...

The challenge is to read three digits, indicated in the figure below in red. Ideally, the algorithm would find the position of the digits on its own and then report each of them back. Note that the second readout in the right part of the image is not relevant and does not need to be read out.

Loading output library...

Pipeline

#Pipeline

The pipeline of the algorithm to read the digits on the readout is shown below.

Step "1. Retrieve image from camera / load image" will be considered very simplified. Instead of connecting to the camera with openCV, we will only import a test image that is the same as the camera will read. However, the connection to the camera could be made with the code shown further below.

In step "2. Digit classification" the Street View House Number (SVHN) dataset (http://ufldl.stanford.edu/housenumbers/) is used to train a KERAS model that classifies images that contain digits into classes 0,1,...,9.

For the step "3. Digit detection / localisation" we will just take the ground truth and give the algorithm the correct position of the images. However, one solution (that is time consuming) is shown to find the position of the digits. There are better algorithms for that out there. For the sake of this project, that part can always be refined on later.

Loading output library...

Code to connect to the IP camera with openCV

#Code-to-connect-to-the-IP-camera-with-openCV

03. Model training for digit classification

#03.-Model-training-for-digit-classification

In this part steps "1. Retrieve image from camera / load image" and "2. Digit classification" from the project pipeline will be covered. The retrieval from the camera is not covered here, see the previous notebook for the example code.

We will load the SVHN dataset and train a KERAS model on the training dataset. The model will be saved to disc to be loaded later on. Then, we will evaluate the accuracy of the model on the test dataset.

Loading the test image

#Loading-the-test-image

The image from the camera was exportes as a numpy array with grayscale values from the image. It is loaded with the following code:

Loading output library...

SVHN dataset

#SVHN-dataset

The SVHN dataset (http://ufldl.stanford.edu/housenumbers/) ...

is a real-world image dataset for developing machine learning and object recognition algorithms with minimal requirement on data preprocessing and formatting. It can be seen as similar in flavor to MNIST (e.g., the images are of small cropped digits), but incorporates an order of magnitude more labeled data (over 600,000 digit images) and comes from a significantly harder, unsolved, real world problem (recognizing digits and numbers in natural scene images). SVHN is obtained from house numbers in Google Street View images.

It consists of 73257 training and 26032 test example images. These images are 32 by 32 pixels wide. To make things simple, we will use this framesize (32x32) throughout this project.

Load dataset

#Load-dataset

This function is included in the class SVHNDataset() but shown in the next cell. Be sure to change the path to the path to the dataset accordingly.

Let us plot 10 random examples from this dataset. The labels are atop the images.

Loading output library...

We see now, that the images are in RGB, so we will convert them to grayscale.

Loading output library...

After converting to grayscale, we need to preprocess the datasets to be used with KERAS.

Preprocessing datasets

#Preprocessing-datasets

To be used with KERAS, the feature dataset needs to have shape (number_of_examples, width, height, channels) and the labels dataset needs to be of shape (number_of_examples, ). Also, as we saw before, the dataset from SVHN is identifying the label "0" as "10". If we put the dataset as it is into KERAS, there will be an error, so we will take care of this by setting the 10's in the dataset to 0.

Training and saving

#Training-and-saving

Training this model takes around 1 hour. Important note: On my machine, if the line os.environ['KMP_DUPLICATE_LIB_OK']='True' is not included, there will be an error and the kernel will die when saving (or loading) a model.

Loading the model and evaluating

#Loading-the-model-and-evaluating

04. Digit recognition and classification

#04.-Digit-recognition-and-classification

In this part step "4. Classify digit using trained model" from the project pipeline will be covered. The previous step is taken as the ground truth, i.e. the images of the digits to be classified are fed to the algorithm directly. This can be addressed in the future to improve the need for manual input of the images.

The KERAS model trained in the previous step is loaded and used to classify the digits in the image. This image is fed to the algorithm as the ground truth, i.e. the image that contains the digit to be classified.

Classify digits

#Classify-digits

After loading the image, it has to be scaled down to make the digit fit in the 32x32 frame. This could be a result of the digit localization step. I.e. having a sliding window loop through the image with varying size, thus identifying the size of the digit. However, as discussed before, the image to classify is given directly do the algorithm in this project. Giving the model the ground truth images preselected, it classifies the digits in the images.

Loading output library...
Loading output library...
Loading output library...

As we see, the digits 6 and 5 are classified with very high accuracy. However, the digit 4 is only classified with 88%, the model does not predict that well. Let's have a closer look at this digit.

Loading output library...

We see that the model could also consider the digit 4 as a 9, which is not surprising. The training dataset was not straightforard digital digits, but house numbers. What could help here is to include examples of distorted images of the display of the readout, or distort the training images. Still, this minimal viable example shows the potential of the whole algorithm and its application in the field should be considered.

05. Summary

#05.-Summary

In this project a KERAS model was trained on house number digits from the SVHN dataset to classify digits. The trained model was used to classify digits on the readout of a turbomolecular pump. All of the example digits relevant to the algorithm were classified correctly. However, the accuracy during operation has still to be validated. There are numerous way to improve this prototype algorithm. First, the frames containing the digits were fed to the algorithm directly. To find the digits in the full image, a digit localization step can be implemented. There are already blob detection methods out there available that very reliable detect certain features, i.e. digits, in an image. One brute force solution is to slide a window (i.e. of size 32x32) through the image in steps of 2 to 3 pixels and use a pretrained model (y=1 meaning a digit is in the frame, y=0 meaning no digit is in the frame) to determine frames with images. The first attempt of this is included in the notebook create_synthetic_data.ipynb and test_digitdetector.ipynb. Key in this process is to create a synthetic dataset that contains labels of @@0@@ (digit in image) and @@1@@ (no digit in image). The methods developed in these two notebooks are inferior to already established methods, but it shows that there is a quick and dirty way that could be refined on or used as benchmark against more accuracy and developed algorithms.