Complete Automation of ML/DL model using Jenkins & Docker

Daksh Jain
5 min readMay 24, 2020

--

Ever thought of automating an “automation” itself ? This article takes you through the journey of automating the process of finding the accuracy of a Machine Learning model itself.

First lets start by looking at the Machine Learning & Deep Learning programs provided by the client. We will be providing the environment on per code basis, i.e. 1 program & 1 dataset for 1 docker environment.

Deep Learning Python File
Machine Learning Python File

You can find this code in my Github repository.

You can see that the file names are same in these pictures. This is because of the following reasons:

I am only providing the environment, the infrstructure, RAM, CPU, and the other required resources like the Python libraries and dependencies.

So whenever the client provides the file and dataset, when I push it in my environment, it is a new clean environment, so I rename the file to train.py so that Jenkins itself finds whether it is ML or a DL code.

Now creating the Jenkins Job1:

Job 1 -> Pull files from GitHub -> Sends it to Redhat folder /code

Now I am creating 2 docker images using Dockerfile ->

  1. machinelearning:v1 -> solely for ML model training and prediction
  2. mytensor:v1 -> solely for DL model training and prediction

Want to read this story later? Save it in Journal.

Docker file for image machinelearning:v1
Docker file for image mytensor:v1

You can find this Dockerfile in my Github repository.

The main thing to notice is that both the files run the same python code named “train.py”.

This is because we are providing 1 docker environment for 1 code.

So after building the images from these Dockerfile, jenkins will look at 1 code named “train.py” and check if it is a Machine Learning code or a Deep Learning code and accordingly launch a container from any 1 of the 2 images.

Command to create docker image:

# docker build -t machinelearning:v1 <path of Dockerfile>
# docker build -t mytensor:v1 <path of Dockerfile>

Now creating the Job2:

Job 2 -> Launch ML/DL container according to “train.py” -> Find accuracy -> Send the necessary files to client

This is the code written in Bash Execute shell:

ml=$(sudo cat /code/train.py | grep sklearn | wc -l)
dl=$(sudo cat /code/train.py | grep keras | wc -l)
if [ $ml -gt 0 ] && [ $dl -eq 0 ]
then
sudo docker run -dit -v /code:/code machinelearning:v1
echo "ML"
elif [ $dl -gt 0 ]
then
sudo docker run -dit -v /code:/code mytensor:v1
echo "DL"
fi
sudo cp /code/* /var/lib/jenkins/workspace/mlops_job2
Job 2 output

The console output clearly shows that the file “train.py” according to Jenkins is a Deep Learning file, so it launched a container from the mytensor:v1 image.

Then created an “accuracy.txt” file and also save the model with the name “multiclassNN.h5” that will be sent to the client.

Then I am copying the files into the workspace of Jenkins so that a mail can be sent to the client about the accuracy and the trained model.

Mail Sent !!

Now Job 3:

This job checks the accuracy of the model and if it is less than 85 tweeks the code a bit and asks Job 2 to retrain it.

Job 3 runs the upgrade.py file

Now there are 2 major things I am working on here: ML & DL.

For tweeking an ML code I did a variety of things in the code like Feature Engineering which involves imputation, creating Dummy Variables.

We can also do Feature Scaling where we can Normalize the values in the dataset to make the predictions more accurate.

Also we have to do Feature Selection & Elimination to make the data more crisp.

For tweeking a DL code I added new layers, I can increase the epochs, change the number of units, i.e. the neurons.

Code to tweek the model -> upgrade.py

I did this using a python program “upgrade.py” which can be found in my Github repository.

The curl command is a trigger to the Job 2 so that it re trains the model and sends the accuracy file again.

Accuracy before

The ouput of the Job 3 is :

In this case the acccuracy is less than 85% so this job triggers Job 2 again to retrain the model and find the accuracy again.

During this process it is mandatory to keep an eye on the jobs using the Build Pipleline.

Accuracy after

After successful tweeking the accuracy increases to a whooping 94.44 % !!

So now our task is done.

Now once this is done the mail is sent to the client!!

Now a look at the Build Pipeline:

Build Pipeline

Worked in collaboration with Ashish Kumar.

Connect me on my LinkedIn as well.

📝 Save this story in Journal.

👩‍💻 Wake up every Sunday morning to the week’s most noteworthy stories in Tech waiting in your inbox. Read the Noteworthy in Tech newsletter.

--

--

Daksh Jain
Daksh Jain

Written by Daksh Jain

Automation Tech Enthusiast || Terraform Researcher || DevOps || MLOps ||

No responses yet