Deploy ML models using Flask

Deploying Your Machine Learning Models Using Flask

In this article, we discuss how we can deploy ML models using Flask. This article assumes no prior knowledge of the Flask library.

What is Deployment?

Deployment in very simplified terms means making your code available for use of end-users. Let us take an example here. You design an app that you believe can be of great value the society. You have tested your app and your app runs perfectly on your local machine.

But how can other users use your app? Simple. You need to run your app on a computer(server) that is accessible by the users. This whole process of testing and running your code on a server is referred to as deployment.

In our case we will not be deploying a machine learning model in our local-machine.

What is Flask?

Flask is a lightweight WSGI web application framework. It is designed to make getting started quick and easy, with the ability to scale up to complex applications. It began as a simple wrapper around Werkzeug and Jinja and has become one of the most popular Python web application frameworks.

One more important feature of Flask is that it does not enforce any additional dependencies, giving the developer a choice in terms of which libraries to use. To install/update flask you can use the pip command in your terminal:

pip install -U Flask

Note: Linux users might want to use pip3 for python 3 version.

Steps to Deploy ML models using Flask

Let’s get right into the steps to deploying machine learning models using the Flask library.

1. Getting your model ready

Now that you have Flask installed, the next in line is the model we need to deploy. If you have worked out your model in the notebook/IDE, now is the time to save your trained model. It is be noted that the model will not be trained during deployment. We will be using a multilayer perception, to classify the images of the MNIST dataset. To save a TensorFlow model we use the following :'<path to the model>/my_model.h5')

Our model receives the image as an input and returns the label of the image.

Model Working
Fig 1: Working of our model : The model takes in an image as input and returns result which is an integer.

2. Designing our workflow

Now that we have a trained model, we can design how our server should handle user requests. Here is the proposed model :

  1. The user uploads an image to the server using an HTTP POST request.
  2. The image is received and saved on the server. We can also check the image for some potential security threats.
  3. The saved image passed through the model.
  4. The results of the model are returned to the user in the form of text.

Here is a flowchart summarising it:

Fig 2: A flowchart describing the working of our model

Note: This is an overly simplified model. Real-life models are a great deal harder to design and implement and involves creating complex data pipelines which are beyond the scope of the article.

3. Coding the Flask API

We create a python file that runs our app.

The import statements:

# os to handle saving/deleting images
import os

# Import the required functions from flask
from flask import Flask, request, flash, redirect, send_file

# TensorFlow for loading the model
import tensorflow as tf

Creating our app

# Creates a flask app with a name same as the file name
# we can refer to this flask app as 'app' in our program
app = Flask(__name__)

Setting up image upload folder

# uploaded images are stored in 'images' folder
UPLOAD_FOLDER = './images'

# Setting a environment variable

Loading the model

# Assuming the model is saved in folder models as model_1.h5
model = tf.keras.models.load_model('models/model_1.h5')

REST API for our app

Our app receives and sends data to the user. For that purpose, we need to specify a certain set of rules. The app decorator binds the function upload_file() to our app. The function is routed to the base URL( specified as ‘/’) and the only method that is allowed is POST i.e. user can upload to the base URL. The upload_file() takes care of many conditions of file uploads- from no file to correct file.

@app.route('/', methods=['POST'])
def upload_file():
    if 'file' not in request.files:
        flash('No file part')
        return redirect(request.url)
    file = request.files['file']
    if file.filename == '':
        flash('No selected file')
        return redirect(request.url)
    if file:
        filename = secure_filename(file.filename)
        filename = os.path.join(app.config['UPLOAD_FOLDER'], filename)

        # Read from file and convert to tensor
        img_tensor =
        results = model.predict(img_tensor)
        # Delete the file
        return "\n[+] The number is : "+str(results)+"\n\n"

Note: This part of the app is run again and again for every client request unlike other parts of the code.

4. Run the app

# If this file is run as standalone file
# The app will run in debug mode
if __name__ == '__main__':

Get the server up and running

# If your file server name is
Server Output
Fig 3: Staring the server

Note that the server is running on which is our app endpoint. Now that our app is running on our local machine, we can access the app just by using the URL.

Uploading an image

We have not used the front-end to make life easier for end-users. This approach of separating the backend from the front-end on the other hand makes interacting with other standalone front-end apps easier. Even without the front-end can use our good old curl command for uploading an image

curl -X POST -H "Content-Type: multipart/form-data" -F "file=@<file location>"

Replace the <file location> with the location of the image.

Term Output
Fig 4: Accessing the app using curl command from a terminal


We see that we can easily deploy our machine learning model in our local machine, so the users connected to your network can use your app services. For the app service to work 24×7, the app needs to run around the clock on your computer. In this case, you may consider running your code on servers like Heroku, DigitalOcean, Azure etc. We will be covering the deployment of code to a server in a later article. Stay tuned.