Shortcuts

Deploying with Flask

In this recipe, you will learn:

  • How to wrap your trained PyTorch model in a Flask container to expose it via a web API
  • How to translate incoming web requests into PyTorch tensors for your model
  • How to package your model’s output for an HTTP response

Requirements

You will need a Python 3 environment with the following packages (and their dependencies) installed:

  • PyTorch 1.5
  • TorchVision 0.6.0
  • Flask 1.1

Optionally, to get some of the supporting files, you’ll need git.

The instructions for installing PyTorch and TorchVision are available at pytorch.org. Instructions for installing Flask are available on the Flask site.

What is Flask?

Flask is a lightweight web server written in Python. It provides a convenient way for you to quickly set up a web API for predictions from your trained PyTorch model, either for direct use, or as a web service within a larger system.

Setup and Supporting Files

We’re going to create a web service that takes in images, and maps them to one of the 1000 classes of the ImageNet dataset. To do this, you’ll need an image file for testing. Optionally, you can also get a file that will map the class index output by the model to a human-readable class name.

Option 1: To Get Both Files Quickly

You can pull both of the supporting files quickly by checking out the TorchServe repository and copying them to your working folder. (NB: There is no dependency on TorchServe for this tutorial - it’s just a quick way to get the files.) Issue the following commands from your shell prompt:

git clone https://github.com/pytorch/serve
cp serve/examples/image_classifier/kitten.jpg .
cp serve/examples/image_classifier/index_to_name.json .

And you’ve got them!

Option 2: Bring Your Own Image

The index_to_name.json file is optional in the Flask service below. You can test your service with your own image - just make sure it’s a 3-color JPEG.

Building Your Flask Service

The full Python script for the Flask service is shown at the end of this recipe; you can copy and paste that into your own app.py file. Below we’ll look at individual sections to make their functions clear.

Imports

import torchvision.models as models
import torchvision.transforms as transforms
from PIL import Image
from flask import Flask, jsonify, request

In order:

  • We’ll be using a pre-trained DenseNet model from torchvision.models
  • torchvision.transforms contains tools for manipulating your image data
  • Pillow (PIL) is what we’ll use to load the image file initially
  • And of course we’ll need classes from flask

Pre-Processing

def transform_image(infile):
    input_transforms = [transforms.Resize(255),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406],
            [0.229, 0.224, 0.225])]
    my_transforms = transforms.Compose(input_transforms)
    image = Image.open(infile)
    timg = my_transforms(image)
    timg.unsqueeze_(0)
    return timg

The web request gave us an image file, but our model expects a PyTorch tensor of shape (N, 3, 224, 224) where N is the number of items in the input batch. (We will just have a batch size of 1.) The first thing we do is compose a set of TorchVision transforms that resize and crop the image, convert it to a tensor, then normalize the values in the tensor. (For more information on this normalization, see the documentation for torchvision.models_.)

After that, we open the file and apply the transforms. The transforms return a tensor of shape (3, 224, 224) - the 3 color channels of a 224x224 image. Because we need to make this single image a batch, we use the unsqueeze_(0) call to modify the tensor in place by adding a new first dimension. The tensor contains the same data, but now has shape (1, 3, 224, 224).

In general, even if you’re not working with image data, you will need to transform the input from your HTTP request into a tensor that PyTorch can consume.

Inference

def get_prediction(input_tensor):
    outputs = model.forward(input_tensor)
    _, y_hat = outputs.max(1)
    prediction = y_hat.item()
    return prediction

The inference itself is the simplest part: When we pass the input tensor to them model, we get back a tensor of values that represent the model’s estimated likelihood that the image belongs to a particular class. The max() call finds the class with the maximum likelihood value, and returns that value with the ImageNet class index. Finally, we extract that class index from the tensor containing it with the item() call, and return it.

Post-Processing

def render_prediction(prediction_idx):
    stridx = str(prediction_idx)
    class_name = 'Unknown'
    if img_class_map is not None:
        if stridx in img_class_map is not None:
            class_name = img_class_map[stridx][1]

    return prediction_idx, class_name

The render_prediction() method maps the predicted class index to a human-readable class label. It’s typical, after getting the prediction from your model, to perform post-processing to make the prediction ready for either human consumption, or for another piece of software.

Running The Full Flask App

Paste the following into a file called app.py:

import io
import json
import os

import torchvision.models as models
import torchvision.transforms as transforms
from PIL import Image
from flask import Flask, jsonify, request


app = Flask(__name__)
model = models.densenet121(pretrained=True)               # Trained on 1000 classes from ImageNet
model.eval()                                              # Turns off autograd and



img_class_map = None
mapping_file_path = 'index_to_name.json'                  # Human-readable names for Imagenet classes
if os.path.isfile(mapping_file_path):
    with open (mapping_file_path) as f:
        img_class_map = json.load(f)



# Transform input into the form our model expects
def transform_image(infile):
    input_transforms = [transforms.Resize(255),           # We use multiple TorchVision transforms to ready the image
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406],       # Standard normalization for ImageNet model input
            [0.229, 0.224, 0.225])]
    my_transforms = transforms.Compose(input_transforms)
    image = Image.open(infile)                            # Open the image file
    timg = my_transforms(image)                           # Transform PIL image to appropriately-shaped PyTorch tensor
    timg.unsqueeze_(0)                                    # PyTorch models expect batched input; create a batch of 1
    return timg


# Get a prediction
def get_prediction(input_tensor):
    outputs = model.forward(input_tensor)                 # Get likelihoods for all ImageNet classes
    _, y_hat = outputs.max(1)                             # Extract the most likely class
    prediction = y_hat.item()                             # Extract the int value from the PyTorch tensor
    return prediction

# Make the prediction human-readable
def render_prediction(prediction_idx):
    stridx = str(prediction_idx)
    class_name = 'Unknown'
    if img_class_map is not None:
        if stridx in img_class_map is not None:
            class_name = img_class_map[stridx][1]

    return prediction_idx, class_name


@app.route('/', methods=['GET'])
def root():
    return jsonify({'msg' : 'Try POSTing to the /predict endpoint with an RGB image attachment'})


@app.route('/predict', methods=['POST'])
def predict():
    if request.method == 'POST':
        file = request.files['file']
        if file is not None:
            input_tensor = transform_image(file)
            prediction_idx = get_prediction(input_tensor)
            class_id, class_name = render_prediction(prediction_idx)
            return jsonify({'class_id': class_id, 'class_name': class_name})


if __name__ == '__main__':
    app.run()

To start the server from your shell prompt, issue the following command:

FLASK_APP=app.py flask run

By default, your Flask server is listening on port 5000. Once the server is running, open another terminal window, and test your new inference server:

curl -X POST -H "Content-Type: multipart/form-data" http://localhost:5000/predict -F "file=@kitten.jpg"

If everything is set up correctly, you should recevie a response similar to the following:

{"class_id":285,"class_name":"Egyptian_cat"}

Important Resources

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources