Tutorial: Accelerate Deep Learning Models with Intel Movidius

Tutorial: Accelerate Deep Learning Models with Intel Movidius

Tutorial: Accelerate Deep Learning Models with Intel Movidius

Intel Movidius Neural Compute Stick accelerates machine learning inferencing at the edge. I covered the details of this device last week. In this tutorial, we will take an existing Caffe deep learning model and optimize it for Intel Movidius.

This guide is based on Intel Movidius NCS 1 and NCSDK 2. The most recent version of the device uses Intel OpenVINO Toolkit which is not compatible with the previous versions of the SDK. But Intel is still selling NCS 1 devices while actively maintaining the SDK.

Apart from the NCS1 device, you need an Ubuntu 16.04 PC with a free USB 3 port. You can use VirtualBox or VMware Fusion/Workstation to set up and configure the SDK. Optionally, a Raspberry Pi 3 device may be used to run the optimized graph.

Installing Intel Neural Compute SDK 2

Let’s start by confirming that the Intel Movidius NCS USB device is recognized. Running lsusb should show a device with ID 03e7:2150.

We will now install the prerequisites – Python3, Pip, and Git packages on Ubuntu.

sudo apt-get upgrade
sudo apt install python3
sudo apt install python3-pip
sudo apt install git-all

Clone the NCSDK 2 Github repository and build the SDK.

git clone -b ncsdk2 http://github.com/Movidius/ncsdk
cd ncsdk
make install

It may take a few minutes for the setup to complete. After it is done, verify that the SDK is properly installed by running the hello_ncs_py sample.

cd ~/ncsdk/examples/apps/hello_ncs_py
make run

The above output confirms that the SDK is able to access the Intel Movidius NCS device.

Generating the Graph from Caffe Deep Learning Model

In one of the previous tutorials, I used NVIDIA DIGITS to build a Convolutional Neural Network (CNN) that classifies images. We will use the fully-trained model from that demo to classify the images of dogs and cats.

Download the trained Caffe model from the below links:

mkdir cat-dog && cd cat-dog
wget https://www.dropbox.com/s/vxyby375e82vq1b/cat-dog.caffemodel?dl=0 -o cat-dog.caffemodel
wget https://www.dropbox.com/s/byvf1d4ul09ujn9/deploy.prototxt?dl=0 -o deploy.prototxt

Before we use the model for inference, we need to generate a graph optimized for Intel Movidius. For this, we will use mvNCCompile, one of the command tools available in the NC SDK. This tool takes the trained model as an input and generates the required graph.

mvNCCompile deploy.prototxt -w cat-dog.caffemodel -s 12 -is 227 227

The first two parameters point to the Caffe model while -s 12 denotes that we are using 12 SHAVE Cores for the graph. The last parameter is the size of image, which is 227X227.

You should now find two new files – graph and output_expected.npy which can be loaded onto the NCS device for inference.

It’s time for us write Python code that loads the graph and does inference.

import os
import sys
import glob
import numpy
import ntpath
import argparse
import skimage.io
import skimage.transform
import sys

import mvnc.mvncapi as mvnc

GRAPH="graph"
SIZE=[227,227]

devices = mvnc.enumerate_devices()
if len( devices ) == 0:
print( "No devices found" )
quit()

device = mvnc.Device( devices[0] )
device.open()

with open(GRAPH, mode='rb' ) as f:
blob = f.read()

# Load the graph buffer into the NCS
graph = mvnc.Graph(GRAPH)
# Set up fifos
fifo_in, fifo_out = graph.allocate_with_fifos( device, blob )

img = skimage.io.imread( sys.argv[1] )
img = skimage.transform.resize( img, SIZE, preserve_range=True, mode='constant' )

labels =[ line.rstrip('\n') for line in
open( "./labels.txt" ) if line != 'classes\n']

print( "\n==============================================================" )
# Load the image as an array
graph.queue_inference_with_fifo_elem( fifo_in, fifo_out, img.astype(numpy.float32), None )
# Get the results from NCS
output, userobj = fifo_out.read_elem()

# Get execution time
inference_time = graph.get_option( mvnc.GraphOption.RO_TIME_TAKEN )

# Find the index of highest confidence
top_prediction = output.argmax()

# Print top predictions for each image
print( "Predicted " + sys.argv[1]
+ " as " + labels[top_prediction]
+ " in %.2f ms" % ( numpy.sum( inference_time ) )
+ " with %3.1f%%" % (100.0 * output[top_prediction] )
+ " confidence." )
print( "==============================================================\n" )

fifo_in.destroy()
fifo_out.destroy()
graph.destroy()
device.close()
device.destroy()

Let’s call the above file as run.py. Before we pass sample images to test the graph, we need to create a labels file with just two entries in separate lines.

echo cat > labels.txt
echo dog >> labels.txt

You may want to download samples preprocessed images from this link.

Invoke the graph through the below command:

python3 run.py images/3.jpg

In one of my upcoming tutorials, I will demonstrate how to use Intel Movidius NCS 2 with OpenVINO Toolkit.

Janakiram MSV’s Webinar series, “Machine Intelligence and Modern Infrastructure (MI2)” offers informative and insightful sessions covering cutting-edge technologies. Sign up for the upcoming MI2 webinar for a deep dive on accelerating machine learning inference with Intel Movidius.

Feature image by F. Muhammad from Pixabay.

The post Tutorial: Accelerate Deep Learning Models with Intel Movidius appeared first on The New Stack.

 

Leave a Reply

Your email address will not be published. Required fields are marked *