Deep Learning Inference Engine backend from the Intel OpenVINO toolkit is one of the supported OpenCV DNN backends. It was mentioned in the previous post that ARM CPUs support has been recently added to Inference Engine via the dedicated ARM CPU plugin. Let’s review how OpenCV DNN module can leverage Inference Engine and this plugin to run DL networks on ARM CPUs.
Several options for how to configure Inference Engine with OpenCV are mentioned in OpenCV wiki. We will build all components from the scratch: OpenVINO, ARM CPU plugin, OpenCV, and then run YOLOv4-tiny inference on Raspberry Pi.
OpenVINO and OpenCV cross-compilation
We will cross-compile OpenVINO with the plugin and OpenCV in Docker container on the x86 platform. It allows us to speed up compilation – the native compilation process on Raspberry Pi would take a while.
First, we will create a Docker image with configured build environment which contains OpenCV and OpenVINO dependencies and runs a build script. To do that, create Dockerfile with the following content:
FROM debian:buster
USER root
RUN echo 'deb https://deb.debian.org/debian unstable main' > /etc/apt/sources.list.d/unstable.list
RUN dpkg --add-architecture armhf && \
apt-get update && \
apt-get install -y --no-install-recommends \
build-essential \
crossbuild-essential-armhf \
python3-dev \
python3-pip \
python3-numpy/unstable \
git-lfs \
scons \
wget \
xz-utils \
cmake \
libusb-1.0-0-dev:armhf \
libgtk-3-dev:armhf \
libavcodec-dev:armhf \
libavformat-dev:armhf \
libswscale-dev:armhf \
libgstreamer1.0-dev:armhf \
libgstreamer-plugins-base1.0-dev:armhf \
libpython3-dev:armhf && \
rm -rf /var/lib/apt/lists/*
COPY arm_build.sh /arm_build.sh
RUN mkdir /arm
WORKDIR /arm/
CMD ["sh", "/arm_build.sh"]
Before we build the Docker image, we need to create the build script. This script contains 3 parts:
- clone OpenCV, OpenVINO, and OpenVINO contrib repositories
- build OpenVINO with ARM CPU plugin
- build OpenCV with IE backend support
Create a file with the name “arm_build.sh” and add the following content to the script:
#!/bin/sh
set -x
fail()
{
echo $1
exit 1
}
git clone --recurse-submodules --shallow-submodules --depth 1 https://github.com/opencv/opencv.git && \
git clone --recurse-submodules --shallow-submodules --depth 1 https://github.com/openvinotoolkit/openvino.git && \
git clone --recurse-submodules --shallow-submodules --depth 1 https://github.com/openvinotoolkit/openvino_contrib.git || \
fail "Failed to clone source repositories"
cd /arm/openvino && \
mkdir openvino_install && mkdir openvino_build && cd openvino_build && \
cmake -DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX="../openvino_install" \
-DCMAKE_TOOLCHAIN_FILE="../cmake/arm.toolchain.cmake" \
-DTHREADING=SEQ \
-DIE_EXTRA_MODULES=/arm/openvino_contrib/modules/arm_plugin \
-DTHREADS_PTHREAD_ARG="-pthread" .. \
-DCMAKE_INSTALL_LIBDIR=lib \
-DCMAKE_CXX_FLAGS=-latomic \
-DENABLE_TESTS=OFF -DENABLE_BEH_TESTS=OFF -DENABLE_FUNCTIONAL_TESTS=OFF && \
make --jobs=$(nproc --all) && make install && \
cp /arm/openvino/bin/armv7l/Release/lib/libarmPlugin.so \
/arm/openvino/openvino_install/deployment_tools/inference_engine/lib/armv7l/ || \
fail "OpenVINO build failed"
cd /arm/opencv && mkdir opencv_install && mkdir opencv_build && cd opencv_build && \
PYTHONVER=`ls /usr/include | grep "python3.*"` && \
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_LIST=imgcodecs,videoio,highgui,dnn,python3 \
-DCMAKE_INSTALL_PREFIX="../opencv_install" \
-DOPENCV_CONFIG_INSTALL_PATH="cmake" \
-DCMAKE_TOOLCHAIN_FILE="../platforms/linux/arm-gnueabi.toolchain.cmake" \
-DWITH_IPP=OFF \
-DBUILD_TESTS=OFF \
-DBUILD_PERF_TESTS=OFF \
-DOPENCV_ENABLE_PKG_CONFIG=ON \
-DPYTHON3_PACKAGES_PATH="../opencv_install/python" \
-DPKG_CONFIG_EXECUTABLE="/usr/bin/arm-linux-gnueabihf-pkg-config" \
-DBUILD_opencv_python2=OFF -DBUILD_opencv_python3=ON \
-DPYTHON3_INCLUDE_PATH="/usr/include/${PYTHONVER}" \
-DPYTHON3_NUMPY_INCLUDE_DIRS="/usr/lib/python3/dist-packages/numpy/core/include" \
-DPYTHON3_LIMITED_API=ON \
-DOPENCV_SKIP_PYTHON_LOADER=ON \
-DENABLE_NEON=ON \
-DCPU_BASELINE="NEON" \
-DWITH_INF_ENGINE=ON \
-DWITH_NGRAPH=ON \
-Dngraph_DIR="/arm/openvino/openvino_build/ngraph" \
-DINF_ENGINE_RELEASE=2021030000 \
-DInferenceEngine_DIR="/arm/openvino/openvino_build" \
-DINF_ENGINE_LIB_DIRS="/arm/openvino/bin/armv7l/Release/lib" \
-DINF_ENGINE_INCLUDE_DIRS="/arm/openvino/inference-engine/include" \
-DCMAKE_FIND_ROOT_PATH="/arm/openvino" \
-DENABLE_CXX11=ON .. && \
make --jobs=$(nproc --all) && make install || \
fail "OpenCV build failed"
Now we are ready to build the image and run it:
docker image build -t cross_armhf . mkdir arm && docker container run -u `id -u`:`id -g` --rm -t -v $PWD/arm:/arm cross_armhf
As soon as build script is completed, you can find OpenCV artifacts in arm/opencv/opencv_install directory and OpenVINO artifacts in arm/openvino/openvino_install directory. We need to copy OpenCV and OpenVINO artifacts to target ARM platform. If target ARM platform has SSH interface, then scp tool can be used:
scp -r arm/{opencv/opencv_install/,openvino/openvino_install} <user>@<host>:<path>
Running the application
To evaluate the ARM CPU plugin, we can use the following Python application to detect objects:
import cv2 as cv
import numpy as np
import os
import argparse
def draw_boxes(image, boxes, confidences, class_ids, idxs):
if len(idxs) > 0:
for i in idxs.flatten():
# extract bounding box coordinates
left, top = boxes[i][0], boxes[i][1]
width, height = boxes[i][2], boxes[i][3]
# draw bounding box and label
cv.rectangle(image, (left, top), (left + width, top + height), (0, 255, 0))
label = "%s: %.2f" % (classes[class_ids[i]], confidences[i])
cv.putText(image, label, (left, top - 5), cv.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0))
return image
def make_prediction(net, layer_names, labels, frame, conf_threshold, nms_threshold):
boxes = []
confidences = []
class_ids = []
frame_height, frame_width = frame.shape[:2]
# create a blob from a frame
blob = cv.dnn.blobFromImage(frame, 1 / 255.0, (416, 416), swapRB=True, crop=False)
net.setInput(blob)
outputs = net.forward(layer_names)
# extract bounding boxes, confidences and class ids
for output in outputs:
for detection in output:
# extract the scores, class id and confidence
scores = detection[5:]
class_id = np.argmax(scores)
confidence = scores[class_id]
# consider the predictions that are above the threshold
if confidence > conf_threshold:
center_x = int(detection[0] * frame_width)
center_y = int(detection[1] * frame_height)
width = int(detection[2] * frame_width)
height = int(detection[3] * frame_height)
# get top left corner coordinates
left = int(center_x - (width / 2))
top = int(center_y - (height / 2))
boxes.append([left, top, width, height])
confidences.append(float(confidence))
class_ids.append(class_id)
idxs = cv.dnn.NMSBoxes(boxes, confidences, conf_threshold, nms_threshold)
return boxes, confidences, class_ids, idxs
parser = argparse.ArgumentParser()
parser.add_argument('--model', default='yolov4-tiny.weights', help='Path to a binary file of model')
parser.add_argument('--config', default='yolov4-tiny.cfg', help='Path to network configuration file')
parser.add_argument('--classes', default='coco.names', help='Path to label file')
parser.add_argument('--conf_threshold', type=float, default=0.5, help='Confidence threshold')
parser.add_argument('--nms_threshold', type=float, default=0.3, help='Non-maximum suppression threshold')
parser.add_argument('--input', help='Path to video file')
parser.add_argument('--output', default='', help='Path to directory for output video file')
args = parser.parse_args()
# load names of classes
classes = open(args.classes).read().rstrip('\n').split('\n')
# load a network
net = cv.dnn.readNet(args.config, args.model)
net.setPreferableBackend(cv.dnn.DNN_BACKEND_INFERENCE_ENGINE)
layer_names = net.getUnconnectedOutLayersNames()
cap = cv.VideoCapture(args.input)
# define the codec and create VideoWriter object
if args.output != '':
input_file_name = os.path.basename(args.input)
output_file_name, output_file_format = os.path.splitext(input_file_name)
output_file_name += '-output'
if output_file_format != '':
fourcc = int(cap.get(cv.CAP_PROP_FOURCC))
else:
output_file_format = '.mp4'
fourcc = cv.VideoWriter_fourcc(*'mp4v')
output_file_path = args.output + output_file_name + output_file_format
fps = cap.get(cv.CAP_PROP_FPS)
frame_size = (int(cap.get(cv.CAP_PROP_FRAME_WIDTH)),
int(cap.get(cv.CAP_PROP_FRAME_HEIGHT)))
out = cv.VideoWriter(output_file_path,
fourcc,
fps,
frame_size)
while cv.waitKey(1) < 0:
hasFrame, frame = cap.read()
if not hasFrame: break
boxes, confidences, class_ids, idxs = make_prediction(net, layer_names, classes, frame, args.conf_threshold, args.nms_threshold)
frame = draw_boxes(frame, boxes, confidences, class_ids, idxs)
if args.output != '':
out.write(frame)
else:
cv.imshow('object detection', frame)
There is no code related to the ARM platform in the demo application. Default CPU target and detected ARM architecture tells Inference Engine to use ARM CPU plugin for inference. If the same application is run on the x86 platform, Inference Engine will select mklDNN backend for model inference.
Before we run the application, we need to download a pre-trained YOLOv4-tiny model and video file:
wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.weights wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4-tiny.cfg wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/coco.names wget https://raw.githubusercontent.com/intel-iot-devkit/sample-videos/master/people-detection.mp4
Also, we need to define LD_LIBRARY_PATH and PYTHONPATH environment variables:
export PYTHONPATH=$PYTHONPATH:<artifacts_dir>/opencv_install/python/ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<artifacts_dir>/opencv_install/lib/:<artifacts_dir>/openvino_install/deployment_tools/ngraph/lib/: <artifacts_dir>/openvino_install/deployment_tools/inference_engine/lib/armv7l/
Finally, we can run the application:
python3 object_detection.py --model yolov4-tiny.weights --config yolov4-tiny.cfg --classes coco.names --input people-detection.mp4
If a window can’t be displayed on the platform, you can save the output in a video file using –output flag:
python3 object_detection.py --model yolov4-tiny.weights --config yolov4-tiny.cfg --classes coco.names --input people-detection.mp4 --output ./

Conclusion
In this article, you have learned how to use the ARM CPU plugin with OpenCV and validate it by running the YOLO object detection demo. If you have any issues with the ARM CPU plugin, please don’t hesitate to raise a ticket in OpenVINO contrib repository where the plugin is hosted.


5K+ Learners
Join Free VLM Bootcamp3 Hours of Learning