About the author:
Zhang Yin obtained his bachelor’s degree in computer science and technology from Sun Yat-sen University in 2018. He is currently a master student of Intelligent Software Research Center of Institute of Software Chinese Academy of Sciences. His main research interests are compiler technology, especially the support and application of RISC-V ISA.
This post is the part of the Google Summer of Code 2020 project
OpenCV runs on many hardware platforms and makes use of the SIMD (Single Instruction Multiple Data) acceleration on the ones that support it. Today we will describe how OpenCV was ported and accelerated for RISC-V.
What is RISC-V and Why RISC-V
From Wikipedia:
RISC-V is an open standard instruction set architecture (ISA) based on established reduced instruction set computer (RISC) principles. Unlike most other ISA designs, the RISC-V is provided under open source licenses that do not require fees to use. A number of companies are offering or have announced RISC-V hardware, open source operating systems with RISC-V support are available and the instruction set is supported in several popular software toolchains.
In other words, RISC-V has a free/open license and more concise design compared with someother popular architectures. These key features make it likely to be widely used in the future. Since OpenCV’s roots are in open software and hardware, we’re very happy to bring support for RISC-V to the platform and will continue to improve it.
The Optimization Approach
OpenCV provides a convenient method to port many optimized kernels at once to a new CPU, as long as that CPU supports SIMD/vector instructions. We use so-called Wide Universal Intrinsics for that. Thus far Wide Universal Intrinsics support several SIMDinstruction sets, such as SSE, AVX, AVX2, AVX512 on x86 and x64 architectures, NEON on ARM architecture, VSX on IBM Power architecture, and MSA on MIPS architecture.
The goal of the Google Summer of Code 2020 project, done by the author, wasto add an implementation of Wide Universal Intrinsics based on the RISC-V vector extension to enable vector acceleration on RISC-V architecture.
The RISC-V “V” (vector) extension (RVV) is one of the standard extensions of RISC-V ISA (* actually, RVV extension is still in a draft state). It introduces vector registers and the corresponding vector instructions to the basic RISC-V ISA, so that program code can be optimized and accelerated with vector architecture.
In our code we use the native intrinsics of the RISC-V vector extension to access its vector data types and vector operations. And we wrap them into OpenCV’s Wide Universal Intrinsics. When OpenCV is compiled and runs on the RISC-V platform, the Wide Universal Intrinsics used by the OpenCV algorithms are translated into RISC-V vector instructions.
The Current Status of RISC-V acceleration in OpenCV
At present, we have completed the first implementation of a RISC-V version of Wide Universal Intrinsics. The first version has been successfully compiled by RISC-V gnu toolchain and the rvv-llvm version provided by PLCT. This version has passed all the HAL (Hardware Acceleration Layer) accuracy tests and 11000+ core tests (about 99.8% of core tests) when tested with our simulator using QEMU.
How to Compile and Run OpenCV on RISC-V
To build OpenCV with RISC-V RVV optimizations enabled you can use the following commands to cross-compile OpenCV on Ubuntu (tested on Ubuntu 18.04) running on an X64 platform.
1. Gather Prerequisites
apt-get update apt-get install gcc g++ git make cmake python python3 gcc-multilib vim autoconf automake autotools-dev curl libmpc-dev libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf libtool patchutils bc zlib1g-dev libexpat-dev pkg-config libglib2.0-dev
2. Build RISC-V GNU Compiler Toolchain and QEMU simulator
git clone [email protected]:riscv/riscv-gnu-toolchain.git -b rvv-intrinsic cd riscv-gnu-toolchain git submodule update --init --recursive ./configure --prefix=/opt/RISCV --with-arch=rv64gcv_zfh --with-abi=lp64d make linux -j$(nproc) make build-qemu -j$(nproc)
3. Build OpenCV for RISC-V
git clone [email protected]:opencv/opencv.git cd opencv mkdir build && cd build cmake -DCMAKE_TOOLCHAIN_FILE=../platforms/linux/riscv64-gcc.toolchain.cmake ../ make -j$(nproc)
Once the above commands complete, you have successfully compiled OpenCV library for RISC-V with RVV. Now we can run our code on a RISC-V platform. Because there is no suitable hardware at the moment of writing this article, we can run it on the QEMU simulator we just built. Use the following commands to run the accuracy test on the QEMU simulator.
4. Run Accuracy Test on the QEMU Simulator
/opt/RISCV/bin/qemu-riscv64 -cpu rv64,x-v=true opencv/build/bin/opencv_test_core
The Future Work
In-memory Vector Types
The framework of Wide Universal Intrinsics was originally designed based on a fixed vector length. Our implementation today based on the RISC-V vector extension was created with a 128-bit fixed vector length, but the RISC-V vector extension itself is scalable. As a result, the vector types of the current version supported with Universal Intrinsics are stored in memory. It has a negative effect on performance.
We see two ways to solve this problem:
- A new Wide Universal Intrinsics framework designed to fit vector length agnostic architectures
- Adding non-scalable support for RVV on the compiler side.
Performance Tests and Optimizations
The implementation shown today has passed our accuracy tests. But, as stated above, this implementation with our Universal Intrinsics may not be the most efficient. We will definitely carry out more performance testing and introduce further optimizations as time goes on.
Of course, even highly-optimized code is useless without a compatible hardware to run on. RISC-V community is actively working on finalization of RISC-V RVV specification, and we hope to see some hardware support soon.
References and Related Links
- RISC-V wikipedia: https://en.wikipedia.org/wiki/RISC-V
- Pull Request of the Implementation: https://github.com/opencv/opencv/pull/18228
- Wide Universal Intrinsic: https://docs.opencv.org/master/df/d91/group__core__hal__intrin.html
- OpenCV official repository: https://github.com/opencv/opencv
- RISC-V ISA specification: https://github.com/riscv/riscv-isa-manual
- RISC-V “V” extension specification: https://github.com/riscv/riscv-v-spec
- RVV Intrinsic specification: https://github.com/riscv/rvv-intrinsic-doc
- RISC-V GNU toolchain: https://github.com/riscv/riscv-gnu-toolchain
- Rvv-llvm from PLCT Group: https://github.com/isrc-cas/rvv-llvm
- In-memory Issue details: https://github.com/riscv/riscv-gnu-toolchain/issues/701
Leave a Reply