Fork me on GitHub

OpenCL

Intro

Open Computing Language (OpenCL) is an open standard for writing code that runs across heterogeneous platforms including CPUs, GPUs, DSPs and etc. In particular OpenCL provides applications with an access to GPUs for non-graphical computing (GPGPU) that in some cases results in significant speed-up. In Computer Vision many algorithms can run on a GPU much more effectively than on a CPU: e.g. image processing, matrix arithmetic, computational photography, object detection etc.

History

Acceleration of OpenCV with OpenCL started 2011 by AMD. As the result the OpenCV-2.4.3 release included the new ocl module containing OpenCL implementations of some existing OpenCV algorithms. That is, when OpenCL runtime and a compatible device are available on a client machine, user may call cv::ocl::resize() instead of cv::resize() to use the accelerated code. During 3 years more and more functions and classes have been added to the ocl module; but it has been a separate API alongside with the primary CPU-oriented API in OpenCV-2.x.

In OpenCV-3.x the architecture concept has been changed to the so-called Transparent API (T-API). In the new architecture a separate OpenCL-accelerated cv::ocl::resize() is removed from external API and becomes a branch in a regular cv::resize(). This branch is called automatically when it’s possible and makes sense from the performance point of view. More details on the T-API can be found in the following slides. The T-API implementation was sponsored by AMD and Intel companies.

Numbers

Some performance numbers are shown on the picture below:



Code sample

Regular CPU code OpenCL-aware code OpenCV-2.x OpenCL-aware code OpenCV-3.x
// initialization
VideoCapture vcap(...);
CascadeClassifier fd("haar_ff.xml");
Mat frame, frameGray;
vector faces;
for(;;){
  // processing loop
  vcap >> frame;
  cvtColor(frame, frameGray, BGR2GRAY);
  equalizeHist(frameGray, frameGray);
  fd.detectMultiScale(frameGray, faces, ...);
  // draw rectangles …
  // show image …
}
// initialization
VideoCapture vcap(...);
ocl::OclCascadeClassifier fd("haar_ff.xml");
ocl::oclMat frame, frameGray;
Mat frameCpu;
vector faces;
for(;;){
  // processing loop
  vcap >> frameCpu;
  frame = frameCpu;
  ocl::cvtColor(frame, frameGray, BGR2GRAY);
  ocl::equalizeHist(frameGray, frameGray);
  fd.detectMultiScale(frameGray, faces, ...);
  // draw rectangles …
  // show image …
}
// initialization
VideoCapture vcap(...);
CascadeClassifier fd("haar_ff.xml");
UMat frame, frameGray;
vector faces;
for(;;){
  // processing loop
  vcap >> frame;
  cvtColor(frame, frameGray, BGR2GRAY);
  equalizeHist(frameGray, frameGray);
  fd.detectMultiScale(frameGray, faces, ...);
  // draw rectangles …
  // show image …
}

Links