Overview

One of the most important processes involved in human Face recognition is how accurately you detect human faces located in any scene. This is why it is really important to decide what framework should we use for Face Detection when we are planning to create an iOS application.

With this article, we tried to compare performance and precision of the most popular frameworks (Core Image, OpenCV, and Vision) used in iOS for Face Detection.

Goals

Examine Core Image Face Detection from the sample app we have created with some different images
Examine Vision Face Detection from the sample app we have created with some different images
Examine OpenCV Face Detection from the sample app we have created with some different images
Compare Performance and Confidence among all three Frameworks

About

Core Image: (https://developer.apple.com/)

Core Image is an image processing and analysis technology designed to provide near real-time processing for still and video images. It operates on image data types from the Core Graphics, Core Video, and Image I/O frameworks, using either a GPU or CPU rendering path.

Vision: (https://machinelearning.apple.com)

Apply high-performance image analysis and computer vision techniques to identify faces, detect features, and classify scenes in images and video.

OpenCV: (https://opencv.org/about.html)

OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library.

What we’re going to do:

First, we’re going to talk about the API’s required in order to detect human faces using Core Image, Vision, and OpenCV.

Then we will compare performance and precision among all three frameworks.

Face Detection using Core Image

In order to detect a face using Core Image, here are the steps you must follow:

A: Convert the UIImage to CIImage

     let ciimage = CIImage(image: UIImage)

B: Set the CIDetector accuracy (You can choose CIDetectorAccuracyLow or CIDetectorAccuracyHigh depending upon the requirement)

     let detectOption = [CIDetectorAccuracy: CIDetectorAccuracyHigh]

C: Create an object of CIDetector of typeface with options (CIDetector is an image processor that identifies notable features (such as faces and barcodes) in a still image or video.)

     let faceDetector = CIDetector(ofType: CIDetectorTypeFace, context: nil, options: detectOption)

D: At the end, you need to call features function of CIDetector which will extract and return an array of faces from the given image.

     let faces = faceDetector.features(in: ciimage)

Observation :

Face Detection using Vision

Detecting face using Vision, you need to follow the below steps:

A: Initialise the VNDetectFaceRectanglesRequest, which is an image analysis request that finds faces within an image and also initializes the VNSequenceRequestHandler (processes image analysis requests pertaining to a sequence of multiple images)

          let faceDetectionRequest = VNDetectFaceRectanglesRequest()

          let faceDetectionHandler = VNSequenceRequestHandler()

B: Convert the UIImage to CIImage

          let ciimage = CIImage(image: UIImage)

C: Perform the face detection request on the given image

          try? faceDetectionHandler.perform([faceDetectionRequest], on: ciimage)

D: And finally the call results of VNDetectFaceRectanglesRequest will return an array of Faces

          let results = faceDetection.results as? [VNFaceObservation]

Observation :

Face Detection using OpenCV

OpenCV is a C++ API consisting of various modules containing a wide range of functions, from low-level image color space conversions to high-level machine learning tools.

Since OpenCV is built on C++ you can not call OpenCV functions directly from Swift.

Read the article: How to use OpenCV with Swift

OpenCV uses CascadeClassifier (detect objects in an image or video) to detect face from MAT Image(MAT: N-Dimensional Matrix Class) by loading the HAAR Cascade for Frontal Face(To know more about HaarCascade: https://docs.opencv.org/3.3.0/d7/d8b/tutorial_py_face_detection.html)

Now to detect a face using OpenCV, you need to follow the below steps :

A: First you need to create the Cascade Classifier by loading the haarcascade_frontalface_alt2.xml

         static cv::CascadeClassifier _faceCascade;

         NSString *faceCascadePath = [[NSBundle mainBundle] pathForResource:”haarcascade_frontalface_alt2.xml”  ofType:@"xml"];

         faceCascade.load([faceCascadePath UTF8String])

B: Convert UIIMage to MAT Image

  CGColorSpaceRef colorSpace = CGImageGetColorSpace(image.CGImage);

  CGFloat cols = image.size.width;

  CGFloat rows = image.size.height;

  cv::Mat cvMat(rows, cols, CV_8UC4);

  CGContextRef context = CGBitmapContextCreate(cvMat.data, cols, rows, 8,  cvMat.step[0], colorSpace, kCGImageAlphaNoneSkipLast|CGBitmap ByteOrder Default );

  CGContextDrawImage(contextRef, CGRectMake(0, 0, cols, rows), image.CGImage);

  CGContextRelease(contextRef);

  CGColorSpaceRelease(colorSpace);  

  cv::Mat matImage;

  cvtColor(cvMat, matImage, CV_RGB2GRAY);

C: At the end, you need to call detectMultiScale function of Cascade Classifier which will return the detected faces from the given image.

  std::vector<cv::Rect> faces;

  _faceCascade.detectMultiScale(matImage, faces, 1.1, 4, CV_HAAR_DO_ROUGH_SEARCH, cv::Size(30, 30));

Observation:

Comparison (Tested on iPhone7):

I tried to compare the processing time and confidence of all three technologies for face detection with different scenes of smaller and higher resolutions.

The table below shows the comparison results taken from five scenes:

First Column shows the number of faces located in a scene.

Second Column shows the resolution of each scene.

Third Column contains the link to every scene.

Fourth column shows the processing time(millisecond) taken by Core Image, Vision, and OpenCV.

Fifth column shows the number of faces detected by Core Image, Vision and OpenCV from the given scene.

The below graph displays the processing time taken by each technology where the x-axis represents scene resolutions and y-axis represent the processing time millisecond :

The below graph display the face detection accuracy rate where x-axis represent the scene resolutions with the number of faces located in a scene and y-axis represent number of faces:

Results of Comparison:

By comparing all three technologies I observed that the Face detection using Vision and Core Image is much faster. While OpenCV took more processing time, the rate of accuracy in detecting faces from any given scene was much better when compared to Vision and Core Image.

Also observed that vision and core image failed to detect all the faces from scenes with higher resolution whereas OpenCV was able to provide a much more accurate result. It should be noted that even OpenCV sometimes provide inaccurate detection, but the chance to fail is much lower than that of the other two frameworks.