Neural Network Taught to Detect Age and Gender by Video Almost 20% More Accurately

Researchers from the Higher School of Economics have created a technology to help neural networks identify certain people on video, detecting their age and gender more quickly and accurately. The development has already become the basis for offline detection systems in Android mobile apps. The results of the study were published in an article entitled ‘Video-based age and gender recognition in mobile applications’ .

Modern neural networks detect gender on videos with a 90% accuracy. The situation with age prediction is much more complicated. Traditional neural networks consider discrete values of age, e.g., range between 1 and 100 years. In each of the video frames, the network estimates the probability of the person in the image being of a certain age. For example, if in 30% of the frames the top prediction of the network is a person’s age as 21 years, and in 10% as 60 years, its conclusion will be as follows: with a probability of 30% this person is 21, and with a probability of 10%, he or she is 60. Due to various conditions of observations or even slight head rotation, prediction of the same person’s age in different video frames varies in the range of 5 years, plus or minus.

A team of HSE experts in computer vision headed by Professor Andrey Savchenko has found a way to optimize neural networks’ operations. Experiments on several video datasets have proven that their technology allows for implementation of today’s most accurate algorithms of gender and age recognition on video as compared to such popular convolutional neural networks as VGGFace, VGGFace2, Light CNN, DEX, and age_net/gender_net.

Researchers have implemented a novel method to aggregate confidence levels produced by the neural network for each frame based on mathematical statistics and Dempster–Shafer theory. The software systems of facial recognition analysis usually include several separate neural networks. One of them identifies the person, another one determines the gender, etc. An effective neural network with several outputs has been developed. It solves several tasks at a time: predicts age and gender, and produces a set of 1,000 numbers (attribute vector) that uniquely attribute each person and allows them to be distinguished from other people. According to researchers, this solution will work even on low-performance smartphones.

As of today, the researchers have scaled the solutions for Android mobile apps. The neural network collects information on a user’s circle of acquaintanceship, family composition, and the age and gender of their closest contacts. The system works offline, processing photos and videos from the gallery on the user’s device. This is what distinguishes it from similar developments that analyse social media profiles and comments, such as on Instagram (property of Meta, which has been recognised as an extremist organisation in Russia).

These data may be used by the smartphone manufacturer to create various recommendation systems. For example, if a user has a considerable amount of content with a toddler, he or she would be offered an advertisement for a children’s store. If they have a lot of friends in photos taken on certain days, the smartphone will suggest a restaurant for a party. This technology has already attracted interest of the biggest smartphone manufacturer.

‘A gadget is a short-distance sprinter, since its battery quickly depletes. That is why it is important to make sure the smartphone with this app works quickly and consumes less power. To avoid wasting time and battery charge, we use our efficient convolutional neural network to analyse the images, explains Professor Savchenko. We also pay a lot of attention to privacy: processing is done only on the user’s smartphone in offline mode. The phone does not send the photos and videos to a remote server, and the data can’t be seen and analysed by third parties. The server does not receive the images, but a ready-made profile with demographic and social data. For example, most often, your photos feature four women and two men, and you also like going to McDonald’s. This helps in delivering advertisements that are of greatest interest to users.’

December 20, 2018

High Tech