Donate to Science & Enterprise

S&E on Mastodon

S&E on LinkedIn

S&E on Flipboard

Please share Science & Enterprise

Machine Learning Harnessed for Mobile Eye Tracker

Swiping on tablet

(niekverlaan, Pixabay)

17 June 2016. Tracking eye movements that usually requires high-priced equipment will soon be done with a mobile device camera, thanks to machine learning and crowdsourcing. An engineering team at Massachusetts Institute of Technology and University of Georgia will describe their technology on 28 June at the IEEE conference on Computer Vision and Pattern Recognition in Las Vegas.

Eye-tracking is a potentially valuable diagnostic and analytical tool for psychology and market research, but the complexity and expense of today’s technology limits its use. “The field is kind of stuck in this chicken-and-egg loop,” says Aditya Khosla, co-leader of the project and an MIT graduate student in electrical engineering and computer science, in an MIT statement. “Since few people have the external devices, there’s no big incentive to develop applications for them. Since there are no applications, there’s no incentive for people to buy the devices.”

Khosla and fellow computer science doctoral student Kyle Krafka at University of Georgia seek to break this cycle by developing an inexpensive eye tracking application on ordinary mobile devices, using their built-in cameras. Solving the problem, however, requires going beyond hardware. The system also needs to recognize and interpret small subtle eye movements, which calls for sophisticated models and software.

For their solution, Khosla, Krafka, and colleagues employ machine learning, where underlying algorithms for software are derived from patterns of individual behavior, in this case the ways people’s eyes move when they use their phones and tablets. But the developers discovered the largest available data set of gaze patterns has only 50 cases. They needed many more data points to achieve a sufficient level of accuracy.

The researchers turned to crowdsourcing to build their database. The team created an Apple iPhone app that flashes a red dot at a spot on a phone or tablet screen, then replaces it with an R or L, an instruction to swipe the screen right or left. As the participant reacts to the red dot and executes the swipe, the device’s camera captures images of the user’s face. The team recruited participants from Mechanical Turk, a crowdsourcing site offered by, and paid each individual a small fee.

The team’s efforts yielded data from 1,450 participants with each individual generating an average of 1,600 images. To continuously mine these data efficiently, the researchers employed a technique known as dark knowledge, which creates an interim trained network offering an approximate solution for the larger generalized model. That interim network then makes possible further training and learning on a much smaller set of new data that fits on a smartphone.

The authors say their network, called iTracker, now yields an eye-tracking prediction accuracy within 1.3 centimeters when calibrated on smartphones and 2.1 centimeters on tablets. The researchers note in an MIT statement, to achieve an accuracy within 0.5 centimeters required for commercial applications will mean capturing data from some 10,000 participants.

Khosla is co-founder and chief technologist of PathAI, a start-up enterprise applying neural networks and deep machine learning to large data sets for diagnosing cancer.

Read more:

*     *     *

1 comment to Machine Learning Harnessed for Mobile Eye Tracker

  • The researchers’ machine-learning system was a neural network, which is a software abstraction but can be thought of as a huge network of very simple information processors arranged into discrete layers. Training modifies the settings of the individual processors so that a data item — in this case, a still image of a mobile-device user — fed to the bottom layer will be processed by the subsequent layers. The output of the top layer will be the solution to a computational problem — in this case, an estimate of the direction of the user’s gaze.