Via Ferrata 5, 27100 Pavia - ITALY
web-vision@unipv.it
+39 0382 98 5372/5486

COntactlesS Multibiometric mObile System in the wild

logo COSMOS

COSMOS (Un sistema mobile multi-biometrico senza contatto né vincoli) aims at delivering a comprehensive approach to multibiometric person verification and recognition, including most contactless biometrics, flexibly integrated through a context adaptive acquisition/matching strategy based on their complementarity and exploiting the agile and ubiquitous hardware platforms represented by last generation smartphones and tablets. More in practice, the project will exploit the specific knowledge of each of the participants to provide an unprecedented unified biometric platform for contactless person verification/recognition by means of both hard biometrics like face (both in 2D and 3D), iris, ear, fingerprint/palmprint and soft biometrics like gait and gaze. Moreover, multitracking methods will be also developed to enabling screening from distance capabilities to allow the proposed system to detect subjects of interest or potential threats to be checked in detail by the other biometric modalities.

Participants: Università di Salerno, Università di Sassari, Università di Pavia, Università di Milano, Università di Modena e Reggio Emilia, Università di Roma "La Sapienza", Università di Napoli "Federico II", Università di Roma Tre

Gaze-Based Biometrics
The group of the University of Pavia has mostly investigated covert gaze-based biometric approaches, but also their combination with overt approaches.

Regarding the covert category, we carried out experiments with 40 participants involved in 88 total test sessions [2]. Gaze data was acquired by means of an extremely cheap Eye Tribe eye tracker (with a 30 Hz sampling rate). Participants were presented with a random sequence of 20 grayscale pictures displayed (for six seconds each) on a monitor, interleaved with a grey screen with a small cross at the center (displayed for two seconds, so as to define an initial gaze location). The participants’ task was simply to freely observe the images.
For each participant and each image, 180 gaze samples were acquired, from which fixations and saccades were obtained. Besides left and right pupil diameters pdl and pdr, three consequent features were considered, namely |pdl - pdr|, pdl / pdr, and pdl · pdr. For all of them, basic statistics were computed (minimum, maximum, mean, standard deviation, median, geometric mean, range, and kurtosis). Through feature ranking (with the Information Gain method), the best 21 pupil features were then selected. The percentages of lost gaze samples and of samples detected outside the screen area were also calculated for each image and participant.
Gaze data were analyzed both considering each test picture as a whole (Method 1) and using Areas of Interest (Method 2). In both cases, seven additional features regarding fixations and saccades were added.
The following classifiers were used: Naive Bayes (with Laplace as a probability estimator), Neural Network (multi-layer Perceptron with 50 hidden layer neurons), Random Forest (with 100 trees), and AdaBoost (using boosting as an ensemble, classification trees as weak learners, and Information Gain as an attribute selection criterion). The classification performance was assessed in terms of accuracy, sensitivity, specificity, and precision. We used stratified 10-fold cross-validation (although cross validation performed on the entire dataset may not be the best way to evaluate a biometric system, we considered it an acceptable compromise between limited dataset sizes and objective evaluation of classifiers’ performance).
Results. For identification, we obtained very good accuracies (around 0.8 with Random Forest for both Method 1 and Method 2), as well as satisfying values (above 0.8) for sensitivity, specificity, and precision. Very good results were also obtained for verification (with both methods), with most accuracies above 0.9 (0.94 with Random Forest) and sensitivity, specificity, and precision ranging always above 0.85 (and higher than 0.9 in most cases).

A variant of the above described experiment consisted in a presentation of 18 images belonging to three different “affective” categories (six “Positive” images, six “Negative” images, and six “Neutral” images). Forty subjects freely observed the images, displayed on a screen, for six seconds each, in random order. An Eye Tribe eye tracker was used. Images were taken from IAPS, the International Affective Picture System.
Results. The obtained results were comparable with those achieved with the previously presented study, and no significant differences in performance were found for the three affective categories.

As a combination of covert and overt approaches, we carried out a study in which eye data were acquired while entering a PIN through the gaze [3]. In other words, we considered the way the PIN is entered as a biometrics. The dataset was built involving 45 subjects. The experiment consisted in entering six-digit PINs using an on-screen virtual keypad. A key “press” occurred by looking at a key for two seconds (after which an acoustic signal was also played). The PIN had to be confirmed by pressing a “Done” key, while a “Canc” key was used to delete the last digit entered in case of errors. The PINs were dictated to the participants. As an eye tracker, we used the very cheap Eye Tribe.
The features used to characterize testers can be classified into six categories, namely Fixations, Pupil Size, Lost Gaze Samples, Saccades, User Behavior, and Physiological. For the features in each category, several statistics were calculated, obtaining a total of 42 direct or derived features.
Specific features were also defined depending on two different kinds of analysis that were carried out: one based on the whole PIN sequence and the other focused on the single key.
We used four classifiers, namely Naïve Bayes, Random Forest, Neural Network, and AdaBoost, with 70%-30% random sampling with ten times repetition.
Results. For identification, with all features and the whole PIN, we obtained a maximum classification accuracy of 0.82 with neural network. Reducing the features to 20, with feature selection, the best accuracy was 0.77. Considering the single buttons, with all features, the best accuracy was 0.89, still with neural network, and remained practically the same with the best selected features. Sensitivity and specificity were also fairly good (almost always above 0.8).
As for verification, with all features and the whole PIN, the best accuracy was achieved with Random Forest (0.84) and became about 0.87 in case of selected features. Very good results were obtained with the analysis for single buttons, with accuracies close to 0.9 or above. As for sensitivity and specificity, they were almost always higher than 0.9.

As another combination of covert and overt approaches, we implemented experiments in which 42 subjects were asked to freely look at an animation involving three shapes (a circle, a square, and a triangle), that moved according to random paths. Data were acquired using a cheap Eye Tribe eye tracker. The animation was subdivided into three phases: in the first, only the circle was present, in the second there were both the circle and the square, and in the third all three shapes were displayed. The analysis was performed considering the three phases separately, two consecutive phases, and all three phases together. Also, the whole animation was subdivided into 6, 9, and 18 intervals, analyzed separately to increase the number of available feature vectors. Statistics about left pupil diameter, right pupil diameter, pupil product, pupil difference, pupil ratio, fixation length, and number of fixations were used as features. Four classifiers (Naive Bayes, Neural Network, Classification Tree, and Random Forest) and two different sampling methods were used, with 10-fold cross validation.
Results. The obtained results show that the second and the third phases, considered together, are the best case. In particular, for both identification and verification the best accuracy (0.81 and 0.9, respectively) was achieved with phases 2 and 3 together, 9 intervals, and the Random Forest classifier.

In a similar study, we used four visual stimuli composed of graphical elements (simple squares) moving on the screen according to different motion patterns. In particular, four animations were used. The first was composed of a square appearing in a random location within the screen and starting to move with a constant speed of 450 pixels per second, following a random direction among the four 45 degrees’ diagonals and bouncing with a 90 degrees angle. The second stimulus was a square that, along a set of predefined radial directions, moved with a constant speed of 750 pixels per second towards the border of the screen, bouncing back along the same exact direction in order to reach again the center of the screen. The third stimulus was formed of 24 squares, each with a an initial speed chosen randomly in the range 450-750 pixels per second, appearing at the same time at the center of the screen and moving each along a predefined direction towards the borders, progressively reducing their size by 40 pixels per second, but maintaining the constant speed randomly chosen at the beginning (once the squares reached a border of the screen, they restarted from the center). The fourth animation exploited a single 50-pixel square that, starting from the center of the screen, moved towards the borders covering a spiral trajectory.
Forty-four subjects participated in the experiments.
Results. Machine learning algorithms in both identification and verification procedures allowed to achieve recognition accuracies higher than 74% for identification and than 90% for verification.
As a possible application of gaze-based biometrics, we have explored the forensic field as well [4], which we may consider in the next future.

We also investigated an area of biometrics that has recently attracted great attention, i.e., gender and age classification (with applications in the fields of security, surveillance, marketing, and demographic information gathering). This kind of information extracted from a biometric sample can also help to decrease the time to identify the exact individual.
In particular, in our study we exploited pupil size as a discriminating feature for the estimation of gender and age. Data obtained from the free observation of face images were used to train Adaboost and SVM classifiers, considering both the best results produced by each classifier and their fusion through weighted means.
With experiments involving more than 100 participants, we found that pupil size can provide significant results, better than those achievable using data on fixations and gaze paths.
Results. Pupil Diameter Mean (PDM) proved to be the most effective discriminating feature for both gender and age, with best accuracies of 65.46% for gender and 88.66% for age categories (both with SVM) [1].

Related publications
  1. V. Cantoni, L. Cascone, M. Nappi, M. Porta (2020). Demographic classification through pupil analysis, Image and Vision Computing, Elsevier, Volume 102, October 2020, DOI:10.1016/j.imavis.2020.103980.
  2. M. Porta, A. Barboni (2019). “Strengthening Security in Industrial Settings: A Study on Gaze-Based Biometrics through Free Observation of Static Images”. Proceedings of ETFA '19 (24th IEEE Conference on Emerging Technologies and Factory Automation), Zaragoza, Spain, September 10-13, 2019, pp. 1273-1277 (DOI 10.1109/ETFA.2019.8868961)
  3. V. Cantoni, T. Lacovara, M. Porta, H. Wang (2018). “A Study on Gaze-Controlled PIN Input with Biometric Data Analysis”. Proceedings of CompSysTech ’18 (19th International Conference on Computer Systems and Technologies), Ruse, Bulgaria, September 13-14, 2018, pp. 99-103 (DOI 10.1145/3274005.3274029)
  4. V. Cantoni, M. Musci, N. Nugrahaningsih, M. Porta (2018). “Gaze-based biometrics: An introduction to forensic applications”. Pattern Recognition Letters, Vol. 113, 2018, pp. 54-57 (DOI 10.1016/j.patrec.2016.12.006)
PROJECT INFO
Duration: 01/03/2017 - 01/03/2020
Funded by: MIUR
Project type: PRIN 2015
Get In Touch

Laboratorio di Visione Artificiale e Multimedia
Dipartimento di Ingegneria Industriale e dell'Informazione
Università di Pavia
Via Ferrata 5, 27100 Pavia - ITALY

+39 0382 98 5372/5486

web-vision@unipv.it