TY - GEN
T1 - Patch-level Gaze Distribution Prediction for Gaze Following
AU - Miao, Qiaomu
AU - Hoai, Minh
AU - Samaras, Dimitris
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Gaze following aims to predict where a person is looking in a scene, by predicting the target location, or indicating that the target is located outside the image. Recent works detect the gaze target by training a heatmap regression task with a pixel-wise mean-square error (MSE) loss, while formulating the in/out prediction task as a binary classification task. This training formulation puts a strict, pixel-level constraint in higher resolution on the single annotation available in training, and does not consider annotation variance and the correlation between the two subtasks. To address these issues, we introduce the patch distribution prediction (PDP) method. We replace the in/out prediction branch in previous models with the PDP branch, by predicting a patch-level gaze distribution that also considers the outside cases. Experiments show that our model regularizes the MSE loss by predicting better heatmap distributions on images with larger annotation variances, meanwhile bridging the gap between the target prediction and in/out prediction subtasks, showing a significant improvement in performance on both subtasks on public gaze following datasets.
AB - Gaze following aims to predict where a person is looking in a scene, by predicting the target location, or indicating that the target is located outside the image. Recent works detect the gaze target by training a heatmap regression task with a pixel-wise mean-square error (MSE) loss, while formulating the in/out prediction task as a binary classification task. This training formulation puts a strict, pixel-level constraint in higher resolution on the single annotation available in training, and does not consider annotation variance and the correlation between the two subtasks. To address these issues, we introduce the patch distribution prediction (PDP) method. We replace the in/out prediction branch in previous models with the PDP branch, by predicting a patch-level gaze distribution that also considers the outside cases. Experiments show that our model regularizes the MSE loss by predicting better heatmap distributions on images with larger annotation variances, meanwhile bridging the gap between the target prediction and in/out prediction subtasks, showing a significant improvement in performance on both subtasks on public gaze following datasets.
KW - Algorithms: Biometrics
KW - Image recognition and understanding (object detection, categorization, segmentation, scene modeling, visual reasoning)
KW - Video recognition and understanding (tracking, action recognition, etc.)
KW - body pose
KW - face
KW - gesture
UR - https://www.scopus.com/pages/publications/85149050943
U2 - 10.1109/WACV56688.2023.00094
DO - 10.1109/WACV56688.2023.00094
M3 - Conference contribution
AN - SCOPUS:85149050943
T3 - Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023
SP - 880
EP - 889
BT - Proceedings - 2023 IEEE Winter Conference on Applications of Computer Vision, WACV 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 23rd IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2023
Y2 - 3 January 2023 through 7 January 2023
ER -