TY - GEN
T1 - RadioTransformer
T2 - 17th European Conference on Computer Vision, ECCV 2022
AU - Bhattacharya, Moinak
AU - Jain, Shubham
AU - Prasanna, Prateek
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - In this work, we present RadioTransformer, a novel student-teacher transformer framework, that leverages radiologists’ gaze patterns and models their visuo-cognitive behavior for disease diagnosis on chest radiographs. Domain experts, such as radiologists, rely on visual information for medical image interpretation. On the other hand, deep neural networks have demonstrated significant promise in similar tasks even where visual interpretation is challenging. Eye-gaze tracking has been used to capture the viewing behavior of domain experts, lending insights into the complexity of visual search. However, deep learning frameworks, even those that rely on attention mechanisms, do not leverage this rich domain information for diagnostic purposes. RadioTransformerfills this critical gap by learning from radiologists’ visual search patterns, encoded as ‘human visual attention regions’ in a cascaded global-focal transformer framework. The overall ‘global’ image characteristics and the more detailed ‘local’ features are captured by the proposed global and focal modules, respectively. We experimentally validate the efficacy of RadioTransformeron 8 datasets involving different disease classification tasks where eye-gaze data is not available during the inference phase. Code: https://github.com/bmi-imaginelab/radiotransformer
AB - In this work, we present RadioTransformer, a novel student-teacher transformer framework, that leverages radiologists’ gaze patterns and models their visuo-cognitive behavior for disease diagnosis on chest radiographs. Domain experts, such as radiologists, rely on visual information for medical image interpretation. On the other hand, deep neural networks have demonstrated significant promise in similar tasks even where visual interpretation is challenging. Eye-gaze tracking has been used to capture the viewing behavior of domain experts, lending insights into the complexity of visual search. However, deep learning frameworks, even those that rely on attention mechanisms, do not leverage this rich domain information for diagnostic purposes. RadioTransformerfills this critical gap by learning from radiologists’ visual search patterns, encoded as ‘human visual attention regions’ in a cascaded global-focal transformer framework. The overall ‘global’ image characteristics and the more detailed ‘local’ features are captured by the proposed global and focal modules, respectively. We experimentally validate the efficacy of RadioTransformeron 8 datasets involving different disease classification tasks where eye-gaze data is not available during the inference phase. Code: https://github.com/bmi-imaginelab/radiotransformer
KW - Chest radiographs
KW - Disease classification
KW - Eye-gaze
KW - Visual attention
UR - https://www.scopus.com/pages/publications/85142667682
U2 - 10.1007/978-3-031-19803-8_40
DO - 10.1007/978-3-031-19803-8_40
M3 - Conference contribution
AN - SCOPUS:85142667682
SN - 9783031198021
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 679
EP - 698
BT - Computer Vision – ECCV 2022 - 17th European Conference, Proceedings
A2 - Avidan, Shai
A2 - Brostow, Gabriel
A2 - Cissé, Moustapha
A2 - Farinella, Giovanni Maria
A2 - Hassner, Tal
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 23 October 2022 through 27 October 2022
ER -