TY - GEN
T1 - Coca-Mil
T2 - 21st IEEE International Symposium on Biomedical Imaging, ISBI 2024
AU - Goel, Paras
AU - Kapse, Saarthak
AU - Pati, Pushpak
AU - Prasanna, Prateek
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Whole slide image (WSI) classification in digital pathology is a challenging weakly supervised task due to the gigapixel scale of the data. While handcrafted features bring domain-specific insights, deep learned features offer superior generalizability and performance. Drawing inspiration from the attention mechanism in transformers, we introduce CoCa-MIL, a novel framework that unifies these features using Multiple Instance Learning (MIL). CoCa-MIL comprises two methods: Co-Attention, which leverages handcrafted features to guide deep feature-based representation learning, and Cross-Attention, which fuses both feature types to harness their complementary information for slide-level tasks. In this study, we show that both methods surpass traditional singlefeature-type WSI classification. On the TCGA Lung Cancer dataset, they achieve accuracy improvements of up to 2.60% and 5.21% over their respective baselines, underscoring the efficacy of attention-based fusion methods in exploiting the complementary nature of the handcrafted and deep features for enhancing performance beyond deep learning alone.
AB - Whole slide image (WSI) classification in digital pathology is a challenging weakly supervised task due to the gigapixel scale of the data. While handcrafted features bring domain-specific insights, deep learned features offer superior generalizability and performance. Drawing inspiration from the attention mechanism in transformers, we introduce CoCa-MIL, a novel framework that unifies these features using Multiple Instance Learning (MIL). CoCa-MIL comprises two methods: Co-Attention, which leverages handcrafted features to guide deep feature-based representation learning, and Cross-Attention, which fuses both feature types to harness their complementary information for slide-level tasks. In this study, we show that both methods surpass traditional singlefeature-type WSI classification. On the TCGA Lung Cancer dataset, they achieve accuracy improvements of up to 2.60% and 5.21% over their respective baselines, underscoring the efficacy of attention-based fusion methods in exploiting the complementary nature of the handcrafted and deep features for enhancing performance beyond deep learning alone.
KW - Attention
KW - Co-Attention
KW - Deep Features
KW - Fusion
KW - Handcrafted Features
KW - Multiple Instance Learning
UR - https://www.scopus.com/pages/publications/85203348456
U2 - 10.1109/ISBI56570.2024.10635193
DO - 10.1109/ISBI56570.2024.10635193
M3 - Conference contribution
AN - SCOPUS:85203348456
T3 - Proceedings - International Symposium on Biomedical Imaging
BT - IEEE International Symposium on Biomedical Imaging, ISBI 2024 - Conference Proceedings
PB - IEEE Computer Society
Y2 - 27 May 2024 through 30 May 2024
ER -