TY - GEN
T1 - Contextualized attention-based knowledge transfer for spoken conversational question answering
AU - You, Chenyu
AU - Chen, Nuo
AU - Zou, Yuexian
N1 - Publisher Copyright:
© 2021 ISCA
PY - 2021
Y1 - 2021
N2 - Spoken conversational question answering (SCQA) requires machines to model the flow of multi-turn conversation given the speech utterances and text corpora. Different from traditional text question answering (QA) tasks, SCQA involves audio signal processing, passage comprehension, and contextual understanding. However, ASR systems introduce unexpected noisy signals to the transcriptions, which result in performance degradation on SCQA. To overcome the problem, we propose CADNet, a novel contextualized attention-based distillation approach, which applies both cross-attention and self-attention to obtain ASR-robust contextualized embedding representations of the passage and dialogue history for performance improvements. We also introduce the spoken conventional knowledge distillation framework to distill the ASR-robust knowledge from the estimated probabilities of the teacher model to the student. We conduct extensive experiments on the Spoken-CoQA dataset and demonstrate that our approach achieves remarkable performance in this task.
AB - Spoken conversational question answering (SCQA) requires machines to model the flow of multi-turn conversation given the speech utterances and text corpora. Different from traditional text question answering (QA) tasks, SCQA involves audio signal processing, passage comprehension, and contextual understanding. However, ASR systems introduce unexpected noisy signals to the transcriptions, which result in performance degradation on SCQA. To overcome the problem, we propose CADNet, a novel contextualized attention-based distillation approach, which applies both cross-attention and self-attention to obtain ASR-robust contextualized embedding representations of the passage and dialogue history for performance improvements. We also introduce the spoken conventional knowledge distillation framework to distill the ASR-robust knowledge from the estimated probabilities of the teacher model to the student. We conduct extensive experiments on the Spoken-CoQA dataset and demonstrate that our approach achieves remarkable performance in this task.
KW - Conversational question answering
KW - Machine reading comprehension
KW - Spoken conversational question answering
UR - https://www.scopus.com/pages/publications/85115727596
U2 - 10.21437/Interspeech.2021-110
DO - 10.21437/Interspeech.2021-110
M3 - Conference contribution
AN - SCOPUS:85115727596
T3 - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
SP - 3746
EP - 3750
BT - 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
PB - International Speech Communication Association
T2 - 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021
Y2 - 30 August 2021 through 3 September 2021
ER -