TY - JOUR
T1 - Interpretable dimensionality reduction and classification of mass spectrometry imaging data in a visceral pain model via non-negative matrix factorization
AU - Pathirage, Kasun
AU - Virmani, Aman
AU - Scott, Alison J.
AU - Traub, Richard J.
AU - Ernst, Robert K.
AU - Ghodssi, Reza
AU - Babadi, Behtash
AU - Abshire, Pamela Ann
N1 - Publisher Copyright:
Copyright: © 2024 Pathirage et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2024/10/1
Y1 - 2024/10/1
N2 - Mass spectrometry imaging (MSI) is a powerful scientific tool for understanding the spatial distribution of biochemical compounds in tissue structures. In this paper, we introduce three novel approaches in MSI data processing to perform the tasks of data augmentation, feature ranking, and image registration. We use these approaches in conjunction with non-negative matrix factorization (NMF) to resolve two of the biggest challenges in MSI data analysis, namely: 1) the large file sizes and associated computational resource requirements and 2) the complexity of interpreting the very high dimensional raw spectral data. There are many dimensionality reduction techniques that address the first challenge but do not necessarily result in readily interpretable features, leaving the second challenge unaddressed. We demonstrate that NMF is an effective dimensionality reduction algorithm that reduces the size of MSI datasets by three orders of magnitude with limited loss of information, yielding spatial and spectral components with meaningful correlation to tissue structure that may be used directly for subsequent data analysis without the need for additional clustering steps. This analysis is demonstrated on an MSI dataset from female Sprague-Dawley rats for an animal model of comorbid visceral pain hypersensitivity (CPH). We find that high-dimensional MSI data (* 100,000 ions per pixel) can be reduced to 20 spectral NMF components with < 20% loss in reconstruction accuracy. The resulting spatial NMF components are reproducible and correlate well with H&E-stained tissue images. These components may also be used to generate images with enhanced specificity for different tissue types. Small patches of NMF data (i.e., 20 spatial NMF components over 20 × 20 pixels) provide an accuracy of * 87% in classifying CPH vs naïve control subjects. This paper presents the novel data processing methodologies that were used to produce these results, encompassing novel data processing pipelines for data augmentation to support training for classification, ranking of features according to their contribution to classification, and image registration to enhance tissue-specific imaging.
AB - Mass spectrometry imaging (MSI) is a powerful scientific tool for understanding the spatial distribution of biochemical compounds in tissue structures. In this paper, we introduce three novel approaches in MSI data processing to perform the tasks of data augmentation, feature ranking, and image registration. We use these approaches in conjunction with non-negative matrix factorization (NMF) to resolve two of the biggest challenges in MSI data analysis, namely: 1) the large file sizes and associated computational resource requirements and 2) the complexity of interpreting the very high dimensional raw spectral data. There are many dimensionality reduction techniques that address the first challenge but do not necessarily result in readily interpretable features, leaving the second challenge unaddressed. We demonstrate that NMF is an effective dimensionality reduction algorithm that reduces the size of MSI datasets by three orders of magnitude with limited loss of information, yielding spatial and spectral components with meaningful correlation to tissue structure that may be used directly for subsequent data analysis without the need for additional clustering steps. This analysis is demonstrated on an MSI dataset from female Sprague-Dawley rats for an animal model of comorbid visceral pain hypersensitivity (CPH). We find that high-dimensional MSI data (* 100,000 ions per pixel) can be reduced to 20 spectral NMF components with < 20% loss in reconstruction accuracy. The resulting spatial NMF components are reproducible and correlate well with H&E-stained tissue images. These components may also be used to generate images with enhanced specificity for different tissue types. Small patches of NMF data (i.e., 20 spatial NMF components over 20 × 20 pixels) provide an accuracy of * 87% in classifying CPH vs naïve control subjects. This paper presents the novel data processing methodologies that were used to produce these results, encompassing novel data processing pipelines for data augmentation to support training for classification, ranking of features according to their contribution to classification, and image registration to enhance tissue-specific imaging.
UR - https://www.scopus.com/pages/publications/85206027012
U2 - 10.1371/journal.pone.0300526
DO - 10.1371/journal.pone.0300526
M3 - Article
C2 - 39388402
AN - SCOPUS:85206027012
SN - 1932-6203
VL - 19
JO - PLoS ONE
JF - PLoS ONE
IS - 10 October
M1 - e0300526
ER -