SARATOV FALL MEETING SFM 

© 2026 All Rights Reserved

Avoiding overestimation in multivariate analysis of Raman spectra of biological objects: Interpretation with the SP-LIME algorithm

Lyudmila A. Bratchenko1,2, Yulia A. Khristoforova1, Maria A. Skuratova3, Petr A. Lebedev4, Ivan A. Bratchenko1,2; 1Samara University, Samara, Russia; 2Immanuel Kant Baltic Federal University, Kaliningrad, Russia; 3Samara City Clinical Hospital №1 named after N. I. Pirogov, Samara, Russia; 4Samara State Medical University, Samara, Russia

Abstract

Raman spectroscopy, in combination with multivariate analysis, is a powerful analytical tool for solving regression and classification problems in various fields – from materials science to clinical practice. However, in practical applications, experimental studies and the implementation of Raman spectroscopy present numerous challenges, including multicollinearity in spectral data and the ‘black box’ problem of complex analytical models. To avoid these problems, the proposed classification and regression models require proper interpretation. This study makes use of a comparative analysis of explanation methods based on the SP-LIME (Local Interpretable Model-agnostic Explanations with submodular pick) algorithm of a bilinear model (projection onto latent structures – PLS) and a nonlinear model (one-dimensional convolutional neural network – CNN). The aim of this paper is to develop an approach to explain the operation of the analytical models and provides the way to reveal the exact Raman bands with the biggest impact on the model performance. The experimental database consists of the recorded spectral characteristics of blood serum of 370 patients. The work specifically aims at interpreting the performance of a deep learning model using a real clinical field problem as an example.

Speaker

Lyudmila A. Bratchenko
Samara University; Immanuel Kant Baltic Federal University
Russia

Discussion

Ask question