SARATOV FALL MEETING SFM 

© 2026 All Rights Reserved

Machine learning for spectral data processing in biospectroscopy. Application to Parkinson's disease diagnostics.

N.P. Bainaev-Mangilev (1), V.V. Salmin(1,2), V.B. Loschenov(3,4), A.B. Ochirova (3), M.N. Andreev (5), E.Yu. Fedotova (5), A.B. Salmina (5), S.N. Illarioshkin (5)

1) Moscow Institute of Physics and Technology (National Research University)
2) Bauman Moscow State Technical University (National Research University)
3) National Research Nuclear University MEPhI
4) Federal Research Center A.M. Prokhorov General Physics Institute of the Russian Academy of Sciences
5) Federal State Budgetary Scientific Institution "Neurology Research Center"

Abstract

Parkinson's disease (PD) is the second most common neurodegenerative disorder, affecting approximately 1% of the population over 60 years of age. In the absence of effective treatments, the development of early and objective diagnostic methods to improve patients' quality of life through timely symptom relief is critical. A promising area of ​​research in this field is biospectroscopy, specifically the analysis of skin autofluorescence spectra. This method is based on the hypothesis of a correlation between the accumulation of the pathological protein alpha-synuclein in the epidermis and changes in its spectral properties. Recent advances in machine learning algorithms offer new opportunities for analyzing complex spectroscopic data, enabling the identification of subtle patterns associated with pathology, invisible in traditional analysis, and the construction of classification models for differential diagnosis.

This study involved 92 participants: a control group (n=24, healthy volunteers, 50-80 yreas), a comparison group (n=15, patients with similar symptoms, 50-90 yreas), and a PD group (n=53, 50-80 years). For each participant, eight autofluorescence spectra of forearm skin were recorded (excitation: 375 nm, range: 400-670 nm). All measurements were performed in the morning to minimize the influence of circadian rhythms. Primary spectral processing included normalization, smoothing, and artifact removal.

Statistical analysis confirmed significant differences (p<0.05) between the control-PD and comparison-PD groups, but not between control-comparison (p>0.05). A comprehensive ML approach was employed for binary classification. Multiple models, including Support Vector Machine (SVM) and a deep neural network, were trained and calibrated. An ensemble model achieved the highest classification accuracy of 85% on the test set.

Speaker

Bainaev-Mangilev Nikita Pavlovich
Moscow Institute of Physics and Technology (National Research University)
Russia

Discussion

Ask question