Clustering Mass Spectral Peaks Increases Recognition
Accuracy and Stability of SVM-based Feature Selection |
Mikhail Pyatnitskiy, Maria Karpova *, Sergei Moshkovskii, Andrey Lisitsa, Alexander Archakov |
Institute of Biomedical Chemistry, 119121, Pogodinskaya str., 10, Moscow, Russia |
| *Corresponding authors: |
Dr.Maria Karpova, Institute of Biomedical Chemistry,
119121, Pogodinskaya str., 10, Moscow, Russia,
Tel: +7-499-
2461641,
E-mail: karpova@bioinformatics.ru. |
|
Received January 13, 2010; Accepted February 12, 2010; Published
February 12, 2010 |
| Citation: Pyatnitskiy M, Karpova M, Moshkovskii S, Lisitsa A, Archakov A (2010) Clustering Mass Spectral Peaks Increases Recognition Accuracy
and Stability of SVM-based Feature Selection. J Proteomics
Bioinform 3: 048-054. doi: 10.4172/jpb.1000120 |
| Copyright: © 2010 Pyatnitskiy M, et al. This is an open-access article
distributed under the terms of the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any
medium, provided the original author and source are credited. |
| Abstract |
Mass spectral profiling of serum or plasma is one of the
tools widely used to make experimental diagnostic systems
for different cancer types. In this approach, a set of
discriminatory peaks serves as a multiplex cancer
biomarker. Hence, adequate selection of peaks is a crucial
stage in the development of diagnostic rule. In the present
paper we propose using sequential filter and wrapper
feature selection in a complete cross-validation scheme
with feature selection performed at each run of crossvalidation
separately. Filter feature selection is represented
by hierarchical cluster analysis; recursive feature
elimination coupled with support vector machine is utilized
as a wrapper feature selection method. The method
performance is demonstrated on previously obtained
dataset with ovarian cancer and non-cancer sera.
Application of our approach led to a slight but statistically
significant increase in accuracy. Peak clustering favoured
more stable results of feature selection and provided a
biological meaning to selected m/z values. We recommend
clustering of peaks as a filter dimensionality reduction for
further use in mass spectral studies. |
|
|
|