Gene selection in class space for molecular classification of cancer

Junying Zhang, Yue Joseph Wang, Javed Khan, Robert Clarke

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Gene selection (feature selection) is generally performed in gene space (feature space), where a very serious curse of dimensionality problem always exists because the number of genes is much larger than the number of samples in gene space (G-space). This results in difficulty in modeling the data set in this space and the low confidence of the result of gene selection. How to find a gene subset in this case is a challenging subject. In this paper, the above G-space is transformed into its dual space, referred to as class space (C-space) such that the number of dimensions is the very number of classes of the samples in G-space and the number of samples in C-space is the number of genes in G-space. It is obvious that the curse of dimensionality in C-space does not exist. A new gene selection method which is based on the principle of separating different classes as far as possible is presented with the help of Principal Component Analysis (PCA). The experimental results on gene selection for real data set are evaluated with Fisher criterion, weighted Fisher criterion as well as leave-one-out cross validation, showing that the method presented here is effective and efficient. Copyright by Science in China Press 2004.

Original languageEnglish (US)
Pages (from-to)301-314
Number of pages14
JournalScience in China, Series F: Information Sciences
Volume47
Issue number3
DOIs
StatePublished - Jun 2004
Externally publishedYes

Keywords

  • Class space
  • Feature selection (gene selection)
  • Feature space (gene space)
  • PCA

Fingerprint Dive into the research topics of 'Gene selection in class space for molecular classification of cancer'. Together they form a unique fingerprint.

Cite this