Dimensionality of big data sets explored by Cluj descriptors

Claudiu Lungu, Sara Ersali, Beata Szefler, Atena Pîrvan-Moldovan, Subhash Basak, Mircea V. Diudea

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Dimensionality of a relatively big data set (95 compounds) observed for toxicity (mutagenicity) was explored in order to compute QSAR models. Distinct molecular descriptors were used. Dimensionality of data, using PCA, correlation plots and clustering, was evaluated. Analyzing data dimensionality allowed model optimization. Docking studies and PCA were used in order to expand data dimensionality. Pearson correlation coefficient (r2) values, obtained for both perceptive and predictive models, were satisfactory.

Original languageEnglish (US)
Pages (from-to)197-204
Number of pages8
JournalStudia Universitatis Babes-Bolyai Chemia
Volume62
Issue number3
DOIs
StatePublished - 2017

Bibliographical note

Funding Information:
This work was supported by a grant of the Romanian National Authority for Scientific Research and Innovation, CCCDI – UEFISCDI, project number 8/2015, acronym GEMNS (under the frame of the ERA-NET EuroNanoMed II European Innovative Research and Technological Development Projects in Nanomedicine).

Publisher Copyright:
© 2017, Universitatea Babes-Bolyai, Catedra de Filosofie Sistematica. All rights reserved.

Keywords

  • Ames test
  • Data dimensionality
  • Mutagenity
  • Principal component analysis (PCA)
  • QSAR
  • Topological descriptor

Cite this