Computer models for identifying instrumental citations in the biomedical literature

Lawrence D. Fu, Yindalon Aphinyanaphongs, Constantin F. Aliferis

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

The most popular method for evaluating the quality of a scientific publication is citation count. This metric assumes that a citation is a positive indicator of the quality of the cited work. This assumption is not always true since citations serve many purposes. As a result, citation count is an indirect and imprecise measure of impact. If instrumental citations could be reliably distinguished from non-instrumental ones, this would readily improve the performance of existing citation-based metrics by excluding the non-instrumental citations. A citation was operationally defined as instrumental if either of the following was true: the hypothesis of the citing work was motivated by the cited work, or the citing work could not have been executed without the cited work. This work investigated the feasibility of developing computer models for automatically classifying citations as instrumental or non-instrumental. Instrumental citations were manually labeled, and machine learning models were trained on a combination of content and bibliometric features. The experimental results indicate that models based on content and bibliometric features are able to automatically classify instrumental citations with high predictivity (AUC = 0.86). Additional experiments using independent hold out data and prospective validation show that the models are generalizeable and can handle unseen cases. This work demonstrates that it is feasible to train computer models to automatically identify instrumental citations.

Original languageEnglish (US)
Pages (from-to)871-882
Number of pages12
JournalScientometrics
Volume97
Issue number3
DOIs
StatePublished - Dec 2013

Keywords

  • Bibliometrics
  • Citation analysis
  • Information retrieval
  • Machine learning

Fingerprint

Dive into the research topics of 'Computer models for identifying instrumental citations in the biomedical literature'. Together they form a unique fingerprint.

Cite this