Species distribution modeling and prediction: A class imbalance problem

Reid A. Johnson, Nitesh V. Chawla, Jessica J. Hellmann

Research output: Chapter in Book/Report/Conference proceedingConference contribution

33 Scopus citations

Abstract

Predicting the distributions of species is central to a variety of applications in ecology and conservation biology. With increasing interest in using electronic occurrence records, many modeling techniques have been developed to utilize this data and compute the potential distribution of species as a proxy for actual observations. As the actual observations are typically overwhelmed by non-occurrences, we approach the modeling of species' distributions with a focus on the problem of class imbalance. Our analysis includes the evaluation of several machine learning methods that have been shown to address the problems of class imbalance, but which have rarely or never been applied to the domain of species distribution modeling. Evaluation of these methods includes the use of the area under the precision-recall curve (AUPR), which can supplement other metrics to provide a more informative assessment of model utility under conditions of class imbalance. Our analysis concludes that emphasizing techniques that specifically address the problem of class imbalance can provide AUROC and AUPR results competitive with traditional species distribution models.

Original languageEnglish (US)
Title of host publicationProceedings - 2012 Conference on Intelligent Data Understanding, CIDU 2012
Pages9-16
Number of pages8
DOIs
StatePublished - 2012
Externally publishedYes
Event2012 Conference on Intelligent Data Understanding, CIDU 2012 - Boulder, CO, United States
Duration: Oct 24 2012Oct 26 2012

Publication series

NameProceedings - 2012 Conference on Intelligent Data Understanding, CIDU 2012

Other

Other2012 Conference on Intelligent Data Understanding, CIDU 2012
Country/TerritoryUnited States
CityBoulder, CO
Period10/24/1210/26/12

Fingerprint

Dive into the research topics of 'Species distribution modeling and prediction: A class imbalance problem'. Together they form a unique fingerprint.

Cite this