Assessment of groundwater well vulnerability to contamination through physics-informed machine learning

Mario A. Soriano; Helen G. Siegel; Nicholaus P. Johnson; Kristina M. Gutchess; Boya Xiong; Yunpo Li; Cassandra J. Clark; Desiree L. Plata; Nicole C. Deziel; James E. Saiers

doi:10.1088/1748-9326/ac10e0

Assessment of groundwater well vulnerability to contamination through physics-informed machine learning

Mario A. Soriano, Helen G. Siegel, Nicholaus P. Johnson, Kristina M. Gutchess, Boya Xiong, Yunpo Li, Cassandra J. Clark, Desiree L. Plata, Nicole C. Deziel, James E. Saiers

Civil, Environmental, and Geo- Engineering

Research output: Contribution to journal › Article › peer-review

20 Scopus citations

Abstract

Contamination from anthropogenic activities is a long-standing challenge to the sustainability of groundwater resources. Physically based (PB) models are often used in groundwater risk assessments, but their application to large scale problems requiring high spatial resolution remains computationally intractable. Machine learning (ML) models have emerged as an alternative to PB models in the era of big data, but the necessary number of observations may be impractical to obtain when events are rare, such as episodic groundwater contamination incidents. The current study employs metamodeling, a hybrid approach that combines the strengths of PB and ML models while addressing their respective limitations, to evaluate groundwater well vulnerability to contamination from unconventional oil and gas development (UD). We illustrate the approach in northeastern Pennsylvania, where intensive natural gas production from the Marcellus Shale overlaps with local community dependence on shallow aquifers. Metamodels were trained to classify vulnerability from predictors readily computable in a geographic information system. The trained metamodels exhibited high accuracy (average out-of-bag classification error <5%). A predictor combining information on topography, hydrology, and proximity to contaminant sources (inverse distance to nearest upgradient UD source) was found to be highly important for accurate metamodel predictions. Alongside violation reports and historical groundwater quality records, the predicted vulnerability provided critical insights for establishing the prevalence of UD contamination in 94 household wells that we sampled in 2018. While <10% of the sampled wells exhibited chemical signatures consistent with UD produced wastewaters, >60% were predicted to be in vulnerable locations, suggesting that future impacts are likely to occur with greater frequency if safeguards against contaminant releases are relaxed. Our results show that hybrid physics-informed ML offers a robust and scalable framework for assessing groundwater contamination risks.

Original language	English (US)
Article number	084013
Journal	Environmental Research Letters
Volume	16
Issue number	8
DOIs	https://doi.org/10.1088/1748-9326/ac10e0
State	Published - Aug 2021

Bibliographical note

Publisher Copyright:
© 2021 The Author(s). Published by IOP Publishing Ltd

Keywords

Drinking water quality
Gas development
Groundwater contamination risk assessment
Metamodeling
Physics-informed machine learning
Unconventional oil

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access

10.1088/1748-9326/ac10e0

OpenUrl availability

Full text

Cite this

@article{0e1ecd7a03974f32a3fc73ecfa27b5d3,

title = "Assessment of groundwater well vulnerability to contamination through physics-informed machine learning",

abstract = "Contamination from anthropogenic activities is a long-standing challenge to the sustainability of groundwater resources. Physically based (PB) models are often used in groundwater risk assessments, but their application to large scale problems requiring high spatial resolution remains computationally intractable. Machine learning (ML) models have emerged as an alternative to PB models in the era of big data, but the necessary number of observations may be impractical to obtain when events are rare, such as episodic groundwater contamination incidents. The current study employs metamodeling, a hybrid approach that combines the strengths of PB and ML models while addressing their respective limitations, to evaluate groundwater well vulnerability to contamination from unconventional oil and gas development (UD). We illustrate the approach in northeastern Pennsylvania, where intensive natural gas production from the Marcellus Shale overlaps with local community dependence on shallow aquifers. Metamodels were trained to classify vulnerability from predictors readily computable in a geographic information system. The trained metamodels exhibited high accuracy (average out-of-bag classification error <5%). A predictor combining information on topography, hydrology, and proximity to contaminant sources (inverse distance to nearest upgradient UD source) was found to be highly important for accurate metamodel predictions. Alongside violation reports and historical groundwater quality records, the predicted vulnerability provided critical insights for establishing the prevalence of UD contamination in 94 household wells that we sampled in 2018. While <10% of the sampled wells exhibited chemical signatures consistent with UD produced wastewaters, >60% were predicted to be in vulnerable locations, suggesting that future impacts are likely to occur with greater frequency if safeguards against contaminant releases are relaxed. Our results show that hybrid physics-informed ML offers a robust and scalable framework for assessing groundwater contamination risks.",

keywords = "Drinking water quality, Gas development, Groundwater contamination risk assessment, Metamodeling, Physics-informed machine learning, Unconventional oil",

author = "Soriano, {Mario A.} and Siegel, {Helen G.} and Johnson, {Nicholaus P.} and Gutchess, {Kristina M.} and Boya Xiong and Yunpo Li and Clark, {Cassandra J.} and Plata, {Desiree L.} and Deziel, {Nicole C.} and Saiers, {James E.}",

note = "Publisher Copyright: {\textcopyright} 2021 The Author(s). Published by IOP Publishing Ltd",

year = "2021",

month = aug,

doi = "10.1088/1748-9326/ac10e0",

language = "English (US)",

volume = "16",

journal = "Environmental Research Letters",

issn = "1748-9318",

publisher = "IOP Publishing Ltd.",

number = "8",

}

TY - JOUR

T1 - Assessment of groundwater well vulnerability to contamination through physics-informed machine learning

AU - Soriano, Mario A.

AU - Siegel, Helen G.

AU - Johnson, Nicholaus P.

AU - Gutchess, Kristina M.

AU - Xiong, Boya

AU - Li, Yunpo

AU - Clark, Cassandra J.

AU - Plata, Desiree L.

AU - Deziel, Nicole C.

AU - Saiers, James E.

PY - 2021/8

Y1 - 2021/8

N2 - Contamination from anthropogenic activities is a long-standing challenge to the sustainability of groundwater resources. Physically based (PB) models are often used in groundwater risk assessments, but their application to large scale problems requiring high spatial resolution remains computationally intractable. Machine learning (ML) models have emerged as an alternative to PB models in the era of big data, but the necessary number of observations may be impractical to obtain when events are rare, such as episodic groundwater contamination incidents. The current study employs metamodeling, a hybrid approach that combines the strengths of PB and ML models while addressing their respective limitations, to evaluate groundwater well vulnerability to contamination from unconventional oil and gas development (UD). We illustrate the approach in northeastern Pennsylvania, where intensive natural gas production from the Marcellus Shale overlaps with local community dependence on shallow aquifers. Metamodels were trained to classify vulnerability from predictors readily computable in a geographic information system. The trained metamodels exhibited high accuracy (average out-of-bag classification error <5%). A predictor combining information on topography, hydrology, and proximity to contaminant sources (inverse distance to nearest upgradient UD source) was found to be highly important for accurate metamodel predictions. Alongside violation reports and historical groundwater quality records, the predicted vulnerability provided critical insights for establishing the prevalence of UD contamination in 94 household wells that we sampled in 2018. While <10% of the sampled wells exhibited chemical signatures consistent with UD produced wastewaters, >60% were predicted to be in vulnerable locations, suggesting that future impacts are likely to occur with greater frequency if safeguards against contaminant releases are relaxed. Our results show that hybrid physics-informed ML offers a robust and scalable framework for assessing groundwater contamination risks.

AB - Contamination from anthropogenic activities is a long-standing challenge to the sustainability of groundwater resources. Physically based (PB) models are often used in groundwater risk assessments, but their application to large scale problems requiring high spatial resolution remains computationally intractable. Machine learning (ML) models have emerged as an alternative to PB models in the era of big data, but the necessary number of observations may be impractical to obtain when events are rare, such as episodic groundwater contamination incidents. The current study employs metamodeling, a hybrid approach that combines the strengths of PB and ML models while addressing their respective limitations, to evaluate groundwater well vulnerability to contamination from unconventional oil and gas development (UD). We illustrate the approach in northeastern Pennsylvania, where intensive natural gas production from the Marcellus Shale overlaps with local community dependence on shallow aquifers. Metamodels were trained to classify vulnerability from predictors readily computable in a geographic information system. The trained metamodels exhibited high accuracy (average out-of-bag classification error <5%). A predictor combining information on topography, hydrology, and proximity to contaminant sources (inverse distance to nearest upgradient UD source) was found to be highly important for accurate metamodel predictions. Alongside violation reports and historical groundwater quality records, the predicted vulnerability provided critical insights for establishing the prevalence of UD contamination in 94 household wells that we sampled in 2018. While <10% of the sampled wells exhibited chemical signatures consistent with UD produced wastewaters, >60% were predicted to be in vulnerable locations, suggesting that future impacts are likely to occur with greater frequency if safeguards against contaminant releases are relaxed. Our results show that hybrid physics-informed ML offers a robust and scalable framework for assessing groundwater contamination risks.

KW - Drinking water quality

KW - Gas development

KW - Groundwater contamination risk assessment

KW - Metamodeling

KW - Physics-informed machine learning

KW - Unconventional oil

UR - http://www.scopus.com/inward/record.url?scp=85112105546&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85112105546&partnerID=8YFLogxK

U2 - 10.1088/1748-9326/ac10e0

DO - 10.1088/1748-9326/ac10e0

M3 - Article

AN - SCOPUS:85112105546

SN - 1748-9318

VL - 16

JO - Environmental Research Letters

JF - Environmental Research Letters

IS - 8

M1 - 084013

ER -

Assessment of groundwater well vulnerability to contamination through physics-informed machine learning

Abstract

Bibliographical note

Keywords

UN SDGs

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this