Cloud detection algorithm comparison and validation for operational Landsat data products

Steve Foga; Pat L. Scaramuzza; Song Guo; Zhe Zhu; Ronald D. Dilley; Tim Beckmann; Gail L. Schmidt; John L. Dwyer; M. Joseph Hughes; Brady Laue

doi:10.1016/j.rse.2017.03.026

Cloud detection algorithm comparison and validation for operational Landsat data products

Steve Foga, Pat L. Scaramuzza, Song Guo, Zhe Zhu, Ronald D. Dilley, Tim Beckmann, Gail L. Schmidt, John L. Dwyer, M. Joseph Hughes, Brady Laue

Earth and Environmental Sciences-Twin Cities

Research output: Contribution to journal › Article › peer-review

717 Scopus citations

Abstract

Clouds are a pervasive and unavoidable issue in satellite-borne optical imagery. Accurate, well-documented, and automated cloud detection algorithms are necessary to effectively leverage large collections of remotely sensed data. The Landsat project is uniquely suited for comparative validation of cloud assessment algorithms because the modular architecture of the Landsat ground system allows for quick evaluation of new code, and because Landsat has the most comprehensive manual truth masks of any current satellite data archive. Currently, the Landsat Level-1 Product Generation System (LPGS) uses separate algorithms for determining clouds, cirrus clouds, and snow and/or ice probability on a per-pixel basis. With more bands onboard the Landsat 8 Operational Land Imager (OLI)/Thermal Infrared Sensor (TIRS) satellite, and a greater number of cloud masking algorithms, the U.S. Geological Survey (USGS) is replacing the current cloud masking workflow with a more robust algorithm that is capable of working across multiple Landsat sensors with minimal modification. Because of the inherent error from stray light and intermittent data availability of TIRS, these algorithms need to operate both with and without thermal data. In this study, we created a workflow to evaluate cloud and cloud shadow masking algorithms using cloud validation masks manually derived from both Landsat 7 Enhanced Thematic Mapper Plus (ETM +) and Landsat 8 OLI/TIRS data. We created a new validation dataset consisting of 96 Landsat 8 scenes, representing different biomes and proportions of cloud cover. We evaluated algorithm performance by overall accuracy, omission error, and commission error for both cloud and cloud shadow. We found that CFMask, C code based on the Function of Mask (Fmask) algorithm, and its confidence bands have the best overall accuracy among the many algorithms tested using our validation data. The Artificial Thermal-Automated Cloud Cover Algorithm (AT-ACCA) is the most accurate nonthermal-based algorithm. We give preference to CFMask for operational cloud and cloud shadow detection, as it is derived from a priori knowledge of physical phenomena and is operable without geographic restriction, making it useful for current and future land imaging missions without having to be retrained in a machine-learning environment.

Original language	English (US)
Pages (from-to)	379-390
Number of pages	12
Journal	Remote Sensing of Environment
Volume	194
DOIs	https://doi.org/10.1016/j.rse.2017.03.026
State	Published - Jun 1 2017

Bibliographical note

Funding Information:
Work performed under USGS contract G15PC00012. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government. We thank Joshua Picotte, MacKenzie Friedrichs, Dr. Christopher Barnes, and our anonymous reviewers for their helpful comments to improve the manuscript.

Publisher Copyright:
© 2017 Elsevier Inc.

Keywords

Biome sampling
CFMask
Cloud detection
Cloud validation masks
Data products
Landsat

Access

10.1016/j.rse.2017.03.026

OpenUrl availability

Full text

Cite this

@article{0fb1cf6abad0438bbe6ce7a20f17bb29,

title = "Cloud detection algorithm comparison and validation for operational Landsat data products",

abstract = "Clouds are a pervasive and unavoidable issue in satellite-borne optical imagery. Accurate, well-documented, and automated cloud detection algorithms are necessary to effectively leverage large collections of remotely sensed data. The Landsat project is uniquely suited for comparative validation of cloud assessment algorithms because the modular architecture of the Landsat ground system allows for quick evaluation of new code, and because Landsat has the most comprehensive manual truth masks of any current satellite data archive. Currently, the Landsat Level-1 Product Generation System (LPGS) uses separate algorithms for determining clouds, cirrus clouds, and snow and/or ice probability on a per-pixel basis. With more bands onboard the Landsat 8 Operational Land Imager (OLI)/Thermal Infrared Sensor (TIRS) satellite, and a greater number of cloud masking algorithms, the U.S. Geological Survey (USGS) is replacing the current cloud masking workflow with a more robust algorithm that is capable of working across multiple Landsat sensors with minimal modification. Because of the inherent error from stray light and intermittent data availability of TIRS, these algorithms need to operate both with and without thermal data. In this study, we created a workflow to evaluate cloud and cloud shadow masking algorithms using cloud validation masks manually derived from both Landsat 7 Enhanced Thematic Mapper Plus (ETM +) and Landsat 8 OLI/TIRS data. We created a new validation dataset consisting of 96 Landsat 8 scenes, representing different biomes and proportions of cloud cover. We evaluated algorithm performance by overall accuracy, omission error, and commission error for both cloud and cloud shadow. We found that CFMask, C code based on the Function of Mask (Fmask) algorithm, and its confidence bands have the best overall accuracy among the many algorithms tested using our validation data. The Artificial Thermal-Automated Cloud Cover Algorithm (AT-ACCA) is the most accurate nonthermal-based algorithm. We give preference to CFMask for operational cloud and cloud shadow detection, as it is derived from a priori knowledge of physical phenomena and is operable without geographic restriction, making it useful for current and future land imaging missions without having to be retrained in a machine-learning environment.",

keywords = "Biome sampling, CFMask, Cloud detection, Cloud validation masks, Data products, Landsat",

author = "Steve Foga and Scaramuzza, {Pat L.} and Song Guo and Zhe Zhu and Dilley, {Ronald D.} and Tim Beckmann and Schmidt, {Gail L.} and Dwyer, {John L.} and {Joseph Hughes}, M. and Brady Laue",

note = "Funding Information: Work performed under USGS contract G15PC00012. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government. We thank Joshua Picotte, MacKenzie Friedrichs, Dr. Christopher Barnes, and our anonymous reviewers for their helpful comments to improve the manuscript. Publisher Copyright: {\textcopyright} 2017 Elsevier Inc.",

year = "2017",

month = jun,

day = "1",

doi = "10.1016/j.rse.2017.03.026",

language = "English (US)",

volume = "194",

pages = "379--390",

journal = "Remote Sensing of Environment",

issn = "0034-4257",

publisher = "Elsevier",

}

TY - JOUR

T1 - Cloud detection algorithm comparison and validation for operational Landsat data products

AU - Foga, Steve

AU - Scaramuzza, Pat L.

AU - Guo, Song

AU - Zhu, Zhe

AU - Dilley, Ronald D.

AU - Beckmann, Tim

AU - Schmidt, Gail L.

AU - Dwyer, John L.

AU - Joseph Hughes, M.

AU - Laue, Brady

N1 - Funding Information: Work performed under USGS contract G15PC00012. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government. We thank Joshua Picotte, MacKenzie Friedrichs, Dr. Christopher Barnes, and our anonymous reviewers for their helpful comments to improve the manuscript. Publisher Copyright: © 2017 Elsevier Inc.

PY - 2017/6/1

Y1 - 2017/6/1

N2 - Clouds are a pervasive and unavoidable issue in satellite-borne optical imagery. Accurate, well-documented, and automated cloud detection algorithms are necessary to effectively leverage large collections of remotely sensed data. The Landsat project is uniquely suited for comparative validation of cloud assessment algorithms because the modular architecture of the Landsat ground system allows for quick evaluation of new code, and because Landsat has the most comprehensive manual truth masks of any current satellite data archive. Currently, the Landsat Level-1 Product Generation System (LPGS) uses separate algorithms for determining clouds, cirrus clouds, and snow and/or ice probability on a per-pixel basis. With more bands onboard the Landsat 8 Operational Land Imager (OLI)/Thermal Infrared Sensor (TIRS) satellite, and a greater number of cloud masking algorithms, the U.S. Geological Survey (USGS) is replacing the current cloud masking workflow with a more robust algorithm that is capable of working across multiple Landsat sensors with minimal modification. Because of the inherent error from stray light and intermittent data availability of TIRS, these algorithms need to operate both with and without thermal data. In this study, we created a workflow to evaluate cloud and cloud shadow masking algorithms using cloud validation masks manually derived from both Landsat 7 Enhanced Thematic Mapper Plus (ETM +) and Landsat 8 OLI/TIRS data. We created a new validation dataset consisting of 96 Landsat 8 scenes, representing different biomes and proportions of cloud cover. We evaluated algorithm performance by overall accuracy, omission error, and commission error for both cloud and cloud shadow. We found that CFMask, C code based on the Function of Mask (Fmask) algorithm, and its confidence bands have the best overall accuracy among the many algorithms tested using our validation data. The Artificial Thermal-Automated Cloud Cover Algorithm (AT-ACCA) is the most accurate nonthermal-based algorithm. We give preference to CFMask for operational cloud and cloud shadow detection, as it is derived from a priori knowledge of physical phenomena and is operable without geographic restriction, making it useful for current and future land imaging missions without having to be retrained in a machine-learning environment.

AB - Clouds are a pervasive and unavoidable issue in satellite-borne optical imagery. Accurate, well-documented, and automated cloud detection algorithms are necessary to effectively leverage large collections of remotely sensed data. The Landsat project is uniquely suited for comparative validation of cloud assessment algorithms because the modular architecture of the Landsat ground system allows for quick evaluation of new code, and because Landsat has the most comprehensive manual truth masks of any current satellite data archive. Currently, the Landsat Level-1 Product Generation System (LPGS) uses separate algorithms for determining clouds, cirrus clouds, and snow and/or ice probability on a per-pixel basis. With more bands onboard the Landsat 8 Operational Land Imager (OLI)/Thermal Infrared Sensor (TIRS) satellite, and a greater number of cloud masking algorithms, the U.S. Geological Survey (USGS) is replacing the current cloud masking workflow with a more robust algorithm that is capable of working across multiple Landsat sensors with minimal modification. Because of the inherent error from stray light and intermittent data availability of TIRS, these algorithms need to operate both with and without thermal data. In this study, we created a workflow to evaluate cloud and cloud shadow masking algorithms using cloud validation masks manually derived from both Landsat 7 Enhanced Thematic Mapper Plus (ETM +) and Landsat 8 OLI/TIRS data. We created a new validation dataset consisting of 96 Landsat 8 scenes, representing different biomes and proportions of cloud cover. We evaluated algorithm performance by overall accuracy, omission error, and commission error for both cloud and cloud shadow. We found that CFMask, C code based on the Function of Mask (Fmask) algorithm, and its confidence bands have the best overall accuracy among the many algorithms tested using our validation data. The Artificial Thermal-Automated Cloud Cover Algorithm (AT-ACCA) is the most accurate nonthermal-based algorithm. We give preference to CFMask for operational cloud and cloud shadow detection, as it is derived from a priori knowledge of physical phenomena and is operable without geographic restriction, making it useful for current and future land imaging missions without having to be retrained in a machine-learning environment.

KW - Biome sampling

KW - CFMask

KW - Cloud detection

KW - Cloud validation masks

KW - Data products

KW - Landsat

UR - http://www.scopus.com/inward/record.url?scp=85017309987&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85017309987&partnerID=8YFLogxK

U2 - 10.1016/j.rse.2017.03.026

DO - 10.1016/j.rse.2017.03.026

M3 - Article

AN - SCOPUS:85017309987

SN - 0034-4257

VL - 194

SP - 379

EP - 390

JO - Remote Sensing of Environment

JF - Remote Sensing of Environment

ER -

Cloud detection algorithm comparison and validation for operational Landsat data products

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this