Objective: Commercial listings of food retail outlets are increasingly used by community members and food policy councils and in multilevel intervention research to identify areas with limited access to healthier food. This study quantified the amount of count, type, and geospatial error in 2 commercial data sources. Methods: InfoUSA and Dun and Bradstreet were compared with a validated field census and validity statistics were calculated. Results: Considering only completeness, Dun and Bradstreet data undercounted 24% of existing supermarkets and grocery stores, and InfoUSA, 29%. In addition, considering accuracy of outlet type assignment increased the undercount error to 42% and 39%, respectively. Marked overcount existed as well, and only 43% of existing supermarkets were correctly identified with respect to presence, outlet type, and location. Conclusions and Implications: Relying exclusively on secondary data to characterize the food environment will result in substantial error. Whereas extensive data cleaning can offset some error, verification of outlets with a field census is still the method of choice.
Bibliographical noteFunding Information:
This project was supported by grant R21CA132133 from the National Cancer Institute . The authors thank Denise M. Hodo, Kristopher Corwin, and Dr. Andrey Bortsov for conducting the fieldwork; Michele Nichols for data analyses; and Xiaoguang Ma for assistance with manuscript preparation. The contents of this article are solely the responsibility of the authors and do not necessarily represent the official views of the National Cancer Institute or the National Institutes of Health.
- Food desert
- Retail food environment
- Secondary data sources