NWPerf: A system wide performance monitoring tool for large Linux clusters

Ryan Mooney, Kenneth P. Schmidt, R. Scott Studham, Jarek Nieplocha

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Scopus citations

Abstract

We present NWPerf, a new system for analyzing fine granularity performance metric data on large-scale supercomputing clusters. This tool is able to measure application efficiency on a system wide basis from both a global system perspective as well as providing a detailed view of individual applications. NWPerf provides this service while minimizing the impact on the performance of user applications. We describe the type of information that can be derived from the system, and demonstrate how the system was used detect and eliminate a performance problem in an application application that improved performance by up to several thousand percent. The NWPerf architecture has proven to be a stable and scalable platform for gathering performance data on a large 1954-CPU production Linux cluster at PNNL

Original languageEnglish (US)
Title of host publication2004 IEEE International Conference on Cluster Computing, ICCC 2004
Pages379-389
Number of pages11
DOIs
StatePublished - 2004
Event2004 IEEE International Conference on Cluster Computing, ICCC 2004 - San Diego, CA, United States
Duration: Sep 20 2004Sep 23 2004

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
ISSN (Print)1552-5244

Other

Other2004 IEEE International Conference on Cluster Computing, ICCC 2004
CountryUnited States
CitySan Diego, CA
Period9/20/049/23/04

Fingerprint Dive into the research topics of 'NWPerf: A system wide performance monitoring tool for large Linux clusters'. Together they form a unique fingerprint.

Cite this