Accurate statistical approaches for generating representative workload compositions

Lieven Eeckhout; Rashmi Sundareswarat; Joshua J. Yi; David J Lilja; Paul R Schrater

doi:10.1109/IISWC.2005.1526001

Accurate statistical approaches for generating representative workload compositions

Lieven Eeckhout, Rashmi Sundareswarat, Joshua J. Yi, David J Lilja, Paul R Schrater

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

15 Scopus citations

Abstract

Composing a representative workload is a crucial step during the design process of a microprocessor. The workload should be composed in such a way that it is representative for the target domain of application and yet, the amount of redundancy in the workload should be minimized as much as possible in order not to overly increase the total simulation time. As a result, there is an important trade-off that needs to be made between workload representativeness and simulation accuracy versus simulation speed. Previous work used statistical data analysis techniques to identify representative benchmarks and corresponding inputs, also called a subset, from a large set of potential benchmarks and inputs. These methodologies measure a number of program characteristics on which Principal Components Analysis (PCA) is applied before identifying distinct program behaviors among the benchmarks using cluster analysis. In this paper we propose Independent Components Analysis (ICA) as a better alternative to PCA as it does not assume that the original data set has a Gaussian distribution, which allows ICA to better find the important axes in the workload space. Our experimental results using SPEC CPU2000 benchmarks show that ICA significantly outperforms PCA in that ICA achieves smaller benchmark subsets that are more accurate than those found by PCA.

Original language	English (US)
Title of host publication	Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005
Pages	56-66
Number of pages	11
DOIs	https://doi.org/10.1109/IISWC.2005.1526001
State	Published - 2005
Event	2005 IEEE International Symposium on Workload Characterization, IISWC-2005 - Austin, TX, United States Duration: Oct 6 2005 → Oct 8 2005

Publication series

Name	Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005
Volume	2005

Other

Other	2005 IEEE International Symposium on Workload Characterization, IISWC-2005
Country/Territory	United States
City	Austin, TX
Period	10/6/05 → 10/8/05

Access

10.1109/IISWC.2005.1526001

OpenUrl availability

Full text

Cite this

Eeckhout, L., Sundareswarat, R., Yi, J. J., Lilja, D. J., & Schrater, P. R. (2005). Accurate statistical approaches for generating representative workload compositions. In Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005 (pp. 56-66). Article 1526001 (Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005; Vol. 2005). https://doi.org/10.1109/IISWC.2005.1526001

Accurate statistical approaches for generating representative workload compositions. / Eeckhout, Lieven; Sundareswarat, Rashmi; Yi, Joshua J. et al.
Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005. 2005. p. 56-66 1526001 (Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005; Vol. 2005).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Eeckhout, L, Sundareswarat, R, Yi, JJ, Lilja, DJ & Schrater, PR 2005, Accurate statistical approaches for generating representative workload compositions. in Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005., 1526001, Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005, vol. 2005, pp. 56-66, 2005 IEEE International Symposium on Workload Characterization, IISWC-2005, Austin, TX, United States, 10/6/05. https://doi.org/10.1109/IISWC.2005.1526001

Eeckhout L, Sundareswarat R, Yi JJ, Lilja DJ, Schrater PR. Accurate statistical approaches for generating representative workload compositions. In Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005. 2005. p. 56-66. 1526001. (Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005). doi: 10.1109/IISWC.2005.1526001

@inproceedings{3ca3411cce2e40ac985c1fa3e140124e,

title = "Accurate statistical approaches for generating representative workload compositions",

abstract = "Composing a representative workload is a crucial step during the design process of a microprocessor. The workload should be composed in such a way that it is representative for the target domain of application and yet, the amount of redundancy in the workload should be minimized as much as possible in order not to overly increase the total simulation time. As a result, there is an important trade-off that needs to be made between workload representativeness and simulation accuracy versus simulation speed. Previous work used statistical data analysis techniques to identify representative benchmarks and corresponding inputs, also called a subset, from a large set of potential benchmarks and inputs. These methodologies measure a number of program characteristics on which Principal Components Analysis (PCA) is applied before identifying distinct program behaviors among the benchmarks using cluster analysis. In this paper we propose Independent Components Analysis (ICA) as a better alternative to PCA as it does not assume that the original data set has a Gaussian distribution, which allows ICA to better find the important axes in the workload space. Our experimental results using SPEC CPU2000 benchmarks show that ICA significantly outperforms PCA in that ICA achieves smaller benchmark subsets that are more accurate than those found by PCA.",

author = "Lieven Eeckhout and Rashmi Sundareswarat and Yi, {Joshua J.} and Lilja, {David J} and Schrater, {Paul R}",

year = "2005",

doi = "10.1109/IISWC.2005.1526001",

language = "English (US)",

isbn = "0780394615",

series = "Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005",

pages = "56--66",

booktitle = "Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005",

}

TY - GEN

T1 - Accurate statistical approaches for generating representative workload compositions

AU - Eeckhout, Lieven

AU - Sundareswarat, Rashmi

AU - Yi, Joshua J.

AU - Lilja, David J

AU - Schrater, Paul R

PY - 2005

Y1 - 2005

N2 - Composing a representative workload is a crucial step during the design process of a microprocessor. The workload should be composed in such a way that it is representative for the target domain of application and yet, the amount of redundancy in the workload should be minimized as much as possible in order not to overly increase the total simulation time. As a result, there is an important trade-off that needs to be made between workload representativeness and simulation accuracy versus simulation speed. Previous work used statistical data analysis techniques to identify representative benchmarks and corresponding inputs, also called a subset, from a large set of potential benchmarks and inputs. These methodologies measure a number of program characteristics on which Principal Components Analysis (PCA) is applied before identifying distinct program behaviors among the benchmarks using cluster analysis. In this paper we propose Independent Components Analysis (ICA) as a better alternative to PCA as it does not assume that the original data set has a Gaussian distribution, which allows ICA to better find the important axes in the workload space. Our experimental results using SPEC CPU2000 benchmarks show that ICA significantly outperforms PCA in that ICA achieves smaller benchmark subsets that are more accurate than those found by PCA.

AB - Composing a representative workload is a crucial step during the design process of a microprocessor. The workload should be composed in such a way that it is representative for the target domain of application and yet, the amount of redundancy in the workload should be minimized as much as possible in order not to overly increase the total simulation time. As a result, there is an important trade-off that needs to be made between workload representativeness and simulation accuracy versus simulation speed. Previous work used statistical data analysis techniques to identify representative benchmarks and corresponding inputs, also called a subset, from a large set of potential benchmarks and inputs. These methodologies measure a number of program characteristics on which Principal Components Analysis (PCA) is applied before identifying distinct program behaviors among the benchmarks using cluster analysis. In this paper we propose Independent Components Analysis (ICA) as a better alternative to PCA as it does not assume that the original data set has a Gaussian distribution, which allows ICA to better find the important axes in the workload space. Our experimental results using SPEC CPU2000 benchmarks show that ICA significantly outperforms PCA in that ICA achieves smaller benchmark subsets that are more accurate than those found by PCA.

UR - http://www.scopus.com/inward/record.url?scp=33749055123&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33749055123&partnerID=8YFLogxK

U2 - 10.1109/IISWC.2005.1526001

DO - 10.1109/IISWC.2005.1526001

M3 - Conference contribution

AN - SCOPUS:33749055123

SN - 0780394615

SN - 9780780394612

T3 - Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005

SP - 56

EP - 66

BT - Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005

T2 - 2005 IEEE International Symposium on Workload Characterization, IISWC-2005

Y2 - 6 October 2005 through 8 October 2005

ER -

Accurate statistical approaches for generating representative workload compositions

Abstract

Publication series

Other

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this