A memory management system optimized for BDMPI's memory and execution model

Jeremy Iverson; George Karypis

doi:10.1145/2802658.2802666

A memory management system optimized for BDMPI's memory and execution model

Jeremy Iverson, George Karypis

Computer Science and Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

There is a growing need to perform large computations on small systems, as access to large systems is not widely avail- Able and cannot keep up with the scaling of data. BDMPI was recently introduced as a way of achieving this for applications written in MPI. BDMPI allows the efficient execution of standard MPI programs on systems whose aggregate amount of memory is smaller than that required by the computations and significantly outperforms other approaches. In this paper we present a virtual memory subsystem which we implemented as part of the BDMPI runtime. Our new virtual memory subsystem, which we call SBMA, bypasses the operating system virtual memory manager to take advantage of BDMPI's node-level cooperative multi- Taking. Benchmarking using a synthetic application shows that for the use cases relevant to BDMPI, the overhead incurred by the BDMPI-SBMA system is amortized such that it performs as fast as explicit data movement by the application developer. Furthermore, we tested SBMA with three different classes of applications and our results show that with no modification to the original MPI program, speedups from 2 × 12 over a standard BDMPI implementation can be achieved for the included applications.

Original language	English (US)
Title of host publication	Proceedings of the 22nd European MPI Users' Group Meeting, EuroMPI 2015
Publisher	Association for Computing Machinery
ISBN (Electronic)	9781450337953
DOIs	https://doi.org/10.1145/2802658.2802666
State	Published - Sep 21 2015
Event	22nd European MPI Users' Group Meeting, EuroMPI 2015 - Bordeaux, France Duration: Sep 21 2015 → Sep 23 2015

Publication series

Name	ACM International Conference Proceeding Series
Volume	21-23-September-2015

Other

Other	22nd European MPI Users' Group Meeting, EuroMPI 2015
Country/Territory	France
City	Bordeaux
Period	9/21/15 → 9/23/15

Bibliographical note

Funding Information:
This work was supported in part by NSF (IIS-0905220, OCI-1048018, CNS-1162405, IIS-1247632, IIP-1414153, IIS- 1447788), Army Research Office (W911NF-14-1-0316), Intel Software and Services Group, and the Digital Technology Center at the University of Minnesota. Access to research and computing facilities was provided by the Digital Tech- nology Center and the Minnesota Supercomputing Institute

Publisher Copyright:
© 2015 ACM.

Keywords

Big data
Distributed computing
MPI
Out-of- core
Virtual memory

Access

10.1145/2802658.2802666

OpenUrl availability

Full text

Cite this

A memory management system optimized for BDMPI's memory and execution model. / Iverson, Jeremy; Karypis, George.
Proceedings of the 22nd European MPI Users' Group Meeting, EuroMPI 2015. Association for Computing Machinery, 2015. a2 (ACM International Conference Proceeding Series; Vol. 21-23-September-2015).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Iverson, J & Karypis, G 2015, A memory management system optimized for BDMPI's memory and execution model. in Proceedings of the 22nd European MPI Users' Group Meeting, EuroMPI 2015., a2, ACM International Conference Proceeding Series, vol. 21-23-September-2015, Association for Computing Machinery, 22nd European MPI Users' Group Meeting, EuroMPI 2015, Bordeaux, France, 9/21/15. https://doi.org/10.1145/2802658.2802666

@inproceedings{6ea9c36a595d41bd9b20e7359c6347aa,

title = "A memory management system optimized for BDMPI's memory and execution model",

abstract = "There is a growing need to perform large computations on small systems, as access to large systems is not widely avail- Able and cannot keep up with the scaling of data. BDMPI was recently introduced as a way of achieving this for applications written in MPI. BDMPI allows the efficient execution of standard MPI programs on systems whose aggregate amount of memory is smaller than that required by the computations and significantly outperforms other approaches. In this paper we present a virtual memory subsystem which we implemented as part of the BDMPI runtime. Our new virtual memory subsystem, which we call SBMA, bypasses the operating system virtual memory manager to take advantage of BDMPI's node-level cooperative multi- Taking. Benchmarking using a synthetic application shows that for the use cases relevant to BDMPI, the overhead incurred by the BDMPI-SBMA system is amortized such that it performs as fast as explicit data movement by the application developer. Furthermore, we tested SBMA with three different classes of applications and our results show that with no modification to the original MPI program, speedups from 2 × 12 over a standard BDMPI implementation can be achieved for the included applications.",

keywords = "Big data, Distributed computing, MPI, Out-of- core, Virtual memory",

author = "Jeremy Iverson and George Karypis",

note = "Funding Information: This work was supported in part by NSF (IIS-0905220, OCI-1048018, CNS-1162405, IIS-1247632, IIP-1414153, IIS- 1447788), Army Research Office (W911NF-14-1-0316), Intel Software and Services Group, and the Digital Technology Center at the University of Minnesota. Access to research and computing facilities was provided by the Digital Tech- nology Center and the Minnesota Supercomputing Institute Publisher Copyright: {\textcopyright} 2015 ACM.; 22nd European MPI Users' Group Meeting, EuroMPI 2015 ; Conference date: 21-09-2015 Through 23-09-2015",

year = "2015",

month = sep,

day = "21",

doi = "10.1145/2802658.2802666",

language = "English (US)",

series = "ACM International Conference Proceeding Series",

publisher = "Association for Computing Machinery",

booktitle = "Proceedings of the 22nd European MPI Users' Group Meeting, EuroMPI 2015",

}

TY - GEN

T1 - A memory management system optimized for BDMPI's memory and execution model

AU - Iverson, Jeremy

AU - Karypis, George

N1 - Funding Information: This work was supported in part by NSF (IIS-0905220, OCI-1048018, CNS-1162405, IIS-1247632, IIP-1414153, IIS- 1447788), Army Research Office (W911NF-14-1-0316), Intel Software and Services Group, and the Digital Technology Center at the University of Minnesota. Access to research and computing facilities was provided by the Digital Tech- nology Center and the Minnesota Supercomputing Institute Publisher Copyright: © 2015 ACM.

PY - 2015/9/21

Y1 - 2015/9/21

N2 - There is a growing need to perform large computations on small systems, as access to large systems is not widely avail- Able and cannot keep up with the scaling of data. BDMPI was recently introduced as a way of achieving this for applications written in MPI. BDMPI allows the efficient execution of standard MPI programs on systems whose aggregate amount of memory is smaller than that required by the computations and significantly outperforms other approaches. In this paper we present a virtual memory subsystem which we implemented as part of the BDMPI runtime. Our new virtual memory subsystem, which we call SBMA, bypasses the operating system virtual memory manager to take advantage of BDMPI's node-level cooperative multi- Taking. Benchmarking using a synthetic application shows that for the use cases relevant to BDMPI, the overhead incurred by the BDMPI-SBMA system is amortized such that it performs as fast as explicit data movement by the application developer. Furthermore, we tested SBMA with three different classes of applications and our results show that with no modification to the original MPI program, speedups from 2 × 12 over a standard BDMPI implementation can be achieved for the included applications.

AB - There is a growing need to perform large computations on small systems, as access to large systems is not widely avail- Able and cannot keep up with the scaling of data. BDMPI was recently introduced as a way of achieving this for applications written in MPI. BDMPI allows the efficient execution of standard MPI programs on systems whose aggregate amount of memory is smaller than that required by the computations and significantly outperforms other approaches. In this paper we present a virtual memory subsystem which we implemented as part of the BDMPI runtime. Our new virtual memory subsystem, which we call SBMA, bypasses the operating system virtual memory manager to take advantage of BDMPI's node-level cooperative multi- Taking. Benchmarking using a synthetic application shows that for the use cases relevant to BDMPI, the overhead incurred by the BDMPI-SBMA system is amortized such that it performs as fast as explicit data movement by the application developer. Furthermore, we tested SBMA with three different classes of applications and our results show that with no modification to the original MPI program, speedups from 2 × 12 over a standard BDMPI implementation can be achieved for the included applications.

KW - Big data

KW - Distributed computing

KW - MPI

KW - Out-of- core

KW - Virtual memory

UR - http://www.scopus.com/inward/record.url?scp=84983387034&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84983387034&partnerID=8YFLogxK

U2 - 10.1145/2802658.2802666

DO - 10.1145/2802658.2802666

M3 - Conference contribution

AN - SCOPUS:84983387034

T3 - ACM International Conference Proceeding Series

BT - Proceedings of the 22nd European MPI Users' Group Meeting, EuroMPI 2015

PB - Association for Computing Machinery

T2 - 22nd European MPI Users' Group Meeting, EuroMPI 2015

Y2 - 21 September 2015 through 23 September 2015

ER -

A memory management system optimized for BDMPI's memory and execution model

Abstract

Publication series

Other

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this