TY - JOUR
T1 - A virtual memory manager optimized for node-level cooperative multi-tasking in memory constrained systems
AU - Iverson, Jeremy
AU - Karypis, George
PY - 2018/9/1
Y1 - 2018/9/1
N2 - There is a growing need to perform large computations on small systems, as access to large systems is not widely available and cannot keep up with the size of the data that needs to be processed. Recently, a runtime system for programs using a library that implements the Message Passing Interface (MPI), called Big Data MPI (BDMPI), that allows MPI programs whose aggregate amount of memory exceeds the physical amount of memory to be executed efficiently by utilizing node-level cooperative multi-tasking. In this paper we present a virtual memory subsystem which we implemented as part of the BDMPI runtime. Our new virtual memory subsystem, which we call SBMA takes advantage of BDMPI’s node-level cooperative multi-tasking in order to intelligently determine the parts of the virtual address space that need to be loaded to and unloaded from the main memory. Benchmarking using a synthetic application shows that for the use cases relevant to BDMPI, the overhead incurred by the memory protection constructs necessary for the BDMPI-SBMA system is amortized such that it performs as fast as explicit data movement by the application developer. Furthermore, testing SBMA with five different classes of applications showed that with no modification to the original MPI program, speedups from 2×–12× over a standard BDMPI implementation can be achieved for the included applications.
AB - There is a growing need to perform large computations on small systems, as access to large systems is not widely available and cannot keep up with the size of the data that needs to be processed. Recently, a runtime system for programs using a library that implements the Message Passing Interface (MPI), called Big Data MPI (BDMPI), that allows MPI programs whose aggregate amount of memory exceeds the physical amount of memory to be executed efficiently by utilizing node-level cooperative multi-tasking. In this paper we present a virtual memory subsystem which we implemented as part of the BDMPI runtime. Our new virtual memory subsystem, which we call SBMA takes advantage of BDMPI’s node-level cooperative multi-tasking in order to intelligently determine the parts of the virtual address space that need to be loaded to and unloaded from the main memory. Benchmarking using a synthetic application shows that for the use cases relevant to BDMPI, the overhead incurred by the memory protection constructs necessary for the BDMPI-SBMA system is amortized such that it performs as fast as explicit data movement by the application developer. Furthermore, testing SBMA with five different classes of applications showed that with no modification to the original MPI program, speedups from 2×–12× over a standard BDMPI implementation can be achieved for the included applications.
KW - distributed computing
KW - mpi
KW - out-of-core
KW - virtual memory
UR - http://www.scopus.com/inward/record.url?scp=85053880108&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85053880108&partnerID=8YFLogxK
U2 - 10.1177/1094342017690975
DO - 10.1177/1094342017690975
M3 - Article
AN - SCOPUS:85053880108
VL - 32
SP - 744
EP - 759
JO - International Journal of High Performance Computing Applications
JF - International Journal of High Performance Computing Applications
SN - 1094-3420
IS - 5
ER -