Toward efficient search for ultrascale storage systems

Joseph L. Naps; Mohmed F. Mokbel; David H.C. Du

doi:10.1145/2125636.2125638

Toward efficient search for ultrascale storage systems

Joseph L. Naps, Mohmed F. Mokbel, David H.C. Du

Computer Science and Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

As the rate at which scientific computing generates data continues to increase, we are finding that managing this data, in all facets, is quickly becoming more challenging. In many facilities with large scale storage needs, this massive amount of data is stored in distributed, multi-tiered storage systems. It has become imperative to allow for efficient and effective search within these kinds of environments. For some search problems, specifically file system metadata, traditional relational databases have been used with, initially, good results. As the scale of supercomputing has grown though, we find that it is becoming increasing difficult for databases to scale up with the volume of metadata that they are managing. In this paper, we propose a new direction for database management techniques within the context of high performance computing, specifically, search within ultrascale storage systems. Instead of using databases as a layer sitting above the storage system, we suggest the movement of database components within the storage system itself. By taking this approach, we aim to leverage the decades of research and tuning that have made relational database technology successful. At the same time, this integration gives us the ability to maintain a better view of the storage system for search optimization. Through this effort, we can position these techniques to better scale to the degree that is required by the high performance computing community currently, and in the future.

Original language	English (US)
Title of host publication	HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11
Pages	1-4
Number of pages	4
DOIs	https://doi.org/10.1145/2125636.2125638
State	Published - 2011
Event	1st Annual 2011 Workshop on High-Performance Computing Meets Databases, HPCDB'11, Co-located with Supercomputing, SC'11 - Seattle, WA, United States Duration: Nov 13 2011 → Nov 13 2011

Publication series

Name	HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11

Other

Other	1st Annual 2011 Workshop on High-Performance Computing Meets Databases, HPCDB'11, Co-located with Supercomputing, SC'11
Country/Territory	United States
City	Seattle, WA
Period	11/13/11 → 11/13/11

Keywords

Databases
Exascale
File systems
Indexing
Search

Access

10.1145/2125636.2125638

OpenUrl availability

Full text

Cite this

Naps, J. L., Mokbel, M. F., & Du, D. H. C. (2011). Toward efficient search for ultrascale storage systems. In HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11 (pp. 1-4). (HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11). https://doi.org/10.1145/2125636.2125638

Toward efficient search for ultrascale storage systems. / Naps, Joseph L.; Mokbel, Mohmed F.; Du, David H.C.
HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11. 2011. p. 1-4 (HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Naps, JL, Mokbel, MF & Du, DHC 2011, Toward efficient search for ultrascale storage systems. in HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11. HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11, pp. 1-4, 1st Annual 2011 Workshop on High-Performance Computing Meets Databases, HPCDB'11, Co-located with Supercomputing, SC'11, Seattle, WA, United States, 11/13/11. https://doi.org/10.1145/2125636.2125638

@inproceedings{520cb65a0a884ad5bb3f32e793c1321b,

title = "Toward efficient search for ultrascale storage systems",

abstract = "As the rate at which scientific computing generates data continues to increase, we are finding that managing this data, in all facets, is quickly becoming more challenging. In many facilities with large scale storage needs, this massive amount of data is stored in distributed, multi-tiered storage systems. It has become imperative to allow for efficient and effective search within these kinds of environments. For some search problems, specifically file system metadata, traditional relational databases have been used with, initially, good results. As the scale of supercomputing has grown though, we find that it is becoming increasing difficult for databases to scale up with the volume of metadata that they are managing. In this paper, we propose a new direction for database management techniques within the context of high performance computing, specifically, search within ultrascale storage systems. Instead of using databases as a layer sitting above the storage system, we suggest the movement of database components within the storage system itself. By taking this approach, we aim to leverage the decades of research and tuning that have made relational database technology successful. At the same time, this integration gives us the ability to maintain a better view of the storage system for search optimization. Through this effort, we can position these techniques to better scale to the degree that is required by the high performance computing community currently, and in the future.",

keywords = "Databases, Exascale, File systems, Indexing, Search",

author = "Naps, {Joseph L.} and Mokbel, {Mohmed F.} and Du, {David H.C.}",

year = "2011",

doi = "10.1145/2125636.2125638",

language = "English (US)",

isbn = "9781450311571",

series = "HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11",

pages = "1--4",

booktitle = "HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11",

note = "1st Annual 2011 Workshop on High-Performance Computing Meets Databases, HPCDB'11, Co-located with Supercomputing, SC'11 ; Conference date: 13-11-2011 Through 13-11-2011",

}

TY - GEN

T1 - Toward efficient search for ultrascale storage systems

AU - Naps, Joseph L.

AU - Mokbel, Mohmed F.

AU - Du, David H.C.

PY - 2011

Y1 - 2011

N2 - As the rate at which scientific computing generates data continues to increase, we are finding that managing this data, in all facets, is quickly becoming more challenging. In many facilities with large scale storage needs, this massive amount of data is stored in distributed, multi-tiered storage systems. It has become imperative to allow for efficient and effective search within these kinds of environments. For some search problems, specifically file system metadata, traditional relational databases have been used with, initially, good results. As the scale of supercomputing has grown though, we find that it is becoming increasing difficult for databases to scale up with the volume of metadata that they are managing. In this paper, we propose a new direction for database management techniques within the context of high performance computing, specifically, search within ultrascale storage systems. Instead of using databases as a layer sitting above the storage system, we suggest the movement of database components within the storage system itself. By taking this approach, we aim to leverage the decades of research and tuning that have made relational database technology successful. At the same time, this integration gives us the ability to maintain a better view of the storage system for search optimization. Through this effort, we can position these techniques to better scale to the degree that is required by the high performance computing community currently, and in the future.

AB - As the rate at which scientific computing generates data continues to increase, we are finding that managing this data, in all facets, is quickly becoming more challenging. In many facilities with large scale storage needs, this massive amount of data is stored in distributed, multi-tiered storage systems. It has become imperative to allow for efficient and effective search within these kinds of environments. For some search problems, specifically file system metadata, traditional relational databases have been used with, initially, good results. As the scale of supercomputing has grown though, we find that it is becoming increasing difficult for databases to scale up with the volume of metadata that they are managing. In this paper, we propose a new direction for database management techniques within the context of high performance computing, specifically, search within ultrascale storage systems. Instead of using databases as a layer sitting above the storage system, we suggest the movement of database components within the storage system itself. By taking this approach, we aim to leverage the decades of research and tuning that have made relational database technology successful. At the same time, this integration gives us the ability to maintain a better view of the storage system for search optimization. Through this effort, we can position these techniques to better scale to the degree that is required by the high performance computing community currently, and in the future.

KW - Databases

KW - Exascale

KW - File systems

KW - Indexing

KW - Search

UR - http://www.scopus.com/inward/record.url?scp=84857926748&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84857926748&partnerID=8YFLogxK

U2 - 10.1145/2125636.2125638

DO - 10.1145/2125636.2125638

M3 - Conference contribution

AN - SCOPUS:84857926748

SN - 9781450311571

T3 - HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11

SP - 1

EP - 4

BT - HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11

T2 - 1st Annual 2011 Workshop on High-Performance Computing Meets Databases, HPCDB'11, Co-located with Supercomputing, SC'11

Y2 - 13 November 2011 through 13 November 2011

ER -

Toward efficient search for ultrascale storage systems

Abstract

Publication series

Other

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this