TY - GEN
T1 - Toward efficient search for ultrascale storage systems
AU - Naps, Joseph L.
AU - Mokbel, Mohmed F.
AU - Du, David H.C.
PY - 2011
Y1 - 2011
N2 - As the rate at which scientific computing generates data continues to increase, we are finding that managing this data, in all facets, is quickly becoming more challenging. In many facilities with large scale storage needs, this massive amount of data is stored in distributed, multi-tiered storage systems. It has become imperative to allow for efficient and effective search within these kinds of environments. For some search problems, specifically file system metadata, traditional relational databases have been used with, initially, good results. As the scale of supercomputing has grown though, we find that it is becoming increasing difficult for databases to scale up with the volume of metadata that they are managing. In this paper, we propose a new direction for database management techniques within the context of high performance computing, specifically, search within ultrascale storage systems. Instead of using databases as a layer sitting above the storage system, we suggest the movement of database components within the storage system itself. By taking this approach, we aim to leverage the decades of research and tuning that have made relational database technology successful. At the same time, this integration gives us the ability to maintain a better view of the storage system for search optimization. Through this effort, we can position these techniques to better scale to the degree that is required by the high performance computing community currently, and in the future.
AB - As the rate at which scientific computing generates data continues to increase, we are finding that managing this data, in all facets, is quickly becoming more challenging. In many facilities with large scale storage needs, this massive amount of data is stored in distributed, multi-tiered storage systems. It has become imperative to allow for efficient and effective search within these kinds of environments. For some search problems, specifically file system metadata, traditional relational databases have been used with, initially, good results. As the scale of supercomputing has grown though, we find that it is becoming increasing difficult for databases to scale up with the volume of metadata that they are managing. In this paper, we propose a new direction for database management techniques within the context of high performance computing, specifically, search within ultrascale storage systems. Instead of using databases as a layer sitting above the storage system, we suggest the movement of database components within the storage system itself. By taking this approach, we aim to leverage the decades of research and tuning that have made relational database technology successful. At the same time, this integration gives us the ability to maintain a better view of the storage system for search optimization. Through this effort, we can position these techniques to better scale to the degree that is required by the high performance computing community currently, and in the future.
KW - Databases
KW - Exascale
KW - File systems
KW - Indexing
KW - Search
UR - http://www.scopus.com/inward/record.url?scp=84857926748&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84857926748&partnerID=8YFLogxK
U2 - 10.1145/2125636.2125638
DO - 10.1145/2125636.2125638
M3 - Conference contribution
AN - SCOPUS:84857926748
SN - 9781450311571
T3 - HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11
SP - 1
EP - 4
BT - HPCDB'11 - Proceedings of the 2011 Workshop on High-Performance Computing Meets Databases, Co-located with SC'11
T2 - 1st Annual 2011 Workshop on High-Performance Computing Meets Databases, HPCDB'11, Co-located with Supercomputing, SC'11
Y2 - 13 November 2011 through 13 November 2011
ER -