TY - JOUR
T1 - Horton+
T2 - 39th International Conference on Very Large Data Bases, VLDB 2012
AU - Sarwat, Mohamed
AU - Elnikety, Sameh
AU - He, Yuxiong
AU - Mokbel, Mohamed F.
PY - 2013/9
Y1 - 2013/9
N2 - Horton+ is a graph query processing system that executes declarative reachability queries on a partitioned attributed multi-graph. It employs a query language, query optimizer, and a distributed execution engine. The query language expresses declarative reachability queries, and supports closures and predicates on node and edge attributes to match graph paths. We introduce three algebraic operators, select, traverse, and join, and a query is compiled into an execution plan containing these operators. As reachability queries access the graph elements in a random access pattern, the graph is therefore maintained in the main memory of a cluster of servers to reduce query execution time. We develop a distributed execution engine that processes a query plan in parallel on the graph servers. Since the query language is declarative, we build a query optimizer that uses graph statistics to estimate predicate selectivity. We experimentally evaluate the system performance on a cluster of 16 graph servers using synthetic graphs as well as a real graph from an application that uses reachability queries. The evaluation shows (1) the efficiency of the optimizer in reducing query execution time, (2) system scalability with the size of the graph and with the number of servers, and (3) the convenience of using declarative queries.
AB - Horton+ is a graph query processing system that executes declarative reachability queries on a partitioned attributed multi-graph. It employs a query language, query optimizer, and a distributed execution engine. The query language expresses declarative reachability queries, and supports closures and predicates on node and edge attributes to match graph paths. We introduce three algebraic operators, select, traverse, and join, and a query is compiled into an execution plan containing these operators. As reachability queries access the graph elements in a random access pattern, the graph is therefore maintained in the main memory of a cluster of servers to reduce query execution time. We develop a distributed execution engine that processes a query plan in parallel on the graph servers. Since the query language is declarative, we build a query optimizer that uses graph statistics to estimate predicate selectivity. We experimentally evaluate the system performance on a cluster of 16 graph servers using synthetic graphs as well as a real graph from an application that uses reachability queries. The evaluation shows (1) the efficiency of the optimizer in reducing query execution time, (2) system scalability with the size of the graph and with the number of servers, and (3) the convenience of using declarative queries.
UR - http://www.scopus.com/inward/record.url?scp=84891119285&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84891119285&partnerID=8YFLogxK
U2 - 10.14778/2556549.2556573
DO - 10.14778/2556549.2556573
M3 - Conference article
AN - SCOPUS:84891119285
SN - 2150-8097
VL - 6
SP - 1918
EP - 1929
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 14
Y2 - 26 August 2013 through 30 August 2013
ER -