Efficient Aggregation Algorithms on Very Large Compressed Data Warehouses

Jianzhong Li; Yingshu Li; Jaideep Srivastava

doi:10.1007/BF02948809

Efficient Aggregation Algorithms on Very Large Compressed Data Warehouses

Jianzhong Li, Yingshu Li, Jaideep Srivastava

Computer Science and Engineering

Research output: Contribution to journal › Article › peer-review

2 Scopus citations

Abstract

Multidimensional aggregation is a dominant operation on data warehouses for on-line analytical processing (OLAP). Many efficient algorithms to compute multidimensional aggregation on relational database based data warehouses have been developed. However, to our knowledge, there is nothing to date in the literature about aggregation algorithms on multidimensional data warehouses that store datasets in multidimensional arrays rather than in tables. This paper presents a set of multidimensional aggregation algorithms on very large and compressed multidimensional data warehouses. These algorithms operate directly on compressed datasets in multidimensional data warehouses without the need to first decompress them. They are applicable to a variety of data compression methods. The algorithms have different performance behavior as a function of dataset parameters, sizes of outputs and main memory availability. The algorithms are described and analyzed with respect to the I/O and CPU costs. A decision procedure to select the most efficient algorithm, given an aggregation request, is also proposed. The analytical and experimental results show that the algorithms are more efficient than the traditional aggregation algorithms.

Original language	English (US)
Pages (from-to)	213-229
Number of pages	17
Journal	Journal of Computer Science and Technology
Volume	15
Issue number	3
DOIs	https://doi.org/10.1007/BF02948809
State	Published - May 2000

Bibliographical note

Funding Information:
Research supported by the National Natural Science Foundation of China and the National '863' High-Tech Programme of China.

Keywords

Aggregation
Data warehouse
OLAP

Access

10.1007/BF02948809

OpenUrl availability

Full text

Cite this

@article{f3b7bcbde0854ac6a1981d0099bcc850,

title = "Efficient Aggregation Algorithms on Very Large Compressed Data Warehouses",

abstract = "Multidimensional aggregation is a dominant operation on data warehouses for on-line analytical processing (OLAP). Many efficient algorithms to compute multidimensional aggregation on relational database based data warehouses have been developed. However, to our knowledge, there is nothing to date in the literature about aggregation algorithms on multidimensional data warehouses that store datasets in multidimensional arrays rather than in tables. This paper presents a set of multidimensional aggregation algorithms on very large and compressed multidimensional data warehouses. These algorithms operate directly on compressed datasets in multidimensional data warehouses without the need to first decompress them. They are applicable to a variety of data compression methods. The algorithms have different performance behavior as a function of dataset parameters, sizes of outputs and main memory availability. The algorithms are described and analyzed with respect to the I/O and CPU costs. A decision procedure to select the most efficient algorithm, given an aggregation request, is also proposed. The analytical and experimental results show that the algorithms are more efficient than the traditional aggregation algorithms.",

keywords = "Aggregation, Data warehouse, OLAP",

author = "Jianzhong Li and Yingshu Li and Jaideep Srivastava",

note = "Funding Information: Research supported by the National Natural Science Foundation of China and the National '863' High-Tech Programme of China.",

year = "2000",

month = may,

doi = "10.1007/BF02948809",

language = "English (US)",

volume = "15",

pages = "213--229",

journal = "Journal of Computer Science and Technology",

issn = "1000-9000",

publisher = "Springer New York",

number = "3",

}

TY - JOUR

T1 - Efficient Aggregation Algorithms on Very Large Compressed Data Warehouses

AU - Li, Jianzhong

AU - Li, Yingshu

AU - Srivastava, Jaideep

N1 - Funding Information: Research supported by the National Natural Science Foundation of China and the National '863' High-Tech Programme of China.

PY - 2000/5

Y1 - 2000/5

N2 - Multidimensional aggregation is a dominant operation on data warehouses for on-line analytical processing (OLAP). Many efficient algorithms to compute multidimensional aggregation on relational database based data warehouses have been developed. However, to our knowledge, there is nothing to date in the literature about aggregation algorithms on multidimensional data warehouses that store datasets in multidimensional arrays rather than in tables. This paper presents a set of multidimensional aggregation algorithms on very large and compressed multidimensional data warehouses. These algorithms operate directly on compressed datasets in multidimensional data warehouses without the need to first decompress them. They are applicable to a variety of data compression methods. The algorithms have different performance behavior as a function of dataset parameters, sizes of outputs and main memory availability. The algorithms are described and analyzed with respect to the I/O and CPU costs. A decision procedure to select the most efficient algorithm, given an aggregation request, is also proposed. The analytical and experimental results show that the algorithms are more efficient than the traditional aggregation algorithms.

AB - Multidimensional aggregation is a dominant operation on data warehouses for on-line analytical processing (OLAP). Many efficient algorithms to compute multidimensional aggregation on relational database based data warehouses have been developed. However, to our knowledge, there is nothing to date in the literature about aggregation algorithms on multidimensional data warehouses that store datasets in multidimensional arrays rather than in tables. This paper presents a set of multidimensional aggregation algorithms on very large and compressed multidimensional data warehouses. These algorithms operate directly on compressed datasets in multidimensional data warehouses without the need to first decompress them. They are applicable to a variety of data compression methods. The algorithms have different performance behavior as a function of dataset parameters, sizes of outputs and main memory availability. The algorithms are described and analyzed with respect to the I/O and CPU costs. A decision procedure to select the most efficient algorithm, given an aggregation request, is also proposed. The analytical and experimental results show that the algorithms are more efficient than the traditional aggregation algorithms.

KW - Aggregation

KW - Data warehouse

KW - OLAP

UR - http://www.scopus.com/inward/record.url?scp=0346739135&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0346739135&partnerID=8YFLogxK

U2 - 10.1007/BF02948809

DO - 10.1007/BF02948809

M3 - Article

AN - SCOPUS:0346739135

SN - 1000-9000

VL - 15

SP - 213

EP - 229

JO - Journal of Computer Science and Technology

JF - Journal of Computer Science and Technology

IS - 3

ER -

Efficient Aggregation Algorithms on Very Large Compressed Data Warehouses

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this