Similar_Join: Extending DBMS with a bio-specific operator

Jake Yue Chen; John V. Carlis

Similar_Join: Extending DBMS with a bio-specific operator

Jake Yue Chen, John V. Carlis

Computer Science and Engineering

Research output: Contribution to conference › Paper › peer-review

5 Scopus citations

Abstract

Existing sequence comparison software applications lack adequate automation, abstraction, performance, and flexibility. Users need a new way of studying and applying sequence comparisons in the post-genomics era. We invented and developed a new bio-specific Database Management System (DBMS) operator, Similar_Join, to abstract the labor-intensive batch sequence similarity search task into a syntactically concise database operation. We implemented the Similar Join operator as part of a relational operator package. This implementation enabled us to write simple PL/SQL scripts within the DBMS to accomplish routine sequence similarity searches conveniently, for example, a "batch BLAST" that compares 7,000 human genes against 500,000 human Expressed Sequence Tags (EST) in a few hours. We also implemented a simple version of Similar_Join as a database operator in the extended data cartridge of Oracle 8i object-relational DBMS. When fully integrated into SQL language extensions, we demonstrated this operator could enable biology users to achieve interesting complex biological queries previously impossible inside the DBMS.

Original language	English (US)
Pages	109-114
Number of pages	6
State	Published - 2003
Event	Proceedings of the 2003 ACM Symposium on Applied Computing - Melbourne, FL, United States Duration: Mar 9 2003 → Mar 12 2003

Other

Other	Proceedings of the 2003 ACM Symposium on Applied Computing
Country/Territory	United States
City	Melbourne, FL
Period	3/9/03 → 3/12/03

Keywords

Database Management System (DBMS)
Genomic DBMS Extension
Relational Operator
Similar_Join Operator
Similarity Search

OpenUrl availability

Full text

Cite this

@conference{a34075d297e2462c9af0e3963083ad72,

title = "Similar_Join: Extending DBMS with a bio-specific operator",

abstract = "Existing sequence comparison software applications lack adequate automation, abstraction, performance, and flexibility. Users need a new way of studying and applying sequence comparisons in the post-genomics era. We invented and developed a new bio-specific Database Management System (DBMS) operator, Similar_Join, to abstract the labor-intensive batch sequence similarity search task into a syntactically concise database operation. We implemented the Similar Join operator as part of a relational operator package. This implementation enabled us to write simple PL/SQL scripts within the DBMS to accomplish routine sequence similarity searches conveniently, for example, a {"}batch BLAST{"} that compares 7,000 human genes against 500,000 human Expressed Sequence Tags (EST) in a few hours. We also implemented a simple version of Similar_Join as a database operator in the extended data cartridge of Oracle 8i object-relational DBMS. When fully integrated into SQL language extensions, we demonstrated this operator could enable biology users to achieve interesting complex biological queries previously impossible inside the DBMS.",

keywords = "Database Management System (DBMS), Genomic DBMS Extension, Relational Operator, Similar_Join Operator, Similarity Search",

author = "Chen, {Jake Yue} and Carlis, {John V.}",

year = "2003",

language = "English (US)",

pages = "109--114",

}

TY - CONF

T1 - Similar_Join

T2 - Proceedings of the 2003 ACM Symposium on Applied Computing

AU - Chen, Jake Yue

AU - Carlis, John V.

PY - 2003

Y1 - 2003

N2 - Existing sequence comparison software applications lack adequate automation, abstraction, performance, and flexibility. Users need a new way of studying and applying sequence comparisons in the post-genomics era. We invented and developed a new bio-specific Database Management System (DBMS) operator, Similar_Join, to abstract the labor-intensive batch sequence similarity search task into a syntactically concise database operation. We implemented the Similar Join operator as part of a relational operator package. This implementation enabled us to write simple PL/SQL scripts within the DBMS to accomplish routine sequence similarity searches conveniently, for example, a "batch BLAST" that compares 7,000 human genes against 500,000 human Expressed Sequence Tags (EST) in a few hours. We also implemented a simple version of Similar_Join as a database operator in the extended data cartridge of Oracle 8i object-relational DBMS. When fully integrated into SQL language extensions, we demonstrated this operator could enable biology users to achieve interesting complex biological queries previously impossible inside the DBMS.

AB - Existing sequence comparison software applications lack adequate automation, abstraction, performance, and flexibility. Users need a new way of studying and applying sequence comparisons in the post-genomics era. We invented and developed a new bio-specific Database Management System (DBMS) operator, Similar_Join, to abstract the labor-intensive batch sequence similarity search task into a syntactically concise database operation. We implemented the Similar Join operator as part of a relational operator package. This implementation enabled us to write simple PL/SQL scripts within the DBMS to accomplish routine sequence similarity searches conveniently, for example, a "batch BLAST" that compares 7,000 human genes against 500,000 human Expressed Sequence Tags (EST) in a few hours. We also implemented a simple version of Similar_Join as a database operator in the extended data cartridge of Oracle 8i object-relational DBMS. When fully integrated into SQL language extensions, we demonstrated this operator could enable biology users to achieve interesting complex biological queries previously impossible inside the DBMS.

KW - Database Management System (DBMS)

KW - Genomic DBMS Extension

KW - Relational Operator

KW - Similar_Join Operator

KW - Similarity Search

UR - http://www.scopus.com/inward/record.url?scp=0037661321&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0037661321&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:0037661321

SP - 109

EP - 114

Y2 - 9 March 2003 through 12 March 2003

ER -

Similar_Join: Extending DBMS with a bio-specific operator

Abstract

Other

Keywords

OpenUrl availability

Other files and links

Fingerprint

Cite this