Abstract
Existing sequence comparison software applications lack adequate automation, abstraction, performance, and flexibility. Users need a new way of studying and applying sequence comparisons in the post-genomics era. We invented and developed a new bio-specific Database Management System (DBMS) operator, Similar_Join, to abstract the labor-intensive batch sequence similarity search task into a syntactically concise database operation. We implemented the Similar Join operator as part of a relational operator package. This implementation enabled us to write simple PL/SQL scripts within the DBMS to accomplish routine sequence similarity searches conveniently, for example, a "batch BLAST" that compares 7,000 human genes against 500,000 human Expressed Sequence Tags (EST) in a few hours. We also implemented a simple version of Similar_Join as a database operator in the extended data cartridge of Oracle 8i object-relational DBMS. When fully integrated into SQL language extensions, we demonstrated this operator could enable biology users to achieve interesting complex biological queries previously impossible inside the DBMS.
Original language | English (US) |
---|---|
Pages | 109-114 |
Number of pages | 6 |
State | Published - 2003 |
Event | Proceedings of the 2003 ACM Symposium on Applied Computing - Melbourne, FL, United States Duration: Mar 9 2003 → Mar 12 2003 |
Other
Other | Proceedings of the 2003 ACM Symposium on Applied Computing |
---|---|
Country/Territory | United States |
City | Melbourne, FL |
Period | 3/9/03 → 3/12/03 |
Keywords
- Database Management System (DBMS)
- Genomic DBMS Extension
- Relational Operator
- Similar_Join Operator
- Similarity Search