TY - GEN
T1 - Privacy leakage in multi-relational databases via pattern based semi-supervised learning
AU - Xiong, Hui
AU - Steinbach, Michael
AU - Kumar, Vipin
PY - 2005
Y1 - 2005
N2 - In multi-relational databases, a view, which is a context- and content-dependent subset of one or more tables (or other views), is often used to preserve privacy by hiding sensitive information. However, recent developments in data mining present a new challenge for database security even when traditional database security techniques, such as database access control, are employed. This paper presents a data mining framework using semi-super vised learning that demonstrates the potential for privacy leakage in multi-relational databases. Many different types of semi-supervised learning techniques, such as the K-nearest neighbor (KNN) method, can be used to demonstrate privacy leakage. However, we also introduce a new approach to semi-supervised learning, hyperclique pattern based semi-supervised learning (HPSL), which differs from traditional semi-supervised learning approaches in that it considers the similarity among groups of objects instead of only pairs of objects. Our experimental results show that both the KNN and HPSL methods have the ability to compromise database security, although HPSL is better at this privacy violation than the KNN method.
AB - In multi-relational databases, a view, which is a context- and content-dependent subset of one or more tables (or other views), is often used to preserve privacy by hiding sensitive information. However, recent developments in data mining present a new challenge for database security even when traditional database security techniques, such as database access control, are employed. This paper presents a data mining framework using semi-super vised learning that demonstrates the potential for privacy leakage in multi-relational databases. Many different types of semi-supervised learning techniques, such as the K-nearest neighbor (KNN) method, can be used to demonstrate privacy leakage. However, we also introduce a new approach to semi-supervised learning, hyperclique pattern based semi-supervised learning (HPSL), which differs from traditional semi-supervised learning approaches in that it considers the similarity among groups of objects instead of only pairs of objects. Our experimental results show that both the KNN and HPSL methods have the ability to compromise database security, although HPSL is better at this privacy violation than the KNN method.
KW - Database Security
KW - Hyperclique Patterns
KW - Privacy Preserving Data Mining
KW - Semi-supervised Learning
UR - http://www.scopus.com/inward/record.url?scp=33745793065&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33745793065&partnerID=8YFLogxK
U2 - 10.1145/1099554.1099664
DO - 10.1145/1099554.1099664
M3 - Conference contribution
AN - SCOPUS:33745793065
SN - 1595931406
SN - 9781595931405
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 355
EP - 356
BT - CIKM'05 - Proceedings of the 14th ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - CIKM'05 - Proceedings of the 14th ACM International Conference on Information and Knowledge Management
Y2 - 31 October 2005 through 5 November 2005
ER -