Algorithms to calculate the Manhattan (L1) distance for vertical data represented in pTrees

Mohammad K. Hossain; Arjun G. Roy; Arijit Chatterjee; William Perrizo

Algorithms to calculate the Manhattan (L1) distance for vertical data represented in pTrees

Mohammad K. Hossain, Arjun G. Roy, Arijit Chatterjee, William Perrizo

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

In data mining applications different types of distance metrics are used to measure the closeness of two data points. Among these metrics Manhattan (L1), Euclidean (L2) and Max (L.) distances are used very frequently in various algorithms. In pTree vertical data representation Max distance can be efficiently implemented using only bitwise operations across the pTrees without any horizontal access of the data points. But many clustering and classification algorithms require computing L1 and L2 distances in order to increase 1 their accuracy. In this paper we have shown how Manhattan or L₁ distance can be calculated for vertical data represented in pTrees. Similar to the Max distance this algorithm also uses only bitwise operations across various pTrees without performing any horizontal scan of the data points. As a result the algorithm works very fast on huge volume of data represented by pTrees comparing with traditional horizontal data representation. Also these algorithms enable various data mining algorithms that use pTrees to improve their accuracy without sacrificing any significant speed.

Original language	English (US)
Title of host publication	Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012
Pages	65-70
Number of pages	6
State	Published - 2012
Externally published	Yes
Event	27th International Conference on Computers and Their Applications, CATA 2012 - Las Vegas, NV, United States Duration: Mar 12 2012 → Mar 14 2012

Publication series

Name	Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012

Conference

Conference	27th International Conference on Computers and Their Applications, CATA 2012
Country/Territory	United States
City	Las Vegas, NV
Period	3/12/12 → 3/14/12

OpenUrl availability

Full text

Cite this

Hossain, M. K., Roy, A. G., Chatterjee, A., & Perrizo, W. (2012). Algorithms to calculate the Manhattan (L1) distance for vertical data represented in pTrees. In Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012 (pp. 65-70). (Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012).

Algorithms to calculate the Manhattan (L1) distance for vertical data represented in pTrees. / Hossain, Mohammad K.; Roy, Arjun G.; Chatterjee, Arijit et al.
Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012. 2012. p. 65-70 (Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Hossain, MK, Roy, AG, Chatterjee, A & Perrizo, W 2012, Algorithms to calculate the Manhattan (L1) distance for vertical data represented in pTrees. in Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012. Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012, pp. 65-70, 27th International Conference on Computers and Their Applications, CATA 2012, Las Vegas, NV, United States, 3/12/12.

Hossain, Mohammad K. ; Roy, Arjun G. ; Chatterjee, Arijit et al. / Algorithms to calculate the Manhattan (L1) distance for vertical data represented in pTrees. Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012. 2012. pp. 65-70 (Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012).

@inproceedings{e30fb7d96cc54fd18cf00a80513554ad,

title = "Algorithms to calculate the Manhattan (L1) distance for vertical data represented in pTrees",

abstract = "In data mining applications different types of distance metrics are used to measure the closeness of two data points. Among these metrics Manhattan (L1), Euclidean (L2) and Max (L.) distances are used very frequently in various algorithms. In pTree vertical data representation Max distance can be efficiently implemented using only bitwise operations across the pTrees without any horizontal access of the data points. But many clustering and classification algorithms require computing L1 and L2 distances in order to increase 1 their accuracy. In this paper we have shown how Manhattan or L1 distance can be calculated for vertical data represented in pTrees. Similar to the Max distance this algorithm also uses only bitwise operations across various pTrees without performing any horizontal scan of the data points. As a result the algorithm works very fast on huge volume of data represented by pTrees comparing with traditional horizontal data representation. Also these algorithms enable various data mining algorithms that use pTrees to improve their accuracy without sacrificing any significant speed.",

author = "Hossain, {Mohammad K.} and Roy, {Arjun G.} and Arijit Chatterjee and William Perrizo",

year = "2012",

language = "English (US)",

isbn = "9781880843840",

series = "Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012",

pages = "65--70",

booktitle = "Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012",

note = "27th International Conference on Computers and Their Applications, CATA 2012 ; Conference date: 12-03-2012 Through 14-03-2012",

}

TY - GEN

T1 - Algorithms to calculate the Manhattan (L1) distance for vertical data represented in pTrees

AU - Hossain, Mohammad K.

AU - Roy, Arjun G.

AU - Chatterjee, Arijit

AU - Perrizo, William

PY - 2012

Y1 - 2012

N2 - In data mining applications different types of distance metrics are used to measure the closeness of two data points. Among these metrics Manhattan (L1), Euclidean (L2) and Max (L.) distances are used very frequently in various algorithms. In pTree vertical data representation Max distance can be efficiently implemented using only bitwise operations across the pTrees without any horizontal access of the data points. But many clustering and classification algorithms require computing L1 and L2 distances in order to increase 1 their accuracy. In this paper we have shown how Manhattan or L1 distance can be calculated for vertical data represented in pTrees. Similar to the Max distance this algorithm also uses only bitwise operations across various pTrees without performing any horizontal scan of the data points. As a result the algorithm works very fast on huge volume of data represented by pTrees comparing with traditional horizontal data representation. Also these algorithms enable various data mining algorithms that use pTrees to improve their accuracy without sacrificing any significant speed.

AB - In data mining applications different types of distance metrics are used to measure the closeness of two data points. Among these metrics Manhattan (L1), Euclidean (L2) and Max (L.) distances are used very frequently in various algorithms. In pTree vertical data representation Max distance can be efficiently implemented using only bitwise operations across the pTrees without any horizontal access of the data points. But many clustering and classification algorithms require computing L1 and L2 distances in order to increase 1 their accuracy. In this paper we have shown how Manhattan or L1 distance can be calculated for vertical data represented in pTrees. Similar to the Max distance this algorithm also uses only bitwise operations across various pTrees without performing any horizontal scan of the data points. As a result the algorithm works very fast on huge volume of data represented by pTrees comparing with traditional horizontal data representation. Also these algorithms enable various data mining algorithms that use pTrees to improve their accuracy without sacrificing any significant speed.

UR - http://www.scopus.com/inward/record.url?scp=84872004517&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872004517&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84872004517

SN - 9781880843840

T3 - Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012

SP - 65

EP - 70

BT - Proceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012

T2 - 27th International Conference on Computers and Their Applications, CATA 2012

Y2 - 12 March 2012 through 14 March 2012

ER -

Algorithms to calculate the Manhattan (L1) distance for vertical data represented in pTrees

Abstract

Publication series

Conference

OpenUrl availability

Other files and links

Fingerprint

Cite this