Node Embedding with Adaptive Similarities for Scalable Learning over Graphs

Dimitris Berberidis; Georgios B. Giannakis

doi:10.1109/TKDE.2019.2931542

Node Embedding with Adaptive Similarities for Scalable Learning over Graphs

Dimitris Berberidis, Georgios B. Giannakis

Research output: Contribution to journal › Article › peer-review

5 Scopus citations

Abstract

Node embedding is the task of extracting informative and descriptive features over the nodes of a graph. The importance of node embedding for graph analytics as well as learning tasks, such as node classification, link prediction, and community detection, has led to a growing interest and a number of recent advances. Nonetheless, node embedding faces several major challenges. Practical embedding methods have to deal with real-world graphs that arise from different domains, with inherently diverse underlying processes as well as similarity structures and metrics. On the other hand, similar to principal component analysis in feature vector spaces, node embedding is an inherently unsupervised task. Lacking metadata for validation, practical schemes motivate standardization and limited use of tunable hyperparameters. Finally, node embedding methods must be scalable in order to cope with large-scale real-world graphs of networks with ever-increasing size. The present work puts forth an adaptive node embedding framework that adjusts the embedding process to a given underlying graph, in a fully unsupervised manner. This is achieved by leveraging the notion of a tunable node similarity matrix that assigns weights on multihop paths. The design of multihop similarities ensures that the resultant embeddings also inherit interpretable spectral properties. The proposed model is thoroughly investigated, interpreted, and numerically evaluated using stochastic block models. Moreover, an unsupervised algorithm is developed for training the model parameters effieciently. Extensive node classification, link prediction, and clustering experiments are carried out on many real-world graphs from various domains, along with comparisons with state-of-the-art scalable and unsupervised node embedding alternatives. The proposed method enjoys superior performance in many cases, while also yielding interpretable information on the underlying graph structure.

Original language	English (US)
Article number	8778744
Pages (from-to)	637-650
Number of pages	14
Journal	IEEE Transactions on Knowledge and Data Engineering
Volume	33
Issue number	2
DOIs	https://doi.org/10.1109/TKDE.2019.2931542
State	Published - Feb 1 2021
Externally published	Yes

Bibliographical note

Publisher Copyright:
© 1989-2012 IEEE.

Keywords

SVD
SVM
multiscale
random walks
spectral
unsupervised

Access

10.1109/TKDE.2019.2931542

OpenUrl availability

Full text

Cite this

@article{ea83fdafa03b4d48872c8b330f241ce8,

title = "Node Embedding with Adaptive Similarities for Scalable Learning over Graphs",

abstract = "Node embedding is the task of extracting informative and descriptive features over the nodes of a graph. The importance of node embedding for graph analytics as well as learning tasks, such as node classification, link prediction, and community detection, has led to a growing interest and a number of recent advances. Nonetheless, node embedding faces several major challenges. Practical embedding methods have to deal with real-world graphs that arise from different domains, with inherently diverse underlying processes as well as similarity structures and metrics. On the other hand, similar to principal component analysis in feature vector spaces, node embedding is an inherently unsupervised task. Lacking metadata for validation, practical schemes motivate standardization and limited use of tunable hyperparameters. Finally, node embedding methods must be scalable in order to cope with large-scale real-world graphs of networks with ever-increasing size. The present work puts forth an adaptive node embedding framework that adjusts the embedding process to a given underlying graph, in a fully unsupervised manner. This is achieved by leveraging the notion of a tunable node similarity matrix that assigns weights on multihop paths. The design of multihop similarities ensures that the resultant embeddings also inherit interpretable spectral properties. The proposed model is thoroughly investigated, interpreted, and numerically evaluated using stochastic block models. Moreover, an unsupervised algorithm is developed for training the model parameters effieciently. Extensive node classification, link prediction, and clustering experiments are carried out on many real-world graphs from various domains, along with comparisons with state-of-the-art scalable and unsupervised node embedding alternatives. The proposed method enjoys superior performance in many cases, while also yielding interpretable information on the underlying graph structure.",

keywords = "SVD, SVM, multiscale, random walks, spectral, unsupervised",

author = "Dimitris Berberidis and Giannakis, {Georgios B.}",

note = "Publisher Copyright: {\textcopyright} 1989-2012 IEEE.",

year = "2021",

month = feb,

day = "1",

doi = "10.1109/TKDE.2019.2931542",

language = "English (US)",

volume = "33",

pages = "637--650",

journal = "IEEE Transactions on Knowledge and Data Engineering",

issn = "1041-4347",

publisher = "IEEE Computer Society",

number = "2",

}

TY - JOUR

T1 - Node Embedding with Adaptive Similarities for Scalable Learning over Graphs

AU - Berberidis, Dimitris

AU - Giannakis, Georgios B.

PY - 2021/2/1

Y1 - 2021/2/1

N2 - Node embedding is the task of extracting informative and descriptive features over the nodes of a graph. The importance of node embedding for graph analytics as well as learning tasks, such as node classification, link prediction, and community detection, has led to a growing interest and a number of recent advances. Nonetheless, node embedding faces several major challenges. Practical embedding methods have to deal with real-world graphs that arise from different domains, with inherently diverse underlying processes as well as similarity structures and metrics. On the other hand, similar to principal component analysis in feature vector spaces, node embedding is an inherently unsupervised task. Lacking metadata for validation, practical schemes motivate standardization and limited use of tunable hyperparameters. Finally, node embedding methods must be scalable in order to cope with large-scale real-world graphs of networks with ever-increasing size. The present work puts forth an adaptive node embedding framework that adjusts the embedding process to a given underlying graph, in a fully unsupervised manner. This is achieved by leveraging the notion of a tunable node similarity matrix that assigns weights on multihop paths. The design of multihop similarities ensures that the resultant embeddings also inherit interpretable spectral properties. The proposed model is thoroughly investigated, interpreted, and numerically evaluated using stochastic block models. Moreover, an unsupervised algorithm is developed for training the model parameters effieciently. Extensive node classification, link prediction, and clustering experiments are carried out on many real-world graphs from various domains, along with comparisons with state-of-the-art scalable and unsupervised node embedding alternatives. The proposed method enjoys superior performance in many cases, while also yielding interpretable information on the underlying graph structure.

AB - Node embedding is the task of extracting informative and descriptive features over the nodes of a graph. The importance of node embedding for graph analytics as well as learning tasks, such as node classification, link prediction, and community detection, has led to a growing interest and a number of recent advances. Nonetheless, node embedding faces several major challenges. Practical embedding methods have to deal with real-world graphs that arise from different domains, with inherently diverse underlying processes as well as similarity structures and metrics. On the other hand, similar to principal component analysis in feature vector spaces, node embedding is an inherently unsupervised task. Lacking metadata for validation, practical schemes motivate standardization and limited use of tunable hyperparameters. Finally, node embedding methods must be scalable in order to cope with large-scale real-world graphs of networks with ever-increasing size. The present work puts forth an adaptive node embedding framework that adjusts the embedding process to a given underlying graph, in a fully unsupervised manner. This is achieved by leveraging the notion of a tunable node similarity matrix that assigns weights on multihop paths. The design of multihop similarities ensures that the resultant embeddings also inherit interpretable spectral properties. The proposed model is thoroughly investigated, interpreted, and numerically evaluated using stochastic block models. Moreover, an unsupervised algorithm is developed for training the model parameters effieciently. Extensive node classification, link prediction, and clustering experiments are carried out on many real-world graphs from various domains, along with comparisons with state-of-the-art scalable and unsupervised node embedding alternatives. The proposed method enjoys superior performance in many cases, while also yielding interpretable information on the underlying graph structure.

KW - SVD

KW - SVM

KW - multiscale

KW - random walks

KW - spectral

KW - unsupervised

UR - http://www.scopus.com/inward/record.url?scp=85099453284&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85099453284&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2019.2931542

DO - 10.1109/TKDE.2019.2931542

M3 - Article

AN - SCOPUS:85099453284

SN - 1041-4347

VL - 33

SP - 637

EP - 650

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

IS - 2

M1 - 8778744

ER -

Node Embedding with Adaptive Similarities for Scalable Learning over Graphs

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this