Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec.
Since the invention of word2vec, the skip-gram model has significantlyadvanced the research of network embedding, such as the recent emergence of theDeepWalk, LINE, PTE, and node2vec approaches. In this work, we show that all ofthe aforementioned models with negative sampling can be unified into the matrixfactorization framework with closed forms. Our analysis and proofs reveal that:(1) DeepWalk empirically produces a low-rank transformation of a network'snormalized Laplacian matrix; (2) LINE, in theory, is a special case of DeepWalkwhen the size of vertices' context is set to one; (3) As an extension of LINE,PTE can be viewed as the joint factorization of multiple networks' Laplacians;(4) node2vec is factorizing a matrix related to the stationary distribution andtransition probability tensor of a 2nd-order random walk. We further providethe theoretical connections between skip-gram based network embeddingalgorithms and the theory of graph Laplacian. Finally, we present the NetMFmethod as well as its approximation algorithm for computing network embedding.Our method offers significant improvements over DeepWalk and LINE forconventional network mining tasks. This work lays the theoretical foundationfor skip-gram based network embedding methods, leading to a betterunderstanding of latent network representation learning.
Stay in the loop.
Subscribe to our newsletter for a weekly update on the latest podcast, news, events, and jobs postings.