Recently there has been an increasing attention to use Deep Learning(DL) techniques to analyze social graphs, such as Flickr, Youtube, Twitter and so on. The beauty of such solution is that once DL is applied, several network mining tasks such as node classification, link prediction, node visualization, node recommendation can be solved by conventional machine learning algorithms.
In this project, we will build a model that can capture the network information of a node in an efficient and scalable manner. These learned representations will be used to do nodes classification in our project.
This project studies the problem of embedding very large information networks into low-dimensional vector spaces, which is useful in many tasks such as visualization, node classification, and link prediction. Most existing graph em- bedding methods do not scale for real world information networks which usually contain millions of nodes. In this paper, we implemented a network embedding method called the “LINE,” which is suitable for arbitrary types of informa- tion networks: undirected, directed, and/or weighted. The method optimizes a carefully designed objective function that preserves both the local and global network structures. An edge-sampling algorithm is proposed that addresses the limitation of the classical stochastic gradient descent and improves both the effectiveness and the efficiency of the inference. Empirical experiments prove the effectiveness of the LINE on a variety of real-world information networks,including language networks, social networks, and citation networks. The algorithm is very efficient, which is able to learn the embedding of a network with millions of vertices and billions of edges in a few hours on a typical single machine.
Better representation of nodes helps in solving various network mining tasks by conventional machine learning algorithms. It can be used for:
- Node classification
- Link prediction
- Node visualization
- Node recommendation
Important Project Links
- DeepWalk: Online Learning of Social Representations
- LINE: Large-scale Information Network Embedding
- Understanding basics of Neural Network
- Understanding the need of deep learning
- Datasets: BlogCatalog Data, Flickr Data, Youtube Data