floregol / age Goto Github PK
View Code? Open in Web Editor NEWThis project forked from vwz/age
Active Learning for Graph Embedding
License: MIT License
This project forked from vwz/age
Active Learning for Graph Embedding
License: MIT License
This program (AGE) implements an active learning for graph embedding framework, as proposed in the following paper. If you use it for scientific experiments, please cite this paper: @article{DBLP:journals/corr/CaiZC17, author = {HongYun Cai and Vincent Wenchen Zheng and Kevin Chen{-}Chuan Chang}, title = {Active Learning for Graph Embedding}, journal = {CoRR}, volume = {abs/1705.05085}, year = {2017}, url = {https://arxiv.org/abs/1705.05085}, timestamp = {Mon, 15 May 2017 06:49:04 GMT} } The code has been tested under Ubuntu 16.04 LTS with Intel Xeon(R) CPU E5-1620 @3.50GHz*8 and 16G memory. ============== *** Installation *** ============== python setup.py install ============== *** Requirements *** ============== tensorflow (>0.12) networkx Graph convolutional network (Kipf and Welling, ICLR 2017): https://github.com/tkipf/gcn ============== *** Data *** ============== In order to use your own data, you have to provide an N by N adjacency matrix (N is the number of nodes), an N by D feature matrix (D is the number of features per node), and an N by E binary label matrix (E is the number of classes). Have a look at the load_data() function in utils.py for an example. In this example, we load citation network data (Cora, Citeseer or Pubmed). The original datasets can be found here: http://linqs.cs.umd.edu/projects/projects/lbc/. In our version (see data folder) we use dataset splits provided by https://github.com/kimiyoung/planetoid (Zhilin Yang, William W. Cohen, Ruslan Salakhutdinov, Revisiting Semi-Supervised Learning with Graph Embeddings, ICML 2016) to load the whole dataset, and use the same test data as theirs. The validation node instances are randomly sampled from the non-test nodes set. We randomly generate 10 validation sets for each dataset and the node indexes are stored in "source/datasetname/val_idxa.txt" (where a is the validation set id, range within [0,10]). The initially labeled nodes are randomly sampled from the non-test and non-train nodes set. Given the C (the number of classes in this dataset) and a predefined L, AGE will randomly sample L nodes from each class as the initially labeled nodes (so there are C*L initial labeled nodes in total). ============== *** Run the Program *** ============== 1. First generate the graph centrality score for each node as follows. Command: python get_graph_centrality.py datasetname e.g., python get_graph_centrality.py citeseer Parameteres: datasetname: denote the dataset to process Output: The centality scores for each node (same order as in graph) are stored in "res/datasetname/graphcentrality/normcen" Note: We adopt PageRank Centrality in this work. You can try other centrality measurements by modifing function "centralissimo()" in file "get_graph_centrality.py". 2. Run the AGE algorithm to actively select nodes to label during the graph embedding process and record the MacroF1 and MicroF1 for node classification Command: python train_entropy_density_graphcentral_ts.py validation_id nb_initial_labelled_nodes_per_class class_nb datasetname e.g., python train_entropy_density_graphcentral_ts.py 0 4 6 citeseer Parameters: validation_id: the validation set id, refering to the id listed in "source/datasetname/val_idxa.txt" nb_initial_labelled_nodes_per_class: number of the initial labelled nodes per class, we use four in this work class_nb: number of class for each dataset datasetname: the name of the dataset to process
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.