GithubHelp home page GithubHelp logo

cs-249-project's Introduction

CS-249-Project

Download the adjacency matrix and feature vector from here https://drive.google.com/drive/folders/1ZdROyFY3KTh7ARqleMKwfYFL8BoWBsLD?usp=sharing or run the generation codes in model 2 or 3 (about 10-15 min to generate)

Model 1 Naive Bayes:

You can find the codes in the NaiveBayes_Jupyter.ipynb. It will require author_feature.pickle and author_features.pickle. Both can be generated, downloaded and came with the repo. You can also run NaiveBayes_Python.py if you prefer python codes.

Model 2 Pairwise Conditional Random Field:

The codes are in the ModelPairwiseCRF directory. Implemented with PyStruct

Prerequisite

  1. numpy
  2. pandas
  3. pystruct (note that pystruct is only supported on Python2, Python3.6 or less, we use Python2.7 to test the code)
  4. The model requires af_py2.pickle which stores the author feature matrix with pickle protocol 2 (since we use Python2). We provide the pickle file in the directory, or you can re-generate the pickle file, run gen_author_feature_py2_pickle.py.

Train and Evaluate

python crf.py

Experiment Results

We have run the model with python crf.py > crf.log, you can either rerun the model or directly check our results in crf.log

Model3 Graph Convolutional Networks in PyTorch:

PyTorch implementation of Graph Convolutional Networks (GCNs) for semi-supervised classification [1].

For a high-level introduction to GCNs, see:

Thomas Kipf, Graph Convolutional Networks (2016)

Graph Convolutional Networks

Requirements

  • PyTorch 0.4 or 0.5
  • Python 2.7 or 3.6

Usage

python train.py

Dataset:

gcn/data/DBLP_four_area/ and gcn/data/four_area/ 
(note: we directly read dataset from author_matrix.pickle and author_feature.pickle, which were generated from the dataset through function "generate_adj_feature()" in gcn/pygcn/utils.py)
You can generate the pickle files or download them, and modify the path of reading function "load_data()" in gcn/pygcn/utils.py.

Model Design:

Data structure: authors as the nodes in the graph, Adjacency matrix between authors, features of authors, labels of authors
Generate our own dataset: "load_data()" in gcn/pygcn/utils.py.

References

[1] Kipf & Welling, Semi-Supervised Classification with Graph Convolutional Networks, 2016

[2] Sen et al., Collective Classification in Network Data, AI Magazine 2008

cs-249-project's People

Contributors

asherniu avatar

Watchers

James Cloos avatar  avatar llxxee avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.