GithubHelp home page GithubHelp logo

0zgur0 / hilap Goto Github PK

View Code? Open in Web Editor NEW

This project forked from morningmoni/hilap

0.0 1.0 0.0 607 KB

Code for paper "Hierarchical Text Classification with Reinforced Label Assignment" EMNLP 2019

Python 100.00%

hilap's Introduction

This repo provides the code with paper "Hierarchical Text Classification with Reinforced Label Assignment" EMNLP 2019.

prediction_animation

HiLAP_architecture

Abstract

While existing hierarchical text classification (HTC) methods attempt to capture label hierarchies for model training, they either make local decisions regarding each label or completely ignore the hierarchy information during inference. To solve the mismatch between training and inference as well as modeling label dependencies in a more principled way, we formulate HTC as a Markov decision process and propose to learn a Label Assignment Policy via deep reinforcement learning to determine where to place an object and when to stop the assignment process. The proposed method, HiLAP, explores the hierarchy during both training and inference time in a consistent manner and makes inter-dependent decisions. As a general framework, HiLAP can incorporate different neural encoders as base models for end-to-end training. Experiments on five public datasets and four base models show that HiLAP yields an average improvement of 33.4% in Macro-F1 over flat classifiers and outperforms state-of-the-art HTC methods by a large margin.

Model

model.py: The main model of HiLAP.

TextCNN.py: Our implementation of "Convolutional Neural Networks for Sentence Classification" EMNLP 2014.

OHCNN(_fast).py: Our implementation of "Effective Use of Word Order for Text Categorization with Convolutional Neural Networks" NAACL 2015.

HAN.py: Our implementation of "Hierarchical Attention Networks for Document Classification" NAACL 2016.

HMCN.py: Our implementation of "Hierarchical Multi-Label Classification Networks" ICML 2018.

Requirements

Python 3

PyTorch 0.3

Data

Due to copyright issues, we can't directly release the datasets used in our experiments. Instead, we provide the links to the five data sources (the first two may require license):

Please check readData_*.py to see how to use our scripts to process and generate the datasets from the original data.

Run

All the parameters in conf.py have default values. Change parameters mode, base_model, and dataset and then run main.py to train or test on different settings. To test a model, set load_model=model_file & is_Train=False in conf.py and run main.py.

Cite

@article{mao2019hierarchical,
  title={Hierarchical Text Classification with Reinforced Label Assignment},
  author={Mao, Yuning and Tian, Jingjing and Han, Jiawei and Ren, Xiang},
  journal={arXiv preprint arXiv:1908.10419},
  year={2019}
}

hilap's People

Contributors

morningmoni avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.