GithubHelp home page GithubHelp logo

bert_classify's Introduction

疫情期间网民情绪识别-Baseline

简介

该项目为DataFountain的竞赛项目,竞赛网址:

https://www.datafountain.cn/competitions/423

要求根据给定微博ID和微博内容,设计算法对微博内容进行情绪识别,判断微博内容是积极的、消极的还是中性的。

环境

  • Pytorch = 1.5.0
  • GTX1080

代码说明

  1. 基于Bert+分类网络实现的Baseline.
  2. 采用5折交叉验证的方式训练模型,训练出5个模型,并将5个模型的预测结果相加,得到最终的结果。
  3. 由于样本中的类别样本不平衡,为了缓解这个问题,设置了两种loss函数,交叉熵损失函数、Focal_loss损失函数。在main.py中设置loss_type参数选择不同的损失函数。
  4. Bert部分与分类网络部分使用不同的学习率,Bert模块默认使用0.00001学习率,分类网络部分默认使用0.0001学习率,在main函数中均可设置。
  5. 在data_preprocess.py中实现数据集的预处理+分割。

后续工作:

1)使用对抗训练的方式来提高训练效果。

2)将fc分类网络替换为TextCNN模型,看能否进一步提升效果。

bert_classify's People

Contributors

qq751220449 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar Henry LOL avatar superwbb007 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.