GithubHelp home page GithubHelp logo

hhy5277 / sentibridge Goto Github PK

View Code? Open in Web Editor NEW

This project forked from rainarch/sentibridge

0.0 1.0 0.0 307.31 MB

SentiBridge: A Knowledge Base for Entity-Sentiment Representation

License: Other

Python 100.00%

sentibridge's Introduction

SentiBridge

《SentiBridge: 中文实体情感知识库》/ SentiBridge: A Knowledge Base for Entity-Sentiment Representation

本词典包含:实体/属性—情感词。例如:“长城 宏伟”、“性价比 高”、“价格 高”。主要目的是刻画人们是怎么描述某个实体的,例如大家通常用 宏伟 来形容长城。

目前词典包含三个领域语料的抽取结果:新闻、旅游、餐饮,共计30万对。

文件说明(SentiBridge.zip)

每个文件夹中包含两种文件

  1. 前缀pair_sort代表排序得到的结果:
  • pair_sort_[m,n],指的是从m%到n%的排序部分

  • 数据形式是:实体/属性 情感词 收敛分数

  1. 前缀pair_mine代表提炼得到的结果:
  • 数据形式是:实体/属性 情感词 相似度分数1 相似度分数2

  • pair_mine后面的数字是提炼算法得到的结果中,保证正确率取的分数值。即文本中所有分数1和2,都必须高于该值并保留的结果

领域说明

  1. 新闻领域(Gigaword新闻语料):
  • pair_sort_[0,1],正确率统计92%,数量11.9w个
  • pair_mine_0.25,正确率统计90%,数量1.7w个
  1. 旅游领域(旅游用户评论):
  • pair_sort_[0,1],正确率统计98%
  • pair_sort_[1,2],正确率统计94%
  • pair_sort_[2,3],正确率统计90%
  • 以上总计8w个
  • pair_mine_0.2,正确率统计90%,数量5624个
  1. 餐饮领域(餐饮用户评论):
  • pair_sort_[0,1],正确率统计92%,数量9.3w个
  • pair_mine_0.2,正确率统计90%,数量5.2w个

代码

  • 代码在目录: ./Entity_Emotion_Express
  • 为了取得较好效果,最好使用用户评论数据,例如:淘宝评论、大众点评评论、携程评论等等。

声明和参考文献

  1. SentiBridge数据仅供学术研究使用,商用请联系我们(wlchen at suda.edu.cn)获取授权。
  2. 相关文献
  • 卢奇, 陈文亮. 大规模中文实体情感知识的自动获取, 中文信息学报, 32 (8): 32-41, 2018年8月
  • LU Qi, CHEN Wenliang, Automatically Building a Large Scale Dictionary of Chinese Entity Sentiment Expressions, Journal of Chinese Information Processing, 32 (8): 32-41, 2018-8.

sentibridge's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.