GithubHelp home page GithubHelp logo

awesome-archive / huannlp Goto Github PK

View Code? Open in Web Editor NEW

This project forked from liuhuanyong/huannlp

0.0 1.0 0.0 29.43 MB

self implement of NLP toolkit 个人实现NLP汉语自然语言处理组件,提供基于HMM与CRF的分词,词性标注,命名实体识别接口,提供基于CRF的依存句法接口。

Python 100.00%

huannlp's Introduction

HuanNLP

self implement of NLP toolkit 个人实现NLP汉语自然语言处理组件,提供基于HMM与CRF的分词,词性标注,命名实体识别接口,提供基于CRF的依存句法接口。

使用简介

引入

import nlp nlp = huannlp.HuanNLP('HMM') 或者 nlp = huannlp.HuanNLP('CRF') text = "刘焕勇硕士毕业于北京语言大学,目前在**科学院软件研究所工作"  

分词

words = huannlp.cut(text)

HMM模式:

['刘焕勇', '硕士', '毕业', '于', '北京', '语言', '大学', ',', '目前', '在', '中', '国', '科学', '院', '软', '件', '研究', '所', '工作']

CRF模式:

['刘焕勇', '硕士', '毕业于', '北京', '语言', '大学', ',', '目前', '在', '**科学院', '软件', '研究', '所', '工作']

词性标注

postags = huannlp.postag(text)

HMM模式:

['r', 'n', 'v', 'p', 'ns', 'n', 'n', 'w', 'nt', 'p', 'nd', 'n', 'n', 'n', 'a', 'n', 'v', 'u', 'n']

CRF模式:

['n', 'n', 'v', 'ns', 'n', 'n', 'w', 'nt', 'p', 'ni', 'n', 'v', 'u', 'n']

词性对照表

标记 词性 标记 词性 标记 词性 标记 词性
n 普通名词 nt 时间名词 nd 方位名词 nl 处所名词
nh 人名 nhf nhs ns 地名
nn 族名 ni 机构名 nz 其他专名 v 动词
vd 趋向动词 vl 联系动词 vu 能愿动词 a 形容词
f 区别词 m 数词 q 量词 d 副词
r 代词 p 介词 c 连词 u 助词
e 叹词 o 拟声词 i 习用语 j 缩略语
h 前接成分 k 后接成分 g 语素字 x 非语素字
w 标点符号 ws 非汉字字符串 wu 其他未知的符号 -- ---

命名实体识别

ners = huannlp.ner(text)

HMM模式:

{'TIM': [], 'PER': ['刘焕勇'], 'LOC': [], 'ORG': []}

CRF模式:

{'LOC': [], 'TIM': ['目前'], 'PER': ['刘焕勇'], 'ORG': ['**科学院', '北京语言大学']}

实体标记对照表

标记 实体类型
LOC 地名实体
PER 人名实体
ORG 机构实体
TIM 时间实体

依存句法标注

deps = nlp.dep(words, postags)

['刘焕勇', 'n', '硕士', 'n', 'ATT']
['硕士', 'n', '毕业于', 'v', 'SBV']
['毕业于', 'v', 'Root', '-', 'HED']
['北京', 'ns', '大学', 'n', 'ATT']
['语言', 'n', '大学', 'n', 'ATT']
['大学', 'n', '**科学院', 'ni', 'COO']
[',', 'w', '大学', 'n', 'WP']
['目前', 'nt', '大学', 'n', 'ATT']
['在', 'p', '软件', 'n', 'POB']
['**科学院', 'ni', '软件', 'n', 'ATT']
['软件', 'n', '研究', 'v', 'SBV']
['研究', 'v', '软件', 'n', 'VOB']
['所', 'u', '研究', 'v', 'ATT']
['工作', 'n', '**科学院', 'ni', 'VOB']

huannlp's People

Contributors

liuhuanyong avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.