GithubHelp home page GithubHelp logo

词性和权重的分割 about nodejieba HOT 6 CLOSED

willin avatar willin commented on August 15, 2024
词性和权重的分割

from nodejieba.

Comments (6)

yanyiwu avatar yanyiwu commented on August 15, 2024

谢谢提醒!
[email protected] 需要 split 的这个设计确实很二,已经修复该问题。

from nodejieba.

willin avatar willin commented on August 15, 2024

词性标注 具体 有哪些词性?
uj ul n v 总表有吗

@yanyiwu

from nodejieba.

yanyiwu avatar yanyiwu commented on August 15, 2024

@willin 参考一下这个 https://gist.github.com/luw2007/6016931

from nodejieba.

willin avatar willin commented on August 15, 2024

thx

@yanyiwu 不过jieba部分没有中文注解 如 df mg

a 形容词
    ad 副形词
    an 名形词
    ag 形容词性语素
b 区别词
c 连词
d 副词
    df      *********************
    dg 副语素
e 叹词
f 方位词
g 语素
h 前接成分
i 成语
j 简称略语
k 后接成分
l 习用语
m 数词
    mg      *********************
    mq 数量词
n 名词
    ng 名词性语素
    nr 人名
    nrfg     *********************
    nrt      *********************
    ns 地名
    nt 机构团体名
    nz 其它专名
o 拟声词
p 介词
q 量词
r 代词
    rg 代词性语素
    rr 人称代词
    rz 指示代词
s 处所词
t 时间词
    tg 时间词性语素
u 助词
    ud      *********************
    ug      *********************
    uj      *********************
    ul      *********************
    uv      *********************
    uz      *********************
v 动词
    vd 副动词
    vg 动词性语素
    vi 不及物动词(内动词)
    vn 名动词
    vq      *********************
x 非语素字
y 语气词
z 状态词
    zg      *********************

from nodejieba.

willin avatar willin commented on August 15, 2024

最近有个需求要做智能控制,所以分词上有很多问题会需要请教,能给个联系方式吗,比如微信什么的

{ cut: [ '把', '卧室', '所有', '的', '灯', '都', '关', '了' ],
  tag:
   [ { word: '把', tag: 'p' },
     { word: '卧室', tag: 'n' },
     { word: '所有', tag: 'b' },
     { word: '的', tag: 'uj' },
     { word: '灯', tag: 'n' },
     { word: '都', tag: 'd' },
     { word: '关', tag: 'v' },
     { word: '了', tag: 'ul' } ],
  extract: [ { word: '卧室', weight: 8.20023407859 } ] }
{ cut: [ '把', '卧室', '全部', '的', '灯', '都', '关', '了' ],
  tag:
   [ { word: '把', tag: 'p' },
     { word: '卧室', tag: 'n' },
     { word: '全部', tag: 'n' },
     { word: '的', tag: 'uj' },
     { word: '灯', tag: 'n' },
     { word: '都', tag: 'd' },
     { word: '关', tag: 'v' },
     { word: '了', tag: 'ul' } ],
  extract: [ { word: '卧室', weight: 8.20023407859 } ] }
  { cut: [ '把', '卧室', '全部', '的', '灯关', '了' ],
  tag:
   [ { word: '把', tag: 'p' },
     { word: '卧室', tag: 'n' },
     { word: '全部', tag: 'n' },
     { word: '的', tag: 'uj' },
     { word: '灯关', tag: 'x' },
     { word: '了', tag: 'ul' } ],
  extract:
   [ { word: '灯关', weight: 11.739204307083542 },
     { word: '卧室', weight: 8.20023407859 } ] }

三个句子是差不多的。分别是把卧室全部的灯关了把卧室全部的灯都关了把卧室所有的灯都关了,结果全部算名词,所有算区别词。这样的结果跟我设想的很不一样啊。。。。

  1. 灯关 x =_=!!!
  2. 全部 n | 所有 b

from nodejieba.

yanyiwu avatar yanyiwu commented on August 15, 2024

微信联系方式在README.md 最下面就有啊。

from nodejieba.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.