GithubHelp home page GithubHelp logo

Comments (8)

howl-anderson avatar howl-anderson commented on August 11, 2024

Hi, 你对 "intent_entity_featurizer_regex" 的工作方式理解错误,它不负责提取实体,它只是建议后续的实体提取组件这个可能是一个实体,具体是什么实体和采不采纳建议都是有后续组件训练得到的。具体对中文的支持情况,我还没测试。由于具体原理类的问题和 WeatherBot 项目并无关系,建议你加入 Rasa **官方社区 QQ Group ID:820037374,那里比较适合问rasa 相关的问题

from weatherbot.

lixiang1991 avatar lixiang1991 commented on August 11, 2024

好的,谢谢指点,我再深入研究下

from weatherbot.

lixiang1991 avatar lixiang1991 commented on August 11, 2024

经过测试发现怎么样使用查找表都没效。
然后在rasa_nlu文档中看到了,Regex features 只支持CRFEntityExtractor。
而查找表也是在regex基础上使用的,所以它也不支持mitie。
都怪我看文档不够仔细

from weatherbot.

howl-anderson avatar howl-anderson commented on August 11, 2024

如果你动手能力强,可以自己在后续的一些 NER 或者 分类器上实现支持这个属性的,不是很难,你如果感兴趣可以试一试,顺便给官方贡献一个 feature

from weatherbot.

lixiang1991 avatar lixiang1991 commented on August 11, 2024

我用英文数据研究了几天,想要修改mitie的实体抽取部分以实现这个功能。
我发现crf的确能根据regex特征提取出只在查找表中出现,而examples中不出现的实体。mitie不行。

原因是mitie的python调用工具里,并没有提供“为token附加其他特征”的接口。也就是说即使"regex"组件为每个token附加了regex特征,但是最终训练时的数据集里还是把regex特征丢弃了。

我不清楚是mitie本身就没有添加额外特征的功能还是python版没提供这个接口,请问你知道吗?

from weatherbot.

howl-anderson avatar howl-anderson commented on August 11, 2024

具体是 mitie 不支持额外特征还是只是 rasa 没有集成,需要你看看 mitie 的原理什么的了,我也不确定,只有看过代码理解原理才好回答

from weatherbot.

lixiang1991 avatar lixiang1991 commented on August 11, 2024

https://github.com/mit-nlp/MITIE/blob/master/examples/python/train_ner.py#L1
这个是mitie的官方示例,里面就是只有词和词的索引作为训练数据,应该不是rasa没集成的原因。
那我再仔细研究下mitie吧

from weatherbot.

howl-anderson avatar howl-anderson commented on August 11, 2024

追根问底的解决问题的方式非常棒!加油,如果找到了最终的答案,也请在这里 update 一下,作为众人的参考。:)

from weatherbot.

Related Issues (7)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.