GithubHelp home page GithubHelp logo

chinese-bionlp's Introduction

中文生物医学自然语言处理(Chinese-BioNLP)

该项目旨在跟踪中文生物医学自然语言处理的进展,收集整理相关的论文列表和展示现存方法性能。

中文电子病历命名实体识别

中文电子病历命名实体识别(Chinese Clinical Named Entity Recognition, Chinese-CNER)任务目标是从给定的电子病历纯文本文档中识别并抽取出与医学临床相关的实体提及,并将它们归类到预定义的类别。下图展示了CCKS18 CNER评测数据的一个样例。

chinese-bionlp's People

Contributors

lingluodlut avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

chinese-bionlp's Issues

关于 CCKS 2017 task2 数据集的疑问

您好,很感谢您对 ccks 系列数据集做的整理。我在使用 ccks 2017 task2 数据集的过程中发现,我阅读过的论文中的统计结果都与您记录的一致 (如下表),对于数据集的描述也大多为 “其中训练集包括300个医疗记录,测试集包含100个医疗记录”。

症状体征 检查检验 疾病诊断 治疗 身体部位 总数
训练集 7,831 9,546 722 1,048 10,719 29,866
测试集 2,311 3,143 553 465 3,021 9,493

我对此有三个疑问:

  1. 我在竞赛平台上下载的测试集是不带标签的,测试集的评测需要上传到竞赛平台上评测(现已关闭),如何能统计测试集的实体信息?
  2. 我在ccks 官网上下载的任务描述文件中指出 “本任务采用的数据集由北京极目云健康科技有限公司提供,数据来源于其云医院平台的真实电子病历数据,共计800条(单个病人单次就诊记录),经脱敏处理,仅限CCKS 2017 竞赛评测用。”,与 “其中训练集包括300个医疗记录,测试集包含100个医疗记录” 有所出入
  3. 我统计的训练集实体信息和上表出入较大 (训练集我是从ccks 官网下载的))
症状体征 检查检验 疾病诊断 治疗 身体部位 总数
训练集 6187 5785 604 712 8310 21598
测试集 - - - - - -

因为 ccks 2017 确实已经过去很久了,所以不清楚这几年中是否发生了变化,如果您有和您记录的信息一致的数据集,是否可以共享给我,或者是否可以给我提供正确的获取途径,多有打扰,希望您能回复。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.