GithubHelp home page GithubHelp logo

kmounlp / ner Goto Github PK

View Code? Open in Web Editor NEW
88.0 5.0 18.0 20.69 MB

한국어 개체명 정의 및 표지 표준화 기술보고서와 이를 기반으로 제작된 개체명 형태소 말뭉치

License: Other

Python 98.47% Perl 1.53%

ner's People

Contributors

ho-lol avatar youngjinshin avatar

Stargazers

Hojin Jang avatar  avatar cgiosy avatar KyungSoo Kang avatar Dongryul Min avatar ZanD avatar  avatar Minseok Yang avatar Jhori avatar BH avatar Jinhong Park avatar  avatar 이소명 avatar jaehak avatar Yoojin Ha avatar  avatar Hyunmin Jeon avatar Dave Morrissey avatar DongGeon Lee avatar Jiyeol Park avatar hkjang avatar HyunJi Shin avatar  avatar  avatar  avatar Seong-Hwan, Heo avatar  avatar Heeseon Cheon avatar HanJun Choi avatar rei avatar Jihun Jeung avatar DevCoop avatar  avatar Mattias Lee avatar Lee SeongJu avatar Sanghoun Choi avatar Jiiyeon avatar Syy.jung avatar Sung Geun An avatar Kilsup Lee avatar Eu-Bin KIM avatar 김병준 avatar snoop2head avatar SOHYEON avatar Sangchun Ha (Patrick) avatar Insu Jeon avatar Hogok avatar  avatar LEE JONGMIN avatar Joon Hwan Hur / 허준환 avatar Saebyeok avatar Bora Seo avatar LEE HANYONG avatar Goldsmith avatar Roman Hossain Shaon avatar 爱可可-爱生活 avatar Ukjae Jeong avatar Kyuhong Byun (변규홍 / combacsa) avatar shavingmace avatar pooh4880 avatar  avatar Soyoung Cho avatar nbikng avatar k2and5 avatar Kyunghoon Kim avatar Jiseong avatar Inhwan Lee avatar MrBananaHuman avatar doubledrive avatar Jihyung Moon avatar  avatar Soo-Heang EO avatar MINKYU PARK avatar gyunggyung avatar Eunhwan Park avatar Ch. (Chanwhi Choi) avatar 송영숙 avatar Keeho Ahn avatar Chisung Song avatar ChanYub Park avatar Chul woo Lee avatar Minchul Lee avatar  avatar  avatar Chunghyeon Nam avatar Kevin Ko avatar Joosung Yoon avatar Myungchul Shin avatar

Watchers

James Cloos avatar Myungchul Shin avatar  avatar  avatar  avatar

ner's Issues

안녕하세요!!

해당 말뭉치에 대해서 라이센스 정책이 없어서 문의드립니다.
무료로 사용가능한 건가요?

Add a license

This data is very useful, but currently cannot legally be used for anything. Would you be willing to license the data? Some standard open-source licenses for data are Creative Commons fair-use licenses. For more information, see here.

If you're willing, I can submit a pull request with a LICENSE.

데이터 오류

데이터를 보면 두번째, 세번째 필드에 오류가 있는 경우가 있습니다.
세번째 tags 필드의 시작이 '+'인 경우에 태그가 빠져있어서 살펴보니,
대략 아래와 같은 규칙으로 교정이 가능해보였습니다.

if tags[0] == '+':
                if morphi == '봤':
                    morphs = '보+았'
                    tags   = 'VV+EP'
                if morphi == '됐':
                    morphs = '되+었'
                    tags   = 'VV+EP'
                if morphi == '되':
                    morphs = '되'
                    tags   = 'VV'
                if morphi == '했':
                    morphs = '하+였'
                    tags   = 'VV+EP'
                if morphi == '했었':
                    morphs = '하+였었'
                    tags   = 'VV+EP'
                if morphi == '왔':
                    morphs = '오+았'
                    tags   = 'VV+EP'
                if morphi == '왔었':
                    morphs = '오+았었'
                    tags   = 'VV+EP'
                if morphi == '와':
                    morphs = '오+아'
                    tags   = 'VV+EC'
                if morphi == '와야':
                    morphs = '오+아야'
                    tags   = 'VV+EC'
                if morphi == '와서':
                    morphs = '오+아서'
                    tags   = 'VV+EC'
                if morphi == '컸':
                    morphs = '크+었'
                    tags   = 'VA+EP'
                if morphi == '커서':
                    morphs = '크+어서'
                    tags   = 'VA+EC'
                if morphi == '커':
                    morphs = '크+어'
                    tags   = 'VA+EC'
                if morphi == '줬':
                    morphs = '주+었'
                    tags   = 'VV+EP'
                if morphi == '졌':
                    morphs = '지+었'
                    tags   = 'VX+EP'
                if morphi == '써야':
                    morphs = '쓰+어야'
                    tags   = 'VV+EC'
                if morphi == '써서':
                    morphs = '쓰+어서'
                    tags   = 'VV+EC'
                if morphi == '써':
                    morphs = '쓰+어'
                    tags   = 'VV+EC'
                if morphi == '써도':
                    morphs = '쓰+어도'
                    tags   = 'VV+EC'
                if morphi == '썼':
                    morphs = '쓰+었'
                    tags   = 'VV+EP'
                if morphi == '쐈':
                    morphs = '쏘+았'
                    tags   = 'VV+EP'
                if morphi == '꿨':
                    morphs = '꾸+었'
                    tags   = 'VV+EP'
                if morphi == '쳤':
                    morphs = '치+었'
                    tags   = 'VV+EP'
                if morphi == '췄':
                    morphs = '추+었'
                    tags   = 'VV+EP'
                if morphi == '놨':
                    morphs = '노+았'
                    tags   = 'VV+EP'
                if morphi == '겠':
                    morphs = '겠'
                    tags   = 'EP'
                if morphi == '퍼':
                    morphs = '푸+어'
                    tags   = 'VV+EC'
                if morphi == '뒀':
                    morphs = '두+었'
                    tags   = 'VV+EP'
                if morphi == '꼈':
                    morphs = '끼+었'
                    tags   = 'VV+EP'
                if morphi == '떴':
                    morphs = '뜨+었'
                    tags   = 'VV+EP'
                if morphi == '떠':
                    morphs = '뜨+어'
                    tags   = 'VV+EC'
...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.