Topic: corpora Goto Github
Some thing interesting about corpora
Some thing interesting about corpora
corpora,A collaborative catalog of NLP resources for Indic languages
Organization: ai4bharat
Home Page: https://ai4bharat.github.io/indicnlp_catalog
corpora,API for corpora
User: alexeykosh
Home Page: https://github.com/lingcorpora/lingcorpora.py
corpora,Multilingual text corpus designed to study multilingual and cross-lingual natural language understanding (NLU) models and the strategies of localization of virtual assistants
User: cartesinus
corpora,An advanced, extensible web front-end for the Manatee-open corpus search engine
Organization: czcorpus
corpora,WaG - install your own word profile generator out of diverse data resources
Organization: czcorpus
corpora,CoNLL-X utilities
User: danieldk
corpora,The Data Format for Digital Linguistics (DaFoDiL)
Organization: digitallinguistics
Home Page: https://format.digitallinguistics.io
corpora,Jupyter Notebook for Natural Language Processing learning
Organization: digitaltools
corpora,A curated list of resources for natural language processing (NLP) in Swedish
User: dkalpakchi
corpora,Data from a corpus of written Hawaiian
User: dohliam
Home Page: https://dohliam.github.io/corpus/haw/
corpora,Table compiling the list of biomedically-related corpora available for named entity recognition (and some also suitable for association detection). First version has was published as part of the paper: Dieter Galea, Ivan Laponogov, Kirill Veselkov; Exploiting and assessing multi-source data for supervised biomedical named entity recognition, Bioinformatics, bty152, https://doi.org/10.1093/bioinformatics/bty152 . If you would like to add other (or your) corpora, please submit a pull request and I'll happily approve it.
User: dterg
corpora,An unofficial Python API that allows users to create a corpus of lyrical text from their favorite artists and billboard charts
User: edwardseley
corpora,Dataset containing Semantic Relations and Metadata, for Training and Evaluating Distributional Semantic Models in English and Mandarin Chinese
User: esantus
Home Page: https://github.com/esantus/EVALution
corpora,repo for Tibetan corpora
Organization: esukhia
corpora,Clean corpus generic script made with tm package
User: filipefilardi
corpora,Textstelle is a collection of corpora for the creation of bots and other things that generate text 🤖
User: gambolputty
Home Page: https://textstelle.0x0a.li
corpora,Pythonic Access to Digital Corpus of Sanskrit (DCS)
User: hrishikeshrt
corpora,Named Entity Recognition for biomedical entities
Organization: hu-ner
corpora,The Official Repository for 👉 CCAE: A Corpus of Chinese-based Asian Englishes @ NLPCC 2023
User: jacklanda
corpora,Scripts for building a geo-located web corpus using Common Crawl data
User: jonathandunn
corpora,Measure the similarity of text corpora for 74 languages
User: jonathandunn
corpora,Command-line corpus tools
User: jonsafari
corpora,Unannotated Spanish 3 Billion Words Corpora
User: josecannete
corpora,A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
User: juand-r
corpora,A variety of loaders for various NLP corpora.
Organization: juliatext
corpora,An R package for dynamic exploration of text collections
User: kgjerde
Home Page: https://kgjerde.github.io/corporaexplorer
corpora,A comprehensive list of annotated training datasets classified by use case.
Organization: kili-technology
Home Page: https://cloud.kili-technology.com/label
corpora,Open Korean NLP Dataset Curation for the Users All Around the Globe
Organization: ko-nlp
corpora,OPUS (opus.nlpl.eu) Python3 API
User: korenyoni
Home Page: https://k0ren.com
corpora,API for Russian National Corpus
User: kunansy
Home Page: https://kunansy.github.io/RNC
corpora,Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.
User: m4t1ss
corpora,Demeuk is a simple tool to clean up corpora (like dictionaries) or any dataset containing plain text strings.
Organization: netherlandsforensicinstitute
corpora,NLTK Data
Organization: nltk
corpora,微信公众号语料库
User: nonamestreet
corpora,A web-based engine for creating and annotating textual corpora
Organization: opencorpora
Home Page: http://opencorpora.org
corpora,Quantitative analysis of judgments of the European Court of Justice
User: phhartl
corpora,Data repository for pretrained NLP models and NLP corpora.
User: piskvorky
Home Page: https://rare-technologies.com/new-api-for-pretrained-nlp-models-and-datasets-in-gensim/
corpora,Official source for Spanish pretrained biomedical and clinical language models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).
Organization: plantl-gob-es
corpora,Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).
Organization: plantl-gob-es
corpora,Framework for working with brat-annotated .ann files
User: s-lilo
corpora,Automatic categorization of documents, consists in assigning a category to a text based on the information it contains. We'll follow different approach of Supervised Machine Learning.
User: saidziani
corpora,Simple CORPORA list crawler
User: tastyminerals
corpora,German Parliamentary Corpus (GerParCor)
Organization: texttechnologylab
corpora,Reading the data from OPIEC - an Open Information Extraction corpus
Organization: uma-pi1
Home Page: https://www.uni-mannheim.de/dws/research/resources/opiec/
corpora,The Potsdam Twitter Sentiment Corpus
User: wladimirsidorenko
corpora,CrossNER: Evaluating Cross-Domain Named Entity Recognition (AAAI-2021)
User: zliucr
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.