summerlight / anlp Goto Github PK
View Code? Open in Web Editor NEWApplied Natural Language Processing project
License: Apache License 2.0
Applied Natural Language Processing project
License: Apache License 2.0
We need to choose the project topic. Please refer to the candidate topics page
We need a script to generate a dataset for experiment. Our current dataset is ALTA-2010 Shared Task. In the case for the need of more language, annotation, shorter text or whatever else, we need to be able to generate a similar dataset.
Step needed:
We'll decide(or at least narrow down) a project topic tomorrow. After the meeting, we should prepare to answer the questions at the corresponding wiki page.
Also, we need to decide each member's role for this project. Please choose the role from below you're hoping to take. (All members should be able to edit this issue; if you're not, please let me know)
Write a proposal before Tuesday 23:59.
Currently I am writing a proposal based on multi-lingual language identification.
At the last meeting, several selected project topics are assigned to each member. The evaluation text is supposed to answer the below questions:
This set of questions is basically a gist of the most important part of the corresponding wiki page. So it is good to think about those detailed questions while writing a topic evaluation.
Let's talk about our team name.
Our research need detect similar language and partition the whole language set into accordingly separated sets. At the first stage, a full-fledged LID is not needed; just make some fake detector which can "simulate" language detection results.
We want to implement (very) basic LID schemes with CRF or structured SVM. Then we can see the result and find out whether it could be improved or not. We'll use PyStruct for this purpose. At the first stage, we don't need a full dataset. Just make some development set by hand (50~ would be suffice), and develop some identifier.
Before developing identifiers, please study the topic and how to use the library idiomatically. Fixing bugs in a legacy code is much harder than writing a new code from scratch, especially for those who are not code owners.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.