GithubHelp home page GithubHelp logo

jeannefukumaru / nb_dep_ud_sm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ohenrik/nb_dep_ud_sm

0.0 1.0 0.0 123.26 MB

Spacy model trained based on Norwegian corpus converted from OBT to Universal dep.

License: Other

Python 100.00%

nb_dep_ud_sm's Introduction

Experimental Norwegian (Bokmål) language model for Spacy

This model is based of the Norwegian Universal dependency dataset that can be found here:

https://github.com/UniversalDependencies/UD_Norwegian-Bokmaal

Command used to train the model:

batch_from=16 batch_to=64 python -m spacy train nb model_out no_bokmaal-ud-train.json no_bokmaal-ud-dev.json -n 30

There is probably much room for improvement on this model. However in regards to tagging the model seems to perform pretty well.

Iteration 7 seemed to be working best so this is the one packaged here.

To get the same results as show here please use the updated Norwegian language package for Spacy. It should now be a part of the master branch, but the Pull request can be found here: explosion/spaCy#1882

Installation

To install the package use this command:

pip install https://github.com/ohenrik/nb_dep_ud_sm/raw/master/nb_dep_ud_sm-0.0.1/dist/nb_dep_ud_sm-0.0.1.tar.gz

Usage

import spacy
nb = spacy.load("nb_dep_ud_sm")

doc = nb("Det er kaldt på vinteren i Norge.")

Training results:

Itn. P.Loss N.Loss UAS NER P. NER R. NER F. Tag % Token % na na
0 500.962 0.000 83,67 0.000 0.000 0.000 93,269 100.000 3542.9 0.0
1 86.554 0.000 86,38 0.000 0.000 0.000 94,396 100.000 3767.6 0.0
2 35.351 0.000 87,07 0.000 0.000 0.000 94,762 100.000 3611.1 0.0
3 21.769 0.000 87,99 0.000 0.000 0.000 94,839 100.000 3779.8 0.0
4 19.490 0.000 88,26 0.000 0.000 0.000 95,02 100.000 3565.9 0.0
5 17.730 0.000 88,48 0.000 0.000 0.000 95,084 100.000 3421.0 0.0
6 16.141 0.000 88,77 0.000 0.000 0.000 95,042 100.000 3533.3 0.0
7 14.906 0.000 88,72 0.000 0.000 0.000 95,139 100.000 3572.3 0.0
8 13.644 0.000 88,76 0.000 0.000 0.000 95,042 100.000 3585.8 0.0
9 12.909 0.000 88,72 0.000 0.000 0.000 95,125 100.000 3694.2 0.0
10 12.194 0.000 88,72 0.000 0.000 0.000 95,075 100.000 3618.3 0.0
11 11.435 0.000 88,65 0.000 0.000 0.000 95,042 100.000 3738.2 0.0
12 10.950 0.000 88,67 0.000 0.000 0.000 94,754 100.000 3909.9 0.0
13 10.325 0.000 88,85 0.000 0.000 0.000 47,879 100.000 3673.9 0.0
14 9.793 0.000 88,88 0.000 0.000 0.000 42,063 100.000 3758.4 0.0
15 9.456 0.000 88,77 0.000 0.000 0.000 43,68 100.000 3497.1 0.0
16 8.967 0.000 88,69 0.000 0.000 0.000 45,06 100.000 3514.9 0.0
17 8.493 0.000 88,88 0.000 0.000 0.000 46,537 100.000 3632.7 0.0
18 8.109 0.000 88,76 0.000 0.000 0.000 47,249 100.000 3837.6 0.0
19 7.795 0.000 88,73 0.000 0.000 0.000 47,485 100.000 3473.2 0.0
20 7.573 0.000 88,81 0.000 0.000 0.000 47,579 100.000 3482.8 0.0
21 7.131 0.000 88,82 0.000 0.000 0.000 47,282 100.000 3327.1 0.0
22 7.053 0.000 88,87 0.000 0.000 0.000 46,916 100.000 3576.0 0.0
23 6.736 0.000 88,61 0.000 0.000 0.000 46,394 100.000 3223.6 0.0
24 6.459 0.000 88,83 0.000 0.000 0.000 45,841 100.000 3523.7 0.0
25 6.364 0.000 88,67 0.000 0.000 0.000 45,423 100.000 3163.7 0.0
26 6.080 0.000 88,80 0.000 0.000 0.000 44,959 100.000 3497.2 0.0
27 5.984 0.000 88,77 0.000 0.000 0.000 44,56 100.000 3642.3 0.0
28 5.724 0.000 88,99 0.000 0.000 0.000 44,249 100.000 3467.4 0.0
29 5.620 0.000 88,97 0.000 0.000 0.000 43,895 100.000 3628.4 0.0

Not an official model

This is not yet an official spacy model

nb_dep_ud_sm's People

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.