GithubHelp home page GithubHelp logo

natlibfi / annif-tutorial Goto Github PK

View Code? Open in Web Editor NEW
36.0 10.0 9.0 473.22 MB

Instructions, exercises and example data sets for Annif hands-on tutorial

License: Creative Commons Attribution 4.0 International

Makefile 2.17% Python 6.29% Dockerfile 2.69% Shell 2.01% Jupyter Notebook 86.83%
tutorial workshop annif machine-learning code4lib glam multilabel-classification subject-indexing text-classification

annif-tutorial's People

Contributors

annakasprzik avatar juhoinkinen avatar mo-fu avatar monalehtinen avatar osma avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

annif-tutorial's Issues

Adding additional data sets to VirtualBox installation?

Hi,
I have the VirtualBox version of the Annif tutorial, and have been through the mandatory exercises. I must say the tutorial is very useful and pedagogically presented!
Is there any way I can try Annif with any of my own vocabularies and data sets within that framework?
Oddrun

downloading and using pretrained API models locally?

This might be a stupid question, but how do I use the pretrained models available via API locally (either via docker or Python)? It seems that in all tutorials and instructions here the main assumption is that models are trained from scratch. I would just like to download the latest models you have available via your API and use them locally. I can use your online API (e.g., via Python), but I'd rather use faster offline solution as I have tens of thousands of documents to process.
For example, see the docker for Turku NLP neural parser (http://turkunlp.org/Turku-neural-parser-pipeline/docker.html): Just few simple steps and you can use pretained models with your own texts. I was looking for something similar for Annif.

Installation trouble (VirtualBox)

I've just installed VirtualBox on my Windows 10 laptop, downloaded Annif-tutorial and added the Annif-tutorial image to VirtualBox, - all according to tutorial instruction video.
When trying to start the image (green arrow), I get the following error message:


Failed to open a session for the virtual machine annif-tutorial.
Not in a hypervisor partition (HVP=0) (VERR_NEM_NOT_AVAILABLE).
VT-x is disabled in the BIOS for all CPU modes (VERR_VMX_MSR_ALL_VMX_DISABLED).

Result Code: E_FAIL (0x80004005)
Component: ConsoleWrap
Interface: IConsole {872da645-4a9b-1727-bee2-5585105b9eed}

Any idea what to do?

Exercise about sufficient amount of train data (learning curves)

A common question in the tutorial sessions has been "how many documents do I need for training a model". We could have an optional exercise that would show how increasing --docs-limit value in training a model affects the evaluation results of the model. Also some simple way to plot the results as a learning curve would be nice.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.