GithubHelp home page GithubHelp logo

geonodeusergroup-de / geonode-agrovoc-importer Goto Github PK

View Code? Open in Web Editor NEW
0.0 3.0 0.0 37 KB

django management command to import agrovoc theusaurus into GeoNode

License: GNU General Public License v3.0

Python 100.00%
geonode

geonode-agrovoc-importer's Introduction

geonode-agrovoc-importer

Django management command to import agrovoc theusaurus into GeoNode.

To run the import command copy it into your geonode project via:

cp load_agrovoc_thesaurus.py ~/path/to/geonode/geonode/base/management/commands/load_agrovoc_thesaurus.py

It then appears in your list of available commands in python manage.py help. Now you can get yourself an overview about the functionality with printing help:

p manage.py load_agrovoc_thesaurus --help
usage: manage.py load_agrovoc_thesaurus [-h] [-d] [--name NAME] [--file FILE] [--title TITLE] [--description DESCRIPTION] [--force-lower-case] [--defaultlang DEFAULT_LANG] [--version] [-v {0,1,2,3}] [--settings SETTINGS]
                                        [--pythonpath PYTHONPATH] [--traceback] [--no-color] [--force-color] [--skip-checks]

(Down)Load a AGROVOC in RDF format into DB

options:
  -h, --help            show this help message and exit
  -d, --dry-run         Only parse and print the thesaurus file, without perform insertion in the DB.
  --name NAME           Identifier name for the thesaurus in this GeoNode instance.
  --file FILE           Full path to a thesaurus in RDF format.
  --title TITLE         title to set in the base_thesaurus table for the agrovoc thesaurus
  --description DESCRIPTION
                        description to set in the base_thesaurus table for the agrovoc thesaurus
  --force-lower-case    all tkeywords and and tkeywordlabels are stored in lower case ...
  --defaultlang DEFAULT_LANG
                        change default language.
  --version             show program's version number and exit
  -v {0,1,2,3}, --verbosity {0,1,2,3}
                        Verbosity level; 0=minimal output, 1=normal output, 2=verbose output, 3=very verbose output
  --settings SETTINGS   The Python path to a settings module, e.g. "myproject.settings.main". If this isn't provided, the DJANGO_SETTINGS_MODULE environment variable will be used.
  --pythonpath PYTHONPATH
                        A directory to add to the Python path, e.g. "/home/djangoprojects/myproject".
  --traceback           Raise on CommandError exceptions
  --no-color            Don't colorize the command output.
  --force-color         Force colorization of the command output.
  --skip-checks         Skip system checks.

Before you run the command you have to download the agrovoc. You can do so from https://www.fao.org/agrovoc/releases (like https://data.apps.fao.org/catalog/dataset/agrovoc-2024-04/resource/9e78d388-1e52-41b2-a748-9645de136eca). After downloading and unziping the agrovoc RDF-File you can load the file into GeoNode via:

python manage.py load_agrovoc_thesaurus --name AGROVOC --file agrovoc_core.rdf

Please notice: running this script will require a big amount of memory (~12.5GB)

geonode-gemet-importer

Django management command to import GEMET theusaurus into GeoNode.

THIS IS A COPY WITH MINOR CHANGES FROM https://github.com/GeoNode/geonode/blob/4.2.2/geonode/base/management/commands/load_thesaurus.py

To run the import command copy it into your geonode project via:

cp load_gemet_thesaurus.py ~/path/to/geonode/geonode/base/management/commands/load_gemet_thesaurus.py

It then appears in your list of available commands in python manage.py help. Now you can get yourself an overview about the functionality with printing help:

python manage.py load_gemet_thesaurus --help
usage: manage.py load_gemet_thesaurus [-h] [-d] [--name NAME] [--defaultlang DEFAULT_LANG] [--force-lower-case] [--file FILE] [--version] [-v {0,1,2,3}] [--settings SETTINGS] [--pythonpath PYTHONPATH] [--traceback]
                                      [--no-color] [--force-color] [--skip-checks]

Load a thesaurus in RDF format into DB

options:
  -h, --help            show this help message and exit
  -d, --dry-run         Only parse and print the thesaurus file, without perform insertion in the DB.
  --name NAME           Identifier name for the thesaurus in this GeoNode instance.
  --defaultlang DEFAULT_LANG
                        change default language.
  --force-lower-case    all tkeywords and and tkeywordlabels are stored in lower case ...
  --file FILE           Full path to a thesaurus in RDF format.
  --version             show program's version number and exit
  -v {0,1,2,3}, --verbosity {0,1,2,3}
                        Verbosity level; 0=minimal output, 1=normal output, 2=verbose output, 3=very verbose output
  --settings SETTINGS   The Python path to a settings module, e.g. "myproject.settings.main". If this isn't provided, the DJANGO_SETTINGS_MODULE environment variable will be used.
  --pythonpath PYTHONPATH
                        A directory to add to the Python path, e.g. "/home/djangoprojects/myproject".
  --traceback           Raise on CommandError exceptions
  --no-color            Don't colorize the command output.
  --force-color         Force colorization of the command output.
  --skip-checks         Skip system checks.

Before you run the command you have to download the gemet full version from https://www.eionet.europa.eu/gemet/en/exports/rdf/latest -> Entire GEMET thesaurus in SKOS format. After downloading and gunzipping the agrovoc RDF-File you can load the file into GeoNode via:

python manage.py load_gemet_thesaurus --name GEMET --file gemet.rdf

geonode-agrovoc-importer's People

Contributors

mwallschlaeger avatar

Watchers

matthesrieke avatar Xenia Specka avatar  avatar

geonode-agrovoc-importer's Issues

unicode latin-1 encoding

When importing GEMET or AGROVOC thesauries the encoding is broken. Geonode exception, when trying to set one tkeyword via geonode api:

UnicodeEncodeError: 'latin-1' codec can't encode character '\u2019' in position 4821: Body ('โ€™') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.

very high memory consumption

with using the rdflib library inside of this project, memory usage is a big issue. Running this on the agrovoc requires more than 5GB of Ram.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.