GithubHelp home page GithubHelp logo

mat-o-lab / ckanext-csvtocsvw Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 125 KB

Extention to automate meta data creation for csv files with the help of the csvtocsvw tool of mat-o-lab.

License: GNU Affero General Public License v3.0

Python 84.38% JavaScript 0.24% CSS 1.02% HTML 14.36%

ckanext-csvtocsvw's Introduction

Tests

ckanext-csvtocsvw

Extension automatically generating csvw metadata for uploaded textual tabular data. It uploads the data of the first table documented into a datastore for the source csv file. should be used as replacement for datapusher

Requirements

Needs a running instance of the CSVToCSVW Application. Point at it through env variables. Also needed is a Api Token for an account with the right privaledges to make the background job work on private datasets and ressources.

CKAN_CSVTOCSVW_URL=http://${CSVTOCSVW_HOST}:${CSVTOCSVW_APP_PORT}
CSVW_API_TOKEN=${CKAN_API_TOKEN}

You can set the default formats to annotate by setting the env variable CSVTOCSVW_FORMATS for example

CKANINI__CSVTOCSVW__FORMATS="csv txt asc"

else it will react to the following formats: "csv", "txt", "asc", "tsv"

Purpose

Reacts to CSV files uploaded. DEFAULT_FORMATS are "csv; txt" It creates two to sites for each resource.

  • /annotate creates CSVW annotation file for a CSV in json-ld format named <csv_filename>-metadata.json, uploades table-1 to ckan datastore o u can explorer it with recline views
  • /transform utilizes CSVW metadata to transform the whole content of the csv file to rdf, output is <csv_filename>.ttl The plugins default behavior includes a trigger to csv file uploads, so it runs annotation automatically on upload. The transformation is a bonus feature and outputs standard tabular data as mentioned in the CSVW documentation of the W3C. It must be triggered manually.

TODO: For example, you might want to mention here which versions of CKAN this extension works with.

If your extension works across different versions you can add the following table:

Compatibility with core CKAN versions:

CKAN version Compatible?
2.8 and arlier not tested
2.9 yes
2.10 yes

Suggested values:

  • "yes"
  • "not tested" - I can't think of a reason why it wouldn't work
  • "not yet" - there is an intention to get it working
  • "no"

Installation

TODO: Add any additional install steps to the list below. For example installing any non-Python dependencies or adding any required config settings.

To install ckanext-csvtocsvw:

  1. Activate your CKAN virtual environment, for example:

    . /usr/lib/ckan/default/bin/activate

  2. Clone the source and install it on the virtualenv

    git clone https://github.com/Mat-O-Lab/ckanext-csvtocsvw.git cd ckanext-csvtocsvw pip install -e . pip install -r requirements.txt

  3. Add csvtocsvw to the ckan.plugins setting in your CKAN config file (by default the config file is located at /etc/ckan/default/ckan.ini).

  4. Restart CKAN. For example if you've deployed CKAN with Apache on Ubuntu:

    sudo service apache2 reload

Config settings

None at present

TODO: Document any optional config settings here. For example:

# The minimum number of hours to wait before re-checking a resource
# (optional, default: 24).
ckanext.csvtocsvw.some_setting = some_default_value

Developer installation

To install ckanext-csvtocsvw for development, activate your CKAN virtualenv and do:

git clone https://github.com/Mat-O-Lab/ckanext-csvtocsvw.git
cd ckanext-csvtocsvw
python setup.py develop
pip install -r dev-requirements.txt

Tests

To run the tests, do:

pytest --ckan-ini=test.ini

Releasing a new version of ckanext-csvtocsvw

If ckanext-csvtocsvw should be available on PyPI you can follow these steps to publish a new version:

  1. Update the version number in the setup.py file. See PEP 440 for how to choose version numbers.

  2. Make sure you have the latest version of necessary packages:

    pip install --upgrade setuptools wheel twine

  3. Create a source and binary distributions of the new version:

    python setup.py sdist bdist_wheel && twine check dist/*
    

    Fix any errors you get.

  4. Upload the source distribution to PyPI:

    twine upload dist/*
    
  5. Commit any outstanding changes:

    git commit -a
    git push
    
  6. Tag the new release of the project on GitHub with the version number from the setup.py file. For example if the version number in setup.py is 0.0.1 then do:

    git tag 0.0.1
    git push --tags
    

License

AGPL

Acknowledgments

The authors would like to thank the Federal Government and the Heads of Government of the Länder for their funding and support within the framework of the Platform Material Digital consortium. Funded by the German Federal Ministry of Education and Research (BMBF) through the MaterialDigital Call in Project KupferDigital - project id 13XP5119.

ckanext-csvtocsvw's People

Contributors

thhanke avatar

Watchers

Alexandru Todor avatar

ckanext-csvtocsvw's Issues

locale is not working with minipod

Line 15 in csvw_parser.py:

locale.setlocale(locale.LC_ALL, "en_US")

I had to remove this line as it results in an error about that the locale is not known, even if I set it correctly in the Dockerfile.

Is this line required by your code or could you remove it? In the Dockerfile the locale is en_US

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.