GithubHelp home page GithubHelp logo

vivo-community / generate2vivo Goto Github PK

View Code? Open in Web Editor NEW
4.0 5.0 2.0 228 KB

Extensible Data Ingest Tool for VIVO. Contains data sources like Datacite Commons, Crossref, ORCID and ROR. Developed at TIB as part of the BMBF funded project TAPIR.

Home Page: https://projects.tib.eu/tapir/en/

License: BSD 3-Clause "New" or "Revised" License

Java 100.00%
spring-boot docker maven vivo sparql-generate graphql java datacite-commons datacite ror

generate2vivo's Introduction

Java Spring Docker

Project Status: Active โ€“ The project has reached a stable, usable state and is being actively developed. Open Source Love

generate2vivo

generate2vivo is an extensible Data Ingest and Transformation Tool for the open source software VIVO. It currently contains queries for metadata from Datacite Commons, Crossref, ROR and ORCID and maps them to the VIVO ontology using sparql-generate. The resulting RDF data can be exported to a VIVO instance (or any SPARQL endpoint) directly or it can be returned in JSON-LD.


Features

generate2vivo features

Starting point was the sparql-generate library that we use as an engine for our transformations, which are defined in different GENERATE queries. Notice that code and queries are separate, this allows users

  • to write or change queries without going into the code
  • to reuse queries (meaning you can dump the code and only use the queries for example with the command line tool provided on the sparql-generate website)
  • to reuse code (meaning you can dump the queries if the data sources are not interesting for you and use only the code with your own queries)

In addition we gave the application a REST API so other programs or services can communicate with the application using HTTP requests which allows generate2vivo to be integrated in an existing data ingest process.

On the other side we added output functionality that allows you to export the generated data either directly into a VIVO instance via its SPARQL API or alternatively if you want to check the data before importing or are using a messaging service like Apache Kafka you can return the generated data as JSON-LD and do some post-processing with it.

Installation

  1. Prerequisites: You need to have maven and a JDK for Java 11 installed.
  2. Clone the repository to a local folder using git clone https://github.com/vivo-community/generate2vivo.git
  3. Change into the folder where the repository has been cloned.
  4. Open src/main/resources/application.properties and change your VIVO details accordingly. If you don't provide a vivo.url, vivo.email or vivo.password, the application will not import the mapped data to VIVO but return the triples in format JSON-LD.
  5. Run the application:
  • You can run the application directly via mvn spring-boot:run.
  • Or alternatively you can run the application in Docker:
    mvn spring-boot:build-image
    docker run -p 9000:9000 generate2vivo:latest
  1. A minimal swagger-ui will be available at http://localhost:9000/swagger-ui/.

Wiki resources

Additional resources are available in the GitHub wiki, e.g.

  • data sources & queries : A detailed overview of the data sources and their queries.

  • using generate2vivo: A short user tutorial with screenshots on how to use the swagger-UI to execute a query.

  • run queries in cmd line : An alternative way of running the queries via cmd line with the provided Jar from sparql-generate.

  • dev guide : resources specifically for developers, e.g. on how to add a data source or query, how to put variables from code into the query.

generate2vivo's People

Contributors

hauschke avatar smierz avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.