GithubHelp home page GithubHelp logo

illgrenoble / visa-db-etl-py Goto Github PK

View Code? Open in Web Editor NEW
0.0 3.0 1.0 24 KB

This project contains the source code for the Python library to be used to load data into the database of the VISA platform.

License: GNU General Public License v3.0

Python 98.56% Dockerfile 1.44%

visa-db-etl-py's Introduction

Visa Database ETL Python Library

This project contains the source code for the Python library to be used to load data into the database of the VISA platform.

VISA (Virtual Infrastructure for Scientific Analysis) makes it simple to create compute instances on facility cloud infrastructure to analyse your experimental data using just your web browser.

See the User Manual for deployment instructions and end user documentation.

Description

The ETL Process is an application, running independently to VISA, that is used to push data into the VISA database.

Data includes User Office information (users, proposals, experiments, instruments) and roles of different users depending on their function at the site.

The Extraction and Transformation parts of the process are left to the administrators of VISA who have access to the local facility data sources. This library helps in the creation of the ETL Process at each site by providing the load aspect of the process.

How to populate with the CSV source

The CSV source is an example of how to develop a source for the Visa ETL.

To run it, simply modify the code to update the connection parameters, and comment or uncomment the call to clean() (for performance reasons, if you have a lot a data or it is the first time you load the data) The CSV files must be in the csv_data/ folder below the code, and have the same name as the function.

How to create a loader

  • create a python 3 module
  • instanciate class Loader with an asyncpg Connection object
  • Init schema by loading schema.sql and/or call Loader.clean() if necessary
  • call the methods in the order specified in CSV_source.py (inverse from the method Loader.clean())
  • each method expect an iterable. It can be a list, list comprehension, sequence, generator ...
  • each element of the iterable must be a dictionnary, with the column name as the key, and the value as a string
  • the loader is asynchronous, it must be run with asyncio.run()
  • every other details is up to you (where you get you data from )

Acknowledgements

VISA has been developed as part of the Photon and Neutron Open Science Cloud (PaNOSC)

PaNOSC has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 823852.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.