GithubHelp home page GithubHelp logo

unicef / magasin Goto Github PK

View Code? Open in Web Editor NEW
4.0 9.0 2.0 19.04 MB

Cloud native open-source end-to-end data / AI / ML platform

Home Page: https://unicef.github.io/magasin/

License: Apache License 2.0

Shell 13.20% Mustache 55.67% Python 11.73% Smarty 18.57% Dockerfile 0.82%
cloud dagster data data-pipelines data-science data-visualization helm-charts kubernetes magasin

magasin's Introduction

magasin : cloud native end-to-end open-source data platform

magasin enables organizations to perform of automatic data ingestion, storage, analysis, ML/AI compute and visualization at scale.

Learn more about why magasin and its architecture.

Get started

In the get started you will install magasin within your local machine or in a kubernetes cluster, then you will perform an end-to-end data analysis that includes:

  • exploratory analysis of a data source,
  • create a pipeline to automate data ingestion, and
  • create a dashboard to present your findings.

Documentation

All the documentation, features, architecture, get started, advanced installation, deployment, contributing...

Contributing

Magasin follows an open approach towards accepting contributions.

License

Apache License Version 2.0 UNICEF

magasin's People

Contributors

0xifis avatar andrii29 avatar consideratio avatar danielfrg avatar dashanji avatar dask-bot avatar dependabot[bot] avatar dertiedemann avatar github-actions[bot] avatar guillaumeeb avatar isvoid avatar jacobtomlinson avatar jcscottiii avatar jsignell avatar kazimuth avatar lbrindze avatar manics avatar matrixmanatyrservice avatar matt711 avatar merlos avatar michcio1234 avatar mrocklin avatar nathanbaleeta avatar pre-commit-ci[bot] avatar raybellwaves avatar rgduncan avatar srikiz avatar stevededalus avatar tomaugspurger avatar zonca avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

dalvarez83 merlos

magasin's Issues

Installer declare -A invalid option on MacOS

On MacOs running ./install-magasin.sh in zsh the following is displayed:

-----------
./install-magasin.sh: line 254: declare: -A: invalid option
declare: usage: declare [-afFirtx] [-p] [name[=value] ...]

Automate magasin changes in the daskhub helm chart

By default daskhub helm chart sets a public proxy service in LoadBalancer mode.
f that is installed in a regular kubernetes cloud service, it exposes a jupyterhub interface to any user without password and therefore allows running arbitrary code in that instance.

To prevent that this change is added in the values.yaml file within the helm chart.

jupyterhub:
  proxy:
    service:
      type: ClusterIP

However, if the helm chart is updated (dev-scripts/update-helm-charts.sh) this change will be overwritten.

Installer does not work when missing dependencies using curl piping

In the installer, when it is run through curl piping and there are dependencies missing (which is almost always as mag is usually not installed), the installation fails when it reaches the point in which it asks the user if he wants to install the dependencies.

The reason for this is that the shell launched with curl piping is not interactive.

Solution:
Detect if the shell is interactive, if so => allow the question otherwise automatically install the dependencies.

Run magsh in Windows

magsh was not tested on windows when released for the First time.

After testing it on windows it should be something like:

docker run -ti -P -v "$env:USERPROFILE\.kube\config:/kube/config" -v "$env:USERPROFILE/.mc:/root/.mc" -v "$env:USERPROFILE/magsh:/shared" merlos/magsh:latest

Where $env:USERPROFILE will be replaced by C:\Users\currentUserName\

Uninstall chart tenant on uninstaller not working properly

[ i ] helm uninstall tenant --namespace magasin-tenant
Error: failed to delete release: tenant
[ βœ— ] Could not uninstall magasin/tenant in the namespace magasin-tenant
namespace "magasin" deleted

May be related with uninstalling operator first.

Install dagster within the installer

Dagster is a very agile product and the newer versions may not be compatible with the older ones. To ensure that the correct version of dagster is installed this can be somehow managed by the setup.

The version can be obtained by running

helm list --all-namespaces | grep magasin-dagster | awk '{print $10}'

Where magasin-dagster is the namespace for the dagster component within the realm magasin

Then the package can be setup using

pip install dagster==<version>

Alternatively, dagster could be a dependency within the setup.py of the CLI. So that by installing the mag client it is also installed dagster.

Cannot create main namespace for only suffix realm

Realms with only a suffix start with - (f.i., '-dev') . During the installation, the installer tries to create a namespace with the name "-" (f.i., '-dev'). However given the rules for creating a namespace this is not allowed.

This namespace is created for future use, such as identification of realms, or magasin workloads.

The conditions are restricted by label names RFC 1123 (kubernetes creates a label when a new namespace is created)
https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#dns-label-names

  • metadata.name: Invalid value: "_dev": a lowercase RFC 1123 label must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name', or '123-abc', regex used for validation is 'a-z0-9?')
  • metadata.labels: Invalid value: "dev": a valid label must be an empty string or consist of alphanumeric characters, '-', '' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')

Options:

  1. Do not allow the suffix only realms
  2. Add some kind of prefix in the main realm name such as "m--".

Enhancements in Contributing and support Documentation

Cotributing

  • Ways of contributing: till now it is focused on developers, but there are other ways of contributing such as filing an issue, writing about magasin, sharing in social networks.
  • How to report an issue + Issue template
  • Code of conduct

Support

  • include what are the ways of getting support from the community and what kind of support is available.
  • Include internal support within UNICEF.

Add administrator guides

There are some tasks that are commonly done in the different components such as adding a new store in drill or adding a new datasource in superset.

This needs to be broken down and expanded.

Web publishing Github action does not include install and uninstall script

Quarto publish action wipes out the gh-pages.

As a result, when a new version is deployed anything that is not managed by quarto is removed.
The current version does manage to keep the index.yaml file, but does not do the same with install and unistall scripts.

If a new version of the web is released it breaks the curl piped install and uninstall as the files are removed.

curl -X https://unicef.github.io/magasin/install-magasin.sh | bash

Automate changes in superset values.yaml

There are three items in values than need to be updated from the default values of the superset helm chart

  • bootstrapScript: |
    pip installs sqlalchemy-drill
  • extraSecretEnv SUPERSET_SECRET_KEY: 'LNS78tGfwCYyiIggQxLBviTT83itPTgbho822Wr9IWjo/cNcojL/CZK5'
  • initImage:
    repository: apache/superset
    tag: ea6cbcef3e980e42d3a19b8fc5928f973fd7be4a-dockerize-linux-amd64-3.9-slim-bookworm
    pullPolicy: IfNotPresent
    (No longer needed, multi-arch image added when fixing #50)

Add security.txt to publish-web deployment

Security.txt provides a machine-readable file which defines core attributes or your VDP, and is designed to be hosted on websites.

The security.txt file should be placed under the /.well-known/ path (/.well-known/security.txt) on websites. It can also be placed in the root directory (/security.txt) of a website, especially if the /.well-known/ directory cannot be used for technical reasons or as a fallback. The file can be placed in both locations of a website at the same time.

For more information visit https://securitytxt.org/ and the associated RFC8615.

We have the /docs/security.txt, It has to be added to the publish-web github action.

Error: Unable to create magasin-dagster namespace using manual installation

Running the helm command below to create a magasin-dagster namespace returns error using manual installation.

helm install dagster magasin/dagster --namespace magasin-dagster --create-namespace

Error returned:

Error: INSTALLATION FAILED: template: dagster/templates/deployment-webserver.yaml:3:4: executing "dagster/templates/deployment-webserver.yaml" at <include "deployment-webserver" $data>: error calling include: template: dagster/templates/helpers/_deployment-webserver.tpl:36:38: executing "deployment-webserver" at <include (print $.Template.BasePath "/configmap-instance.yaml") .>: error calling include: template: dagster/templates/configmap-instance.yaml:16:12: executing "dagster/templates/configmap-instance.yaml" at <include "dagsterYaml.scheduler.daemon" .>: error calling include: template: no template "dagsterYaml.scheduler.daemon" associated with template "gotpl"

Additional context:
kubectl version
Client Version: v1.29.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.4

Install mentions twice helm working

[ i ] Verifying commands are working...
[ βœ“ ] kubectl is working
[ βœ“ ] helm is working
[ βœ“ ] helm is working
[ i ] magasin helm repository already exists. Resetting it...
[ i ] Running: helm repo remove magasin; helm repo add magasin http://unicef.github.io/magasin 

mag-cli support for address in port forward

When magsh was introduced (PR #58), in order to support port forwarding in the Docker image to allow opening the UI in localhost, it was needed to set the --address 0.0.0.0 in the kubectl command of mag <component> ui.
Adding --address 0.0.0.0 makes kubectl open the port in all interfaces. In the case of Docker, it can be done through

 [Kubernetes cluster service] <------> [docker-image (runs kubectl port-forward) docker-ip ]<----->[docker host: localhost] 

Before this change kubectl only listened to localhost.

Opening the port in all interfaces is not an issue in the docker image, but running mag-cli in a computer, it will open the ports in all the interfaces, which may create a attack vector.

Potential solutions

  1. Enable --address parameter in mag <component> ui and any command that forwards a port so that by default it opens localhost but there is an option for launching the ui kubectl listening to all ports.

  2. In docker run test using ---network=host when launching magsh.

  3. Enable a config setting for enforcing the default behaviour. That way it can be enforced through config to set the address to listen to by default.

  4. Make the mag client savier. If it cannot listen to localhost (which happens in the image) try to listen to the (hostname -i) address.

References:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.