GithubHelp home page GithubHelp logo

quannadev / lakefs-samples Goto Github PK

View Code? Open in Web Editor NEW

This project forked from treeverse/lakefs-samples

0.0 0.0 0.0 187.98 MB

lakefs-samples repository

License: Apache License 2.0

Shell 0.03% Python 9.37% Lua 0.83% Jupyter Notebook 89.29% Dockerfile 0.48%

lakefs-samples's Introduction

lakefs-samples

Check notebooks

Incorporating the Docker Compose formally known as Everything Bagel.

lakeFS logo

This sample repository captures a collection of notebooks, dockerized applications and code snippets that demonstrate how to use lakeFS.

lakeFS is a popular open-source solution for managing data. It provides a consistent and scalable data management layer on top of cloud storage, such as Amazon S3, Azure Blob Storage, or Google Cloud Storage. It allows users to create and manage data in a version-controlled and immutable manner, and offers features such as data governance, data lineage, and data access controls. lakeFS is compatible with a wide range of data processing frameworks and tools.

Let's Get Started ๐Ÿ‘ฉ๐Ÿปโ€๐Ÿ’ป

Clone this repository

git clone https://github.com/treeverse/lakeFS-samples.git
cd lakeFS-samples

You now have two options:

Run a Notebook server with your existing lakeFS Server

If you have already installed lakeFS or are utilizing lakeFS cloud, all you need to run is the Jupyter notebook server:

docker compose up jupyter-notebook

Once the stack's up and running, open the Jupyter Notebook (http://localhost:8888) and check out the catalog of sample notebooks to explore lakeFS.

Once you've finished, run the following to remove all the containers:

docker compose down

Don't have a lakeFS Server or Object Store?

If you want to provision a lakeFS server as well as MinIO for your object store, plus Jupyter then bring up the full stack:

# make sure we have got the lakeFS hooks content too
git submodule init
git submodule update

docker compose up

As above, open the Jupyter Notebook (http://localhost:8888) peruse the catalog of sample notebooks to explore lakeFS.

Environment Details

  • Jupyter Notebook is based on the Jupyter PySpark notebook and provides an interactive environment in which to explore lakeFS using Python and PySpark.
  • lakeFS can be provisioned as part of this environment, or provided by lakeFS cloud or your own installation.
  • If you run lakeFS as part of this environment, MinIO is provided as an S3-compatible object store. If you run lakeFS yourself you can use other S3-compatible object stores include S3, GCS, as well as MinIO
  • A sample lakeFS webhooks server is provided, configured based on using the provided lakeFS server.

Containers

URLs and login details

If you've brought up the full stack you'll also have:

Other Examples

Under the standalone_examples folder are a set of examples that need to be run on their own. Some use the repository's Docker Compose file and extend it, and others are self-contained and use their own Dockerfile.

Got Questions or Want to Chat?

๐Ÿ‘‰๐Ÿป Join the lakeFS Slack group - https://lakefs.io/slack

lakefs-samples's People

Contributors

iddoavn avatar kesarwam avatar vinodhini-sd avatar adipolak avatar rmoff avatar ozkatz avatar itaiad200 avatar johnnyaug avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.