GithubHelp home page GithubHelp logo

sidharthbolar / jumpspark Goto Github PK

View Code? Open in Web Editor NEW

This project forked from spratiher9/jumpspark

0.0 1.0 0.0 76 KB

JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.

License: MIT License

Shell 4.32% Python 80.52% Makefile 15.16%

jumpspark's Introduction

Project JumpSpark - A spark-cookiecutter template with batteries included

JumpSpark

Powered by Cookiecutter, JumpSpark (Spark-Cookiecutter) is a framework for jumpstarting production-ready PySpark projects quickly with sample spark codebase and testcases.

Get started today with cookiecutter gh:Spratiher9/JumpSpark

Features


  • Modern Project structure powered by Poetry
  • Pre-configured virtual environment with batteries included
  • Supports the modern PySpark ecosystem
    1. Quinn - Pyspark Helper functions to enhance developer productivity
    2. Chispa - PySpark test helper methods with beautiful error messages
    3. Many more updates coming ...
  • Prepackaged sample codebase for quick starting

Quickstart


Install the latest Cookiecutter if you haven't installed it yet (this requires Cookiecutter 1.4.0 or higher)::

pip install -U cookiecutter

Navigate to your project's directory location and generate with cookiecutter::

cookiecutter gh:Spratiher9/JumpSpark

Enter the relevant details ( here Angelou is an example project )::

project_name [new-project]: angelou
package_name [angelou]: 
project_version [0.1.0]: 
full_name [Your Name]: Souvik Pratiher
email [Your Email]: [email protected]
github_username [github_username]: spratiher9
project_description [This is a pyspark project]: PySpark on Poetry example
python_version [3.9.6]: 
line_length [88]: 

The following project structure will be generated in your project's directory location::

angelou
|-- angelou
|   |--- __init__.py
|   |--- sparksession.py
|   |--- transformations.py
|
|-- mkdir
|-- pyproject.toml
|-- LICENSE
|-- poetry.lock
|-- README.md
|-- tests
    |---  __init__.py
    |---  conftest.py
    |---  test_angelou.py
    |---  test_compare_dataframes.py
    |---  test_transformations.py

What's next

In the coming days the cookiecutter template will be updated with:

  • Support for more PySpark related packages
  • Support for the CI-CD Devops Pipeline samples

If you got an idea to contribute to the project go for it.

Fork the project
      |
      V
Contribute your enhancements/features
      |
      V
Raise PR to merge

Cheers!!

jumpspark's People

Contributors

souvik-databricks avatar spratiher9 avatar mrpowers avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.