GithubHelp home page GithubHelp logo

jgoerner / data-science-stack-cookiecutter Goto Github PK

View Code? Open in Web Editor NEW
202.0 6.0 68.0 814 KB

🐳📊🤓Cookiecutter template to launch an awesome dockerized Data Science toolstack (incl. Jupyster, Superset, Postgres, Minio, AirFlow & API Star)

License: MIT License

Python 12.06% Shell 19.42% Jupyter Notebook 61.09% Dockerfile 7.43%
data-science docker python jupyter superset airflow apistar postgres minio docker-image

data-science-stack-cookiecutter's Introduction

Just do nail it 👌

ℹ️ TL;DR

// Staff Product Engineer // Ex Data-Scientist // Online Tutor // Open Source Enthusiast // Father //


🎯 Personal Mantra

Bias towards action + excellent execution = Just do nail it 👌


📜 Experience

I the past decade was able to work with smart & humble people...
    ... across various industries like pharma & health care in general, automotive or aerospace
    ... within global organizations (100.000+ people) as well as seed-funded startups
    ... on throw-away POCs as well as customer facing and globally used applications


🦸‍ Personal Superpower

I would describe myself as a T-shaped generalist working between product & engineering.

So Josh, what makes you special?

I guess my personal niche of expertise is relentless innovation. While I'm not the best product manager, engineer or designer I do excel when pulling methods & tools from all of them together to craft new and delightful experiences 👌


🔑 Key Achievements

Some of my personal professional highlights include that I ...
    ... bootstrapped a x-functional team & led full the full product lifecycle, launching a new product in < 6 months with ~ 30% WAU
    ... co-authored a patent for keyless car access - deployed in global BMW fleet since mid 2020
    ... authored the MOOC Beyond Jupyter Notebooks - as of today ~ 3.000 students and 4.9 ⭐️ rating

data-science-stack-cookiecutter's People

Contributors

jgoerner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

data-science-stack-cookiecutter's Issues

Running in WSL on Windows 10

Hi,

I'm trying to get this up and running using the Windows Subsystem for Linux. Three of the containers keep restarting. After debugging a little bit I found the following during the docker-compose build up step

postgres_1_44290de0f346 | 2018-11-26 23:03:17.667 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
postgres_1_44290de0f346 | 2018-11-26 23:03:17.667 UTC [1] LOG:  listening on IPv6 address "::", port 5432
apistar_1_6e8d837f2137 | [2018-11-26 23:03:17 +0000] [1] [INFO] Starting gunicorn 19.8.1
apistar_1_6e8d837f2137 | [2018-11-26 23:03:17 +0000] [1] [INFO] Listening at: http://0.0.0.0:8000 (1)
apistar_1_6e8d837f2137 | [2018-11-26 23:03:17 +0000] [1] [INFO] Using worker: sync
apistar_1_6e8d837f2137 | [2018-11-26 23:03:17 +0000] [7] [INFO] Booting worker with pid: 7
apistar_1_6e8d837f2137 | [2018-11-26 23:03:17 +0000] [7] [ERROR] Exception in worker process
apistar_1_6e8d837f2137 | Traceback (most recent call last):
apistar_1_6e8d837f2137 |   File "/usr/local/lib/python3.6/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
apistar_1_6e8d837f2137 |     worker.init_process()
apistar_1_6e8d837f2137 |   File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/base.py", line 129, in init_process
apistar_1_6e8d837f2137 |     self.load_wsgi()
apistar_1_6e8d837f2137 |   File "/usr/local/lib/python3.6/site-packages/gunicorn/workers/base.py", line 138, in load_wsgi
apistar_1_6e8d837f2137 |     self.wsgi = self.app.wsgi()
apistar_1_6e8d837f2137 |   File "/usr/local/lib/python3.6/site-packages/gunicorn/app/base.py", line 67, in wsgi
apistar_1_6e8d837f2137 |     self.callable = self.load()
apistar_1_6e8d837f2137 |   File "/usr/local/lib/python3.6/site-packages/gunicorn/app/wsgiapp.py", line 52, in load
apistar_1_6e8d837f2137 |     return self.load_wsgiapp()
apistar_1_6e8d837f2137 |   File "/usr/local/lib/python3.6/site-packages/gunicorn/app/wsgiapp.py", line 41, in load_wsgiapp
apistar_1_6e8d837f2137 |     return util.import_app(self.app_uri)
apistar_1_6e8d837f2137 |   File "/usr/local/lib/python3.6/site-packages/gunicorn/util.py", line 350, in import_app
apistar_1_6e8d837f2137 |     __import__(module)
apistar_1_6e8d837f2137 | ModuleNotFoundError: No module named 'app'

Another thing that might be relevant was this (when I ran docker-compose build apistar):

Step 3/4 : RUN pip install --proxy=${http_proxy}    psycopg2-binary    minio
 ---> Running in e8e60c7b69e4
The directory '/home/jovyan/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/jovyan/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.

Any ideas?

Error in Build, step 6

I am facing this error in step 6 of the build. It can not install the Scipy.

Step 6/8 : RUN pip install --no-cache-dir --proxy=${http_proxy}  apistar==0.5.41   gunicorn==19.8.1   dill===0.2.7.1   minio===4.0.0   psycopg2-binary===2.7.4   scipy   scikit-learn
 ---> Running in 1cbfdda6edf0
Collecting apistar==0.5.41
  Downloading https://files.pythonhosted.org/packages/59/ca/9c559fd9b4fab2c98203555b1ff628a54b9cb32896d017db8cc1f1e626a4/apistar-0.5.41.tar.gz (587kB)
Collecting gunicorn==19.8.1
  Downloading https://files.pythonhosted.org/packages/55/cb/09fe80bddf30be86abfc06ccb1154f97d6c64bb87111de066a5fc9ccb937/gunicorn-19.8.1-py2.py3-none-any.whl (112kB)
Collecting dill===0.2.7.1
  Downloading https://files.pythonhosted.org/packages/91/a0/19d4d31dee064fc553ae01263b5c55e7fb93daff03a69debbedee647c5a0/dill-0.2.7.1.tar.gz (64kB)
Collecting minio===4.0.0
  Downloading https://files.pythonhosted.org/packages/bb/82/7137e57d625b362756ef28e7c44cd32b5c03b87b02057e642062187011e1/minio-4.0.0-py2.py3-none-any.whl (50kB)
Collecting psycopg2-binary===2.7.4
  Downloading https://files.pythonhosted.org/packages/77/09/4991fcd9a8f4bea1ee3948e1729fa17c184d25bd10809bacc143626361b9/psycopg2-binary-2.7.4.tar.gz (426kB)
Collecting scipy
  Downloading https://files.pythonhosted.org/packages/04/ab/e2eb3e3f90b9363040a3d885ccc5c79fe20c5b8a3caa8fe3bf47ff653260/scipy-1.4.1.tar.gz (24.6MB)
  Installing build dependencies: started
  Installing build dependencies: still running...
  Installing build dependencies: still running...
  Installing build dependencies: still running...
  Installing build dependencies: still running...
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
    Preparing wheel metadata: started
    Preparing wheel metadata: finished with status 'error'
    ERROR: Command errored out with exit status 1:
     command: /usr/local/bin/python /usr/local/lib/python3.6/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmpk3u7ce_a
         cwd: /tmp/pip-install-fry1dkls/scipy
    Complete output (124 lines):
    lapack_opt_info:
    lapack_mkl_info:
      libraries mkl_rt not found in ['/usr/local/lib', '/usr/lib', '/usr/lib/']
      NOT AVAILABLE
    
    openblas_lapack_info:
    customize UnixCCompiler
    C compiler: gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -DTHREAD_STACK_SIZE=0x100000 -fPIC
    
    creating /tmp/tmpvqyj5o7z/tmp
    creating /tmp/tmpvqyj5o7z/tmp/tmpvqyj5o7z
    compile options: '-c'
    gcc: /tmp/tmpvqyj5o7z/source.c
    /tmp/tmpvqyj5o7z/source.c: In function 'main':
    /tmp/tmpvqyj5o7z/source.c:4:13: warning: implicit declaration of function 'zungqr_'; did you mean 'zungqr'? [-Wimplicit-function-declaration]
        4 |             zungqr_();
          |             ^~~~~~~
          |             zungqr
    gcc /tmp/tmpvqyj5o7z/tmp/tmpvqyj5o7z/source.o -L/usr/lib -lopenblas -o /tmp/tmpvqyj5o7z/a.out
    /usr/lib/gcc/x86_64-alpine-linux-musl/9.2.0/../../../../x86_64-alpine-linux-musl/bin/ld: /tmp/tmpvqyj5o7z/tmp/tmpvqyj5o7z/source.o: in function `main':
    /tmp/tmpvqyj5o7z/source.c:4: undefined reference to `zungqr_'
    collect2: error: ld returned 1 exit status
    /usr/lib/gcc/x86_64-alpine-linux-musl/9.2.0/../../../../x86_64-alpine-linux-musl/bin/ld: /tmp/tmpvqyj5o7z/tmp/tmpvqyj5o7z/source.o: in function `main':
    /tmp/tmpvqyj5o7z/source.c:4: undefined reference to `zungqr_'
    collect2: error: ld returned 1 exit status
      NOT AVAILABLE
    
    system_info:
      NOT AVAILABLE
    
    atlas_3_10_threads_info:
    Setting PTATLAS=ATLAS
      libraries tatlas,tatlas not found in /usr/local/lib
      libraries lapack_atlas not found in /usr/local/lib
      libraries tatlas,tatlas not found in /usr/lib
      libraries lapack_atlas not found in /usr/lib
      libraries tatlas,tatlas not found in /usr/lib/
      libraries lapack_atlas not found in /usr/lib/
    <class 'numpy.distutils.system_info.atlas_3_10_threads_info'>
      NOT AVAILABLE
    
    atlas_3_10_info:
      libraries satlas,satlas not found in /usr/local/lib
      libraries lapack_atlas not found in /usr/local/lib
      libraries satlas,satlas not found in /usr/lib
      libraries lapack_atlas not found in /usr/lib
      libraries satlas,satlas not found in /usr/lib/
      libraries lapack_atlas not found in /usr/lib/
    <class 'numpy.distutils.system_info.atlas_3_10_info'>
      NOT AVAILABLE
    
    atlas_threads_info:
    Setting PTATLAS=ATLAS
      libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib
      libraries lapack_atlas not found in /usr/local/lib
      libraries ptf77blas,ptcblas,atlas not found in /usr/lib
      libraries lapack_atlas not found in /usr/lib
      libraries ptf77blas,ptcblas,atlas not found in /usr/lib/
      libraries lapack_atlas not found in /usr/lib/
    <class 'numpy.distutils.system_info.atlas_threads_info'>
      NOT AVAILABLE
    
    atlas_info:
      libraries f77blas,cblas,atlas not found in /usr/local/lib
      libraries lapack_atlas not found in /usr/local/lib
      libraries f77blas,cblas,atlas not found in /usr/lib
      libraries lapack_atlas not found in /usr/lib
      libraries f77blas,cblas,atlas not found in /usr/lib/
      libraries lapack_atlas not found in /usr/lib/
    <class 'numpy.distutils.system_info.atlas_info'>
      NOT AVAILABLE
    
    lapack_info:
      libraries lapack not found in ['/usr/local/lib', '/usr/lib', '/usr/lib/']
      NOT AVAILABLE
    
    lapack_src_info:
      NOT AVAILABLE
    
      NOT AVAILABLE
    
    setup.py:420: UserWarning: Unrecognized setuptools command ('dist_info --egg-base /tmp/pip-modern-metadata-j8946u5_'), proceeding with generating Cython sources and expanding templates
      ' '.join(sys.argv[1:])))
    Running from scipy source directory.
    /tmp/pip-build-env-fl7ferm4/overlay/lib/python3.6/site-packages/numpy/distutils/system_info.py:572: UserWarning:
        Atlas (http://math-atlas.sourceforge.net/) libraries not found.
        Directories to search for the libraries can be specified in the
        numpy/distutils/site.cfg file (section [atlas]) or by setting
        the ATLAS environment variable.
      self.calc_info()
    /tmp/pip-build-env-fl7ferm4/overlay/lib/python3.6/site-packages/numpy/distutils/system_info.py:572: UserWarning:
        Lapack (http://www.netlib.org/lapack/) libraries not found.
        Directories to search for the libraries can be specified in the
        numpy/distutils/site.cfg file (section [lapack]) or by setting
        the LAPACK environment variable.
      self.calc_info()
    /tmp/pip-build-env-fl7ferm4/overlay/lib/python3.6/site-packages/numpy/distutils/system_info.py:572: UserWarning:
        Lapack (http://www.netlib.org/lapack/) sources not found.
        Directories to search for the sources can be specified in the
        numpy/distutils/site.cfg file (section [lapack_src]) or by setting
        the LAPACK_SRC environment variable.
      self.calc_info()
    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/site-packages/pip/_vendor/pep517/_in_process.py", line 257, in <module>
        main()
      File "/usr/local/lib/python3.6/site-packages/pip/_vendor/pep517/_in_process.py", line 240, in main
        json_out['return_val'] = hook(**hook_input['kwargs'])
      File "/usr/local/lib/python3.6/site-packages/pip/_vendor/pep517/_in_process.py", line 110, in prepare_metadata_for_build_wheel
        return hook(metadata_directory, config_settings)
      File "/tmp/pip-build-env-fl7ferm4/overlay/lib/python3.6/site-packages/setuptools/build_meta.py", line 156, in prepare_metadata_for_build_wheel
        self.run_setup()
      File "/tmp/pip-build-env-fl7ferm4/overlay/lib/python3.6/site-packages/setuptools/build_meta.py", line 237, in run_setup
        self).run_setup(setup_script=setup_script)
      File "/tmp/pip-build-env-fl7ferm4/overlay/lib/python3.6/site-packages/setuptools/build_meta.py", line 142, in run_setup
        exec(compile(code, __file__, 'exec'), locals())
      File "setup.py", line 540, in <module>
        setup_package()
      File "setup.py", line 536, in setup_package
        setup(**metadata)
      File "/tmp/pip-build-env-fl7ferm4/overlay/lib/python3.6/site-packages/numpy/distutils/core.py", line 135, in setup
        config = configuration()
      File "setup.py", line 435, in configuration
        raise NotFoundError(msg)
    numpy.distutils.system_info.NotFoundError: No lapack/blas resources found.
    ----------------------------------------
ERROR: Command errored out with exit status 1: /usr/local/bin/python /usr/local/lib/python3.6/site-packages/pip/_vendor/pep517/_in_process.py prepare_metadata_for_build_wheel /tmp/tmpk3u7ce_a Check the logs for full command output.
ERROR: Service 'apistar' failed to build: The command '/bin/sh -c pip install --no-cache-dir --proxy=${http_proxy}  apistar==0.5.41   gunicorn==19.8.1   dill===0.2.7.1   minio===4.0.0   psycopg2-binary===2.7.4   scipy   scikit-learn' returned a non-zero code: 1

Any idea how to fix the issue?

Missing Cython to build apistar service

I got an error running docker-compose build:

    File "<string>", line 1, in <module>
    File "/tmp/pip-install-t5i6nxmf/scikit-learn/setup.py", line 290, in <module>
      setup_package()
    File "/tmp/pip-install-t5i6nxmf/scikit-learn/setup.py", line 286, in setup_package
      setup(**metadata)
    File "/usr/local/lib/python3.6/site-packages/numpy/distutils/core.py", line 135, in setup
      config = configuration()
    File "/tmp/pip-install-t5i6nxmf/scikit-learn/setup.py", line 174, in configuration
      config.add_subpackage('sklearn')
    File "/usr/local/lib/python3.6/site-packages/numpy/distutils/misc_util.py", line 1024, in add_subpackage
      caller_level = 2)
    File "/usr/local/lib/python3.6/site-packages/numpy/distutils/misc_util.py", line 993, in get_subpackage
      caller_level = caller_level + 1)
    File "/usr/local/lib/python3.6/site-packages/numpy/distutils/misc_util.py", line 930, in _get_configuration_from_setup_py
      config = setup_module.configuration(*args)
    File "sklearn/setup.py", line 66, in configuration
      config.add_subpackage('utils')
    File "/usr/local/lib/python3.6/site-packages/numpy/distutils/misc_util.py", line 1024, in add_subpackage
      caller_level = 2)
    File "/usr/local/lib/python3.6/site-packages/numpy/distutils/misc_util.py", line 993, in get_subpackage
      caller_level = caller_level + 1)
    File "/usr/local/lib/python3.6/site-packages/numpy/distutils/misc_util.py", line 930, in _get_configuration_from_setup_py
      config = setup_module.configuration(*args)
    File "sklearn/utils/setup.py", line 8, in configuration
      from Cython import Tempita
  ModuleNotFoundError: No module named 'Cython'
  ----------------------------------------
  ERROR: Failed building wheel for scikit-learn
  Running setup.py clean for scikit-learn
Failed to build scikit-learn

I added cython to the second-last pip install in services/apistar/Dockerfile and it built successfully.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.