GithubHelp home page GithubHelp logo

heroku / buildpacks-python Goto Github PK

View Code? Open in Web Editor NEW
26.0 45.0 2.0 305 KB

Heroku's Cloud Native Buildpack for Python applications.

License: BSD 3-Clause "New" or "Revised" License

Rust 100.00%
heroku cloud-native-buildpacks python heroku-languages

buildpacks-python's Introduction

Heroku Cloud Native Buildpack: Python

Cloud Native Buildpacks Registry: heroku/python CI on Github Actions: heroku/python

Heroku Cloud Native Buildpack: heroku/python

heroku/python is the Heroku Cloud Native Buildpack for Python applications. It builds Python application source code into application images with minimal configuration.

Important

This is a Cloud Native Buildpack, and is a component of the Heroku Cloud Native Buildpacks project, which is in beta. If you are instead looking for the Heroku Classic Buildpack for Python (for use on the Heroku platform), you may find it here.

Usage

Note

Before getting started, ensure you have the pack CLI installed. Installation instructions are available here.

To build a Python application codebase into a production image:

$ cd ~/workdir/sample-python-app
$ pack build sample-app --builder heroku/builder:22

Then run the image:

docker run --rm -it -e "PORT=8080" -p 8080:8080 sample-app

Application Requirements

A requirements.txt file must be present at the root of your application's repository.

Configuration

Python Version

By default, the buildpack will install the latest version of Python 3.12.

To install a different version, add a runtime.txt file to your appโ€™s root directory that declares the exact version number to use:

$ cat runtime.txt
python-3.12.2

In the future this buildpack will also support specifying the Python version via a .python-version file (see #6).

Contributing

Issues and pull requests are welcome. See our contributing guidelines if you would like to help.

buildpacks-python's People

Contributors

colincasey avatar dependabot[bot] avatar edmorley avatar heroku-linguist[bot] avatar joshwlewis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

buildpacks-python's Issues

Support for `pre_compile` and `post_compile` steps

Just tested out v0.1.0 and noticed this was missing. It looks like you have some other issues to get it to parity with the legacy buildpack, so just dropping this one in too.

Thanks for your work on this!

Support the Pipenv package manager

The classic Python buildpack currently supports the package manager Pipenv:
https://pipenv.pypa.io
https://github.com/heroku/heroku-buildpack-python/blob/main/bin/steps/pipenv-python-version
https://github.com/heroku/heroku-buildpack-python/blob/main/bin/steps/pipenv

We should decide whether we want to still support it in the CNB, or whether Pipenv's declining usage (and mixed stability/issues upstream) mean we would rather only support Pip + Poetry instead.

Internal tracking epic

Automatic Python patch version updates

Currently the user has to either:
(a) not specify a Python version (in which case they get the default)
(b) specify an exact version (such as 3.10.5)

In order to make it easier for users to keep on an up to date Python release, it would be helpful if we also supported specifying just a major version (eg 3.11), which the buildpack would automatically map back to the latest patch release.

See also:

Error running locally built image with pack: Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding

Not sure what I'm doing wrong, it might be a dummy thing, but when trying to run a locally built image (with pack) using the heroku/builder:22, I get the error:

dcaro@vulcanus$ docker run --rm wm-lol
[uWSGI] getting INI configuration from uwsgi.ini
*** Starting uWSGI 2.0.21 (64bit) on [Tue Mar 14 19:03:17 2023] ***
compiled with version: 11.3.0 on 01 January 1980 00:00:01
os: Linux-6.1.0-2-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.7-1 (2023-01-18)
nodename: afb11a88e4cc
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 8
current working directory: /workspace
detected binary path: /layers/heroku_python/dependencies/bin/uwsgi
your memory page size is 4096 bytes
detected max file descriptor number: 1048576
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address :8000 fd 3
Python version: 3.11.2 (main, Feb  8 2023, 12:54:20) [GCC 11.3.0]
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Python path configuration:
  PYTHONHOME = (not set)
  PYTHONPATH = (not set)
  program name = '/layers/heroku_python/dependencies/bin/uwsgi'
  isolated = 0
  environment = 1
  user site = 1
  safe_path = 0
  import site = 1
  is in build tree = 0
  stdlib dir = '/app/.heroku/python/lib/python3.11'
  sys._base_executable = '/layers/heroku_python/dependencies/bin/uwsgi'
  sys.base_prefix = '/app/.heroku/python'
  sys.base_exec_prefix = '/app/.heroku/python'
  sys.platlibdir = 'lib'
  sys.executable = '/layers/heroku_python/dependencies/bin/uwsgi'
  sys.prefix = '/app/.heroku/python'
  sys.exec_prefix = '/app/.heroku/python'
  sys.path = [
    '/app/.heroku/python/lib/python311.zip',
    '/app/.heroku/python/lib/python3.11',
    '/app/.heroku/python/lib/python3.11/lib-dynload',
  ]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'

Current thread 0x00007f6228908440 (most recent call first):
  <no Python frame>

The build process works well:

dcaro@vulcanus$ pack build --builder heroku/builder:22 wm-lol
22: Pulling from heroku/builder
Digest: sha256:eb9486a5587e666f43695ff794a28056b18033fd2f6566c81137634087d23034
Status: Image is up to date for heroku/builder:22
22-cnb: Pulling from heroku/heroku
Digest: sha256:8ed2415c4e53df324c1af95e7286f1065a9cced71ea7e449574064a2c268d55c
Status: Image is up to date for heroku/heroku:22-cnb
latest: Pulling from buildpacksio/lifecycle
Digest: sha256:f75a04887fced3ae0504a37edb2c0d29d366511cd9ede34dbb90c5282b106e79
Status: Image is up to date for buildpacksio/lifecycle:latest
===> ANALYZING
[analyzer] Restoring data for SBOM from previous image
===> DETECTING
[detector] heroku/python   0.1.0
[detector] heroku/procfile 2.0.0
===> RESTORING
[restorer] Restoring metadata for "heroku/python:python" from app image
[restorer] Restoring metadata for "heroku/python:pip-cache" from cache
[restorer] Restoring metadata for "heroku/python:shim" from cache
[restorer] Restoring data for "heroku/python:pip-cache" from cache
[restorer] Restoring data for "heroku/python:python" from cache
[restorer] Restoring data for "heroku/python:shim" from cache
===> BUILDING
[builder]
[builder] [Determining Python version]
[builder] No Python version specified, using the current default of Python 3.11.2.
[builder] To use a different version, see: https://devcenter.heroku.com/articles/python-runtimes
[builder]
[builder] [Installing Python and packaging tools]
[builder] Using cached Python 3.11.2
[builder] Using cached pip 23.0.1, setuptools 67.5.0 and wheel 0.38.4
[builder]
[builder] [Installing dependencies using Pip]
[builder] Using cached pip download/wheel cache
[builder] Running pip install
[builder] Collecting flask
[builder]   Using cached Flask-2.2.3-py3-none-any.whl (101 kB)
[builder] Collecting werkzeug
[builder]   Using cached Werkzeug-2.2.3-py3-none-any.whl (233 kB)
[builder] Collecting uwsgi
[builder]   Using cached uWSGI-2.0.21-cp311-cp311-linux_x86_64.whl
[builder] Collecting Jinja2>=3.0
[builder]   Using cached Jinja2-3.1.2-py3-none-any.whl (133 kB)
[builder] Collecting itsdangerous>=2.0
[builder]   Using cached itsdangerous-2.1.2-py3-none-any.whl (15 kB)
[builder] Collecting click>=8.0
[builder]   Using cached click-8.1.3-py3-none-any.whl (96 kB)
[builder] Collecting MarkupSafe>=2.1.1
[builder]   Using cached MarkupSafe-2.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (27 kB)
[builder] Installing collected packages: uwsgi, MarkupSafe, itsdangerous, click, werkzeug, Jinja2, flask
[builder] Successfully installed Jinja2-3.1.2 MarkupSafe-2.1.2 click-8.1.3 flask-2.2.3 itsdangerous-2.1.2 uwsgi-2.0.21 werkzeug-2.2.3
[builder]
[builder] [Discovering process types]
[builder] Procfile declares types -> web
===> EXPORTING
[exporter] Adding layer 'heroku/python:dependencies'
[exporter] Reusing layer 'heroku/python:python'
[exporter] Reusing layer 'buildpacksio/lifecycle:launch.sbom'
[exporter] Adding 1/1 app layer(s)
[exporter] Reusing layer 'buildpacksio/lifecycle:launcher'
[exporter] Adding layer 'buildpacksio/lifecycle:config'
[exporter] Reusing layer 'buildpacksio/lifecycle:process-types'
[exporter] Adding label 'io.buildpacks.lifecycle.metadata'
[exporter] Adding label 'io.buildpacks.build.metadata'
[exporter] Adding label 'io.buildpacks.project.metadata'
[exporter] Setting default process type 'web'
[exporter] Saving wm-lol...
[exporter] *** Images (f36700bd8dd2):
[exporter]       wm-lol
[exporter] Adding cache layer 'heroku/python:pip-cache'
[exporter] Reusing cache layer 'heroku/python:python'
Successfully built image wm-lol

I can run it though if I set both PYTHONHOME and LD_LIBRARY_PATH to point to the python layer dirs:

docker run --rm --env LD_LIBRARY_PATH=/layers/heroku_python/python/lib/ --env PYTHONHOME=/layers/heroku_python/python/ wm-lol

The code I'm running is https://github.com/david-caro/wm-lol (simple python app uwsgi+flask), but tested with the sample python app too with the same error

Note: moved from heroku/heroku-buildpack-python#1427

Decide whether to continue buildpack NLTK support

The classic Python buildpack supports installing NLTK corpora via an nltk.txt file:
https://devcenter.heroku.com/articles/python-nltk
https://github.com/heroku/heroku-buildpack-python/blob/main/bin/steps/nltk
https://www.nltk.org/data.html

This is a someone infrequently used feature, and it seems it could be replaced by an inline buildpack that runs a single command.

Though we'd need to check whether NLTK_DATA needs to be set, or if we can rely on the default output location.

If we do decide to drop this feature, we'll want to display migration advice in the CNB. Or if we don't drop the feature, we'll need to implement support for it in the CNB.

Automatically run Django's `collectstatic` command

Support passing arbitrary env vars to build backends (such as setuptools)

There are cases where a source distribution build for a package requires certain env vars to be set to control the build (or make it work on Heroku). For example:
heroku/heroku-buildpack-python#760

Currently the CNB (like the classic Python buildpack) doesn't pass arbitrary user-provided environment variables to pip install.

However, this means that we'll constantly be having to add special cases as they come up (like happened in the classic buildpack PR above).

Whilst wanting to prevent user-provided env vars from breaking subprocesses is worth thinking about, the blanket approach of not allowing them is IMO worse for UX than the alternative: Having a denylist of known dangerous env vars, and allowing everything else.

Note: The fix for this issue is also the same as that for #52 - however, the use-case it is in aid of is different.

See also:
heroku/heroku-buildpack-python#417
heroku/heroku-buildpack-python#750

Support disabling automatic Django static asset generation

Automatic Django static asset generation (running the manage.py collectstatic command) was added in #108.

At the moment there is no way to force disable the feature (beyond removing django.contrib.staticfiles from INSTALLED_APPS in the app's Django config, or removing the manage.py script), whereas the classic Python buildpack allows disabling it using DISABLE_COLLECTSTATIC=1.

That said, the new implementation performs more thorough checks to see whether the app is using the static files feature, so between that and a few other improvements (like now passing env vars to the subprocess), there should be fewer cases where the feature needs manually disabling.

Some scenarios in which the ability to disable might be needed:

  • the user needs to run a step between Python package installation and running collectstatic (such as patching invalid urls in CSS comments to work around https://code.djangoproject.com/ticket/21080)
  • the user can't debug a failure locally or at build time, and wants the build to succeed so they can debug at runtime

If we decide we should still add support for disabling automatic Django static asset generation, then we should probably use config options in project.toml rather than env vars (to encourage infrastructure as code). However, this depends on us figuring out a convention there for all of our buildpacks.

Either way, we'll also need to add deprecation (or error) messages if DISABLE_COLLECTSTATIC is set, to warn that it no longer does anything (for people migrating from the classic buildpack).

GUS-W-14109400.

Add documentation

Whilst the Python CNB already has pretty comprehensive user-facing build log output and error messages, plus lots of developer-facing code comments, we also need:

  • User-facing usage instructions in the readme (such as how to use with pack build)
  • Description of features + differences compared to the classic Heroku Python buildpack
  • Developer facing workflow docs (eg how to develop locally, run tests, publish a new buildpack version etc)

Internal tracking epic

Outdated Python version warnings

We should add support for displaying build log warnings/notices in the following cases:

  1. Using a Python major version that has reached end-of-life upstream, for example Python 3.7 after June 2023.
  2. Using a non-latest Python patch version, for example Python 3.11.1 when 3.11.2 is the latest version.
  3. (Notice only, rather than a warning) When using a still-supported major Python version, but a newer major version is available. For example when using Python 3.10 but Python 3.11 is available. This is something the classic buildpack did not do, and would help reduce the number of users left to migrate once a Python version reaches end-of-life.

Internal tracking epic

Support the `PIP_EXTRA_INDEX_URL` env var for specifying an additional package index URL

The classic Python buildpack supports using the PIP_EXTRA_INDEX_URL env var to tell Pip/Pipenv to use an additional package index URL. The CNB currently doesn't support this feature.

See:
https://pip.pypa.io/en/stable/cli/pip_install/#cmdoption-extra-index-url
https://pip.pypa.io/en/stable/topics/configuration/#environment-variables
https://github.com/search?q=repo%3Aheroku%2Fheroku-buildpack-python%20PIP_EXTRA_INDEX_URL&type=code

GUS-W-13753676.

Finish review on initial implementation

I wasn't actually done with my review on #3. I threw out my back and wasn't able to get to it on Friday. Here's the rest of it:

I would like to see the command logged to the end user. Above it's hard coded to "Running pip install". I would like to see the command be exactly what we run for the user so if they copy and paste (and of course modify any paths) they would get the exact same result.

Since it also looks like the results rely on PYTHONUSERBASE env var. I would like to see that in the output as well.

In addition to local debugging, it can also help buildpack maintainers spot accidental discrepancies.

Same as above. We're streaming this command, but not announcing the (exact) command that's being streamed. If we don't want to announce this specific command (since it seems it's a helper command rather than something a user might expect to be run against their code) then perhaps we move to only emitting the command string in the error message.

I want the exact command run to be in the logs or the error (or both).

Function name is log_io_error. Function body says "unexpected error" which might not always be true. Or rather if we're saying "unexpected error" I'm reading that as synonymous with "not my fault" if I'm a customer reading it.

I imagine someone copying and pasting that function without looking too close thinking it's for handling all IO errors. For example a new file format is added and reading it generates an std::io::Error due to a permissions problem or bad symlink the customer checked in. In that case it wouldn't be so unexpected. Rename for clarity? unexpected_io_error?

Mentioned in Colin's PR. I would like to avoid early returns when there's only two branches. We can if/else this and eliminate the early return.

I think we should log_warning instead of log_info.

In a bit of a surprise to even me, I'm going to advocate for less testing. I think testing build logic inside of main.rs should be enough here. One fewer docker boots at CI test time.

Also we're effectively testing the output of pack here which is subject to change. If you do want to keep this test, I would scope it to only the strings you control.

Same as above. Doesn't need to be a docker test.

Regarding the error message (it's very good), when we say the user is missing files: It would be nice for us to give them an ls of what files we see in that directory. So at a glance they could see in one window we're looking for "requirements.txt" but they have "REQUIREMENTS.txt" (or something).

Comment: You could manually combine the streams yourself let output = format!("{}{}", context.pack_stdout, context.pack_stderr). I'm assuming the goal is to get all of the pack output on test failure.

Also I think you could get rid of this test. Esssentially it's testing that a bad requirements.txt file triggers a non-zero pip install. The git test above it, seems more useful as an integration test.

Testing caching ๐Ÿ‘. We can make these less brittle by asserting only the individual lines (or even just parts of lines). I think asserting for "Using cached Python" and "Using cached pip" without the version numbers, would be enough to convince me. Maybe a "Using cached typing_extensions" for good measure. All the other values and numbers will cause churn on this file and possibly failures on otherwise unrelated changes (if libherokubuildpack updates it's logging style for example).

That comment applies to all integration tests. They're really nice and easy to review in this format, but I don't want to have to update 10 files every time I add an oxford comma (for example).

If we know that one change invalidates things, I would bet multiple changes would as well. I think this is covered in your unit tests โœ‚๏ธ

Unit test should be okay.

Comment: This is a good idea

Unit test should be fine.

Time to test

Not that it's a race, but time to run CI universally always goes up, and integration tests are historically one of the last things that developers are willing to delete. Right now Python is ~6min for integration tests while Ruby is ~3min.

Ideally I would like to aim for <5min for CI completion with a maximum of around 10 min. Once you hit 10 min and a random glitch causes the tests to need to be re-run then you're pushing nearly half an hour for a single change and it absolutely kills (my) productivity.

I think we should be agressive in testing and safety. I also think we should consider pruning some of these tests now, as this is otherwise the fastest this CI suite will ever execute.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.