GithubHelp home page GithubHelp logo

containerbuildsystem / dockerfile-parse Goto Github PK

View Code? Open in Web Editor NEW
127.0 12.0 54.0 243 KB

Python library for parsing Dockerfile files.

License: BSD 3-Clause "New" or "Revised" License

Python 97.48% Shell 2.52%
dockerfile python

dockerfile-parse's Issues

Interpret shell substitution for build arguments

Currently build arguments either have the value passed in with build_args in the constructor or the string value assigned in the dockerfile when the arg is defined. However, it is possible to set the value of an arg based on the value of a previous arg using shell substitution.

Here is an example of the current behavior causing issues when using other properties that do variable substitution based on the values of build args.

>>> parser = DockerfileParser()
>>> print(parser.content)
ARG FOO="DEFAULT_FOO"
ARG BAR="__${FOO:-BAR}__"

FROM my-image-${FOO}-${BAR}
>>> parser.baseimage
'my-image-DEFAULT_FOO-__${FOO:-BAR}__'

These examples demonstrate how docker handles the shell substitution.

FROM ubuntu

ARG FOO="DEFAULT_FOO"
ARG BAR="__${FOO:-BAR}__"

RUN echo $FOO $BAR

Examples of building this image:

# No build arguments
$ docker build .
Sending build context to Docker daemon  19.61MB
Step 1/4 : FROM ubuntu
 ---> 1318b700e415
Step 2/4 : ARG FOO="DEFAULT_FOO"
 ---> Running in b904d3d87fcd
Removing intermediate container b904d3d87fcd
 ---> 01a203cae2e4
Step 3/4 : ARG BAR="__${FOO:-BAR}__"
 ---> Running in 4e7d7258da89
Removing intermediate container 4e7d7258da89
 ---> 501fd278a030
Step 4/4 : RUN echo $FOO $BAR
 ---> Running in 8876cf8fd2e6
DEFAULT_FOO __DEFAULT_FOO__
Removing intermediate container 8876cf8fd2e6
 ---> 98b155f169ed

# Set FOO
$ docker build --build-arg FOO=MY_FOO .
Sending build context to Docker daemon  19.61MB
Step 1/4 : FROM ubuntu
 ---> 1318b700e415
Step 2/4 : ARG FOO="DEFAULT_FOO"
 ---> Running in 2bc786748486
Removing intermediate container 2bc786748486
 ---> 3211aa61519a
Step 3/4 : ARG BAR="__${FOO:-BAR}__"
 ---> Running in f61fb4b25a74
Removing intermediate container f61fb4b25a74
 ---> 074f0d3d627f
Step 4/4 : RUN echo $FOO $BAR
 ---> Running in 72f5c0a3c21e
MY_FOO __MY_FOO__
Removing intermediate container 72f5c0a3c21e
 ---> 9f1a5ce8839a
Successfully built 9f1a5ce8839a

# Set BAR
$ docker build --build-arg BAR=MY_BAR .
Sending build context to Docker daemon  19.61MB
Step 1/4 : FROM ubuntu
 ---> 1318b700e415
Step 2/4 : ARG FOO="DEFAULT_FOO"
 ---> Running in fd7ade7ac8ec
Removing intermediate container fd7ade7ac8ec
 ---> d71ad9beb904
Step 3/4 : ARG BAR="__${FOO:-BAR}__"
 ---> Running in c03c4a3511d6
Removing intermediate container c03c4a3511d6
 ---> 1c2301a693ae
Step 4/4 : RUN echo $FOO $BAR
 ---> Running in 6ad10b5754b9
DEFAULT_FOO MY_BAR
Removing intermediate container 6ad10b5754b9
 ---> 421c6ce936e9
Successfully built 421c6ce936e9

Note that each has a different output for step 4.

However, using the library, the substitution is not done.

>>> parser = DockerfileParser()
>>> parser.args
{'FOO': 'DEFAULT_FOO', 'BAR': '__${FOO:-BAR}__'}

>>> parser = DockerfileParser(build_args={"FOO": "MY_FOO"})
>>> parser.args
{'FOO': 'MY_FOO', 'BAR': '__${FOO:-BAR}__'}

>>> parser = DockerfileParser(build_args={"BAR": "MY_BAR"})
>>> parser.args
{'FOO': 'DEFAULT_FOO', 'BAR': 'MY_BAR'}

Option to only write data back to disk when explicitly requested

The parser currently will always write out the content to the file when a value is assigned to a property. This is not immediately obvious, as it is only hinted at by the docstring for fileobj.

The usage example doesn't mention anything about interacting with the file system, but will create or overwrite a file named Dockerfile file in the working directory.

If it would be to hard to adjust the library to work in a more explicit manor, it would be nice if the documentation were updated to be clear to someone finding the library for the first time about how it operates on files.

Extension of DockerfilePaser object to support multi-stage builds

This request is more to extend for what my needs are based on currently latest release.

  1. Type: list
    Name: 'baseimages'
    Purpose: User could do a iterate through the base images
  2. Type: boolean
    Name: multistage
    Purpose: User can easily check if docker file is indeed a multistage. Obviously a user could do the same check with a len on the baseimages but this would be a simple shortcut.

Should accept dict representing environment vars inherited from parent

When substituting environment variables, the environment is assumed to be empty at the start. This is not the case for images that inherited from anything other than scratch.

The constructor should accept an optional dict parameter parent_env, and this should be copied and used as the starting environment variable when parsing the Dockerfile.

Base image is incorrect when FROM statement has --platform

From the docker docs it says Dockerfile supports --platform=<platform> however when we try to reference the baseimage property of the parser, it reports the base image as the --platform=<platform> string.

Using the following Dockerfile:

FROM --platform=linux/amd64 docker.io/library/centos:latest

And using the following code:

from dockerfile_parse import DockerfileParser

parser = DockerfileParser(fileobj=containerfile)
print(parser.baseimage)

Here is my result

--platform=linux/amd64

What if "apt-get" becomes an instruction?

Inside a multi-line RUN instruction, lack of \ will lead missing of command and no exception will be raised.

Here is an example Dockerfile

FROM ubuntu:xenial

RUN apt-get install -y curl && \
    curl --silent --location https://deb.nodesource.com/setup_4.x | bash - && \
    apt-get install -y nodejs
    apt-get install -y build-essentials
    echo Hello

Now, echo is taken as an instruction, but apt-get is lost.

If apt-get can also be taken as an instruction, I can deal with this Dockerfile syntax error manually.
If not, an exception is expected.

ARG's are not inherited from a parent stage - only the last stage is considered

When using global ARG's outside of a FROM line (for instance, to make them available to multiple stages), only the last stage is used to generate the arguments property.

For example, this Dockerfile:

ARG GLOBALARG1=foo
ARG GLOBALARG2=bar

FROM python
ARG GLOBALARG1
RUN echo $GLOBALARG1 $GLOBALARG2

FROM python
ARG GLOABLARG2
RUN echo $GLOBALARG2

With this sample script:

from pprint import pprint
from dockerfile_parse import DockerfileParser

dfp = DockerfileParser()

f = open("Dockerfile", "r")
dfp.content = f.read()

pprint(dfp.args)

returns {'GLOABLARG2': ''}, despite the fact that GLOBALARG2 should have inherited from the parent, and its entirely missing GLOBALARG1.

The library successfully detects this is a multistage build, but I can't seem to find a way to present the global arguments, or any arguments from a stage other than the last - as well as arguments not being inherited correctly

Add support for updating global ARGs

ARGs defined before first FROM in dockerfile are considered to be global ARGs and can be inherited into stages.

https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact

An ARG declared before a FROM is outside of a build stage, so it can’t be used in any instruction after a FROM. To use the default value of an ARG declared before the first FROM use an ARG instruction without a value inside of a build stage

Add ability to list and update global ARGs

Dockerfiles with different file names are not supported

As seen in the official docker code base(https://github.com/docker/cli/blob/ab7cc485809455e6f1149aec5c00b6582066a396/cli/command/image/build/context.go#L324), there is a relative check for filenames with the format dockerfile aside the default name Dockerfile.
The library currently supports only the default name. That is because the DOCKERFILE_FILENAME constant
(

DOCKERFILE_FILENAME = 'Dockerfile'
) is a string.

I'm working on a usecase that requires parsing dockerfiles ending in dockerfile if they exist, however I seem to get this error:

self = <dockerfile_parse.parser.DockerfileParser object at 0x7f7977ae21d0>, mode = 'rb'

    @contextmanager
    def _open_dockerfile(self, mode):
        if self.fileobj is not None:
            self.fileobj.seek(0)
            if 'w' in mode:
                self.fileobj.truncate()
            yield self.fileobj
            self.fileobj.seek(0)
        else:
>           with open(self.dockerfile_path, mode) as dockerfile:
E           NotADirectoryError: [Errno 20] Not a directory: '/tmp/pytest-of-eli/pytest-16/test_sets_content_if_dockerfil0/testmodule/dockerfile/Dockerfile'

If this makes sense, I believe DOCKER_FILENAME should have a value of ('Dockerfile', "dockerfile")

I am also willing to submit a PR if this issue is approved

make dockerfile-parse an RPM available on src.fedoraproject.org/projects/rpms/

I'm with the RHEL Compose team. This project would be very helpful for working with containers in (fedpkg/rhpkg/centpkg), but we would need it to be mirrored as an RPM available at src.fedoraproject.org/projects/rpms/.

I'm going to follow this process generally: https://blog.jwf.io/2017/11/first-rpm-package-fedora/

I'm adding this issue because I wanted to make sure product owners are aware I'm working on this. I would appreciate any experience anyone has to contribute to this process.

Rename to dockerfile-parser

I've just checked https://pypi.python.org/pypi/dockerfile-parser and there's no such project.

I clearly remember that I originally chose to name this project dockerfile-parse (and not dockerfile-parser, even the main class is called DockerfileParser), because there already was one dockerfile-parser on pypi.

So this is just to let you know that it's possible to rename this to dockerfile-parser now.

Incompatibility with pytest 7.2.0

This test fails with pytest 7.2.0

def test_nonseekable_fileobj(self):
with pytest.raises(AttributeError):
DockerfileParser(fileobj=sys.stdin)

The seek method now exists and raises UnsupportedOperation, see

https://github.com/pytest-dev/pytest/blob/54d5a63d1485110015665ece1065982407394517/src/_pytest/capture.py#L218-L219

The output is:

=================================== FAILURES ===================================
________________ TestDockerfileParser.test_nonseekable_fileobj _________________

self = <tests.test_parser.TestDockerfileParser object at 0x7f879703f410>

    def test_nonseekable_fileobj(self):
        with pytest.raises(AttributeError):
>           DockerfileParser(fileobj=sys.stdin)

tests/test_parser.py:914: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
dockerfile_parse/parser.py:109: in __init__
    self.fileobj.seek(0)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <_pytest.capture.DontReadFromInput object at 0x7f8797586c10>, offset = 0

    def seek(self, offset: int) -> int:
>       raise UnsupportedOperation("redirected stdin is pseudofile, has no seek(int)")
E       io.UnsupportedOperation: redirected stdin is pseudofile, has no seek(int)

/usr/lib/python3.11/site-packages/_pytest/capture.py:219: UnsupportedOperation
=========================== short test summary info ============================
FAILED tests/test_parser.py::TestDockerfileParser::test_nonseekable_fileobj
================== 1 failed, 1241 passed, 48 xfailed in 4.57s ==================

Concatenated variables not expanded correctly

Given this Dockerfile:

FROM scratch
ENV NAME=name VER=1
LABEL component="$NAME$VER"

the component label should be set as "name1" but it is not. Test case:

from dockerfile_parse import DockerfileParser
from io import BytesIO
dockerfile = BytesIO(b'''\
FROM scratch
ENV NAME=name VER=1
LABEL component="$NAME$VER"
''')
p = DockerfileParser(fileobj=dockerfile)
assert p.labels['component'] == 'name1'

Instead, p.labels['component'] is "name$VER". Only the first environment variable was expanded.

Accepts multi-line FROM instructions

FROM instructions cannot have line continuations, at least from testing with docker-1.9.1:

$ cat > Dockerfile
FROM \
 scratch
$ docker build -t scratch --rm .
unable to process Dockerfile: unable to parse repository info: repository name component must match "[a-z0-9](?:-*[a-z0-9])*(?:[._][a-z0-9](?:-*[a-z0-9])*)*"

Proposal - add more complex FROM clause parsing

I wrote a more complex FROM clause parsing, and it currently sits in a opensourced piece of software - would you be interested in transferring it to your library? It would make much more sense to have it in here. Its used in production, covered with tests, and should be pretty much a direct copy-paste. Its the DockerImageId class here https://github.com/kiwicom/the-zoo/blob/master/zoo/analytics/tasks/utils.py

Tests are here https://github.com/kiwicom/the-zoo/blob/master/test/analytics/tasks/test_utils.py

Let me know, I will submit PR if you are interested.

[packit] Propose update failed for release 0.0.17

Packit failed on creating pull-requests in dist-git:

dist-git branch error
f30 Pagure API returned an error when calling https://src.fedoraproject.org/api/0/-/whoami`: Invalid or expired token. Please visit https://src.fedoraproject.org/settings#nav-api-tab to get or renew your API token. - Expired token`
f31 Pagure API returned an error when calling https://src.fedoraproject.org/api/0/-/whoami`: Invalid or expired token. Please visit https://src.fedoraproject.org/settings#nav-api-tab to get or renew your API token. - Expired token`
f32 Pagure API returned an error when calling https://src.fedoraproject.org/api/0/-/whoami`: Invalid or expired token. Please visit https://src.fedoraproject.org/settings#nav-api-tab to get or renew your API token. - Expired token`
master Pagure API returned an error when calling https://src.fedoraproject.org/api/0/-/whoami`: Invalid or expired token. Please visit https://src.fedoraproject.org/settings#nav-api-tab to get or renew your API token. - Expired token`

You can re-trigger the update by adding /packit propose-update to the issue comment.

/packit propose-update

DockerfileParse does not handle cache_content properly

In our project meta-test-family (https://github.com/fedora-modularity/meta-test-family), used for testing containers, we parse Dockerfile and sometimes rendered, named e.g. Dockerfile.rendered.

Unfortunately there is a bug, from my point of view:

from dockerfile_parse import DockerfileParser

# dockerfile refers to Dockerfile.rendered and Dockerfile itself looks like https://github.com/container-images/memcached/blob/master/Dockerfile
with open(dockerfile, 'r') as f:
       cached_content = f.readlines()
dfp = DockerfileParser(path=os.path.dirname(dockerfile), cache_content=cached_content)
for struct in dfp.structure:
    print(struct)

This prints Dockerfile and not Dockerfile.rendered.
I guess, bug is here https://github.com/DBuildService/dockerfile-parse/blob/master/dockerfile_parse/parser.py#L176
Before with there should be

if cache_content:
    self.cached_content = cache_content
else:
    with ....

I will sent PR for it.

Incorrectly appending 'Dockerfile' to path

One can provide a file not called Dockerfile to docker build like this: docker build -f Dockerfile.debug .. When instantiating with a file that isn't named Dockerfile dockerfile-parse will append Dockerfile to the path and fail.

>>> test = DockerfileParser('/home/nisha/test_dockerfile')
>>> test.structure
Couldn't retrieve lines from dockerfile: NotADirectoryError(20, 'Not a directory')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nisha/scancodeenv/lib/python3.6/site-packages/dockerfile_parse/parser.py", line 250, in structure
    for line in self.lines:
  File "/home/nisha/scancodeenv/lib/python3.6/site-packages/dockerfile_parse/parser.py", line 143, in lines
    with self._open_dockerfile('rb') as dockerfile:
  File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/home/nisha/scancodeenv/lib/python3.6/site-packages/dockerfile_parse/parser.py", line 131, in _open_dockerfile
    with open(self.dockerfile_path, mode) as dockerfile:
NotADirectoryError: [Errno 20] Not a directory: '/home/nisha/test_dockerfile/Dockerfile'
>>> import os
>>> os.path.exists('/home/nisha/test_dockerfile') <-- the file exists
True

Version 0.0.6 not installable via pip

I'd like to use the new fileobj feature in 0.0.6, but I can't seem to get pip to install it from pypi:

# pip install dockerfile-parse==0.0.6
Collecting dockerfile-parse==0.0.6
  Could not find a version that satisfies the requirement dockerfile-parse==0.0.6 (from versions: 0.0.3, 0.0.4, 0.0.5)
No matching distribution found for dockerfile-parse==0.0.6

Installing without specifying version gives 0.0.5:

# pip install dockerfile-parseCollecting dockerfile-parse
Installing collected packages: dockerfile-parse
Successfully installed dockerfile-parse-0.0.5

But searching indicates that 0.0.6 is latest(??):

# pip search dockerfile-parse
[...trimmed output...]
dockerfile-parse (0.0.6)                - Python library for Dockerfile
                                          manipulation
  INSTALLED: 0.0.5
  LATEST:    0.0.6

Although I can install from source, I'd like to install direct from pypi if possible. Thanks in advance for any help!

RFE: Caching support for Dockerfile content

As explained in containerbuildsystem/osbs-client#200, it'd be great if DockerfileParser could cache Dockerfile content when instantiated.
Since I think the purpose of your library is not only to read existing Dockerfiles, but also to create new Dockerfiles, this can't be done automatically by constructor (i.e. I can create a DockerfileParser instance with a path to not-yet-existing Dockerfile, add the contents and then dump it).
So I think the best course of action would be to add an argument to the constructor, something like cache_content=False. If this is True, Dockerfile content would get cached and DockerfileParser would then operate on the cached data. With False, everything would work pretty much the same way as it does now.

LABEL instruction not appended correctly if no final newline

mkdir /tmp/x
printf 'FROM scratch' > /tmp/x/Dockerfile
python -c "from dockerfile_parse import DockerfileParser; DockerfileParser('/tmp/x').labels['foo'] = 'bar'"
cat /tmp/x/Dockerfile

results in:

FROM scratchLABEL foo=bar

instead of the expected:

FROM scratch
LABEL foo=bar

Handle escape directive for Dockerfiles

Problem description
For Windows Dockerfiles it can often be beneficial to use the escape directive (Documentation) as Windows uses \ for paths. The current parser implementation does not handle this directive and reports errors on specific operations:

In [11]: df.context_structure
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-11-90b26318ceeb> in <module>
----> 1 df.context_structure

c:\python38\lib\site-packages\dockerfile_parse\parser.py in context_structure(self)
    818
    819             if instruction_type in ('ARG', 'ENV', 'LABEL'):
--> 820                 values = get_key_val_dictionary(
    821                     instruction_value=instr['value'],
    822                     env_replace=instruction_type != 'ARG' and self.env_replace,

c:\python38\lib\site-packages\dockerfile_parse\util.py in get_key_val_dictionary(instruction_value, env_replace, args, envs)
    256     args = args or {}
    257     envs = envs or {}
--> 258     return dict(extract_key_values(instruction_value=instruction_value,
    259                                    env_replace=env_replace,
    260                                    args=args, envs=envs))

c:\python38\lib\site-packages\dockerfile_parse\util.py in extract_key_values(env_replace, args, envs, instruction_value)
    244         for k_v in words:
    245             if '=' not in k_v:
--> 246                 raise ValueError('Syntax error - can\'t find = in "{word}". '
    247                                  'Must be of the form: name=value'
    248                                  .format(word=k_v))

ValueError: Syntax error - can't find = in "`". Must be of the form: name=value

Workaround & Solution
A short term workaround can be to avoid this feature. And for the parser the docker directives have to be on the first line of the file so it is valid for the whole file and can be found/detected quite good. So from what I see it might be one change to detect this directive and one to change the regex based on this detection.

Escaped label values are mangled

>>> from dockerfile_parse import DockerfileParser
>>> parser=DockerfileParser('/tmp')
>>> parser.lines=['FROM fedora\n', 'LABEL label=\$VAR\n']
>>> print(parser.labels['label'])
$VAR
>>> assert parser.labels['label'] == '\$VAR'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError

Support parsing individual commands?

Thanks for this project, it looks like it could be useful. I found it while looking for something which could parse out the files from the host system which the Dockerfile actually reads. While I can see that this library supports extracting the body of a command from its instruction (i.e: COPY separate from --chmod foo:bar things/ /var/things/) I couldn't see anything beyond that. Is this something which might be in scope for this project in future, or is that layer deliberately left to the user?

multiline `RUN` at the end of file is broken down

RUN yum-config-manager --enable rhel-7-server-rpms || :
RUN yum install -y git tar wget socat hostname sysvinit-tools util-linux yum-utils ethtool&& \
ADD Dockerfile-aos3-aos-base-v3.0.2.900-0 /root/buildinfo/Dockerfile-aos3-aos-base-v3.0.2.900-0
LABEL "Build_Host"="rcm-img-docker01.build.eng.bos.redhat.com"
    yum clean all
RUN rm -f '/etc/yum.repos.d/aos-unsigned.repo'

it should be

RUN yum-config-manager --enable rhel-7-server-rpms || :
RUN yum install -y git tar wget socat hostname sysvinit-tools util-linux yum-utils ethtool&& \
    yum clean all
RUN rm -f '/etc/yum.repos.d/aos-unsigned.repo'
ADD Dockerfile-aos3-aos-base-v3.0.2.900-0 /root/buildinfo/Dockerfile-aos3-aos-base-v3.0.2.900-0
LABEL "Build_Host"="rcm-img-docker01.build.eng.bos.redhat.com"

is this a reactor issue?

Comments in LABEL call are not ignored

$ cat Dockerfile
FROM rhel7:7.2-61

LABEL \
      # Location of the STI scripts inside the image.
      io.openshift.s2i.scripts-url=image:///usr/libexec/s2i \
      # DEPRECATED: This label will be kept here for backward compatibility.
      io.s2i.scripts-url=image:///usr/libexec/s2i \
      # Labels consumed by Red Hat build service.
      BZComponent="s2i-base-docker" \
      Name="rhscl/s2i-base-rhel7" \
      Version="1" \
      Release="6.3" \
      Architecture="x86_64"
$ python -c "from dockerfile_parse import DockerfileParser; dfp = DockerfileParser(); print(dfp.labels)"
{u'#': u'Location of the STI scripts inside the image.'}

Cannot modify content if using fileobj

Hey,

it seems like there is something wrong with this line:

>>> import io
>>> from dockerfile_parse import DockerfileParser
>>> dfp = DockerfileParser(fileobj=io.StringIO())
>>> dfp.content = 'FROM test'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/projects/env/lib/python3.7/site-packages/dockerfile_parse/parser.py", line 196, in content
    dockerfile.write(u2b(content))
TypeError: string argument expected, got 'bytes'

Probably a StringIO does not take bytes as an input (at least in Python 3).

Rename label?

What's the best way to rename some labels in the Dockerfile?

For example, transform this:

FROM fedora
LABEL NAME="spam"
CMD bash

into this:

FROM fedora
LABEL name="spam"
CMD bash

I've tried deleting the label and adding a new one, but that always adds it to the end of the file, which is not desired for my use case.

Should handle environment variable substitution

Docker allows environment variable substitution in a number of instructions. This parser ought to do likewise.

Additionally, it would be useful to be able to disable environment variable substitution, to inspect the unquoted-but-not-substituted values.

ENV/LABEL is not parsed properly

The quotes escaping when parsing ENV/LABEL is not handled properly.

  • Example Dockerfile:
FROM fedora:25
ENV x="a b 'c'"
ENV y="'d' e 'f g'"
LABEL n=$x m="$y" o='$x $y'
  • Envs are as expected:
{
    'x': "a b 'c'",
    'y': "'d' e 'f g'"
}
  • Labels:
{
    'n': 'a',
    'b': '',
    'c': '',
    'm': "'d' e 'f g'",
    'o': '$x $y'
}
expected = {
    "m": "'d' e 'f g'",
    "n": "a b 'c'",
    "o": "$x $y"
}

Support for modifying labels

With the structure property it is quite easy to modify most values, by first finding the value to modify, then replacing the lines (known from startline and endline) in the lines property.

For example, for CMD:

for insn in parser.structure:
  if insn['instruction'] == 'CMD':
    content = 'CMD my-new-cmd-value\n'
    lines = parser.lines
    del lines[insn['startline']:insn['endline'] + 1]
    lines.insert(startline, content)
    parser.lines = lines
    break

However, for the LABEL instruction it is much more complicated as firstly there are two forms of LABEL, secondly there is quoting to deal with, and thirdly a single LABEL instruction can contain many labels.

It would be nice to provide support for modifying a single label, handling all these cases.

Parser ignores RUN statements if comments are in them

To reproduce:

  1. I am trying to parse this Dockerfile: https://raw.githubusercontent.com/nginxinc/docker-nginx/594ce7a8bc26c85af88495ac94d5cd0096b306f7/mainline/buster/Dockerfile
  2. Ran the following in the repl:
>>> from dockerfile_parse import DockerfileParser
>>> parser = DockerfileParser('/path/to/Dockerfile')
>>> parser.structure[0]
{'instruction': 'FROM', 'startline': 0, 'endline': 0, 'content': 'FROM debian:buster-slim\n', 'value': 'debian:buster-slim'}
>>> parser.structure[1]
{'instruction': 'LABEL', 'startline': 2, 'endline': 2, 'content': 'LABEL maintainer="NGINX Docker Maintainers <[email protected]>"\n', 'value': 'maintainer="NGINX Docker Maintainers <[email protected]>"'}
>>> parser.structure[2]
{'instruction': 'ENV', 'startline': 4, 'endline': 4, 'content': 'ENV NGINX_VERSION   1.17.10\n', 'value': 'NGINX_VERSION   1.17.10'}
>>> parser.structure[3]
{'instruction': 'ENV', 'startline': 5, 'endline': 5, 'content': 'ENV NJS_VERSION     0.3.9\n', 'value': 'NJS_VERSION     0.3.9'}
>>> parser.structure[4]
{'instruction': 'ENV', 'startline': 6, 'endline': 6, 'content': 'ENV PKG_RELEASE     1~buster\n', 'value': 'PKG_RELEASE     1~buster'}
>>> parser.structure[5]
{'instruction': 'COMMENT', 'startline': 9, 'endline': 9, 'content': '# create nginx user/group first, to be consistent throughout docker variants\n', 'value': 'create nginx user/group first, to be consistent throughout docker variants'}

Line 9 is a RUN statement which goes from line 9 to 103.

I am guessing the problem is that since the comments in the RUN statement are not indented, the parser is ignoring the commands and just picking up the comments.

Not catching 'closing quotation' error

$ cat Dockerfile
FROM ubuntu

LABEL label1='A' \
      label2='B'@C' \
      label3='D'

RUN apt-get update -qq
$ pip install dockerfile-parse==0.0.10
$ python -c "from dockerfile_parse import DockerfileParser; dfp = DockerfileParser(); print(dfp.labels)"

result:

{u'label1': u'A', u'label2': u'B@C       label3=D'}

expected:

ValueError: No closing quotation

dockerfile-parse creates a Dockerfile file

When I run

from dockerfile_parse import DockerfileParser

dfp = DockerfileParser()
dfp.content = """\
From  base
LABEL foo="bar baz"
USER  me"""

it creates a Dockerfile locally. This is unexpected. Is it possible to not do that?

LABEL is not properly handled by dockerfile-parse

If Dockerfile is defined like:

LABEL summary="Postfix is a Mail Transport Agent (MTA)." \
       version = "1.0" \
       description="Postfix is mail transfer agent that routes and delivers mail." \
       io.k8s.description="Postfix is mail transfer agent that routes and delivers mail." \
       io.k8s.display-name="Postfix 3.1" \
       io.openshift.expose-services="10025:postfix" \
io.openshift.tags="postfix,mail,mta"

then dockerfile-parse parse it like:

{u'content': u'LABEL summary="Postfix is a Mail Transport Agent (MTA)." \\\n    version="1.0" \\\n    description="Postfix is mail transfer agent that routes and delivers mail." \\\n    io.k8s.description="Postfix is mail transfer agent that routes and delivers mail." \\\n    io.k8s.diplay-name="Postfix 3.1" \\\n    io.openshift.expose-services="10025:postfix" \\\n    io.openshift.tags="postfix,mail,mta"\n',
 u'endline': 22,
 u'instruction': u'LABEL',
 u'startline': 16,
 u'value': u'summary="Postfix is a Mail Transport Agent (MTA)."     version="1.0"     description="Postfix is mail transfer agent that routes and delivers mail."     io.k8s.description="Postfix is mail transfer agent that routes and delivers mail."     io.k8s.diplay-name="Postfix 3.1"     io.openshift.expose-services="10025:postfix"     io.openshift.tags="postfix,mail,mta"'}

python program as reproducer:

        dfp = DockerfileParser(path=self.dir)
        import pprint
        for struct in dfp.structure:
            pprint.pprint(struct)

Would it be possible to somehow structure it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.