GithubHelp home page GithubHelp logo

pdftk-java / pdftk-java-container Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 2.0 33 KB

OCI image (compatible with e.g. Docker or Podman) for pdftk-java

License: GNU General Public License v2.0

Dockerfile 100.00%
docker-image podman-image pdftk oci-image pdf-toolkit pdftk-java

pdftk-java-container's Introduction

Container image for pdftk-java

Build OCI image Docker pulls OCI image size CodeFactor Grade Latest version

About

Source files and build instructions for an OCI image (compatible with e.g. Docker or Podman) for pdftk-java. If PDF is electronic paper, then pdftk-java is an electronic staple-remover, hole-punch, binder, secret-decoder-ring, and X-Ray-glasses. PDFtk is a simple tool for doing everyday things with PDF documents: Merge PDF documents, split PDF pages into a new document, decrypt input as necessary (password required), encrypt output as desired, burst a PDF document into single pages, report on PDF metrics, including metadata and bookmarks, uncompress and re-compress page streams, and repair corrupted PDF (where possible).

Pdftk-java is a port of the original GCJ-based PDFtk to Java. The GNU Compiler for Java (GCJ) is a portable, optimizing, ahead-of-time compiler for the Java programming language, which had no new developments since 2009 and was finally removed in 2016 from the GCC development tree before the release of GCC 7.

Usage

The OCI image automatically runs pdftk-java with the given options and arguments. It may be started with Docker using:

docker run --rm --volume $(pwd):/work pdftk/pdftk:latest --help

And it may be started with Podman using:

podman run --rm --volume $(pwd):/work quay.io/pdftk/pdftk:latest --help

For command-line convenience it might be suitable to alias the command above, e.g.:

alias pdftk='podman run --rm --volume $(pwd):/work quay.io/pdftk/pdftk:latest'

Volumes

  • /work - Default working directory for pdftk-java.

While none of the volumes is required, meaningful usage requires at least persistent storage for /work.

Custom images

For custom OCI images, the following build arguments can be passed:

  • VERSION - Version of the pdftk-java release tarball, defaults to 3.3.3.
  • GIT - Git repository URL of pdftk-java, defaults to https://gitlab.com/pdftk-java/pdftk.git.
  • COMMIT - Git commit, branch or tag of pdftk-java, e.g. master, unset by default.

To build a custom OCI image from current Git, e.g. --build-arg COMMIT=master needs to be passed.

Pipeline / Workflow

Docker Hub and Quay can both automatically build OCI images from a linked GitHub account and automatically push the built image to the respective container repository. However, as of writing, this leads to OCI images for only the amd64 CPU architecture. To support as many CPU architectures as possible (currently 386, amd64, arm/v6, arm/v7, arm64/v8, ppc64le and s390x), GitHub Actions are used. There, the current standard workflow "Build and push OCI image" roughly uses first a GitHub Action to install QEMU static binaries, then a GitHub Action to set up Docker Buildx and finally a GitHub Action to build and push Docker images with Buildx.

Thus the OCI images are effectively built within the GitHub infrastructure (using free minutes for public repositories) and then only pushed to both container repositories, Docker Hub and Quay (which are also free for public repositories). This not only saves repeated CPU resources but also ensures identical bugs independent from which container repository the OCI image gets finally pulled (and somehow tries to keep it distant from program changes such as Docker Hub Rate Limiting in 2020). The authentication for the pushes to the container repositories happen using access tokens, which at Docker Hub need to be bound to a (community) user and at Quay using a robot account as part of the organization. These access tokens are saved as "repository secrets" as part of the settings of the GitHub project.

For each release of the project, a new Git branch (named like the version of the release, e.g. 3.3.3) is created (based on the default branch, e.g. master). The workflow takes care about creating and moving container tags, such as latest. By not using Git tags but branches, downstream bug fixes can be easily applied to the OCI image (e.g. for bugs in the Dockerfile or patches for the source code itself). Old branches are not touched anymore, equivalent to old release archives.

Each commit to a Git branch triggers the workflow and leads to OCI images being pushed (except for GitHub pull requests), where the container tag is always based on the Git branch name. OCI images with non-release container tags pushed for testing purposes need to be cleaned up manually at the container repositories. Additionally, a cron-like option in the workflow leads to a nightly build being also tagged as edge.

Re-running a workflow for failed builds can be performed using the GitHub web interface at the "Actions" section. However, to re-run older or successful builds (e.g. to achieve a newer operating system base image layer for an existing release), git commit --allow-empty -m "Reason" && git push might do the trick (because the GitHub Actions API doesn't seem to allow such re-runs either).

License

This project is licensed under the GNU General Public License, version 2 or later - see the LICENSE file for details.

As with all OCI images, these also contain other software under other licenses (such as BusyBox, OpenJDK etc. from the base distribution, along with any direct or indirect dependencies of the contained pdftk-java).

As for any pre-built image usage, it is the image user's responsibility to ensure that any use of this image complies with any relevant licenses for all software contained within.

pdftk-java-container's People

Contributors

dependabot[bot] avatar robert-scheck avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

elb3k yiqideren

pdftk-java-container's Issues

Docker is sunsetting Free Team organizations

As also mentioned at docker/hub-feedback#2314, Docker is sunsetting Free Team organizations at April 13th, 2023. The initial notification was sent by Docker via e-mail yesterday.

Docker expects users switching to their paid Docker plans, alternatively there is a "specific Docker-Sponsored Open Source (DSOS) program for open-source projects" for which this project might or might not be eligible.

In difference to some other projects, all container images are pushed since ever to Docker Hub and Quay.io. It could be worth considering GHCR additionally.

java.lang.NoSuchMethodError: java.nio.MappedByteBuffer.position(I)Ljava/nio/MappedByteBuffer

Hello there,
I just tried out the pdftk:edge docker image to get an easily deployable version of the latest pdftk, but it is running into this exception upon running.

I tried this:

$ docker run --rm --volume $(pwd):/work pdftk/pdftk:edge large.pdf update_info_utf8 large.dump output large_meta.pdf
Unable to find image 'pdftk/pdftk:edge' locally
edge: Pulling from pdftk/pdftk
96526aa774ef: Pull complete
c1e35dd5d9c7: Pull complete
9bfc9ff2b6ab: Pull complete
4f4fb700ef54: Pull complete
Digest: sha256:082bb56f3255655b376112c167764029b5ca60882984cd320f7e3b9a27a24ceb
Status: Downloaded newer image for pdftk/pdftk:edge
Error: Unexpected Exception in open_reader()
Unhandled Java Exception in main():
java.lang.NoSuchMethodError: java.nio.MappedByteBuffer.position(I)Ljava/nio/MappedByteBuffer;
        at com.gitlab.pdftk_java.com.lowagie.text.pdf.MappedRandomAccessFile.seek(MappedRandomAccessFile.java:166)
        at com.gitlab.pdftk_java.com.lowagie.text.pdf.RandomAccessFileOrArray.seek(RandomAccessFileOrArray.java:374)
        at com.gitlab.pdftk_java.com.lowagie.text.pdf.RandomAccessFileOrArray.setStartOffset(RandomAccessFileOrArray.java:680)
        at com.gitlab.pdftk_java.com.lowagie.text.pdf.PRTokeniser.checkPdfHeader(PRTokeniser.java:191)
        at com.gitlab.pdftk_java.com.lowagie.text.pdf.PdfReader.readPdf(PdfReader.java:493)
        at com.gitlab.pdftk_java.com.lowagie.text.pdf.PdfReader.<init>(PdfReader.java:172)
        at com.gitlab.pdftk_java.com.lowagie.text.pdf.PdfReader.<init>(PdfReader.java:161)
        at com.gitlab.pdftk_java.InputPdf.add_reader(InputPdf.java:73)
        at com.gitlab.pdftk_java.TK_Session.add_reader(TK_Session.java:61)
        at com.gitlab.pdftk_java.TK_Session.open_input_pdf_readers(TK_Session.java:78)
        at com.gitlab.pdftk_java.TK_Session$Parser.parse_state_output_filename(TK_Session.java:854)
        at com.gitlab.pdftk_java.TK_Session$Parser.parse(TK_Session.java:209)
        at com.gitlab.pdftk_java.TK_Session.parse(TK_Session.java:1112)
        at com.gitlab.pdftk_java.pdftk.main_noexit(pdftk.java:184)
        at com.gitlab.pdftk_java.pdftk.main(pdftk.java:161)
There was a problem with pdftk-java. Please report it at
https://gitlab.com/pdftk-java/pdftk/issues
including the message above, the version of pdftk-java (3.3.3), and if possible steps to reproduce the error.

Then tried to build my own image but the outcome is the same.

docker build --build-arg COMMIT=master --tag=pdftk:20231025 .

Is it possible that something is broken there? There were a lot of recent commits in pdftk-java and the aur package pdftk-git seems to work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.