docker-flink / docker-flink Goto Github PK

View Code? Open in Web Editor NEW

141.0 14.0 107.0 416 KB

Docker packaging for Apache Flink

Home Page: https://flink.apache.org

License: Apache License 2.0

Shell 73.67% Dockerfile 26.33%

docker-flink's Issues

Request: Allow user to edit number of numberOfTaskSlots, parallelism

Requesting to allow users to edit some important configurations from flink-conf.yaml like

# The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.

taskmanager.numberOfTaskSlots: 1

# The parallelism used for programs that did not specify and other parallelism.

parallelism.default: 1

Rework add-version.sh into update.sh a la other repos

To conform with other similar setups, add-version.sh likely needs to become update.sh and be taught how to derive the latest version (e.g. 1.2.0) from a given release (e.g. 1.2) and assemble a .travis.yml file dynamically. Two examples are httpd and cassandra.

Upgrade to gosu 1.11

Currently we're using gosu 1.7 which is pretty ancient. Let's upgrade to 1.11.

Drop 1.1 support after Flink 1.2.1 is released

I've included 1.1 in this repo since the start mainly to ensure the tooling in here works for multiple, concurrent supported releases, but I'm not sure it can actually be used from Docker effectively.

FLINK-2821 found that the hostname of the jobmanager has to be exactly the same as the name the taskmanagers connect to it with, due to Akka. A fix/workaround for this was added in 1.2, but as far as I know it wasn't backported to 1.1.

We should actually try out the 1.1 images to see if they work at all, and if not, just drop support and provide official images for 1.2 onward.

Edit: original title "Evaluate and possibly delete 1.1 support" changed to "Drop 1.1 support after Flink 1.2.1 is released"

Add tests to validate a new image release

We lack tests to ensure that the modifications we do to the docker files don't break the images, we need to automate or at least document a simple validation test for this. A simple and valuable case is a simple flink job with checkpointing.

Remove alpine image

It seems that after docker-library/openjdk#322 the official openjdk image (the one we are based on) won't support alpine anymore.
As analternative we may consider basing the alpine version on https://hub.docker.com/r/adoptopenjdk/openjdk11 but probably we'll do better to stay aligned with upstream.

Support configuration properties available as env vars in Docker

Is there any plan for flink docker image to support overrite configurations via env vars ?

I am using kubernetes to deploy a flink session cluster. When I want to change taskmanager.heap.size, I have to use Configmap to overrite flink-conf.yaml in container.

I would like to contribute the feature if the idea is approved.

The docker image should not run as root

A user flink (9999) is created but the docker image is still starting using root.
it is a security concern and it make it not usable in a kubernetes environment where the use of securityContext is enforced

When I start the image with user 999, I got a error: failed switching to "flink": operation not permitted

here is an example fo the k8s manifest

kind: Deployment
metadata:
  name: flink-taskmanager
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: flink
        component: taskmanager
    spec:
      containers:
        - name: taskmanager
          image: flink:1.7.1-scala_2.11
          securityContext:
            runAsUser: 9999
            runAsNonRoot: true
          args:
            - taskmanager
          ports:
            - containerPort: 6121
              name: data
            - containerPort: 6122
              name: rpc
            - containerPort: 6125
              name: query
          env:
            - name: JOB_MANAGER_RPC_ADDRESS
              value: flink-jobmanager``

RocksDB state backend causes segfault on Alpine images

Alpine Linux uses MUSL instead of glibc, but the version of RocksDB embedded in Flink apparently requires some absent glibc functionality.

java.lang.UnsatisfiedLinkError: /tmp/rocksdb-lib-33e471f8d228c175dfc6148869213083/librocksdbjni-linux64.so: Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /tmp/rocksdb-lib-33e471f8d228c175dfc6148869213083/librocksdbjni-linux64.so)

By installing the Alpine Linux package libc6-compat, this error was resolved but was replaced by this one:

java.lang.UnsatisfiedLinkError: /tmp/rocksdb-lib-fea5dea17151b5dfdc87d323576a04c3/librocksdbjni-linux64.so: Error relocating /tmp/rocksdb-lib-fea5dea17151b5dfdc87d323576a04c3/librocksdbjni-linux64.so: __strtod_internal: symbol not found

I didn't find an obvious solution to that one, but it's possible installing an additional package for compatibility would work.

We might want to consider dropping the official -alpine images until this is resolved.

cc @iemejia @StephanEwen

Socket Window WordCounter example error

I've been trying to solve this issue with the SocketWindowWordCounter.jar example with no luck. I ran the image local, started the cluster with ./bin/start-cluster.sh, ran nc -l 9000 on a second shell attached to the container, then submitted the example with ./bin/flink run examples/streaming/SocketWindowWordCount.jar. The job hangs for 2 - 3 seconds before throwing this exception:

root@4689051a20fe:/opt/flink# ./bin/flink run examples/streaming/SocketWindowWordCount.jar --port 9000
Starting execution of program

The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:264)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:464)
at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
at org.apache.flink.streaming.examples.socket.SocketWindowWordCount.main(SocketWindowWordCount.java:92)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:420)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:404)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:785)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:279)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:214)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1025)
at org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1101)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1101)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.flink.streaming.api.functions.source.SocketTextStreamFunction.run(SocketTextStreamFunction.java:96)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:56)
at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
at java.lang.Thread.run(Thread.java:748)

I could run some of the other streaming examples and fetch the results from the task manager's .out file. I tried this on both docker for windows and docker on mac and keep getting the same issue.

Am I missing something here?

Add a new image based on openjdk:8-jre-slim

There is a new set of images (still experimental) that are slimmer versions of the debian and openjdk image, this can reduce the size of our current flink image.

RocksDB state backend causes segfault on Alpine images again

Seems like the problem in #14 has reappeared at least for 1.4.1-hadoop27-scala_2.11-alpine.

Anyone else experience this?

Support Hadoop-free 1.4

The scripts currently don't support the absence of a Hadoop version, though Flink 1.4 now includes a Hadoop-free release.

Update the scripts to build a Hadoop-free image, and ideally, change the Hadoop-ful images to use it as an upstream, only adding the additional jar to /opt/flink/lib.

`Error loading shared library ld-linux-x86-64.so.2` on running jar over http

Hi guys, I'm using the image tagged 1.7.2-scala_2.11-alpine however I seem to be getting shared lib issues. I'm submitting my pipelines via the http API and it's throwing the following exception when constructing the job graph:

java.lang.UnsatisfiedLinkError: /tmp/snappy-1.1.4-a1c7fa1d-2146-4884-8450-3c5b7ad994a3-libsnappyjava.so: Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /tmp/snappy-1.1.4-a1c7fa1d-2146-4884-8450-3c5b7ad994a3-libsnappyjava.so)
	at java.lang.ClassLoader$NativeLibrary.load(Native Method)
	at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
	at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
	at java.lang.Runtime.load0(Runtime.java:809)
	at java.lang.System.load(System.java:1086)
	at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:179)
	at org.xerial.snappy.SnappyLoader.loadSnappyApi(SnappyLoader.java:154)
	at org.xerial.snappy.Snappy.<clinit>(Snappy.java:47)
	at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:97)
	at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:89)
	at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:79)
	at org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:50)
	at org.apache.beam.sdk.util.SerializableUtils.clone(SerializableUtils.java:100)
	at org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:79)
	at org.apache.beam.sdk.io.Read$Unbounded.<init>(Read.java:129)
	at org.apache.beam.sdk.io.Read$Unbounded.<init>(Read.java:124)
	at org.apache.beam.sdk.io.Read.from(Read.java:56)
	at org.apache.beam.sdk.io.kafka.KafkaIO$Read.expand(KafkaIO.java:725)
	at org.apache.beam.sdk.io.kafka.KafkaIO$Read.expand(KafkaIO.java:309)
	at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
	at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:471)
	at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:44)
	at org.apache.beam.sdk.io.kafka.KafkaIO$TypedWithoutMetadata.expand(KafkaIO.java:837)
	at org.apache.beam.sdk.io.kafka.KafkaIO$TypedWithoutMetadata.expand(KafkaIO.java:826)
	at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
	at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:471)
	at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:44)
	at net.thoughtmachine.common.streamio.KafkaRead.expand(KafkaRead.java:31)
	at net.thoughtmachine.common.streamio.KafkaRead.expand(KafkaRead.java:11)
	at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
	at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:488)
	at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:56)
	at net.thoughtmachine.balance.BalancePipeline.lambda$expand$0(BalancePipeline.java:29)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1696)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
	at net.thoughtmachine.balance.BalancePipeline.expand(BalancePipeline.java:30)
	at net.thoughtmachine.balance.BalancePipeline.expand(BalancePipeline.java:17)
	at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
	at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:488)
	at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:56)
	at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:182)
	at net.thoughtmachine.balance.FlinkBalancePipeline.main(FlinkBalancePipeline.java:39)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
	at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
	at org.apache.flink.client.program.OptimizerPlanEnvironment.getOptimizedPlan(OptimizerPlanEnvironment.java:83)
	at org.apache.flink.client.program.PackagedProgramUtils.createJobGraph(PackagedProgramUtils.java:78)
	at org.apache.flink.client.program.PackagedProgramUtils.createJobGraph(PackagedProgramUtils.java:120)
	at org.apache.flink.runtime.webmonitor.handlers.utils.JarHandlerUtils$JarHandlerContext.toJobGraph(JarHandlerUtils.java:117)
	at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$getJobGraphAsync$7(JarRunHandler.java:151)
	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Seems that snappy can't find ld-linux-x86-64.so.2. Looking inside the container, this so has been installed under /lib64.

I tried the debian based image quickly and that works fine however it's not really suitable for our purposes as we'd like to keep to alpine across the board if possible.

high availability configuration

Allow high availability configuration by specifying zookeeper parameters etc.

Doc issue ?

Hello,

I believe that there might be a mistake in the doc in "Running a JobManager or a TaskManager"
Given command for jobmanager : "$ docker run --name flink_jobmanager -d -t flink taskmanager"
Given command for taskmanager : "$ docker run --name flink_taskmanager -d -t flink taskmanager"

For me jobmanager command should be "$ docker run --name flink_jobmanager -d -t flink jobmanager"

Am I right or did I understand something wrong ?

The docker-flink image should be built via automated builds

The docker hub configuration shall not have manually pushed images, those should be built from this repo.
This guarantees assurance for its users that the images have not been altered.

Probably we shall move this into a new github organization (docker-flink) to make maintenance of the automated builds easier.

Add BLOB_SERVER_PORT, JOB_MANAGER_RPC_PORT, JOB_MANAGER_WEB_PORT environment option

Hi guys,

I'm just getting to grips with these images and I'm trying to include them in our suite of docker-based automated end-to-end tests.

One thing that's tripping me up is the blob.server.port configuration. It seems to pick an ephemeral port by default, meaning its impossible to map the port externally when creating the docker instance. This of course means you can't submit jar files to the cluster, because it fails with:

Could not connect to BlobServer at address localhost/127.0.0.1:41989

Would it be possible to add a BLOB_SERVER_PORT environment variable to allow overrides in the config and pin this to a port or range of ports. Then we can programmatically expose this when we spin up the containers in our tests.

Thanks,

Tom

Edit:

I have since discovered that the docker container is incompatible with port mapping. If you start local cluster without changing any config (jobmanager on 6123) and expose the port with a simple mapping -p 6123:6123, then an external cli call of flink list -m localhost:6123 will work fine. However, if you change that mapping to -p 6124:6123, then flink list -m localhost:6124 call will fail. In the JobManager log you can see the message being rejected with:

2017-12-22 09:58:09,656 ERROR akka.remote.EndpointWriter                                    - dropping message [class akka.actor.ActorSelectionMessage] for non-local recipient [Actor[akka.tcp://flink@localhost:6124/]] arriving at [akka.tcp://flink@localhost:6124] inbound addresses are [akka.tcp://flink@localhost:6123]

For this reason, I think the images need to support changing of ports through environment variables. I am going to look into doing this myself, but I am no expert in Docker image configuration. I will update this ticket with how I get on.

Python batch API example issues

Hi guys,
i am playing with flink and would like to try python batch API. Unfortunately having some issues even with examples contained in the docker itself.

using docker-compose example from the docker hub https://hub.docker.com/_/flink/ (using docker flink:1.6.1)
using following Dockerfile to get python installed

FROM flink:1.6.1

RUN apt-get -y update && apt-get -y install python python-dev python-pip

connecting into the job manager container
executing following command

/opt/flink/bin/pyflink.sh /opt/flink/examples/python/batch/WordCount.py

This is CLI output:

Starting execution of program
Failed to run plan: Job failed. (JobID: e1a88feaf33c99926bde821ca9f35de3)

The program didn't contain a Flink job. Perhaps you forgot to call execute() on the execution environment.

Actual exception for the job taken from UI:

java.io.IOException: Cannot run program "kill": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
	at java.lang.Runtime.exec(Runtime.java:620)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.flink.python.api.streaming.data.PythonStreamer.destroyProcess(PythonStreamer.java:223)
	at org.apache.flink.python.api.streaming.data.PythonStreamer.close(PythonStreamer.java:198)
	at org.apache.flink.python.api.functions.PythonMapPartition.close(PythonMapPartition.java:69)
	at org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43)
	at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:507)
	at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
	at java.lang.ProcessImpl.start(ProcessImpl.java:134)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
	... 10 more

Any idea what could be wrong?

Thanks!

How to configure persistent directories ?

Add multi arch images for all the architectures supported by openjdk

Should the debian- or alpine-based images be the default?

Most projects appear to make their default (e.g. "latest" "1.2") images be those based on debian, but what's the real benefit there? The alpine images are significantly smaller, and I don't see why we shouldn't have them be the default, relegating the debian-based images to e.g. "latest-debian" and "1.2-debian".

Add doc/info on how to run the flink image in High Availability (HA) mode

It would be nice to include this for users who want to configure the full HA mode, we can look at the Solr or the Storm docker image for an example.

Pass through JVM crash error report

When the Java Runtime Environment segfaults, it dumps its core and an error report:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00000000000255e6, pid=1, tid=0x00007fd79a2dbae8
#
# JRE version: OpenJDK Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
# Java VM: OpenJDK 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
# Derivative: IcedTea 3.9.0
# Distribution: Custom build (Tue Oct 23 11:27:22 UTC 2018)
# Problematic frame:
# C  0x00000000000255e6
#
# Core dump written. Default location: /flink/bin/core or core.1
#
# An error report file with more information is saved as:
# /flink/bin/hs_err_pid1.log
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
#   http://icedtea.classpath.org/bugzilla
#

In a Kubernetes environment, it's not trivial to excavate this error report. Because we're on OpenJDK8, we can't politely ask the JVM to push the report to stdout instead. To make the log more easily accessible I propose adding a few lines to the docker-entrypoint to check (after main process termination) if the report exists, and to cat it to stdout.

logs question

Hello,
I am running docker versions of flink using a docker-compose.yaml file as shown on DockerHub. When I request to view logs from the Taskmanager (Stdout tab) I get the following error
java.io.IOException: TaskManager log files are unavailable. Log file location not found in environment variable log.file or configuration key taskmanager.log.path.
There are no environment variables set. I thought the default log4j and /opt/flink/log would be used.
I am running Docker 17.12.0-ce on windows.
JP

Drop Alpine-based image support for now

Until #14 is resolved we should not generate images based on Alpine Linux.

New Images not recognizing jobmanager RPC

While using an older sha docker image sha256:8757a61bd995dc43ba63d04ef575251fb36fc4e9794a3e9809fe7f443222ded8 the JOB_MANAGER_RPC_ADDRESS environment variable and configuration in the yaml file jobmanager.rpc.address are both recognized and used when the jobmanager/taskmanager starts up.

In the latest docker images pushed up a few days ago, the rpc address is cluster and not being changed by the values being passed in.

Prepare documentation for the official image

Docker's official images have to be well documented, as seen in this repo. We need to prepare this too.
https://github.com/docker-library/docs

Run "non-root" test as flink user, not arbitrary user

It would be nice to run this test as an arbitrary user, but since we rely on writing out a config file, the image actually has to be run as either root or the flink user.

docker-flink standalone mode

How to use docker image deploy standalone mode, need depend the Hadoop environment ?

Dockerfile

FROM flink:1.6.2-hadoop27-scala_2.11
ENV TZ=Asia/Shanghai
ENV LANG zh_CN.UTF-8
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && \
    echo $TZ > /etc/timezone

COPY docker-entrypoint.sh /
COPY flink-conf.yaml /opt/flink/conf/flink-conf.yaml

ENTRYPOINT ["/docker-entrypoint.sh"]

flink-conf.yaml

jobmanager.rpc.port: 6123
jobmanager.heap.size: 512
taskmanager.heap.size: 512
taskmanager.numberOfTaskSlots: 1
parallelism.default: 1
rest.port: 8081
blob.server.port: 6124
query.server.port: 6125
web.tmpdir: /opt/flink/webTmp
web.log.path: /opt/flink/log
taskmanager.tmp.dirs: /opt/flink/taskManagerTmp
high-availability.storageDir: hdfs://10.1.2.109:8020/wulin/
high-availability.zookeeper.quorum: 10.1.2.11:2181
high-availability.zookeeper.path.root: /flink
gh-availability.cluster-id: /flink

docker-entrypoint.sh

#!/bin/sh
JOB_MANAGER_RPC_ADDRESS=${JOB_MANAGER_RPC_ADDRESS}

drop_privs_cmd() {
    if [ -x /sbin/su-exec ]; then
        # Alpine
        echo su-exec
    else
        # Others
        echo gosu
    fi
}

if [ "$1" = "help" ]; then
    echo "Usage: $(basename "$0") (jobmanager|taskmanager|help)"
    exit 0
elif [ "$1" = "jobmanager" ]; then
    shift 1
    echo "Starting Job Manager"
    echo "$(sed -e "s#jobmanager\.rpc\.address: localhost#jobmanager\.rpc\.address: $JOB_MANAGER_RPC_ADDRESS#g" $FLINK_HOME/conf/flink-conf.yaml)" > $FLINK_HOME/conf/flink-conf.yaml
  
    echo "config file: " && grep '^[^\n#]' $FLINK_HOME/conf/flink-conf.yaml
    exec $(drop_privs_cmd) flink "$FLINK_HOME/bin/jobmanager.sh" start-foreground "$@"
elif [ "$1" = "cluster" ]; then
    exec $(drop_privs_cmd) flink "$FLINK_HOME/bin/start-cluster.sh" start-foreground "$@"
elif [ "$1" = "taskmanager" ]; then
    TASK_MANAGER_NUMBER_OF_TASK_SLOTS=${TASK_MANAGER_NUMBER_OF_TASK_SLOTS:-$(grep -c ^processor /proc/cpuinfo)}
    echo "$(sed -e "s#jobmanager\.rpc\.address: localhost#jobmanager\.rpc\.address: $JOB_MANAGER_RPC_ADDRESS#g" $FLINK_HOME/conf/flink-conf.yaml)" > $FLINK_HOME/conf/flink-conf.yaml
    echo "$(sed -e "s#taskmanager\.numberOfTaskSlots: 1#taskmanager\.numberOfTaskSlots: $TASK_MANAGER_NUMBER_OF_TASK_SLOTS#g" $FLINK_HOME/conf/flink-conf.yaml)" > $FLINK_HOME/conf/flink-conf.yaml
   
    echo "Starting Task Manager"
    echo "config file: " && grep '^[^\n#]' "$FLINK_HOME/conf/flink-conf.yaml"
    exec $(drop_privs_cmd) flink "$FLINK_HOME/bin/taskmanager.sh" start-foreground
fi

exec "$@"

masters

10.1.2.11:8081
10.1.2.10:8081

docker build -t flink:1.0 .

docker run --name jobmanager \
    --restart always \
    --net host \
    -v $PWD/flink-conf.yaml:/opt/flink/conf/flink-conf.yaml \
    -v $PWD/masters:/opt/flink/conf/masters \
    -e JOB_MANAGER_RPC_ADDRESS=10.1.2.11 \
    -d flink:1.0 cluster

Drop 'USER flink' and make docker-entrypoint.sh drop privs

We may want to teach docker-entrypoint.sh to drop privs to the flink user so docker run flink bash gives a root shell into the container. An example is Elasticsearch (debian, alpine).

Create docker images for 1.4 release and CI process for testing future RC releases.

1.4 has been released. Hurray!

But there isn't yet a dockerfile / image for the release. Furthermore, it would be nice to have dockerfile / images for each RC release, so people can incrementally test features using docker before release.

Is it possible to use CI to generate images and add Dockerfiles for each release?

Iterate through keyservers when fetching GPG key for gosu

See tianon/gosu#35 for context.

Create subdirectories per version

We should maybe create a subdirectory for each flink-version, so instead of current

1.1/hadoop24-scala_2.10-alpine
...
1.2/hadoop24-scala_2.10-alpine
...

we may better have:

1.1.4/hadoop24-scala_2.10-alpine
...
1.2.0/hadoop24-scala_2.10-alpine
...

Docker's Official Images build of Flink is broken

See the Full Build Log for details.

Snippet:

gpg: key 1F302569A96CFFD5: public key "Till Rohrmann (stsffap) <[email protected]>" imported
gpg: Total number processed: 15
gpg:               imported: 15
gpg: no ultimately trusted keys found
+ gpg --batch --verify flink.tgz.asc flink.tgz
gpg: Signature made Wed Oct 17 20:23:15 2018 UTC
gpg:                using RSA key C2EED7B111D464BA
gpg: Good signature from "Chesnay Schepler (CODE SIGNING KEY) <[email protected]>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 19F2 195E 1B48 16D7 65A2  C324 C2EE D7B1 11D4 64BA
+ rm -rf /tmp/tmp.BgHhiG flink.tgz.asc
rm: can't remove '/tmp/tmp.BgHhiG/S.gpg-agent.ssh': No such file or directory
Removing intermediate container dab02cab8ec8

Essentially I think we're hitting THIS BUG.

@tianon @yosifkit any ideas from your side?

How to submit new job from docker-compose.yml ?

docker exec -it $(docker ps --filter name=root_jobmanager --format={{.ID}}) flink run -c class.Name /new/job.jar
It works in command line.
How to add the job automatically when flink start?

[DISCUSS] Default Java configuration params for containers

When we execute applications in Java containers, by default Java does not respect the cgroups limits for the heap resulting in some inconsistent behavior, see:
https://blog.csanchez.org/2017/05/31/running-a-jvm-in-a-container-without-getting-killed/

I am really inclined to add some of these flag options as a default for the docker image but I am not sure if this fix should be let to the user. WDYT @patricklucas ?

Support multiple architectures in the docker images

Official docker images support now multi architecture builds, we need to add support for this in the docker image.
https://blog.docker.com/2017/09/docker-official-images-now-multi-platform/
It seems we need just to add some metadata and validate that this works.
https://github.com/docker-library/official-images#multiple-architectures

Change docker-flink logo

For trademark reasons we had to change the logo of this repository, I want to
propose to reuse the logo of the blog post as the official docker-flink logo.

https://activerain-store.s3.amazonaws.com/image_store/uploads/8/7/6/3/9/ar12988558393678.JPG

Blob server needs to bind to known port

The jobmanager's blob server currently binds to an arbitrary port, but should be fixed so it can be exposed outside of the container.

Improve GPG signature handling

According to the discussion on docker-library/official-images#5576, we should make some improvements to how GPG signatures are handled.

Namely, instead of fetching and including the KEYS file from the Apache mirror, keep a simple mapping of Flink version to GPG key id and fetch just that key during the build. To avoid key server reliability issues, loop through a shuffled list of mirrors, but always try ha.pool.sks-keyservers.net first to allow the official images build server to utilize the approach described here.

exec: local: not found

When I run
docker run --name flink_local -p 8081:8081 -t flink local
It returns
/docker-entrypoint.sh: 58: exec: local: not found

When I run
docker ps -a
It returns
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4134e327eb89 flink "/docker-entrypoin..." 2 minutes ago Exited (127) 2 minutes ago flink_local

How can I start it?

RocksDB causes error when mounting a tmpfs to /tmp

Adding a tmpfs: /tmp entry to a docker-compose file will make the RocksDB backend fail to load.

Example compose file:

version: '2.0'

services:
  flank:
    image: flink:1.5.1-hadoop28-scala_2.11
    entrypoint: |
      bash -c "
        echo 'state.backend: rocksdb' >> conf/flink-conf.yaml;
        echo 'state.backend.fs.checkpointdir: file:///tmp/checkpoints' >> conf/flink-conf.yaml;
        taskmanager.sh start-foreground local &
        jobmanager.sh start-foreground local &
        sleep 10;
        flink run examples/streaming/WindowJoin.jar
      "
    ports:
     - "8081:8081"
    tmpfs: /tmp

The output will contain an error like java.lang.UnsatisfiedLinkError: /tmp/rocksdb-lib-[...]/librocksdbjni-linux64.so: /tmp/rocksdb-lib-[...]/librocksdbjni-linux64.so: failed to map segment from shared object

I'm guessing that this is unrelated to #14 / #37.

I'm not all that familiar with JNI and shared object loading under linux, so I'm not sure this is intended to be a possible configuration.

Cleanup images after running tests

Image updates in `openjdk:8` cause build failure

We updated the openjdk:8 images to Debian Stretch in docker-library/openjdk#124 so that it can follow the latest openjdk builds and have newer dependencies. As a result, libsnappy1 is no longer available, but is now libsnappy1v5.

Do you want me to make a PR to update Dockerfile-debian.template?

Add a generate-stackbrew-library.sh to facilitate automated builds

We should have a generate-stackbrew-library.sh script to facilitate automated building, pushing, and tagging of Docker images. Two examples are httpd and cassandra.

Add Support for ARM64V8 architecture

Hi,

The Flink docker package is building and running fine at my end for ARM64v8 architecture.

Branch used to build is:

1.5/scala_2.11-debian

Please suggest, what needs to be done to raise the request to add support for ARM64v8 arch at official docker library too.

Regards,

Update for FLIP-6 changes

The local deployment mode is no longer available, but it would be good to still have a single container option for development. See https://lists.apache.org/thread.html/c9c1fe8d96474300a2f780223327876ebe393d02673d41b3afeff901@%3Cdev.flink.apache.org%3E

Flink 1.7 on docker hub

Flink 1.7.0 is out.
What about docker image on docker hub?
https://hub.docker.com/r/library/flink/

Use Java 11

Java 8 will be end-of-life in 2019/2020, and Java 11 was released in September 2018 which is long-term support. Currently written programs in Java 9 and later don't work due to the Flink image using an outdated version of Java. Is there any reason why Java 8 is still being used?

docker-flink / docker-flink Goto Github PK

docker-flink's Issues

How to use docker image deploy standalone mode, need depend the Hadoop environment ?

Recommend Projects

Recommend Topics

Recommend Org

Jobs