GithubHelp home page GithubHelp logo

docker-flink / docker-flink Goto Github PK

View Code? Open in Web Editor NEW
141.0 14.0 107.0 416 KB

Docker packaging for Apache Flink

Home Page: https://flink.apache.org

License: Apache License 2.0

Shell 73.67% Dockerfile 26.33%

docker-flink's Introduction

MOVED: docker-flink

This repo has moved to apache/flink-docker and will receive no further updates.

Build Status

Docker packaging for Apache Flink

Use add-version.sh to rebuild the Dockerfiles and all variants for a particular Flink release release. Before running this, you must first delete the existing release directory.

usage: ./add-version.sh -r flink-release -f flink-version

Example

$ rm -r 1.2
$ ./add-version.sh -r 1.2 -f 1.2.1

Stackbrew Manifest

generate-stackbrew-library.sh is used to generate the library file required for official Docker Hub images.

When this repo is updated, the output of this script should be used to replaced the contents of library/flink in the Docker official-images repo via a PR.

Note: running this script requires the bashbrew binary and a compatible version of Bash. The Docker image plucas/docker-flink-build contains these dependencies and can be used to run this script.

Example:

docker run --rm \
    --volume /path/to/docker-flink:/build \
    plucas/docker-flink-build \
    /build/generate-stackbrew-library.sh \
> /path/to/official-images/library/flink

License

Licensed under the Apache License, Version 2.0: https://www.apache.org/licenses/LICENSE-2.0

Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation.

docker-flink's People

Contributors

clatko avatar functicons avatar guenhter avatar iemejia avatar nimda7 avatar pangliang avatar patricklucas avatar souvikdas95 avatar tianon avatar tillrohrmann avatar tweise avatar veysiertekin avatar wheresalice avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

docker-flink's Issues

Socket Window WordCounter example error

I've been trying to solve this issue with the SocketWindowWordCounter.jar example with no luck. I ran the image local, started the cluster with ./bin/start-cluster.sh, ran nc -l 9000 on a second shell attached to the container, then submitted the example with ./bin/flink run examples/streaming/SocketWindowWordCount.jar. The job hangs for 2 - 3 seconds before throwing this exception:

root@4689051a20fe:/opt/flink# ./bin/flink run examples/streaming/SocketWindowWordCount.jar --port 9000
Starting execution of program


The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:264)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:464)
at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
at org.apache.flink.streaming.examples.socket.SocketWindowWordCount.main(SocketWindowWordCount.java:92)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528)
at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:420)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:404)
at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:785)
at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:279)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:214)
at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1025)
at org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1101)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1101)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.flink.streaming.api.functions.source.SocketTextStreamFunction.run(SocketTextStreamFunction.java:96)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:56)
at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
at java.lang.Thread.run(Thread.java:748)


I could run some of the other streaming examples and fetch the results from the task manager's .out file. I tried this on both docker for windows and docker on mac and keep getting the same issue.

Am I missing something here?

Create subdirectories per version

We should maybe create a subdirectory for each flink-version, so instead of current

1.1/hadoop24-scala_2.10-alpine
...
1.2/hadoop24-scala_2.10-alpine
...

we may better have:

1.1.4/hadoop24-scala_2.10-alpine
...
1.2.0/hadoop24-scala_2.10-alpine
...

Add BLOB_SERVER_PORT, JOB_MANAGER_RPC_PORT, JOB_MANAGER_WEB_PORT environment option

Hi guys,

I'm just getting to grips with these images and I'm trying to include them in our suite of docker-based automated end-to-end tests.

One thing that's tripping me up is the blob.server.port configuration. It seems to pick an ephemeral port by default, meaning its impossible to map the port externally when creating the docker instance. This of course means you can't submit jar files to the cluster, because it fails with:

Could not connect to BlobServer at address localhost/127.0.0.1:41989

Would it be possible to add a BLOB_SERVER_PORT environment variable to allow overrides in the config and pin this to a port or range of ports. Then we can programmatically expose this when we spin up the containers in our tests.

Thanks,

Tom

Edit:

I have since discovered that the docker container is incompatible with port mapping. If you start local cluster without changing any config (jobmanager on 6123) and expose the port with a simple mapping -p 6123:6123, then an external cli call of flink list -m localhost:6123 will work fine. However, if you change that mapping to -p 6124:6123, then flink list -m localhost:6124 call will fail. In the JobManager log you can see the message being rejected with:

2017-12-22 09:58:09,656 ERROR akka.remote.EndpointWriter                                    - dropping message [class akka.actor.ActorSelectionMessage] for non-local recipient [Actor[akka.tcp://flink@localhost:6124/]] arriving at [akka.tcp://flink@localhost:6124] inbound addresses are [akka.tcp://flink@localhost:6123]

For this reason, I think the images need to support changing of ports through environment variables. I am going to look into doing this myself, but I am no expert in Docker image configuration. I will update this ticket with how I get on.

Use Java 11

Java 8 will be end-of-life in 2019/2020, and Java 11 was released in September 2018 which is long-term support. Currently written programs in Java 9 and later don't work due to the Flink image using an outdated version of Java. Is there any reason why Java 8 is still being used?

Doc issue ?

Hello,

I believe that there might be a mistake in the doc in "Running a JobManager or a TaskManager"
Given command for jobmanager : "$ docker run --name flink_jobmanager -d -t flink taskmanager"
Given command for taskmanager : "$ docker run --name flink_taskmanager -d -t flink taskmanager"

For me jobmanager command should be "$ docker run --name flink_jobmanager -d -t flink jobmanager"

Am I right or did I understand something wrong ?

Drop 1.1 support after Flink 1.2.1 is released

I've included 1.1 in this repo since the start mainly to ensure the tooling in here works for multiple, concurrent supported releases, but I'm not sure it can actually be used from Docker effectively.

FLINK-2821 found that the hostname of the jobmanager has to be exactly the same as the name the taskmanagers connect to it with, due to Akka. A fix/workaround for this was added in 1.2, but as far as I know it wasn't backported to 1.1.

We should actually try out the 1.1 images to see if they work at all, and if not, just drop support and provide official images for 1.2 onward.

Edit: original title "Evaluate and possibly delete 1.1 support" changed to "Drop 1.1 support after Flink 1.2.1 is released"

Support configuration properties available as env vars in Docker

Is there any plan for flink docker image to support overrite configurations via env vars ?

I am using kubernetes to deploy a flink session cluster. When I want to change taskmanager.heap.size, I have to use Configmap to overrite flink-conf.yaml in container.

I would like to contribute the feature if the idea is approved.

`Error loading shared library ld-linux-x86-64.so.2` on running jar over http

Hi guys, I'm using the image tagged 1.7.2-scala_2.11-alpine however I seem to be getting shared lib issues. I'm submitting my pipelines via the http API and it's throwing the following exception when constructing the job graph:

java.lang.UnsatisfiedLinkError: /tmp/snappy-1.1.4-a1c7fa1d-2146-4884-8450-3c5b7ad994a3-libsnappyjava.so: Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /tmp/snappy-1.1.4-a1c7fa1d-2146-4884-8450-3c5b7ad994a3-libsnappyjava.so)
	at java.lang.ClassLoader$NativeLibrary.load(Native Method)
	at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
	at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
	at java.lang.Runtime.load0(Runtime.java:809)
	at java.lang.System.load(System.java:1086)
	at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:179)
	at org.xerial.snappy.SnappyLoader.loadSnappyApi(SnappyLoader.java:154)
	at org.xerial.snappy.Snappy.<clinit>(Snappy.java:47)
	at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:97)
	at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:89)
	at org.xerial.snappy.SnappyOutputStream.<init>(SnappyOutputStream.java:79)
	at org.apache.beam.sdk.util.SerializableUtils.serializeToByteArray(SerializableUtils.java:50)
	at org.apache.beam.sdk.util.SerializableUtils.clone(SerializableUtils.java:100)
	at org.apache.beam.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:79)
	at org.apache.beam.sdk.io.Read$Unbounded.<init>(Read.java:129)
	at org.apache.beam.sdk.io.Read$Unbounded.<init>(Read.java:124)
	at org.apache.beam.sdk.io.Read.from(Read.java:56)
	at org.apache.beam.sdk.io.kafka.KafkaIO$Read.expand(KafkaIO.java:725)
	at org.apache.beam.sdk.io.kafka.KafkaIO$Read.expand(KafkaIO.java:309)
	at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
	at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:471)
	at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:44)
	at org.apache.beam.sdk.io.kafka.KafkaIO$TypedWithoutMetadata.expand(KafkaIO.java:837)
	at org.apache.beam.sdk.io.kafka.KafkaIO$TypedWithoutMetadata.expand(KafkaIO.java:826)
	at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
	at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:471)
	at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:44)
	at net.thoughtmachine.common.streamio.KafkaRead.expand(KafkaRead.java:31)
	at net.thoughtmachine.common.streamio.KafkaRead.expand(KafkaRead.java:11)
	at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
	at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:488)
	at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:56)
	at net.thoughtmachine.balance.BalancePipeline.lambda$expand$0(BalancePipeline.java:29)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1696)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
	at net.thoughtmachine.balance.BalancePipeline.expand(BalancePipeline.java:30)
	at net.thoughtmachine.balance.BalancePipeline.expand(BalancePipeline.java:17)
	at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
	at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:488)
	at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:56)
	at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:182)
	at net.thoughtmachine.balance.FlinkBalancePipeline.main(FlinkBalancePipeline.java:39)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
	at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
	at org.apache.flink.client.program.OptimizerPlanEnvironment.getOptimizedPlan(OptimizerPlanEnvironment.java:83)
	at org.apache.flink.client.program.PackagedProgramUtils.createJobGraph(PackagedProgramUtils.java:78)
	at org.apache.flink.client.program.PackagedProgramUtils.createJobGraph(PackagedProgramUtils.java:120)
	at org.apache.flink.runtime.webmonitor.handlers.utils.JarHandlerUtils$JarHandlerContext.toJobGraph(JarHandlerUtils.java:117)
	at org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$getJobGraphAsync$7(JarRunHandler.java:151)
	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Seems that snappy can't find ld-linux-x86-64.so.2. Looking inside the container, this so has been installed under /lib64.

I tried the debian based image quickly and that works fine however it's not really suitable for our purposes as we'd like to keep to alpine across the board if possible.

The docker image should not run as root

A user flink (9999) is created but the docker image is still starting using root.
it is a security concern and it make it not usable in a kubernetes environment where the use of securityContext is enforced

When I start the image with user 999, I got a error: failed switching to "flink": operation not permitted

here is an example fo the k8s manifest

kind: Deployment
metadata:
  name: flink-taskmanager
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: flink
        component: taskmanager
    spec:
      containers:
        - name: taskmanager
          image: flink:1.7.1-scala_2.11
          securityContext:
            runAsUser: 9999
            runAsNonRoot: true
          args:
            - taskmanager
          ports:
            - containerPort: 6121
              name: data
            - containerPort: 6122
              name: rpc
            - containerPort: 6125
              name: query
          env:
            - name: JOB_MANAGER_RPC_ADDRESS
              value: flink-jobmanager``

How to submit new job from docker-compose.yml ?

docker exec -it $(docker ps --filter name=root_jobmanager --format={{.ID}}) flink run -c class.Name /new/job.jar
It works in command line.
How to add the job automatically when flink start?

Add Support for ARM64V8 architecture

Hi,

The Flink docker package is building and running fine at my end for ARM64v8 architecture.

Branch used to build is:

1.5/scala_2.11-debian

Please suggest, what needs to be done to raise the request to add support for ARM64v8 arch at official docker library too.

Regards,

Should the debian- or alpine-based images be the default?

Most projects appear to make their default (e.g. "latest" "1.2") images be those based on debian, but what's the real benefit there? The alpine images are significantly smaller, and I don't see why we shouldn't have them be the default, relegating the debian-based images to e.g. "latest-debian" and "1.2-debian".

Python batch API example issues

Hi guys,
i am playing with flink and would like to try python batch API. Unfortunately having some issues even with examples contained in the docker itself.

  1. using docker-compose example from the docker hub https://hub.docker.com/_/flink/ (using docker flink:1.6.1)
  2. using following Dockerfile to get python installed
FROM flink:1.6.1

RUN apt-get -y update && apt-get -y install python python-dev python-pip
  1. connecting into the job manager container
  2. executing following command
/opt/flink/bin/pyflink.sh /opt/flink/examples/python/batch/WordCount.py

This is CLI output:

Starting execution of program
Failed to run plan: Job failed. (JobID: e1a88feaf33c99926bde821ca9f35de3)

The program didn't contain a Flink job. Perhaps you forgot to call execute() on the execution environment.

Actual exception for the job taken from UI:

java.io.IOException: Cannot run program "kill": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
	at java.lang.Runtime.exec(Runtime.java:620)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.flink.python.api.streaming.data.PythonStreamer.destroyProcess(PythonStreamer.java:223)
	at org.apache.flink.python.api.streaming.data.PythonStreamer.close(PythonStreamer.java:198)
	at org.apache.flink.python.api.functions.PythonMapPartition.close(PythonMapPartition.java:69)
	at org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43)
	at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:507)
	at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
	at java.lang.ProcessImpl.start(ProcessImpl.java:134)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
	... 10 more

Any idea what could be wrong?

Thanks!

The docker-flink image should be built via automated builds

The docker hub configuration shall not have manually pushed images, those should be built from this repo.
This guarantees assurance for its users that the images have not been altered.

Probably we shall move this into a new github organization (docker-flink) to make maintenance of the automated builds easier.

New Images not recognizing jobmanager RPC

While using an older sha docker image sha256:8757a61bd995dc43ba63d04ef575251fb36fc4e9794a3e9809fe7f443222ded8 the JOB_MANAGER_RPC_ADDRESS environment variable and configuration in the yaml file jobmanager.rpc.address are both recognized and used when the jobmanager/taskmanager starts up.

In the latest docker images pushed up a few days ago, the rpc address is cluster and not being changed by the values being passed in.

Improve GPG signature handling

According to the discussion on docker-library/official-images#5576, we should make some improvements to how GPG signatures are handled.

Namely, instead of fetching and including the KEYS file from the Apache mirror, keep a simple mapping of Flink version to GPG key id and fetch just that key during the build. To avoid key server reliability issues, loop through a shuffled list of mirrors, but always try ha.pool.sks-keyservers.net first to allow the official images build server to utilize the approach described here.

Pass through JVM crash error report

When the Java Runtime Environment segfaults, it dumps its core and an error report:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00000000000255e6, pid=1, tid=0x00007fd79a2dbae8
#
# JRE version: OpenJDK Runtime Environment (8.0_181-b13) (build 1.8.0_181-b13)
# Java VM: OpenJDK 64-Bit Server VM (25.181-b13 mixed mode linux-amd64 compressed oops)
# Derivative: IcedTea 3.9.0
# Distribution: Custom build (Tue Oct 23 11:27:22 UTC 2018)
# Problematic frame:
# C  0x00000000000255e6
#
# Core dump written. Default location: /flink/bin/core or core.1
#
# An error report file with more information is saved as:
# /flink/bin/hs_err_pid1.log
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
#   http://icedtea.classpath.org/bugzilla
#

In a Kubernetes environment, it's not trivial to excavate this error report. Because we're on OpenJDK8, we can't politely ask the JVM to push the report to stdout instead. To make the log more easily accessible I propose adding a few lines to the docker-entrypoint to check (after main process termination) if the report exists, and to cat it to stdout.

Support Hadoop-free 1.4

The scripts currently don't support the absence of a Hadoop version, though Flink 1.4 now includes a Hadoop-free release.

Update the scripts to build a Hadoop-free image, and ideally, change the Hadoop-ful images to use it as an upstream, only adding the additional jar to /opt/flink/lib.

logs question

Hello,
I am running docker versions of flink using a docker-compose.yaml file as shown on DockerHub. When I request to view logs from the Taskmanager (Stdout tab) I get the following error
java.io.IOException: TaskManager log files are unavailable. Log file location not found in environment variable log.file or configuration key taskmanager.log.path.
There are no environment variables set. I thought the default log4j and /opt/flink/log would be used.
I am running Docker 17.12.0-ce on windows.
JP

Docker's Official Images build of Flink is broken

See the Full Build Log for details.

Snippet:

gpg: key 1F302569A96CFFD5: public key "Till Rohrmann (stsffap) <[email protected]>" imported
gpg: Total number processed: 15
gpg:               imported: 15
gpg: no ultimately trusted keys found
+ gpg --batch --verify flink.tgz.asc flink.tgz
gpg: Signature made Wed Oct 17 20:23:15 2018 UTC
gpg:                using RSA key C2EED7B111D464BA
gpg: Good signature from "Chesnay Schepler (CODE SIGNING KEY) <[email protected]>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 19F2 195E 1B48 16D7 65A2  C324 C2EE D7B1 11D4 64BA
+ rm -rf /tmp/tmp.BgHhiG flink.tgz.asc
rm: can't remove '/tmp/tmp.BgHhiG/S.gpg-agent.ssh': No such file or directory
Removing intermediate container dab02cab8ec8

Essentially I think we're hitting THIS BUG.

@tianon @yosifkit any ideas from your side?

Upgrade to gosu 1.11

Currently we're using gosu 1.7 which is pretty ancient. Let's upgrade to 1.11.

RocksDB state backend causes segfault on Alpine images

Alpine Linux uses MUSL instead of glibc, but the version of RocksDB embedded in Flink apparently requires some absent glibc functionality.

java.lang.UnsatisfiedLinkError: /tmp/rocksdb-lib-33e471f8d228c175dfc6148869213083/librocksdbjni-linux64.so: Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /tmp/rocksdb-lib-33e471f8d228c175dfc6148869213083/librocksdbjni-linux64.so)

By installing the Alpine Linux package libc6-compat, this error was resolved but was replaced by this one:

java.lang.UnsatisfiedLinkError: /tmp/rocksdb-lib-fea5dea17151b5dfdc87d323576a04c3/librocksdbjni-linux64.so: Error relocating /tmp/rocksdb-lib-fea5dea17151b5dfdc87d323576a04c3/librocksdbjni-linux64.so: __strtod_internal: symbol not found

I didn't find an obvious solution to that one, but it's possible installing an additional package for compatibility would work.

We might want to consider dropping the official -alpine images until this is resolved.

cc @iemejia @StephanEwen

exec: local: not found

When I run
docker run --name flink_local -p 8081:8081 -t flink local
It returns
/docker-entrypoint.sh: 58: exec: local: not found

When I run
docker ps -a
It returns
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4134e327eb89 flink "/docker-entrypoin..." 2 minutes ago Exited (127) 2 minutes ago flink_local

How can I start it?

docker-flink standalone mode

How to use docker image deploy standalone mode, need depend the Hadoop environment ?

Dockerfile

FROM flink:1.6.2-hadoop27-scala_2.11
ENV TZ=Asia/Shanghai
ENV LANG zh_CN.UTF-8
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && \
    echo $TZ > /etc/timezone

COPY docker-entrypoint.sh /
COPY flink-conf.yaml /opt/flink/conf/flink-conf.yaml

ENTRYPOINT ["/docker-entrypoint.sh"]

flink-conf.yaml

jobmanager.rpc.port: 6123
jobmanager.heap.size: 512
taskmanager.heap.size: 512
taskmanager.numberOfTaskSlots: 1
parallelism.default: 1
rest.port: 8081
blob.server.port: 6124
query.server.port: 6125
web.tmpdir: /opt/flink/webTmp
web.log.path: /opt/flink/log
taskmanager.tmp.dirs: /opt/flink/taskManagerTmp
high-availability.storageDir: hdfs://10.1.2.109:8020/wulin/
high-availability.zookeeper.quorum: 10.1.2.11:2181
high-availability.zookeeper.path.root: /flink
gh-availability.cluster-id: /flink

docker-entrypoint.sh

#!/bin/sh
JOB_MANAGER_RPC_ADDRESS=${JOB_MANAGER_RPC_ADDRESS}

drop_privs_cmd() {
    if [ -x /sbin/su-exec ]; then
        # Alpine
        echo su-exec
    else
        # Others
        echo gosu
    fi
}

if [ "$1" = "help" ]; then
    echo "Usage: $(basename "$0") (jobmanager|taskmanager|help)"
    exit 0
elif [ "$1" = "jobmanager" ]; then
    shift 1
    echo "Starting Job Manager"
    echo "$(sed -e "s#jobmanager\.rpc\.address: localhost#jobmanager\.rpc\.address: $JOB_MANAGER_RPC_ADDRESS#g" $FLINK_HOME/conf/flink-conf.yaml)" > $FLINK_HOME/conf/flink-conf.yaml
  
    echo "config file: " && grep '^[^\n#]' $FLINK_HOME/conf/flink-conf.yaml
    exec $(drop_privs_cmd) flink "$FLINK_HOME/bin/jobmanager.sh" start-foreground "$@"
elif [ "$1" = "cluster" ]; then
    exec $(drop_privs_cmd) flink "$FLINK_HOME/bin/start-cluster.sh" start-foreground "$@"
elif [ "$1" = "taskmanager" ]; then
    TASK_MANAGER_NUMBER_OF_TASK_SLOTS=${TASK_MANAGER_NUMBER_OF_TASK_SLOTS:-$(grep -c ^processor /proc/cpuinfo)}
    echo "$(sed -e "s#jobmanager\.rpc\.address: localhost#jobmanager\.rpc\.address: $JOB_MANAGER_RPC_ADDRESS#g" $FLINK_HOME/conf/flink-conf.yaml)" > $FLINK_HOME/conf/flink-conf.yaml
    echo "$(sed -e "s#taskmanager\.numberOfTaskSlots: 1#taskmanager\.numberOfTaskSlots: $TASK_MANAGER_NUMBER_OF_TASK_SLOTS#g" $FLINK_HOME/conf/flink-conf.yaml)" > $FLINK_HOME/conf/flink-conf.yaml
   
    echo "Starting Task Manager"
    echo "config file: " && grep '^[^\n#]' "$FLINK_HOME/conf/flink-conf.yaml"
    exec $(drop_privs_cmd) flink "$FLINK_HOME/bin/taskmanager.sh" start-foreground
fi

exec "$@"

masters

10.1.2.11:8081
10.1.2.10:8081

docker build -t flink:1.0 .

docker run --name jobmanager \
    --restart always \
    --net host \
    -v $PWD/flink-conf.yaml:/opt/flink/conf/flink-conf.yaml \
    -v $PWD/masters:/opt/flink/conf/masters \
    -e JOB_MANAGER_RPC_ADDRESS=10.1.2.11 \
    -d flink:1.0 cluster

RocksDB causes error when mounting a tmpfs to /tmp

Adding a tmpfs: /tmp entry to a docker-compose file will make the RocksDB backend fail to load.

Example compose file:

version: '2.0'

services:
  flank:
    image: flink:1.5.1-hadoop28-scala_2.11
    entrypoint: |
      bash -c "
        echo 'state.backend: rocksdb' >> conf/flink-conf.yaml;
        echo 'state.backend.fs.checkpointdir: file:///tmp/checkpoints' >> conf/flink-conf.yaml;
        taskmanager.sh start-foreground local &
        jobmanager.sh start-foreground local &
        sleep 10;
        flink run examples/streaming/WindowJoin.jar
      "
    ports:
     - "8081:8081"
    tmpfs: /tmp

The output will contain an error like java.lang.UnsatisfiedLinkError: /tmp/rocksdb-lib-[...]/librocksdbjni-linux64.so: /tmp/rocksdb-lib-[...]/librocksdbjni-linux64.so: failed to map segment from shared object

I'm guessing that this is unrelated to #14 / #37.

I'm not all that familiar with JNI and shared object loading under linux, so I'm not sure this is intended to be a possible configuration.

Add tests to validate a new image release

We lack tests to ensure that the modifications we do to the docker files don't break the images, we need to automate or at least document a simple validation test for this. A simple and valuable case is a simple flink job with checkpointing.

Request: Allow user to edit number of numberOfTaskSlots, parallelism

Requesting to allow users to edit some important configurations from flink-conf.yaml like

# The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.

taskmanager.numberOfTaskSlots: 1

# The parallelism used for programs that did not specify and other parallelism.

parallelism.default: 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.