Hey. I'm trying to run the Hadoop jobtracker in a containerised bridged network mode (i.e. the default network mode for Docker). My goal is to launch the jobtracker with Marathon and map the ports randomly to my host system, finding the web-ui and jobtracker IPC port through service discovery.
The hostname of the host (not the Docker container hostname, which is a random hash string) is available through environment variable $HOSTNAME, and using this at runtime when launching the jobtracker in host-mode works fine. That is with --net=host
when issuing docker run
. The script I have for doing this is very similar to this one.
When running in bridged mode, I first tried setting the mapred.job.tracker
property to localhost:9001
, and equally with the web-ui property. However, this disabled external exposure of the port - as in I got no contact with the container when running the mapping using docker run -p 8080:50030 -p 9001:9001 ...
Chaning the mapred.job.tracker
to $HOST:9001
where $HOST
equals the hostname of the docker container enabled me to contact the Docker container and it seems to work alright - the only bummer is that the hostname in the web-ui is a stupid Docker string, which would be nice to override, but nevermind. But, everything seems to be working - until I look in Mesos.
In mesos I see a new Hadoop: (RPC port: 9001, WebUI port: 50030)
framework trying to register itself every 3 second or so, without success. Enabling more debugging on the client side (using export GLOG_v=2
) I see the following output when starting up the tracker:
I0210 15:39:21.288871 881 process.cpp:2692] Resuming [email protected]:58589 at 2015-02-10 15:39:21.288862976+00:00
I0210 15:39:21.288918 881 pid.cpp:87] Attempting to parse '[email protected]:5050' into a PID
I0210 15:39:21.288990 881 sched.cpp:234] New master detected at [email protected]:5050
I0210 15:39:21.289201 881 sched.cpp:242] No credentials provided. Attempting to register without authentication
I0210 15:39:21.289227 881 sched.cpp:481] Sending registration request to [email protected]:5050
I0210 15:39:21.289579 891 process.cpp:2692] Resuming zookeeper-master-detector(1)@198.41.200.200:58589 at 2015-02-10 15:39:21.289572096+00:00
15/02/10 15:39:21 INFO util.HostsFileReader: Setting the includes file to
15/02/10 15:39:21 INFO util.HostsFileReader: Setting the excludes file to
15/02/10 15:39:21 INFO util.HostsFileReader: Refreshing hosts (include/exclude) list
15/02/10 15:39:21 INFO mapred.JobTracker: Decommissioning 0 nodes
15/02/10 15:39:21 INFO ipc.Server: IPC Server Responder: starting
15/02/10 15:39:21 INFO ipc.Server: IPC Server listener on 9001: starting
15/02/10 15:39:21 DEBUG ipc.Server: IPC Server handler 0 on 9001: starting
15/02/10 15:39:21 DEBUG ipc.Server: IPC Server handler 1 on 9001: starting
15/02/10 15:39:21 DEBUG ipc.Server: IPC Server handler 2 on 9001: starting
15/02/10 15:39:21 DEBUG ipc.Server: IPC Server handler 3 on 9001: starting
15/02/10 15:39:21 DEBUG ipc.Server: IPC Server handler 5 on 9001: starting
15/02/10 15:39:21 DEBUG ipc.Server: IPC Server handler 6 on 9001: starting
15/02/10 15:39:21 DEBUG ipc.Server: IPC Server handler 7 on 9001: starting
15/02/10 15:39:21 DEBUG ipc.Server: IPC Server handler 4 on 9001: starting
15/02/10 15:39:21 DEBUG ipc.Server: IPC Server handler 8 on 9001: starting
15/02/10 15:39:21 INFO mapred.JobTracker: Starting RUNNING
15/02/10 15:39:21 DEBUG ipc.Server: IPC Server handler 9 on 9001: starting
I0210 15:39:22.289831 882 process.cpp:2692] Resuming [email protected]:58589 at 2015-02-10 15:39:22.289812992+00:00
I0210 15:39:22.289913 882 sched.cpp:481] Sending registration request to [email protected]:5050
I0210 15:39:23.290331 892 process.cpp:2692] Resuming [email protected]:58589 at 2015-02-10 15:39:23.290322176+00:00
I0210 15:39:23.290382 892 sched.cpp:481] Sending registration request to [email protected]:5050
I0210 15:39:24.290717 881 process.cpp:2692] Resuming [email protected]:58589 at 2015-02-10 15:39:24.290708992+00:00
I0210 15:39:24.290767 881 sched.cpp:481] Sending registration request to [email protected]:5050
I0210 15:39:25.291088 893 process.cpp:2692] Resuming [email protected]:58589 at 2015-02-10 15:39:25.291079168+00:00
I0210 15:39:25.291139 893 sched.cpp:481] Sending registration request to [email protected]:5050
And the "Resuming scheduler", "Sending registration request"... output continues forever.
I0210 10:28:00.393909 31003 master.cpp:1383] Received registration request for framework 'Hadoop: (RPC port: 9001, WebUI port: 50030)' at [email protected]:42973
I0210 10:28:00.394260 31003 master.cpp:1447] Registering framework 20150204-151306-1176176138-5050-30988-1936 (Hadoop: (RPC port: 9001, WebUI port: 50030)) at [email protected]:42973
I0210 10:28:00.394639 30999 hierarchical_allocator_process.hpp:329] Added framework 20150204-151306-1176176138-5050-30988-1936
I0210 10:28:00.395987 31005 master.cpp:3843] Sending 1 offers to framework 20150204-151306-1176176138-5050-30988-1936 (Hadoop: (RPC port: 9001, WebUI port: 50030)) at [email protected]:42973
I0210 10:28:00.732815 31008 master.cpp:3843] Sending 1 offers to framework 20150204-151306-1176176138-5050-30988-1936 (Hadoop: (RPC port: 9001, WebUI port: 50030)) at [email protected]:42973
I0210 10:28:01.071971 31002 hierarchical_allocator_process.hpp:405] Deactivated framework 20140618-174325-1209730570-5050-4637-0002
I0210 10:28:01.394320 30997 master.cpp:1383] Received registration request for framework 'Hadoop: (RPC port: 9001, WebUI port: 50030)' at [email protected]:42973
I0210 10:28:01.394753 30997 master.cpp:1434] Framework 20150204-151306-1176176138-5050-30988-1936 (Hadoop: (RPC port: 9001, WebUI port: 50030)) at [email protected]:42973 already registered, resending acknowledgement
I0210 10:28:02.394582 31011 master.cpp:1383] Received registration request for framework 'Hadoop: (RPC port: 9001, WebUI port: 50030)' at [email protected]:42973
I0210 10:28:02.395097 31011 master.cpp:1434] Framework 20150204-151306-1176176138-5050-30988-1936 (Hadoop: (RPC port: 9001, WebUI port: 50030)) at [email protected]:42973 already registered, resending acknowledgement
I0210 10:28:03.363574 31000 master.cpp:789] Framework 20150204-151306-1176176138-5050-30988-1936 (Hadoop: (RPC port: 9001, WebUI port: 50030)) at [email protected]:42973 disconnected
I0210 10:28:03.363788 31000 master.cpp:1752] Disconnecting framework 20150204-151306-1176176138-5050-30988-1936 (Hadoop: (RPC port: 9001, WebUI port: 50030)) at [email protected]:42973
I0210 10:28:03.363852 31000 master.cpp:1768] Deactivating framework 20150204-151306-1176176138-5050-30988-1936 (Hadoop: (RPC port: 9001, WebUI port: 50030)) at [email protected]:42973
I0210 10:28:03.363956 31002 hierarchical_allocator_process.hpp:405] Deactivated framework 20150204-151306-1176176138-5050-30988-1936
I0210 10:28:03.364524 31000 master.cpp:811] Giving framework 20150204-151306-1176176138-5050-30988-1936 (Hadoop: (RPC port: 9001, WebUI port: 50030)) at [email protected]:42973 0ns to failover
I0210 10:28:03.364547 31008 hierarchical_allocator_process.hpp:563] Recovered cpus(*):15.8; mem(*):192135; ports(*):[31000-32000, 8001-9000]; disk(*):1.51388e+06 (total allocatable: cpus(*):15.8; mem(*):192135; ports(*):[31000-32000, 8001-9000]; disk(*):1.51388e+06) on slave 20150204-135039-1176176138-5050-11013-S0 from framework 20150204-151306-1176176138-5050-30988-1936
I0210 10:28:03.364784 31007 master.cpp:3713] Framework failover timeout, removing framework 20150204-151306-1176176138-5050-30988-1936 (Hadoop: (RPC port: 9001, WebUI port: 50030)) at [email protected]:42973
I0210 10:28:03.364966 31007 master.cpp:4271] Removing framework 20150204-151306-1176176138-5050-30988-1936 (Hadoop: (RPC port: 9001, WebUI port: 50030)) at [email protected]:42973
I0210 10:28:03.365542 31007 hierarchical_allocator_process.hpp:360] Removed framework 20150204-151306-1176176138-5050-30988-1936
I0210 10:28:03.394796 31011 master.cpp:1383] Received registration request for framework 'Hadoop: (RPC port: 9001, WebUI port: 50030)' at [email protected]:42973
I0210 10:28:03.395130 31011 master.cpp:1447] Registering framework 20150204-151306-1176176138-5050-30988-1937 (Hadoop: (RPC port: 9001, WebUI port: 50030)) at [email protected]:42973
I0210 10:28:03.395417 31005 hierarchical_allocator_process.hpp:329] Added framework 20150204-151306-1176176138-5050-30988-1937
I0210 10:28:03.396935 30996 master.cpp:3843] Sending 1 offers to framework 20150204-151306-1176176138-5050-30988-1937 (Hadoop: (RPC port: 9001, WebUI port: 50030)) at [email protected]:42973
I0210 10:28:03.583683 31005 http.cpp:478] HTTP request for '/master/state.json'
I0210 10:28:03.737133 30998 master.cpp:3843] Sending 1 offers to framework 20150204-151306-1176176138-5050-30988-1937 (Hadoop: (RPC port: 9001, WebUI port: 50030)) at [email protected]:42973
This too, continuing in a loop like this forever.
My questions:
Sorry for the long wall of text here, but I didn't want to exclude any (perhaps) important details. Appreciate any feedback, also those not necessarily giving away "the solution". :-)