GithubHelp home page GithubHelp logo

Comments (11)

steveniemitz avatar steveniemitz commented on June 3, 2024

Do you see anything interesting in the mesos agent logs? (slave-6 in this case). It looks like its exactly 10 seconds after the ACCEPT was sent, which seems suspiciously like the offer is timing out. Do you have an offer timeout set to 10 seconds on mesos by any chance?

There should be a log entry directly after that accept saying something like Starting broker 0: launching task ... by offer ...

Can you also check the master logs and see if you see the Launch Tasks message getting there from the framework?

Finally, what version of mesos are you running?

from kafka.

ror6ax avatar ror6ax commented on June 3, 2024

In fact, only entry in the logs apart from ACCEPTs and immediate DECLINEs is happening when I re-start the broker. Neither clients nor master sow anything related to Kafka in the logs. I'm currently debugging some problems due to overlay network configuration on the cluster, do you think this might be related?

from kafka.

ror6ax avatar ror6ax commented on June 3, 2024

I have switched to Marathon scheduler. Logs are a bit more verbose now.
Broker start fails with

2017-03-14 15:04:09,148 INFO   BrokerLifecycleManager] Broker 0 failed 2, waiting 2m, next start ~ 2017-03-14 15:06:09Z
2017-03-14 15:04:09,713 INFO      KafkaMesosScheduler] [executorLost] executor:#aad1c slave:#22-S3 status:0
2017-03-14 15:05:27,117 INFO                BrokerApi] Handling start for broker 0
2017-03-14 15:05:27,118 INFO   BrokerLifecycleManager] Activating broker 0
2017-03-14 15:05:27,118 INFO             OfferManager] Re-requesting previously suppressed offers.
2017-03-14 15:05:27,124 INFO        BrokerTaskManager] Starting broker 0: launching task kafka-0-8af35576-7318-4cae-902f-49efba5224be by offer some-slave-3#36557
 kafka-0-8af35576-7318-4cae-902f-49efba5224be slave:#22-S0 cpus:1.00; mem:2048.00; ports:[2380..2380] data:{"id":0,"options":{},"syslog":false,"log4jOptions":{},"defaults":{"broker.id":"0","log.dirs":"kafka-logs","port":"2380","zookeeper.connect":"master.mesos:2181","host.name":"some-slave-3","log.retention.bytes":"10737418240"}}
2017-03-14 15:05:27,316 INFO      KafkaMesosScheduler] [executorLost] executor:#90a7d slave:#22-S0 status:256
2017-03-14 15:05:27,317 INFO      KafkaMesosScheduler] [statusUpdate] kafka-0-8af35576-7318-4cae-902f-49efba5224be TASK_FAILED slave:#22-S0 reason:REASON_CONTAINER_LAUNCH_FAILED message:Failed to launch container: Failed to fetch all URIs for container 'a57754aa-62f8-4dd6-9cf1-7a0f9efb5e8d' with exit status: 256
2017-03-14 15:05:27,317 INFO   BrokerLifecycleManager] Broker 0 failed 1, waiting 1m, next start ~ 2017-03-14 15:06:27Z
2017-03-14 15:06:26,688 INFO        BrokerTaskManager] Starting broker 0: launching task kafka-0-1505ef66-a712-404f-95ed-2f97ee60edd3 by offer some-slave-6#36870
 kafka-0-1505ef66-a712-404f-95ed-2f97ee60edd3 slave:#22-S3 cpus:1.00; mem:2048.00; ports:[2384..2384] data:{"id":0,"options":{},"syslog":false,"log4jOptions":{},"defaults":{"broker.id":"0","log.dirs":"kafka-logs","port":"2384","zookeeper.connect":"master.mesos:2181","host.name":"some-slave-6","log.retention.bytes":"10737418240"}}
2017-03-14 15:06:27,672 INFO      KafkaMesosScheduler] [statusUpdate] kafka-0-1505ef66-a712-404f-95ed-2f97ee60edd3 TASK_STARTING slave:#22-S3 reason:REASON_COMMAND_EXECUTOR_FAILED
2017-03-14 15:06:28,299 INFO      KafkaMesosScheduler] [statusUpdate] kafka-0-1505ef66-a712-404f-95ed-2f97ee60edd3 TASK_FAILED slave:#22-S3 reason:REASON_COMMAND_EXECUTOR_FAILED message:java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at ly.stealth.mesos.kafka.executor.MetricCollectorProxy.ly$stealth$mesos$kafka$executor$MetricCollectorProxy$$collect(MetricCollectorProxy.scala:49)
	at ly.stealth.mesos.kafka.executor.MetricCollectorProxy.start(MetricCollectorProxy.scala:45)
	at ly.stealth.mesos.kafka.executor.KafkaServer$Distro$.startCollector(BrokerServer.scala:201)
	at ly.stealth.mesos.kafka.executor.KafkaServer.start(BrokerServer.scala:90)
	at ly.stealth.mesos.kafka.executor.Executor$$anon$1.run(Executor.scala:94)

from kafka.

steveniemitz avatar steveniemitz commented on June 3, 2024

That's odd that you weren't getting those verbose logs before.

Anyways I suspect the issue is the executor crashing on launch

2017-03-14 15:06:28,299 INFO      KafkaMesosScheduler] [statusUpdate] kafka-0-1505ef66-a712-404f-95ed-2f97ee60edd3 TASK_FAILED slave:#22-S3 reason:REASON_COMMAND_EXECUTOR_FAILED message:java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at ly.stealth.mesos.kafka.executor.MetricCollectorProxy.ly$stealth$mesos$kafka$executor$MetricCollectorProxy$$collect(MetricCollectorProxy.scala:49)
	at ly.stealth.mesos.kafka.executor.MetricCollectorProxy.start(MetricCollectorProxy.scala:45)
	at ly.stealth.mesos.kafka.executor.KafkaServer$Distro$.startCollector(BrokerServer.scala:201)
	at ly.stealth.mesos.kafka.executor.KafkaServer.start(BrokerServer.scala:90)
	at ly.stealth.mesos.kafka.executor.Executor$$anon$1.run(Executor.scala:94)

I think I remember someone reporting seeing this when the version of kafka the framework was built against didn't match the version they were running. Do yours match? The default build is 0.10.2.0 now.

Also, it seems like some of your mesos agents might not be able to hit the scheduler:

2017-03-14 15:05:27,317 INFO      KafkaMesosScheduler] [statusUpdate] kafka-0-8af35576-7318-4cae-902f-49efba5224be TASK_FAILED slave:#22-S0 reason:REASON_CONTAINER_LAUNCH_FAILED message:Failed to launch container: Failed to fetch all URIs for container 'a57754aa-62f8-4dd6-9cf1-7a0f9efb5e8d' with exit status: 256

from kafka.

ror6ax avatar ror6ax commented on June 3, 2024

I've substituded --api to Mesos-DNS provided name, got one step closer:

2017-03-14 15:14:45,877 INFO      KafkaMesosScheduler] [statusUpdate] kafka-0-a7a43b22-9088-4b97-9672-8f8f82286b70 TASK_STARTING slave:#22-S3 reason:REASON_COMMAND_EXECUTOR_FAILED
2017-03-14 15:14:46,484 INFO      KafkaMesosScheduler] [statusUpdate] kafka-0-a7a43b22-9088-4b97-9672-8f8f82286b70 TASK_FAILED slave:#22-S3 reason:REASON_COMMAND_EXECUTOR_FAILED message:java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at ly.stealth.mesos.kafka.executor.MetricCollectorProxy.ly$stealth$mesos$kafka$executor$MetricCollectorProxy$$collect(MetricCollectorProxy.scala:49)
	at ly.stealth.mesos.kafka.executor.MetricCollectorProxy.start(MetricCollectorProxy.scala:45)
	at ly.stealth.mesos.kafka.executor.KafkaServer$Distro$.startCollector(BrokerServer.scala:201)
	at ly.stealth.mesos.kafka.executor.KafkaServer.start(BrokerServer.scala:90)
	at ly.stealth.mesos.kafka.executor.Executor$$anon$1.run(Executor.scala:94)
Caused by: java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
	at ly.stealth.mesos.kafka.executor.MetricCollectorProxy$MetricsCollector.collectJvmMetrics(MetricCollectorProxy.scala:111)
	at ly.stealth.mesos.kafka.executor.MetricCollectorProxy$MetricsCollector.run(MetricCollectorProxy.scala:194)

Ypu mean if version in docker matches the one i do ./kafka-mesos.sh broker start 0 with?

Both built from same repo based on https://github.com/mesos/kafka 20 minutes apart.
I only modified the dockerfile to use Java 8 since my build machine uses it and there were "major.minor 52.0" build error problems.

from kafka.

steveniemitz avatar steveniemitz commented on June 3, 2024

Yep, pretty sure this is due to a kafka mismatch. Make sure you're running the scala 2.11 build of kafka and are the version matches what kafka-mesos was built against.

from kafka.

ror6ax avatar ror6ax commented on June 3, 2024

gradle.build says scalaVersion = "2.11.8".

Now, I'm not sure how do I verify I run it using same Scala.

 Add mesos repo
RUN \
  apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv E56151BF && \
  echo deb http://repos.mesosphere.io/ubuntu trusty main > /etc/apt/sources.list.d/mesosphere.list

# Install deps
RUN apt-get update
RUN apt-get install -qy software-properties-common python-software-properties
RUN apt-get install -qy debconf-utils
RUN \
  add-apt-repository ppa:webupd8team/java -y && \
  apt-get update && \
  apt-get install -qy \
    git vim zip mc curl
RUN echo "oracle-java8-installer shared/accepted-oracle-license-v1-1 select true" | sudo debconf-set-selections
RUN apt-get install -y oracle-java8-installer
RUN apt-get install -qy mesos libc6

# Add kafka-mesos & kafka
COPY .docker/kafka* /opt/kafka-mesos/
COPY ./docker-entrypoint.sh /docker-entrypoint.sh

It only says Java 8 is installed, how do I make sure what I launch it with?

from kafka.

steveniemitz avatar steveniemitz commented on June 3, 2024

I mean the version of kafka, that needs to be the 2.11 version. What is the file name of the kafka .tgz you're using?

If you're using the docker build script I'd be weary, I don't know if it still works (or ever worked...)

from kafka.

ror6ax avatar ror6ax commented on June 3, 2024

It turns out my gradle said Mesos version 0.28.0 while I'm running 1.1.0, so I'm testing that.

kafka-mesos-0.10.1.0-SNAPSHOT-kafka_2.11-0.10.1.1.jar
kafka_2.10-0.8.2.2.tgz

Ahaaa. "build-image.sh" has hardcoded values! I missed that during the build. Let me rebuild and see if it helps.

from kafka.

ror6ax avatar ror6ax commented on June 3, 2024

it works. would you be interested in PR?

from kafka.

steveniemitz avatar steveniemitz commented on June 3, 2024

sorry, didn't see this! Yes please! :D

from kafka.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.