GithubHelp home page GithubHelp logo

Comments (12)

yangsishu avatar yangsishu commented on July 19, 2024

有更详细的错误日志吗,只有这些报错很难排查的

from chunjun.

yangsishu avatar yangsishu commented on July 19, 2024

还有提交的命令发下

from chunjun.

yelijun avatar yelijun commented on July 19, 2024

提交命令:bin/flinkx -mode yarn -job ./config-json/yarn-sale_rate_sale_rate.json -plugin /data/software/flinkx/plugins -flinkconf /data/software/flink-1.5.6/conf -yarnconf /etc/hadoop/conf
nohup.out.log

我把nohup日志发出来了

from chunjun.

yelijun avatar yelijun commented on July 19, 2024

15:33:53.346 [flink-akka.actor.default-dispatcher-2] DEBUG akka.remote.transport.netty.NettyTransport - Remote connection to [test-ai-etl-c1-1/10.3.8.49:30736] was disconnected because of [id: 0x1edde4f2, /10.3.8.49:6385 :> test-ai-etl-c1-1/10.3.8.49:30736] DISCONNECTED
15:33:53.349 [flink-akka.actor.default-dispatcher-2] DEBUG akka.remote.transport.ProtocolStateActor - Association between local [tcp://flink@test-ai-etl-c1-1:6385] and remote [tcp://flink@test-ai-etl-c1-1:30736] was disassociated because the ProtocolStateActor failed: Unknown
15:33:53.353 [flink-akka.actor.default-dispatcher-2] WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink@test-ai-etl-c1-1:30736] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink@test-ai-etl-c1-1:30736]] Caused by: [The remote system explicitly disassociated (reason unknown).]
感觉和这个有关,但是不知道怎么解决

from chunjun.

yangsishu avatar yangsishu commented on July 19, 2024

你的flink yarnsession 是否启动正常

from chunjun.

yelijun avatar yelijun commented on July 19, 2024

正常启动,yarn-session.sh -d
在yarn资源管理 能查看的到

from chunjun.

lijiangbo avatar lijiangbo commented on July 19, 2024

能把yarn界面运行的yarn session截图发下吗

from chunjun.

yelijun avatar yelijun commented on July 19, 2024

image
image
这样可以不?

from chunjun.

lijiangbo avatar lijiangbo commented on July 19, 2024

taskmanager没有起起来吗,看下jobmanager的启动情况

from chunjun.

yelijun avatar yelijun commented on July 19, 2024

2019-05-09 16:11:45,653 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - --------------------------------------------------------------------------------
2019-05-09 16:11:45,654 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Registered UNIX signal handlers for [TERM, HUP, INT]
2019-05-09 16:11:45,657 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - YARN daemon is running as: root Yarn client user obtainer: root
2019-05-09 16:11:45,659 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: rest.port, 8081
2019-05-09 16:11:45,659 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: internal.cluster.execution-mode, NORMAL
2019-05-09 16:11:45,659 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1
2019-05-09 16:11:45,660 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.cluster-id, application_1544768545362_1516
2019-05-09 16:11:45,660 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, 10.3.8.49
2019-05-09 16:11:45,660 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.mb, 1024
2019-05-09 16:11:45,660 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2019-05-09 16:11:45,660 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.mb, 1024
2019-05-09 16:11:45,660 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123
2019-05-09 16:11:45,674 INFO org.apache.flink.runtime.clusterframework.BootstrapTools - Setting directories for temporary files to: /data/yarn/nm/usercache/root/appcache/application_1544768545362_1516,/data1/yarn/nm/usercache/root/appcache/application_1544768545362_1516,/data2/yarn/nm/usercache/root/appcache/application_1544768545362_1516,/data3/yarn/nm/usercache/root/appcache/application_1544768545362_1516
2019-05-09 16:11:45,686 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting YarnSessionClusterEntrypoint.
2019-05-09 16:11:45,686 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Install default filesystem.
2019-05-09 16:11:45,736 INFO org.apache.flink.runtime.security.modules.HadoopModule - Hadoop user set to root (auth:SIMPLE)
2019-05-09 16:11:45,750 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Initializing cluster services.
2019-05-09 16:11:45,755 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Trying to start actor system at test-ai-etl-c1-2:20698
2019-05-09 16:11:46,218 INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
2019-05-09 16:11:46,269 INFO akka.remote.Remoting - Starting remoting
2019-05-09 16:11:46,422 INFO akka.remote.Remoting - Remoting started; listening on addresses :[akka.tcp://flink@test-ai-etl-c1-2:20698]
2019-05-09 16:11:46,431 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Actor system started at akka.tcp://flink@test-ai-etl-c1-2:20698
2019-05-09 16:11:46,452 INFO org.apache.flink.runtime.blob.BlobServer - Created BLOB server storage directory /data3/yarn/nm/usercache/root/appcache/application_1544768545362_1516/blobStore-3e28573d-d50a-4740-9d62-e321a6053cdf
2019-05-09 16:11:46,453 INFO org.apache.flink.runtime.blob.BlobServer - Started BLOB server at 0.0.0.0:1599 - max concurrent requests: 50 - max backlog: 1000
2019-05-09 16:11:46,468 INFO org.apache.flink.runtime.metrics.MetricRegistryImpl - No metrics reporter configured, no metrics will be exposed/reported.
2019-05-09 16:11:46,472 INFO org.apache.flink.runtime.dispatcher.FileArchivedExecutionGraphStore - Initializing FileArchivedExecutionGraphStore: Storage directory /data/yarn/nm/usercache/root/appcache/application_1544768545362_1516/executionGraphStore-90531c01-f565-4b1e-8249-5d9004ee0d1c, expiration time 3600000, maximum cache size 52428800 bytes.
2019-05-09 16:11:46,495 INFO org.apache.flink.runtime.blob.TransientBlobCache - Created BLOB cache storage directory /data3/yarn/nm/usercache/root/appcache/application_1544768545362_1516/blobStore-fdba6d30-bd71-443a-8822-d7f4621efb7e
2019-05-09 16:11:46,504 WARN org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Upload directory /tmp/flink-web-e2e9a8e7-e5e7-444d-b571-683487e9fd1f/flink-web-upload does not exist, or has been deleted externally. Previously uploaded files are no longer available.
2019-05-09 16:11:46,504 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Created directory /tmp/flink-web-e2e9a8e7-e5e7-444d-b571-683487e9fd1f/flink-web-upload for file uploads.
2019-05-09 16:11:46,507 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Starting rest endpoint.
2019-05-09 16:11:46,758 INFO org.apache.flink.runtime.webmonitor.WebMonitorUtils - Determined location of main cluster component log file: /data2/yarn/container-logs/application_1544768545362_1516/container_1544768545362_1516_01_000001/jobmanager.log
2019-05-09 16:11:46,758 INFO org.apache.flink.runtime.webmonitor.WebMonitorUtils - Determined location of main cluster component stdout file: /data2/yarn/container-logs/application_1544768545362_1516/container_1544768545362_1516_01_000001/jobmanager.out
2019-05-09 16:11:46,837 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Rest endpoint listening at test-ai-etl-c1-2:15911
2019-05-09 16:11:46,837 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - http://test-ai-etl-c1-2:15911 was granted leadership with leaderSessionID=00000000-0000-0000-0000-000000000000
2019-05-09 16:11:46,837 INFO org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Web frontend listening at http://test-ai-etl-c1-2:15911.
2019-05-09 16:11:46,892 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for org.apache.flink.yarn.YarnResourceManager at akka://flink/user/resourcemanager .
2019-05-09 16:11:46,933 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for org.apache.flink.runtime.dispatcher.StandaloneDispatcher at akka://flink/user/dispatcher .
2019-05-09 16:11:46,967 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at test-ai-etl-c1-3/10.3.8.48:8030
2019-05-09 16:11:47,199 INFO org.apache.flink.yarn.YarnResourceManager - Recovered 0 containers from previous attempts ([]).
2019-05-09 16:11:47,201 INFO org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy - yarn.client.max-cached-nodemanagers-proxies : 0
2019-05-09 16:11:47,203 INFO org.apache.flink.yarn.YarnResourceManager - ResourceManager akka.tcp://flink@test-ai-etl-c1-2:20698/user/resourcemanager was granted leadership with fencing token 00000000000000000000000000000000
2019-05-09 16:11:47,203 INFO org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Starting the SlotManager.
2019-05-09 16:11:47,216 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Dispatcher akka.tcp://flink@test-ai-etl-c1-2:20698/user/dispatcher was granted leadership with fencing token 00000000-0000-0000-0000-000000000000
2019-05-09 16:11:47,218 INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Recovering all persisted jobs.
2019-05-10 14:22:29,706 INFO org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl - Updating with new AMRMToken
2019-05-10 14:22:34,709 INFO org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl - Updating with new AMRMToken
2019-05-10 14:22:39,710 INFO org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl - Updating with new AMRMToken
2019-05-10 14:22:44,712 INFO org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl - Updating with new AMRMToken
2019-05-10 14:22:49,713 INFO org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl - Updating with new AMRMToken
2019-05-10 14:22:54,715 INFO org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl - Updating with new AMRMToken

我之前也是这样的,没有提交任务都是tm个数都是0的,如果有任务提交了,就有对应个数的tm

from chunjun.

lijiangbo avatar lijiangbo commented on July 19, 2024

我看你的flink是1.5.6版本的,master分支是基于1.5.4版本,没有测过1.5.6版本

from chunjun.

yelijun avatar yelijun commented on July 19, 2024

杨思枢老师说1.5.x应该都可以的
那我用1.5.4版本的flink试试看

from chunjun.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.