Comments (3)
The bootstrap assumes you are using YARN and comes ready. It still requires submitting the app with master set to 'yarn' (https://spark.apache.org/docs/latest/submitting-applications.html).
What problem are you running into specifically?
from emr-bootstrap-actions.
I am trying to run spark-examples (JavaKinesisWordCountASLYARN)
I had created EMR and used spark-install bootstrap to install spark 1.3.0c
I had also updated "EMR_DefaultRole" and attached " AdministratorAccess" and "AmazonKinesisFullAccess" to it.
I read example running instruction in Java file..
/home/hadoop/spark/examples/src/main/java/org/apache/spark/examples/streaming/JavaKinesisWordCountASLYARN.java
Example:
* $ export AWS_ACCESS_KEY_ID=<your-access-key>
* $ export AWS_SECRET_KEY=<your-secret-key>
* $ $SPARK_HOME/bin/run-example \
* org.apache.spark.examples.streaming.JavaKinesisWordCountASLYARN mySparkStream \
* https://kinesis.us-east-1.amazonaws.com
*
* There is a companion helper class called KinesisWordCountProducerASL which puts dummy data
* onto the Kinesis stream.
* Usage instructions for KinesisWordCountProducerASL are provided in the class definition.
*/
I first created kinesis stream "mySparkStream"
and using putty ran below command on EMR master node....
Used companion helper class called KinesisWordCountProducerASL which puts dummy data
Verified that records are written within kinesis stream/shard
After writing data I ran below command
./run-example org.apache.spark.examples.streaming.JavaKinesisWordCountASLYARN mySparkStream https://kinesis.us-east-1.amazonaws.com
and I got below error...
[hadoop@ip-10-76-215-114 bin]$ ./run-example org.apache.spark.examples.streaming.JavaKinesisWordCountASLYARN mySparkStream https://kinesis.us-east-1.amazonaws.com
Spark assembly has been built with Hive, including Datanucleus jars on classpath
15/03/27 09:46:09 INFO spark.SparkContext: Running Spark version 1.3.0
15/03/27 09:46:09 WARN spark.SparkConf:
SPARK_CLASSPATH was detected (set to '/home/hadoop/spark/conf:/home/hadoop/conf:/home/hadoop/spark/classpath/emr/*:/home/hadoop/spark/classpath/emrfs/*:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/share/hadoop/common/lib/hadoop-lzo.jar').
This is deprecated in Spark 1.0+.
Please instead use:
- ./spark-submit with --driver-class-path to augment the driver classpath
- spark.executor.extraClassPath to augment the executor classpath
15/03/27 09:46:09 WARN spark.SparkConf: Setting 'spark.executor.extraClassPath' to '/home/hadoop/spark/conf:/home/hadoop/conf:/home/hadoop/spark/classpath/emr/*:/home/hadoop/spark/classpath/emrfs/*:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/share/hadoop/common/lib/hadoop-lzo.jar' as a work-around.
15/03/27 09:46:09 WARN spark.SparkConf: Setting 'spark.driver.extraClassPath' to '/home/hadoop/spark/conf:/home/hadoop/conf:/home/hadoop/spark/classpath/emr/*:/home/hadoop/spark/classpath/emrfs/*:/home/hadoop/share/hadoop/common/lib/*:/home/hadoop/share/hadoop/common/lib/hadoop-lzo.jar' as a work-around.
15/03/27 09:46:10 INFO spark.SecurityManager: Changing view acls to: hadoop
15/03/27 09:46:10 INFO spark.SecurityManager: Changing modify acls to: hadoop
15/03/27 09:46:10 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
15/03/27 09:46:11 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/03/27 09:46:11 INFO Remoting: Starting remoting
15/03/27 09:46:11 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:53585]
15/03/27 09:46:11 INFO util.Utils: Successfully started service 'sparkDriver' on port 53585.
15/03/27 09:46:11 INFO spark.SparkEnv: Registering MapOutputTracker
15/03/27 09:46:11 INFO spark.SparkEnv: Registering BlockManagerMaster
15/03/27 09:46:11 INFO storage.DiskBlockManager: Created local directory at /mnt/spark/spark-deaa2da9-53bc-4ceb-9659-25ce2ac1904e/blockmgr-1ce48168-1548-4dc3-86e8-ff2ff4f3bad5
15/03/27 09:46:12 INFO storage.MemoryStore: MemoryStore started with capacity 265.4 MB
15/03/27 09:46:12 INFO spark.HttpFileServer: HTTP File server directory is /mnt/spark/spark-0a9ca43d-0d92-4959-9d0a-1bf34537c6fc/httpd-89729846-8a4f-4b20-b7bb-086590343ee4
15/03/27 09:46:12 INFO spark.HttpServer: Starting HTTP Server
15/03/27 09:46:12 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/03/27 09:46:12 INFO server.AbstractConnector: Started [email protected]:52041
15/03/27 09:46:12 INFO util.Utils: Successfully started service 'HTTP file server' on port 52041.
15/03/27 09:46:12 INFO spark.SparkEnv: Registering OutputCommitCoordinator
15/03/27 09:46:12 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/03/27 09:46:12 INFO server.AbstractConnector: Started [email protected]:4040
15/03/27 09:46:12 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
15/03/27 09:46:12 INFO ui.SparkUI: Started SparkUI at http://ip-10-76-215-114.ec2.internal:4040
15/03/27 09:46:13 INFO spark.SparkContext: Added JAR file:/home/hadoop/spark/lib/spark-examples-1.3.0-hadoop2.4.0.jar at http://10.76.215.114:52041/jars/spark-examples-1.3.0-hadoop2.4.0.jar with timestamp 1427449573470
15/03/27 09:46:13 INFO cluster.YarnClusterScheduler: Created YarnClusterScheduler
15/03/27 09:46:13 ERROR cluster.YarnClusterSchedulerBackend: Application ID is not set.
15/03/27 09:46:14 INFO netty.NettyBlockTransferService: Server created on 57600
15/03/27 09:46:14 INFO storage.BlockManagerMaster: Trying to register BlockManager
15/03/27 09:46:14 INFO storage.BlockManagerMasterActor: Registering block manager ip-10-76-215-114.ec2.internal:57600 with 265.4 MB RAM, BlockManagerId(<driver>, ip-10-76-215-114.ec2.internal, 57600)
15/03/27 09:46:14 INFO storage.BlockManagerMaster: Registered BlockManager
Exception in thread "main" java.lang.NullPointerException
at org.apache.spark.deploy.yarn.ApplicationMaster$.sparkContextInitialized(ApplicationMaster.scala:581)
at org.apache.spark.scheduler.cluster.YarnClusterScheduler.postStartHook(YarnClusterScheduler.scala:32)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
at org.apache.spark.streaming.StreamingContext$.createNewSparkContext(StreamingContext.scala:642)
at org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:75)
at org.apache.spark.streaming.api.java.JavaStreamingContext.<init>(JavaStreamingContext.scala:132)
at org.apache.spark.examples.streaming.JavaKinesisWordCountASLYARN.main(JavaKinesisWordCountASLYARN.java:127)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
[hadoop@ip-10-76-215-114 bin]$
from emr-bootstrap-actions.
This is a duplicate of issue 80. JavaKinesisWordCountASLYARN
was added back when the examples hard coded the master to local . Given apache/spark@d16e161 this direct example is no longer valid and better form is to follow the stock example which does not set the master in code.
from emr-bootstrap-actions.
Related Issues (20)
- bootstrapping opentsdb using emr-4.6.0, HBASE_HOME issue HOT 1
- Support Scala 2.11 HOT 1
- zookeeper version is invalid HOT 1
- Installing latest version of Impala on EMR HOT 10
- Permission denied error AMI 3.11.0 HOT 1
- Bootstrap for Apache Kylin HOT 3
- is there any plan to create one BA for JCE? HOT 1
- Error downloading file from Amazon S3 HOT 4
- Kafka support on EMR 5.x HOT 2
- Support jupyter notebook HOT 1
- Reading LZO files from Spark stand alone program HOT 1
- Persto 0.157.1 in EMR is facing issues regarding client side encryption AWS KMS Master Key HOT 1
- running an s3 jar file with dependencies HOT 1
- Installing latest version of Impala on EMR HOT 1
- Bootstrap for Sentry HOT 1
- Add bootstrap script to install netdata HOT 1
- sudo R command not found, when using the emR_bootstrap.sh
- Error while reading core-site.xml in elasticsearch bootstrap action HOT 1
- EMR cluster fails at boot strap HOT 1
- Bootstrap has execute failed to my shell script file on S3 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from emr-bootstrap-actions.