Comments (4)
Roles have been supported in EMR Hadoop for some time (even in Hadoop 1.x).
No need to provide access key in the Spark app, the S3 implementation will
pick it up automatically based on the instance role assigned.
On Tuesday, June 9, 2015, Jim Kleckner [email protected] wrote:
I think this line suggests that the current Spark version will work with
AWS: S3 will be accessible according to the policy of the associated
role
https://github.com/awslabs/emr-bootstrap-actions/blame/master/spark/examples/spark-submit-via-step.md#L18Our current code works with s3 by providing the access key but before
monkeying around with converting to roles, I just wanted to confirm that it
should work.Here is a snippet of code used to configure a test:
val sparkConf = new SparkConf().setAppName("appname") val sc = new SparkContext(sparkConf) sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", config.getString("AWS_ACCESS_KEY")) sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", config.getString("AWS_SECRET_KEY")) val hdfs = HDFS(sc.hadoopConfiguration)
Note that the underlying implementations in hadoop claim that this is
implemented in Hadoop 2.6 (
https://issues.apache.org/jira/browse/HADOOP-10400 ) where EMR 3.7 is on
Hadoop 2.4.0Thanks.
—
Reply to this email directly or view it on GitHub
#120.
from emr-bootstrap-actions.
Great, thanks. Roles rock.
from emr-bootstrap-actions.
@christopherbozeman would you mind including a pointer to example code? Does it also work with s3n:// or does it require s3:// for the URI scheme? What would be the simplest way to detect that this spark job is running in EMR?
from emr-bootstrap-actions.
The URI can be either. So if you want to build code to use in and outside
of EMR just use s3n://. No need to even adjust the Hadoop
configuration. Some code examples can be seen at
https://github.com/awslabs/emr-bootstrap-actions/tree/master/spark/examples.
If you need something more specific let me know and I can provide it next
week.
To detect EMR you could look for the JSON files as described at
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/Config_JSON.html
if
running on master or in cluster mode.
On Wednesday, June 10, 2015, Jim Kleckner [email protected] wrote:
@christopherbozeman https://github.com/christopherbozeman would you
mind including a pointer to example code? Does it also work with s3n:// or
does it require s3:// for the URI scheme? What would be the simplest way to
detect that this spark job is running in EMR?—
Reply to this email directly or view it on GitHub
#120 (comment)
.
from emr-bootstrap-actions.
Related Issues (20)
- bootstrapping opentsdb using emr-4.6.0, HBASE_HOME issue HOT 1
- Support Scala 2.11 HOT 1
- zookeeper version is invalid HOT 1
- Installing latest version of Impala on EMR HOT 10
- Permission denied error AMI 3.11.0 HOT 1
- Bootstrap for Apache Kylin HOT 3
- is there any plan to create one BA for JCE? HOT 1
- Error downloading file from Amazon S3 HOT 4
- Kafka support on EMR 5.x HOT 2
- Support jupyter notebook HOT 1
- Reading LZO files from Spark stand alone program HOT 1
- Persto 0.157.1 in EMR is facing issues regarding client side encryption AWS KMS Master Key HOT 1
- running an s3 jar file with dependencies HOT 1
- Installing latest version of Impala on EMR HOT 1
- Bootstrap for Sentry HOT 1
- Add bootstrap script to install netdata HOT 1
- sudo R command not found, when using the emR_bootstrap.sh
- Error while reading core-site.xml in elasticsearch bootstrap action HOT 1
- EMR cluster fails at boot strap HOT 1
- Bootstrap has execute failed to my shell script file on S3 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from emr-bootstrap-actions.