GithubHelp home page GithubHelp logo

Comments (4)

christopherbozeman avatar christopherbozeman commented on May 26, 2024

Roles have been supported in EMR Hadoop for some time (even in Hadoop 1.x).
No need to provide access key in the Spark app, the S3 implementation will
pick it up automatically based on the instance role assigned.

On Tuesday, June 9, 2015, Jim Kleckner [email protected] wrote:

I think this line suggests that the current Spark version will work with
AWS: S3 will be accessible according to the policy of the associated
role

https://github.com/awslabs/emr-bootstrap-actions/blame/master/spark/examples/spark-submit-via-step.md#L18

Our current code works with s3 by providing the access key but before
monkeying around with converting to roles, I just wanted to confirm that it
should work.

Here is a snippet of code used to configure a test:

val sparkConf = new SparkConf().setAppName("appname")
val sc = new SparkContext(sparkConf)
sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", config.getString("AWS_ACCESS_KEY"))
sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", config.getString("AWS_SECRET_KEY"))
val hdfs = HDFS(sc.hadoopConfiguration)

Note that the underlying implementations in hadoop claim that this is
implemented in Hadoop 2.6 (
https://issues.apache.org/jira/browse/HADOOP-10400 ) where EMR 3.7 is on
Hadoop 2.4.0

Thanks.


Reply to this email directly or view it on GitHub
#120.

from emr-bootstrap-actions.

jkleckner avatar jkleckner commented on May 26, 2024

Great, thanks. Roles rock.

from emr-bootstrap-actions.

jkleckner avatar jkleckner commented on May 26, 2024

@christopherbozeman would you mind including a pointer to example code? Does it also work with s3n:// or does it require s3:// for the URI scheme? What would be the simplest way to detect that this spark job is running in EMR?

from emr-bootstrap-actions.

christopherbozeman avatar christopherbozeman commented on May 26, 2024

The URI can be either. So if you want to build code to use in and outside
of EMR just use s3n://. No need to even adjust the Hadoop
configuration. Some code examples can be seen at
https://github.com/awslabs/emr-bootstrap-actions/tree/master/spark/examples.
If you need something more specific let me know and I can provide it next
week.

To detect EMR you could look for the JSON files as described at
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/Config_JSON.html
if
running on master or in cluster mode.

On Wednesday, June 10, 2015, Jim Kleckner [email protected] wrote:

@christopherbozeman https://github.com/christopherbozeman would you
mind including a pointer to example code? Does it also work with s3n:// or
does it require s3:// for the URI scheme? What would be the simplest way to
detect that this spark job is running in EMR?


Reply to this email directly or view it on GitHub
#120 (comment)
.

from emr-bootstrap-actions.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.