GithubHelp home page GithubHelp logo

paradoxicalturn / scala-hadoop-example Goto Github PK

View Code? Open in Web Editor NEW

This project forked from milesegan/scala-hadoop-example

0.0 2.0 0.0 95 KB

A translation of the WordCount example from the Hadoop tutorial from Java to Scala.

Scala 83.23% Shell 16.77%

scala-hadoop-example's Introduction

summary

This is a translation of the WordCount example from the Apache Hadoop Map/Reduce Tutorial to scala. I ran into a few snags making this work myself so I thought I'd bundle up a working example and hopefully save other people some trouble.

I've tried to follow the java example as closely as possible in the scala version, so I haven't tried to impose any higher level of abstraction on the code, even though you can imagine building something much more expressive on top of this with scala.

I chopped out the extra argument parsing logic that was in the java example because I think it just obscures the point of the example. Adding it back to the scala version is left as an exercise for the reader.

Note: this example requires Scala 2.8.

running the scala WordCount example

  1. install hadoop and make sure the hadoop script is on your path
  2. install scala and make sure scalac is on your path
  3. copy the hadoop-core jar from the root directory of the hadoop distribution to the directory in which you've checked out this tutorial
  4. copy the commons-logging jar from the lib directory of the hadoop distribution to the directory in which you've checked out this tutorial
  5. copy the commons-cli jar from the lib directory of the hadoop distribution to the directory in which you've checked out this tutorial
  6. copy the scala-library.jar jar from the lib directory of the scala distribution to the directory in which you've checked out this tutorial
  7. run the scala version of WordCount with the run.sh script included here

scala-hadoop-example's People

Watchers

James Cloos avatar Adrian avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.