GithubHelp home page GithubHelp logo

mapreduce's Introduction

MapReduce

Past Tasks

@Week10

  • AUG, 26, 2015
    • Attend the Symposium

@Week9

  • AUG, 21, 2015

    • [Research note] Introduction updated (authored by Yoonseung)
    • [Presentation] OOM Case Simulation contents updated (authored by Yoonseung)
  • AUG, 18, 2015

    • [OOM Case 01.] Alternative algorithm updated (authored by Yoonseung)
  • AUG, 19, 2015

    • [OOM Case 01.] Simulation executed in Pseudo-distributed operation
    • Test 05.CountSD program update (author: Soyeong Park)
  • AUG, 17, 2015

    • [OOM Case 01.] Simulation executed in Fully-distributed operation
    • [OOM Case 01.] revised (authored by Yoonseung)
    • 05.CountSD program updated (author: Soyeong Park)
    • Test 05.CountSD program update (author: Soyeong Park)

@Week8

  • Aug, 16, 2015

    • Stackoverflow.com OOM Case uploaded (authored by Yoonseung)
  • AUG, 15, 2015

    • Test 04.AverageCommentLength's programs (author: Soyeong Park)
  • AUG, 14, 2015

    • Test 03.CountMinMax's programs (author: Soyeong Park)

@Week7

  • AUG, 12, 2015

    • 05.distributedGrep program updated (author: Yoonseung Choi)
    • 04.AverageCommentLength program updated (author: Soyeong Park)
  • AUG, 11, 2015

    • 03.CountMinMax program updated (author: Soyeong Park)
    • 04.ageStdDev program updated (author: Yoonseung Choi)
  • AUG, 10, 2015

    • Midterm presentation (with korea government visitor)
    • Draft of 'running time table' excel file committed
    • 00.CountLocation program updated (author: Soyeong Park)
    • 01.AverageLocation program updated (author: Soyeong Park)

@Week6

  • AUG, 7, 2015
    • Running time Measurement
    • Prepare for presentation

@Week5 - Silicon valley trip

@Week4

  • JUL, 24, 2015

    • Pseudo-distribute Environment Implementation
    • ageAverage program test as a pseudo-distribute process
      • elapsed processing time: 1 min 6 s
  • JUL, 22, 2015

    • Average age based on location program implementation
    • ageAverage program test as a single-node process
      • elapsed processing time: 1 min 27 s
  • JUL, 21, 2015

    • Study & Implement XML Parser [4, PP.xi~xiii]
    • Implement Location Counter [4, P.22]

@Week3

  • JUL, 17, 2015
    • Analysis Stackoverflow OOM cases [3]

@Week2

  • JUL, 9, 2015

    • Analysis WordCount’s source code [1][2]
    • Study Maven
  • JUL, 7, 2015

    • Study MapReduce Architecture [1][2]
    • Study ITask programming model (*confidential draft paper)

@Week1

  • JUL, 3, 2015
    • Install Hadoop & Java
    • Execute WordCount for 1-node

Todos

  • Full Distribute Environment Implementation
  • Set up Raspberry Pi Hadoop clusters

References

[1] Jeffrey Dean, Sanjay Ghemawat, ‘MapReduce: Simplified Data Processing on Large Clusters’, Google,Inc.

[2] 한기용, ‘Do it! 직접 해보는 하둡 프로그래밍’, 이지스퍼블리싱, 2013.

[3] Lijie Xu, ‘An Empirical study on real-world OOM cases in MapReduce jobs’, Institute of Software, Chinese Academy of Sciences

[4] Donald Miner, Adam Shook, ’MapReduce Design Patterns’, O’Reilly Media,Inc, 2012.

mapreduce's People

Contributors

othellowhite avatar ardiums avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.