GithubHelp home page GithubHelp logo

ai-align's Projects

antlr4 icon antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

boto icon boto

Python interface to Amazon Web Services

cf-baesd-recsys icon cf-baesd-recsys

Recommend System based on Collaborative Filter with the Netflix Data

ctci icon ctci

Cracking the Coding Interview

gym icon gym

A toolkit for developing and comparing reinforcement learning algorithms.

mapreduce_facility icon mapreduce_facility

Project 3: Design and Implementation Guidelines General If you make any assumptions or simplifications, please state them clearly in your report. You are welcome to use Java's RMI, or your own RMI framework from Project 2. Both are allowed. You are not allowed to store any data in AFS to share it between different components in your system. If the purpose of its use is not to replace some functionality of your system, you are allowed to use AFS as an exception. For example, for DFS bootstraping, it is acceptable to have original files stored in AFS so that all the data nodes can read them and create replicas as needed. Once the file system is set up, the data in AFS must no longer be used. Report As mentioned in the project handout, avoid describing your design in terms of classes and methods. We are interested in what components work together to accomplish the goals in your system, and how they do it. E.g., we would like to know how the servers and clients are deployed (where, how many etc.), rather than what classes and methods realize the server functionalities. The latter is best placed in comments in your code. MapReduce Computation Model It is not required that your MapReduce framework explicitly support global sorting of final results. Namely, it suffices to be able to have results produced by Reducers locally sorted. If you would like to support global sorting, you are welcome to make intermediate key-value pairs produced by Mappers available to Reducers running on different machines; this can be done by, for example, writing them in your distributed file system instead of the local file system of Mapper nodes, or directly transferring them to Reducers over the network if you wish. Scheduling Scheduler should try to enforce work conservation, and launch as many Mappers and Reducers as allowed by the task's computation model. Specifically, it should exploit multiple CPU cores on each compute node. On Andrew machines, reading /proc/cpuinfo will give you information about the number of cores. Scheduler should have a reasonable policy considering the cost of data accesses (local or remote) and overall task throughput. For example, it is undesirable to leave some available cores idle simply because Mappers on them could access input data only remotely. Distributed File System You are allowed to assume that input files for Mappers are text files and Mappers are processing input line-by-line. You can then split them into replicas at line breaks, so that individual input entries processed by Mappers do not span multiple replicas. This will also simplify determining the cost of a particular scheduling decision. Upon data node failures, it is recommended (although not required) that your file system create new replicas for those on the failing nodes, to keep the replication factor enforced. On bootstrapping, the replication factor must be enforced.

new-project icon new-project

This repository contains a template you can use to seed a repository for a new open source project.

node icon node

evented I/O for v8 javascript

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.