GithubHelp home page GithubHelp logo

hpp's Introduction

High Performance Programming Labs

You find here all the necessary materials for the labs of the High Performance Programming Course.

For each session of the course, a notion will be introduced (Data Structure, Algorithms, Archictecture) and will be applied in the following lab.

The general framework of the lab is a maven project that process data from the DEBS 2015 Grand Challenge. This challenge contains data from taxi trips in NYC.

You will be asked to answer queries on the data. Each query will reflect the notions seen during the course. The goal being to answer these queries as fast as possible.

Installation

First of all, fork this project into your own account: click on the Fork icon on this page. Clone the forked project on your computer. Import the project in Eclipse via Import->Maven Project.

Running the system

Two main classes are at your disposition, the first one , MainNoNStreaming first loads all data in memory then sends the data to each query processor. The second one, MainStreamingstreams the data to the query processors.

Additional data

The repository contains a small data file with 1000 records. This file is sufficient for test purpose but is too limited for large scale processing. You need to download the 2 millions records file from here (130Mb). Unzip it in src/main/resources/data.

Create a query processor

To create a new query processor, create a new class in the package fr.tse.fi2.hpp.labs.queries.impl. Your class must extend AbstractQueryProcessor.

An exemple of an empty class:

public class SampleQueryProcessor extends AbstractQueryProcessor{

 public SampleQueryProcessor(QueryProcessorMeasure measure) {
	 super(measure);
 }

 @Override
 protected void process(DebsRecord record) {
	// Process the record
 }

}

You must complete the process method to implement the queries. This method is called for each DebsRecord that is sent by the framework. A DebsRecord contains information for one taxi trip: coordinates for pickup and dropoff, price paid, tip, ... The full list is available in the file as well as here (Data Section).

Register your query processor

To be executed, your query processor must be registered in one (or both) main classes. Edit the files to add your own query processor:

	List<AbstractQueryProcessor> processors = new ArrayList<>();
	// Add you query processor here
	processors.add(new SimpleQuerySumEvent(measure));

Write Output

To add a result to the output file simply use the writeLine(String line) method. It will automatically append a line in the results/queryN.txt file, where N is the identifier of your query processor (automatically generated).

Performance measure

The framework includes a basic measurement system. Global execution time, per query execution time and throughput are automatically written in results/result.txt.

For some labs, specific instructions will be given to produce measure with JMH.

Queries per Session

Lab 1: Discovery.

Follow the installation instruction. Verify that everything is ok with a mvn install. Install the extra data in your project. Modify the main classes to parse the sorted_data.csv file.

Remove the existing query that counts the events.

To compare performance for two implementations of the same feature, create the following queries:

  • StupidAveragePrice that puts every new trip price into a list and compute the average based on every number in the list
  • IncrementalAveragePrice that uses the previous results to incrementally compute the average.

Execute both queries and measure the difference of running time and throughput, for both streaming and non streaming case.

Lab 2: JMH.

TBD

Lab 3: Sorting Algorithms.

TBD

Evaluation

Evaluation will be made based on the code available on your forked version of this project. No additional material will be accepted.

hpp's People

Contributors

jsubercaze avatar

Watchers

frederique laforest avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.