GithubHelp home page GithubHelp logo

pipeline-optimization's Introduction

Solution 1: Data Pipeline Prioritization.py

Prioritizes tasks that have the highest importance and the longest amount of time. (Importance is the sum of minutes left of dependent tasks from the same group + tasks without a group.) Yields a minimum time path for the short and big pipeline. Most efficient solution of all.

Solution 2: Data Pipeline Planning Monte Carlo.py

When selecting the next task to be processed choose a random task with a dependency time of 0. (Hence, doesn't depend on other tasks, or they have already been processed.) Loops through the code 1000 times save the results and prints the minimum amount of time. Not the most efficient sollution since the bigger the pipeline and the more the CPUs the longer it is goin to take. Can't be 100% the one is goint to get the most optimal solution.

Solution 3: Data Pipeline Planning Combinations.py

When a CPU core is idle find all possible combinations of idle CPU cores and tasks that are ready to execute filtering out combinations that are equivelent). When more than one combination is available the first combination of tasks is loaded into the CPU the rest of the combinations are saved into a list HistoryState objects. The HistoryState preserves the current state of the execution with another option of tasks loaded into the CPU cores. This way all different combinations of all possible pipeline paths are simulated. Takes long when too few CPU cores. (Tried to optimize it with multithreading. However since its Python it actually took longer to run.)

Solution 4: Data Pipeline Planning BFS.py (Unfinished)

Attempt to use Breadth First Search. The idea was to combination of possible paths(object that contains list of Taks) to load into a cou core when it is avalable. Coundn't figure it out. Works only on pipeline_small.txt

For all solutions, I have assumed that groups of tasks have to be executed in order 'raw','feature', 'model', 'meta_models'

pipeline-optimization's People

Contributors

hristohr avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.