GithubHelp home page GithubHelp logo

pace_2024's Introduction

PACE 2024 - Arcee - Student Submission

This is our submission to the Pace Challenge 2024.

This year’s challenge is about the one-sided crossing minimization problem (OCM). This problem involves arranging the nodes of a bipartite graph on two layers (typically horizontal), with one of the layers fixed, aiming to minimize the number of edge crossings. OCM is one of the basic building block used for drawing hierarchical graphs. It is NP-hard, even for trees, but admits good heuristics, can be constant-factor approximated and solved in FPT time. For an extended overview, see Chapter 13.5 of the Handbook of Graph Drawing. [Pace]

Contributors

Requirements

  • CMake 3.12 or higher
  • a C++17 compiler (we used gcc/g++ 13.3.0)

Set Timelimit for the heuristic solver

Maybe it is useful to set a timelimit for the heuristic solver. You can do by editing the following line in the src/heuristic_solver/heuristic_solver.hpp file.

 public:
    explicit HeuristicSolver(std::chrono::milliseconds limit =
                                 std::chrono::milliseconds(1000 * 60 * 5 -
                                                           1000 * 15))
        : SolutionSolver(limit) {}                      

By default we set a timelimit of 4 miniutes and 45 seconds. You can change the value of 1000 * 60 * 5 - 1000 * 15 to any value you want.

Build

Build Heuristic Solver

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make
./heuristic_solver < path/to/your/gr.file

Build Exact Solver and Parameterized solver

Requirements

  • Google OR-Tools (Highly recommended)

Ensure you have installed Google OR-Tools on your system. Take a look at the official binaries or build and install it from source. We have tested the code with Google OR-Tools version 9.10.

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release -DENABLE_ILP_SOLVER=ON .. 
make
./feedback_edge_set_solver < path/to/your/gr.file

When everything is set up correctly, you should see somewhere the following output:

...
-- Check ILP Solver: ON
...

If your can not install Google OR-Tools on your system, you can use the following command to build the exact solver and parameterized solver (but it is much slower). Make sure to remove the build directory before rebuilding the project with ortools.

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release  .. 
make
./feedback_edge_set_solver < path/to/your/gr.file

Submitting to Optil.io

Since Optil.io does not seems to build in Release mode, you can uncomment the following line in the CMakeLists.txt file to sumbit to Optil.io:

# Set the build type to Release (needed for optil.io)
#set(CMAKE_BUILD_TYPE Release)

Rebuilding the project

Ensure you have removed the build directory before rebuilding the project. For example, when activate/deactivate the ilp solver.

rm -rf build

pace_2024's People

Contributors

jtshark avatar lucidluckylee avatar jessepalarus avatar spalarus avatar charlyhauser avatar gdorndorf avatar kimonboehmer avatar

Stargazers

Paul avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pace_2024's Issues

A few suggestions for the code

I've spotted a few minor efficiency issues in the code, hopefully those can scrape a little bit of performance still :)
There is no obligation to implement any of them, they are just suggestions, and they might even be wrong because I don't fully understand all the code. If anything is unclear, I'm up for discussion :)
Here's the list:

  1. RR1 will produce a ton of cache-misses because the two accesses to the crossing matrix are [a][b] and [b][a], which are unlikely to be in the same cache-line; it should be possible to do this more cache-friendly. Likewise, in RR3 and RRLarge, accesses to [a][b] will be WAY more frequent than accesses into [b][a], so make sure these accesses are cache-friendly.

    } else if (graph.crossing.matrix[b][a] == 0) {

    On the subject of cache-misses, CrossingMatrix::comparable(a,b) checks in matrix[b][a] so be aware that chaining calls to comparable(a,b) will be more efficient when grouped by b, not by a.
    return lt(a, b) || lt(b, a) || a == b;

    Speaking of crossing matrices, it seems that you're restricting the matrix to only be used when there are 10.000 or less nodes, in which case it would be sufficient to store uint16_t in the matrix instead of ints to save a factor of 2 on the space.

  2. vector<bool> is inefficiently implemented in STL because of iterator requirements (every bool occupies 32 or 64bits because iterators must point to valid (aligned) memory), I recommend std::bitset if the size is known or use boost's, or search the internet, or implement your own, or use mine (no guarantees, also needs a 2-3 header dependencies from the same repo and needs C++20 for concepts).

    std::vector<bool> already_deleted(graph.size_free, false);

  3. In C++, a class with only public data members is called a struct

    class FeedbackEdgeHeuristicParameter {

  4. It is recommended to use std::accumulate for this; in general, loops should be avoided if there is an STL algorithm that does it already, since those are highly engineered

  5. There is no need to declare a default-constructor, C++ will generate one for you (rule of zero).

  6. std::shared_ptr are inefficient when compared to raw pointers. If you get your ownership right, then you won't need shared pointers. Indeed, the owner of the Circles and the Edges is clearly the FeedbackEdgeInstance and it is not shared between the FeedbackEdgeInstance and its Edges. As long as no Edge or Circle lives longer than its FeedbackEdgeInstance (and they really should not), you'll be fine with FeedbackEdgeInstance storing a vector (unordered_set if you need to check containment) of unique_ptr<Edge> and replacing all other shared_ptr<Edge> with Edge* which has WAY less overhead (same with Circles). Note that I didn't check for side effects (do you extract a list of shared_ptr<Edge> from an instance that you're then destructing? In that case, keep the shared_ptr).

    std::vector<std::shared_ptr<Circle>> circles;

  7. Did I get this right, that Edge has a numberOfCircles? Can this not be returned from Edge::circles::size()?

    int numberOfCircles = 0;

  8. You're already using ranges when you say for(auto& e: edges), so you can also use std::ranges::sort for sorting

  9. You have a few functions that are way too long and should be split into smaller functions with descriptive names if possible

    metaRapsConstruction(FeedbackEdgeInstance &instance,

  10. std::remove may take linear time in the vector length, are you sure you're not better off with an unordered_set from which you can remove in constant time?

    edges.erase(std::remove(edges.begin(), edges.end(), usedEdge),

  11. I will make a seperate Issue for const-correctness, but here's one issue I saw:

    for (int i = 1; i <= current_order.size(); ++i) {

    for (int i = 1; i <= current_order.size(); ++i) -- if you don't tell the compiler that current_order is constant, it might not be able to optimize away the call to size() which will always return the same thing. If the compiler knew that this will not change, it will translate this into one call to size() in the beginning then then just use that value instead of calling size() each time in the loop.

  12. This should not be done with vector::insert, not even with vector::push_back but with std::iota.

    for (int i = 0; i < graph.size_free; ++i) {

  13. vector::insert may take linear time in the vector length, it may have to re-allocate and copy everything after the insertion point. Is a std::vector even the right data-structure here or should this rather be a linked-list into which we can insert in constant time? Another possibility is using a std::deque because it has better memory management when it comes to inserting items in random positions. At least, allocate the correct size of the vector in the beginning of the function using vector::reserve to avoid re-allocations.

    current_order.insert(current_order.begin() + best_position,

  14. calculatingCrossingNumber is a function that will be called extensively. You should take care that the check for whether the matrix is initialized or not is not done all the time.
    https://github.com/lucidLuckylee/pace_2024/blob/72cab00613e21c8a4372f41262458df7824560f5/src/pace_graph/pace_graph.cpp#L391C33-L391C58
    Personally, I would template the class on whether or not the number of nodes is too big, so the check will be done only once in runtime. Also, you'd never have to call initilize_if_possible because the constructor would initialize if it was possible.

  15. Here, the vector-initialization is a little strange.

    std::vector<double> nodeOffset = std::vector<double>(graph.size_free);

  16. There are checks here that don't depend on the loop variable, so they should be done outside the loop.

    if (meanPositionParameter.useJittering && iteration != 0) {

  17. Calling the vector constructor in the for-loop causes a memory allocation on each iteration; it's better to re-use the same vector and clear it via vector::clear which is much faster than destruction + construction.

    std::vector<double> newNodeOffset = std::vector<double>(nodeOffset);

  18. If you're not changing your parameters for the local search, it would be better to replace the run-time if-checks with compile-time if constexpr checks.

    switch (meanPositionParameter.meanType) {

  19. In the time-critical part of jittering, you're actually copying the neighborhood-vector of the node i instead of just getting a reference to it (actually, you should get a const-reference to it, since you're not going to modify the neighbors).

    auto neighbors_of_node = graph.neighbors_free[i];

    same issue here
    auto crossing_matrix_for_i = graph.crossing.matrix[i];

    and probably on multiple more occasions.

  20. It might be worth thinking about whether or not the crossing matrix caches the sum of each row seperately, so you wouldn't have to re-compute it every time here

    auto crossing_matrix_for_i = graph.crossing.matrix[i];

  21. maybe use structured bindings here, as you have before (auto [x, y] = ...):

    std::tie(crossing_matrix_u_v, crossing_matrix_v_u) =

  22. Maybe the class Order does not need to translate from vertices to positions.
    The translation is used here:

    int posOfV = order.get_position(v);

    and
    int posOfV = bestOrder.get_position(v);

    where I think it can be made obsolete by iterating through order instead of the free vertices of the graph
    However, I'm not so sure that it can be avoided here:
    bool usedInGlobalUB = instance.globalUBOrder.get_position(e->end) <

    I don't think there are other calls to the translation.

  23. Why not just break here if i == 0 ?

const-correctness

Again, just a few hints for increasing performance and correctness of code, there is no obligation whatsoever to do this.

I noticed that your code is very parsimonious with the keyword const, which is a pity because, by expressing that something is constant, you not only help the compiler optimize things (see the other issue), but you also help your future self understand an intent behind a variable or a method (in case you didn't know: methods can be annotated as const stating that they will not change the member variables; this allows calling const-methods of objects that were declared const).

There are other pitfalls: if you call operator[] on an std::unordered_map with an argument that is not in the map, then an entry in the map is created (this is called "unintentional use of brackets-operator"). If you call operator[] on a const std::unordered_map, this will instead throw an exception if the argument is not found in the map which would be way better if you didn't intend to create a new entry in the map.

Here's a great talk by Jason Turner on how using const helps avoid "Code Smells" in C++: https://www.youtube.com/watch?v=f_tLQl0wLUM

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.