GithubHelp home page GithubHelp logo

nevakrien / ompify Goto Github PK

View Code? Open in Web Editor NEW

This project forked from talkad/ompify

0.0 0.0 0.0 186.62 MB

small change idea

Shell 0.14% JavaScript 0.02% Ruby 0.01% C++ 0.01% Python 91.88% C 4.45% ANTLR 3.49% Batchfile 0.01%

ompify's Introduction

OMPify: Automated Conversion from Serial to Shared-Memory Parallelization

The full paper can be found here.

There is an ever-present need for shared memory parallelization schemes to exploit the full potential of multi-core architectures. The most common parallelization API addressing this need today is OpenMP. Nevertheless, writing parallel code manually is complex and effort-intensive. Thus, many deterministic source-to-source (S2S) compilers have emerged, intending to automate the process of translating serial to parallel code. However, recent studies have shown that these compilers are impractical in many scenarios. In this work, we combine the latest advancements in the field of AI and natural language processing (NLP) with the vast amount of open-source code to address the problem of automatic parallelization. Specifically, we propose a novel approach, called OMPify, to detect and predict the OpenMP pragmas and shared- memory attributes in parallel code, given its serial version. OMPify is based on a Transformer-based model that leverages a graph-based representation of source code that exploits the inherent structure of code. We evaluated our tool by predicting the parallelization pragmas and attributes of a large corpus of (over 54,000) snippets of serial code written in C and C++ languages (Open-OMP-Plus). Our results demonstrate that OMPify outperforms existing approaches - the general-purposed and popular ChatGPT and targeted PragFormer models - in terms of F1 score and accuracy. Specifically, OMPify achieves up to 90% accuracy on commonly-used OpenMP benchmark tests such as NAS, SPEC, and PolyBench. Additionally, we performed an ablation study to assess the impact of different model components and present interesting insights derived from the study. Lastly, we also explored the potential of using data augmentation and curriculum learning techniques to improve the model's robustness and generalization capabilities.

In this repository, you will find the dataset and source code required to reproduce the results we obtained.

Overview

Results

We compared our model, namely OMPify, with the baseline model Pragformer, as well as with ChatGPT. The results are as follows:

SPEC Benchmark

Model Precision Recall F1 Accuracy
PragFormer 0.445 0.802 0.572 0.837
OMPify 0.572 0.854 0.685 0.894

PolyBench Benchmark

Model Precision Recall F1 Accuracy
PragFormer 0.703 0.301 0.422 0.648
OMPify 0.836 0.810 0.823 0.851

NAS Benchmark

Model Precision Recall F1 Accuracy
PragFormer 0.635 0.734 0.681 0.634
OMPify 0.731 0.886 0.801 0.766

ChatGPT Test

Model Precision Recall F1 Accuracy
ChatGPT 0.401 0.913 0.557 0.401
PragFormer 0.8153 0.7215 0.7655 0.8176
OMPify 0.839 0.818 0.828 0.860

Examples

  1. Load repositories from github between range of dates and extract all the for-loops from C and C++ programming languages:
python main.py --load "1-5-2008..10-8-2022" --parse "(c|cpp)"
  1. Show Open-MP trends in github:
python main.py --stats

output: image

ompify's People

Contributors

talkad avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.