GithubHelp home page GithubHelp logo

Performance about llvm HOT 4 OPEN

llir avatar llir commented on June 1, 2024
Performance

from llvm.

Comments (4)

pwaller avatar pwaller commented on June 1, 2024 1

It would be great to improve this in the future! And there is definitely some low hanging fruit to be found. @pwaller has gotten a ~10% speed increase by making changes to Textmapper, so that the parser uses values instead of pointers for subtype (%interface) AST nodes, which puts less preasure on the Go garbage collector.

Filed: inspirer/textmapper#29

from llvm.

mewmew avatar mewmew commented on June 1, 2024

The following comparison is based on rev 9c2e8ca.

llir is 1.85x slower than LLVM to parse shell.ll of SQLite.

llir is 1.55x slower than LLVM to parse vdir.ll of Coreutils.

This puts us within 2x of LLVM, which should be good enough for the v0.3.0 release.

It would be great to improve this in the future! And there is definitely some low hanging fruit to be found. @pwaller has gotten a ~10% speed increase by making changes to Textmapper, so that the parser uses values instead of pointers for subtype (%interface) AST nodes, which puts less preasure on the Go garbage collector.

@pwaller, would you mind creating an issue to track the Textmapper work above so we don't loose it? Do you have the diff somewhere of what was needed? :)

Edit: There is also work to reduce allocations and improve performance of LLVM IR output, by defining a WriteTo methodl (see #41).

The specifics for calculating the llir vs LLVM slowdown factor is presented below.

SQLite

LLVM

average real time: 1.69

$ time opt -verify < sqlite/test/shell.ll

real 1.69
user 1.58
sys 0.09

$ time opt -verify < sqlite/test/shell.ll

real 1.68
user 1.54
sys 0.12

$ time opt -verify < sqlite/test/shell.ll

real 1.70
user 1.57
sys 0.10

llir

average real time: 3.12

$ time l-tm sqlite/test/shell.ll
asm: parsing into AST took: 1.318785166s
asm: index AST top-level entities took: 105.619707ms
asm: type resolution took: 1.635337ms
asm: create IR top-level entities took: 49.509083ms
asm: translate AST to IR took: 1.630427198s
asm: add IR definitions to IR module took: 50.121644ms
total time for file "sqlite/test/shell.ll": 3.193751643s

real 3.22
user 3.67
sys 0.25

$ time l-tm sqlite/test/shell.ll
asm: parsing into AST took: 1.332296208s
asm: index AST top-level entities took: 105.263225ms
asm: type resolution took: 1.654263ms
asm: create IR top-level entities took: 49.985311ms
asm: translate AST to IR took: 1.628654783s
asm: add IR definitions to IR module took: 49.276926ms
total time for file "sqlite/test/shell.ll": 3.205750564s

real 3.23
user 3.64
sys 0.29

$ time l-tm sqlite/test/shell.ll
asm: parsing into AST took: 1.333703175s
asm: index AST top-level entities took: 109.893044ms
asm: type resolution took: 1.653571ms
asm: create IR top-level entities took: 51.090025ms
asm: translate AST to IR took: 1.327419262s
asm: add IR definitions to IR module took: 49.214236ms
total time for file "sqlite/test/shell.ll": 2.899078532s

real 2.92
user 3.74
sys 0.28

Coreutils

LLVM

average real time: 0.22

$ time opt -verify < coreutils/test/vdir.ll

real 0.24
user 0.20
sys 0.03

$ time opt -verify < coreutils/test/vdir.ll

real 0.22
user 0.19
sys 0.02

$ time opt -verify < coreutils/test/vdir.ll

real 0.21
user 0.20
sys 0.01

llir

average real time: 0.34

$ time l-tm coreutils/test/vdir.ll
asm: parsing into AST took: 183.7889ms
asm: index AST top-level entities took: 7.405762ms
asm: type resolution took: 135.637µs
asm: create IR top-level entities took: 4.598706ms
asm: translate AST to IR took: 147.208001ms
asm: add IR definitions to IR module took: 3.592691ms
total time for file "coreutils/test/vdir.ll": 355.543323ms

real 0.36
user 0.54
sys 0.03

$ time l-tm coreutils/test/vdir.ll
asm: parsing into AST took: 168.837893ms
asm: index AST top-level entities took: 9.344586ms
asm: type resolution took: 179.538µs
asm: create IR top-level entities took: 4.867541ms
asm: translate AST to IR took: 125.596136ms
asm: add IR definitions to IR module took: 3.579648ms
total time for file "coreutils/test/vdir.ll": 322.248448ms

real 0.32
user 0.40
sys 0.04

$ time l-tm coreutils/test/vdir.ll
asm: parsing into AST took: 166.491659ms
asm: index AST top-level entities took: 9.805071ms
asm: type resolution took: 244.146µs
asm: create IR top-level entities took: 11.506275ms
asm: translate AST to IR took: 130.762989ms
asm: add IR definitions to IR module took: 3.573189ms
total time for file "coreutils/test/vdir.ll": 332.843371ms

real 0.33
user 0.38
sys 0.05

from llvm.

mewmew avatar mewmew commented on June 1, 2024

As we've reached the hand-wavy ~2x slowdown target for the v0.3.0 release, I've changed the milestone to future, as we will constantly seek to improve the performance of llir. It would be really interesting also to explore using concurrency to spin up the cores of multicore machines. I tried a naive approach for this and it seems to be a valid approach for bringing wall clock run time improvements. We just need to figure out how to do it in a less naive way, and also to find the trade-off factors for when we should spin up Go-routines and how many to run. We can define threshold limits for these.

Challenge: beat LLVM wall time performance using concurrency

Anyone interested in experimenting a bit, have a look at asm/translate.go which defines the high-level AST to IR translation order. In it, a few good candidates for concurrent execution have been identified:

Order of translation

Note: step 3 and the substeps of 4a can be done concurrently.
Note: the substeps of 4a can be done concurrently.
Note: the substeps of 4b can be done concurrently.
Note: steps 5-7 can be done concurrently.
Note: the substeps of 8 can be done concurrently.

  1. Index AST top-level entities.

  2. Resolve IR type definitions.

    a. Index type identifiers and create scaffolding IR type definitions
    (without bodies).

    b. Translate AST type definitions to IR.

  3. Translate AST comdat definitions to IR.

Note: step 3 and the substeps of 4a can be done concurrently.

  1. Resolve remaining IR top-level entities.

    a. Index top-level identifiers and create scaffolding IR top-level
    declarations and definitions (without bodies but with types).

    Note: the substeps of 4a can be done concurrently.

    1. Index global identifiers and create scaffolding IR global
      declarations and definitions, indirect symbol definitions, and
      function declarations and definitions (without bodies but with
      types).

    2. Index attribute group IDs and create scaffolding IR attribute group
      definitions (without bodies).

    3. Index metadata names and create scaffolding IR named metadata
      definitions (without bodies).

    4. Index metadata IDs and create scaffolding IR metadata definitions
      (without bodies).

    b. Translate AST top-level declarations and definitions to IR.

    Note: the substeps of 4b can be done concurrently.

    1. Translate AST global declarations and definitions, indirect symbol
      definitions, and function declarations and definitions to IR.

    2. Translate AST attribute group definitions to IR.

    3. Translate AST named metadata definitions to IR.

    4. Translate AST metadata definitions to IR.

Note: steps 5-7 can be done concurrenty.

  1. Translate use-list orders.

  2. Translate basic block specific use-list orders.

  3. Fix basic block references in blockaddress constants.

  4. Add IR top-level declarations and definitions to the IR module in order of
    occurrence in the input.

    Note: the substeps of 8 can be done concurrently.

    a. Add IR type definitions to the IR module in alphabetical order.

    b. Add IR comdat definitions to the IR module in alphabetical order.

    c. Add IR global variable declarations and definitions, indirect symbol
    definitions, and function declarations and definitions to the IR module
    in order of occurrence in the input.

    d. Add IR attribute group definitions to the IR module in numeric order.

    e. Add IR named metadata definitions to the IR module in order of
    occurrence in the input.

    f. Add IR metadata definitions to the IR module in numeric order.

from llvm.

mewmew avatar mewmew commented on June 1, 2024

Moving performance related note from TODO.md to this issue.

from llvm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.