llir / llvm Goto Github PK
View Code? Open in Web Editor NEWLibrary for interacting with LLVM IR in pure Go.
Home Page: https://llir.github.io/document/
License: BSD Zero Clause License
Library for interacting with LLVM IR in pure Go.
Home Page: https://llir.github.io/document/
License: BSD Zero Clause License
Just want to question a4f2487. Is there a good rationale for it? One rule I try to abide by is "only have one name for a thing, to the extent it is possible". So it should be type Func
xor func NewFunction
.
Corresponds to requirement 8, decomp/decomp#98.
LLVM distinguishes between unnamed local variables (e.g. %42), and named local variables (e.g. %"42"). To be compatible, we should too.
Example test case llvm/test/Analysis/DominanceFrontier/new_pm_test.ll:
define void @a_linear_impl_fig_1() nounwind {
0:
br label %"1"
1:
br label %"2"
2:
br label %"3"
3:
br i1 1, label %"13", label %"4"
4:
br i1 1, label %"5", label %"1"
5:
br i1 1, label %"8", label %"6"
6:
br i1 1, label %"7", label %"4"
7:
ret void
8:
br i1 1, label %"9", label %"1"
9:
br label %"10"
10:
br i1 1, label %"12", label %"11"
11:
br i1 1, label %"9", label %"8"
13:
br i1 1, label %"2", label %"1"
12:
switch i32 0, label %"1" [ i32 0, label %"9"
i32 1, label %"8"]
}
When parsing the example file above, we currently get the error:
invalid local ID in function "@a_linear_impl_fig_1", expected %12, got %13
This is because the basic block names are treated as unnamed IDs, and their order is out of place, since basic block 13 appears before 12.
I think it would be very useful to have a way of finding the operands of an instruction.
Here's an example of a code generator which makes such a function.
https://gist.github.com/pwaller/255654cd78b77484a02cdfaa6a22237c
In the end, I don't know exactly how I feel about it (especially the code generation part...).
But I guess this gets me quite close to being able to implement a simple dead code pass which can kill unused private functions.
Example use:
func loopOperands(irModule *ir.Module) {
for _, f := range irModule.Funcs {
for _, bb := range f.Blocks {
for _, i := range bb.Insts {
log.Println("inst:", i.Def())
var tmp [16]irvalue.Value
for _, o := range ir.Operands(tmp[:0], i) {
log.Println(" op:", o)
}
}
}
}
}
I note that Operands() could almost return pointers to values, so that the references were mutable. However, this is broken. The only reason it is broken that I can find is the Scope
field on InstCatchPad
and InstCleanupPad
. I think if we want to be able to obtain mutable references to Operands, those fields should become of types value.Value
. I guess there are pros and cons to that. But if you want mutable references to operands I think the alternatives are going to be much uglier.
The intention is to provide read support for LLVM IR assembly using a Gocc generated lexer and parser from a BNF grammar of the LLVM IR assembly language.
The BNF grammar is located at ast/internal/ll.bnf. The reason to keep the grammar in an internal
directory, is because the lexer and parser packages generated by Gocc will be considered internal packages, and should not be used by end-users directly. Instead, high-level libraries will make use of these internal packages to parse LLVM IR assembly into the data structures of the llir/llvm/ir package.
Since LLVM IR makes use of unnamed local variables and basic blocks, a context is required to keep track of and map local IDs to their associated values. A bit unfortunate, but this essentially means we cannot use syntax directed translation to translate directly from LLVM IR assembly to the data structures of the ir package. Instead, we must introduce an intermediate step which keeps the necessary information around for us to create and make use of this contextual information. Said and done, the current approach is to define an ast package for LLVM IR assembly, which will later be traversed to create the aforementioned context and translate AST nodes into their corresponding ir data types.
To get a feel for what the production action expressions of Gocc looks like, see the follow example.
FuncDef
: "define" OptFuncLinkage
FuncHeader FuncBody << irx.NewFuncDef($2, $3) >>
;
If anyone manages to figure out a clean way for us to skip this step (i.e. not having to translate from BNF grammar to AST, then from AST to ir data types; but instead, translating directly from BNF grammar to ir data types), and go directly from the BNF grammar to the ir package data types using production action expressions, please let us know. This would facilitate the maintainability and future development of this package a lot!
Currently, the type of struct, array and vector constants is inferred by the elements and fields passed to their respective constructors. The idea was to make it easier for users to create these constants. However, there are valid cases where the user may which to pass a specific type to these constructors, especially the NewStruct
constructor as struct types are equated by type identity and not structural equality.
Also for consistency with NewInt, NewFloat and other constructors of constants, we may wish to add a type as the first argument to the constructors NewStruct
, NewArray
and NewVector
.
I'll leave this open for discussion, so we can collect different benefits and drawbacks with the various approaches.
To be specific, this issue suggests to update the constant.NewStruct, constant.NewArray and constant.NewVector constructors, as follows:
package constant
-func NewStruct(fields ...Constant) *Struct
+func NewStruct(t *types.StructType, fields ...Constant) *Struct
-func NewArray(elems ...Constant) *Array
+func NewArray(t *types.ArrayType, elems ...Constant) *Array
-func NewVector(elems ...Constant) *Vector
+func NewVector(t *types.VectorType, elems ...Constant) *Vector
Input:
define void @fn() {
call void @"quoted1"()
call void @"quoted 2"()
ret void
}
declare void @"quoted1"()
declare void @"quoted 2"()
Output:
panic: unable to locate global identifier "\"quoted1\""
goroutine 1 [running]:
github.com/llir/llvm/asm/internal/astx.(*fixer).getGlobal(0xc421d896a8, 0xc420012381, 0x9, 0x5fa520, 0xc421d45360)
/home/dominikh/prj/src/github.com/llir/llvm/asm/internal/astx/fix.go:345 +0x12c
[...]
getGlobal and getLocal in astx/fix.go do not handle quoted names correctly. The maps store unquoted names, but the lookup includes the surrounding quotes.
At the moment, every Def() allocates its own string builder. This results in a lot of allocation and copying overhead for building IR outputs.
I don't (yet) have a benchmark showing this to be a problem, but in terms of API it would be nice to supply a writer and have the llvm package write directly there.
This issue is a reminder to come back to this.
This idea was originally presented by @quarnster in #3 (comment). Creating a dedicated issue to track any discussions on the LLVM-dev mailing list, and implementation discussions. Please post updates in this issue if you read about the direction in which the definition of a stable C LLVM API is heading (this is still an active topic of discussion).
With go 1.5's support for creating c-shared libraries, I really like the idea of having a go generate tool which generates the standard C LLVM api (or the implemented subset anyways). That way this code could be used as a drop in replacement for anything that currently uses the LLVM C api, presuming all the functions used are implemented.
Just figured I'd mention this as it's been on my mind due to the discussion on the llvm-dev list about potentially splitting the LLVM C api into a separate project.
The documentation for NewCall says it may have one of the following types:
*ir.Function
*types.Param
*constant.ExprBitCast
*ir.InstBitCast
*ir.InstLoad
*ir.InlineAsm
However, ir.InlineAsm doesn't implement the value.Named interface, so trying to use it as the callee parameter results in a compile time error about *ir.InlineAsm not implementing GetName()
Corresponds to requirement 7, decomp/decomp#97.
Ref: https://travis-ci.org/llir/llvm/builds/180995728
### gofmt
./asm/internal/token/token.go
./asm/internal/lexer/lexer.go
./asm/internal/lexer/transitiontable.go
./asm/internal/lexer/acttab.go
./asm/internal/parser/actiontable.go
./asm/internal/parser/productionstable.go
./asm/internal/parser/gototable.go
./asm/internal/parser/action.go
./asm/internal/parser/parser.go
./asm/internal/util/rune.go
./asm/internal/util/litconv.go
./asm/internal/errors/errors.go
Extract from gofmt -d
diff ./util/litconv.go gofmt/./util/litconv.go
--- /tmp/gofmt318333961 2016-12-04 12:01:27.498715220 +0100
+++ /tmp/gofmt274563540 2016-12-04 12:01:27.498715220 +0100
@@ -1,4 +1,3 @@
-
// generated by gocc; DO NOT EDIT.
//Copyright 2013 Vastech SA (PTY) LTD
diff ./util/rune.go gofmt/./util/rune.go
--- /tmp/gofmt147755799 2016-12-04 12:01:27.502048553 +0100
+++ /tmp/gofmt271031690 2016-12-04 12:01:27.502048553 +0100
@@ -1,4 +1,3 @@
-
// generated by gocc; DO NOT EDIT.
//Copyright 2013 Vastech SA (PTY) LTD
diff ./parser/parser.go gofmt/./parser/parser.go
--- /tmp/gofmt528173389 2016-12-04 12:01:27.522048553 +0100
+++ /tmp/gofmt702109256 2016-12-04 12:01:27.522048553 +0100
@@ -1,9 +1,8 @@
-
// generated by gocc; DO NOT EDIT.
package parser
-import(
+import (
"bytes"
"fmt"
@@ -20,16 +19,16 @@
// Stack
type stack struct {
- state []int
- attrib []Attrib
+ state []int
+ attrib []Attrib
}
const iNITIAL_STACK_SIZE = 100
func newStack() *stack {
- return &stack{ state: make([]int, 0, iNITIAL_STACK_SIZE),
- attrib: make([]Attrib, 0, iNITIAL_STACK_SIZE),
- }
+ return &stack{state: make([]int, 0, iNITIAL_STACK_SIZE),
+ attrib: make([]Attrib, 0, iNITIAL_STACK_SIZE),
+ }
}
func (this *stack) reset() {
@@ -42,8 +41,8 @@
this.attrib = append(this.attrib, a)
}
-func(this *stack) top() int {
- return this.state[len(this.state) - 1]
+func (this *stack) top() int {
+ return this.state[len(this.state)-1]
}
...
How to create a function with VAArg like int printf ( const char * format, ... );
?
For example:
mod := ir.NewModule()
f := mod.NewFunc(
"printf",
types.I32,
ir.NewParam("format", types.NewPointer(types.I8)),
)
fmt.Printf("%s\n", f.Def())
After searching, I think llir only has instruction about using VAArg but can't create a function with VAArg.
This meta issue is meant to track the implementation of test cases. Ideally these test cases will be implemented after the API skeleton has been drafted but prior to the implementation of any core logic.
Just came across 1f63577 which changes the type of array/vector lengths from int64
to uint64
. This broke some code I have that multiplies by the bitsize of the element type because now they have different types.
I don't mind too much which type is used, but it seems to me that whatever logic is used to choose signed vs unsigned would apply equally well to both, and the consistency breakage is a downside
Lines 208 to 214 in acfb969
Lines 601 to 609 in acfb969
Recently, we've been porting the suite of test cases from the official LLVM project. Many have helped uncover corner cases in the grammar, and the AST to IR translation code.
One of the corner cases seem quite strange though, as it seems valid to use attribute IDs (e.g. #42
) in LLVM IR modules not containing any associated attribute group definition (e.g. #42 = {...}
).
For instance, test/DebugInfo/X86/parameters.ll uses #0
, #1
and #2
, but only contains definitions for #0
and #1
. The definition for #2
is missing.
define void @_ZN7pr147634funcENS_3fooE(%"struct.pr14763::foo"* noalias sret %agg.result, %"struct.pr14763::foo"* %f) #0
declare void @llvm.dbg.declare(metadata, metadata, metadata) #1
declare void @_ZN7pr147633fooC1ERKS0_(%"struct.pr14763::foo"*, %"struct.pr14763::foo"*) #2
attributes #0 = { uwtable }
attributes #1 = { nounwind readnone }
Any ideas why this may be? Also, how shall we handle these issues? It seems to be an error that should be reported, but since Clang and opt
silently ignores it, perhaps we must too.
@pwaller what are your thoughts?
The generated lexer tokenizes header:
as a token distinct from LabelIdent
as header:
is used as the field name of the specialized metadata node GenericDINodeField.
For this reason, any input file containing a basic block named header:
will report a syntax error.
Example from llvm/test/Analysis/ScalarEvolution/2008-02-15-UMax.ll:
define i32 @foo(i32 %n) {
entry:
br label %header
header:
%i = phi i32 [ 100, %entry ], [ %i.inc, %next ]
%cond = icmp ult i32 %i, %n
br i1 %cond, label %next, label %return
next:
%i.inc = add i32 %i, 1
br label %header
return:
ret i32 %i
}
Just a reminder to remove the asm: parsing into AST took: 24.431252ms
debug output before the v0.3.0 release.
Input file issue_27.ll
:
; minimal test case adapted from the @main function of base32.ll, as part of
; coreutils in https://github.com/decomp/testdata
define i32 @main(i32, i8**) {
entry:
br label %loop_init
loop_init:
br label %loop_post
loop_cond:
%cond = icmp ult i32 %i.0, 42
br i1 %cond, label %loop_post, label %loop_exit
loop_post:
%i.1 = phi i32 [ %i.0, %loop_cond ], [ 0, %loop_init ]
%i.0 = add i32 %i.1, 1
br label %loop_cond
loop_exit:
ret i32 %i.0
}
$ lparse issue_27.ll
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x4d928a]
goroutine 1 [running]:
github.com/llir/llvm/ir.(*InstAdd).Type(0xc420707600, 0x65e400, 0xc421d17640)
/home/u/Desktop/go/src/github.com/llir/llvm/ir/inst_binary.go:48 +0x2a
github.com/llir/llvm/asm/internal/irx.(*Module).basicBlock(0xc421d87e08, 0xc420706b80, 0xc420707500)
/home/u/Desktop/go/src/github.com/llir/llvm/asm/internal/irx/translate.go:1114 +0xe69
github.com/llir/llvm/asm/internal/irx.(*Module).funcDecl(0xc421d87e08, 0xc420080320)
/home/u/Desktop/go/src/github.com/llir/llvm/asm/internal/irx/translate.go:622 +0x2fd9
github.com/llir/llvm/asm/internal/irx.Translate(0xc421d4a000, 0x179, 0x379, 0xc421d4a000)
/home/u/Desktop/go/src/github.com/llir/llvm/asm/internal/irx/translate.go:127 +0x1322
github.com/llir/llvm/asm.ParseBytes(0xc421d48000, 0x179, 0x379, 0x179, 0x379, 0x0)
/home/u/Desktop/go/src/github.com/llir/llvm/asm/asm.go:43 +0x60
github.com/llir/llvm/asm.ParseFile(0x7ffcdd84892c, 0xa, 0xc42005c060, 0xc42000a090, 0x1)
/home/u/Desktop/go/src/github.com/llir/llvm/asm/asm.go:22 +0x9f
main.parse(0x7ffcdd84892c, 0xa, 0xc4208ca240, 0xc420079f58)
/home/u/Desktop/go/src/github.com/llir/llvm/cmd/lparse/lparse.go:22 +0x39
main.main()
/home/u/Desktop/go/src/github.com/llir/llvm/cmd/lparse/lparse.go:15 +0x79
hi there,
trying to compile "llvm/asm"
I get:
$> go get -u -v github.com/llir/llvm/asm
github.com/llir/llvm (download)
github.com/pkg/errors (download)
package github.com/llir/llvm/asm/internal/lexer: cannot find package "github.com/llir/llvm/asm/internal/lexer" in any of:
/usr/lib/go/src/github.com/llir/llvm/asm/internal/lexer (from $GOROOT)
/home/binet/work/igo/src/github.com/llir/llvm/asm/internal/lexer (from $GOPATH)
it seems to me there are a few files missing in, at least, "llvm/asm/internal/{lexer,parser}"
.
could this be fixed?
thx!
Hello all,
I would first like to say thanks for an amazing library and I appreciate the hard work and dedication to support LLVM through Go.
However, using the library in a project of my own to generate LLVM instructions for different variables, I ran into a problem where using NewFloat
did not produce the expected results. I will demonstrate using a C program as a comparison.
Using clang -S -emit-llvm main.c
on the following program:
Input C Program:
int main() {
float j = 1.1;
return 0;
}
Produces the following store instruction for the float variable:
store float 0x3FF19999A0000000, float* %2, align 4
This is a 64 bit float with the last 28 bits dropped and converted to hex. (According to: http://lists.llvm.org/pipermail/llvm-dev/2011-April/039811.html)
However, attempting to generate the same instruction using this library:
mainBlock.NewStore(constant.NewFloat(value, types.Float), mainBlock.NewAlloca(types.Float))
where value
is the float literal 1.1
.
I obtained the following instruction:
store float 1.1, float* %1
Putting this into LLVM to generate assembly using:
llc -march=x86 -o main.expr.assembly main.expr.ll
Generates an error of:
llc: main.expr.ll:6:14: error: floating point constant invalid for type
store float 1.1, float* %1
I can provide more information if needed, but a few questions:
I can get it to work using types.Double
, and if that is the solution then so be it for now, but I'd like to investigate if this is actually the expected output.
Again,
Thanks for the work and dedication
Today I noticed a strange use of getelementptr
instructions, that I have yet to find any official documentation describing its semantics. Rather than integer values being used as indices of gep, I found an instruction which uses integer vectors. And the resulting type of the gep
instruction is not a pointer type but a vector of pointers type.
From ls.ll of Coreutils:
%37 = getelementptr inbounds %struct.fileinfo, %struct.fileinfo* %20, <2 x i64> %34, !dbg !4706
...
%40 = bitcast i8** %39 to <2 x %struct.fileinfo*>*, !dbg !4708
store <2 x %struct.fileinfo*> %37, <2 x %struct.fileinfo*>* %40, align 8, !dbg !4708, !tbaa !1793
Notice that the first index of gep
is <2 x i64> %34
, a vector value and not an integer value.
Furthermore, notice that the type of %37
is <2 x %struct.fileinfo*>
, a vector of pointers type, and not a pointer type.
@pwaller Have you seen this before, and do you know how the result type of gep
is calculated?
I skimmed through https://llvm.org/docs/GetElementPtr.html and found no reference of this behaviour.
Cheers!
Robin
First of all, I love this library. The LLVM bindings were a pain to work with because of their compile times and the fact it takes away the cross compilation that go gives us. But I am running in to a problem: In the current build, is there any way to define a function in one module, and use it in another one without erroring because it isn't defined?
Any help would be appreciated!
This issue tracks code coverage for the different llir/llvm packages. We will seek to add test cases for cases where code covered is absent.
The list of concepts to add test cases for is presented below. It is based on rev e157748 and was constructed by assessing the output of go test -coverprofile=a.out && go tool cover -html=a.out
for the asm
package.
Prior to adding test cases for these concepts, the code coverage of asm
was ~75%.
$ go test -cover
coverage: 75.6% of statements
Analogous to fmt.GoStringer
, we could use LLString
(or LLVMString
) to have LLVM IR constructs print their own definition.
The main reason to switch is to free up the name Def
, which we may want to use for use-def chains as part of the v0.4.0 release which focuses on data flow analysis.
I've been unable to locate an official formal grammar for LLVM IR. If anyone has information about work in this direction, please point it out to me.
To address this issue a formal grammar of LLVM IR will be created, prior to the implementation of the LLVM IR Assembly Language parser. This work was taking place at mewlang/llvm/asm/grammar
(old link superseded by https://github.com/llir/llvm/blob/master/asm/internal/ll.bnf).
Edit: For anyone who happen to stumble upon this issue. The latest version of the grammar is located in the llir/grammar repository, more specifically see ll.tm for an EBNF grammar for LLVM IR assembly.
This issue is intended to profile the performance of the llir/llvm
library, measure it against the official LLVM distribution and evaluate different methods for improving the performance.
This is a continuation of mewspring/mewmew-l#6
The benchmark suite is at https://github.com/decomp/testdata. Specifically, the LLVM IR assembly of these projects are used in the benchmark:
Below follows a first evaluation of using concurrency to speed up parsing. The evaluation is based on a very naiive implementation of concurrency, just to get some initial runtime numbers. It is based on 3011396 of the development branch, and subsets of the following patch has been applied https://gist.github.com/mewmew/d127b562fdd8f560222b4ded739861a7
For comparison, below are the runtime results of the opt
tool from the official LLVM distribution (using opt -verify foo.ll
).
real 8.18
user 7.22
sys 0.88
real 1.90
user 1.73
sys 0.13
llir/llvm
resultstotal time for file "testdata/coreutils/testdata/yes.ll": 55.744113ms
real 11.54
user 14.70
sys 0.16
translateTopLevelEntities
translateTopLevelEntities
total time for file "testdata/coreutils/testdata/yes.ll": 53.49785ms
real 10.28
user 16.06
sys 0.15
translateGlobals
translateGlobals
(for global and function definitions)total time for file "testdata/coreutils/testdata/yes.ll": 55.567134ms
real 9.83
user 17.18
sys 0.17
translateTopLevelEntities
and translateGlobals
translateTopLevelEntities
translateGlobals
(for global and function definitions)total time for file "testdata/coreutils/testdata/yes.ll": 58.474581ms
real 9.23
user 18.08
sys 0.16
total time for file "shell.ll": 3.147106433s
real 3.18
user 3.86
sys 0.32
translateTopLevelEntities
translateTopLevelEntities
total time for file "shell.ll": 2.848574349s
real 2.88
user 4.67
sys 0.32
translateGlobals
translateGlobals
(for global and function definitions)total time for file "testdata/sqlite/testdata/shell.ll": 2.86919391s
real 2.90
user 4.90
sys 0.32
translateTopLevelEntities
and translateGlobals
translateTopLevelEntities
translateGlobals
(for global and function definitions)total time for file "shell.ll": 2.897873366s
real 2.93
user 4.79
sys 0.33
Given the input:
%x = type { i32 }
%1 = type { i32 }
%0 = type { %1, %2 }
%2 = type { float, double }
opt -S -o output.ll < input.ll
produces the following output:
%0 = type { %1, %2 }
%1 = type { i32 }
%2 = type { float, double }
%x = type { i32 }
As such, the order of occurrence in the input source file is not taken into consideration during output, but rather, type names are sorted alphabetically. We should do the same.
Input:
$b = comdat any
$a = comdat any
@x = global i32 42, comdat($a)
@b = global i32 42, comdat($b)
opt
output:
$a = comdat any
$b = comdat any
@x = global i32 42, comdat($a)
@y = global i32 42, comdat($b)
Input:
define void @a() #0 {
ret void
}
define void @b() #0 #2 {
ret void
}
define void @c() #22 {
ret void
}
define void @d() {
ret void
}
define void @e() #2 {
ret void
}
attributes #22 = { "foobar" }
attributes #0 = { nounwind readnone "target-cpu"="hexagonv60" }
opt
output:
define void @a() #0 {
ret void
}
define void @b() #0 {
ret void
}
define void @c() #1 {
ret void
}
define void @d() {
ret void
}
define void @e() {
ret void
}
attributes #0 = { nounwind readnone "target-cpu"="hexagonv60" }
attributes #1 = { "foobar" }
Note: besides sorting in numerical order, opt
also renamed #22
to #1
, the first attribute group ID not yet in use.
Input:
define void @a() !x !2 {
ret void
}
define void @b() !x !21 !a !2 {
ret void
}
define void @c() !x !0 {
ret void
}
define void @d() {
ret void
}
define void @e() !x !2 {
ret void
}
!21 = !{ !"foo" }
!2 = !{ !"baz" }
!0 = !{ !"bar" }
opt
output:
define void @a() !x !0 {
ret void
}
define void @b() !x !1 !a !0 {
ret void
}
define void @c() !x !2 {
ret void
}
define void @d() {
ret void
}
define void @e() !x !0 {
ret void
}
!0 = !{!"baz"}
!1 = !{!"foo"}
!2 = !{!"bar"}
Note: besides sorting in numerical order, opt
also renamed !21
to !0
, the first metadata ID not yet in use.
Input:
@x = external global i32
@0 = global i32 42
@1 = external global i32
@a = external global i32
@b = global i32 42
opt
output:
@x = external global i32
@0 = global i32 42
@1 = external global i32
@a = external global i32
@b = global i32 42
Input:
declare void @x()
define void @0() {
ret void
}
declare void @1()
declare void @a()
define void @b() {
ret void
}
opt
output:
declare void @x()
define void @0() {
ret void
}
declare void @1()
declare void @a()
define void @b() {
ret void
}
Input:
@foo = global i32 42
@x = alias i32, i32* @foo
@y = ifunc void (), void ()* @bar
@0 = alias i32, i32* @foo
@1 = alias i32, i32* @foo
@2 = ifunc void (), void ()* @bar
@3 = ifunc void (), void ()* @bar
@a = alias i32, i32* @foo
@c = ifunc void (), void ()* @bar
@b = alias i32, i32* @foo
@d = ifunc void (), void ()* @bar
define void @bar() {
ret void
}
opt
output:
@foo = global i32 42
@x = alias i32, i32* @foo
@0 = alias i32, i32* @foo
@1 = alias i32, i32* @foo
@a = alias i32, i32* @foo
@b = alias i32, i32* @foo
@y = ifunc void (), void ()* @bar
@2 = ifunc void (), void ()* @bar
@3 = ifunc void (), void ()* @bar
@c = ifunc void (), void ()* @bar
@d = ifunc void (), void ()* @bar
define void @bar() {
ret void
}
Input:
define void @a() !x !2 {
ret void
}
define void @b() !x !21 !a !2 {
ret void
}
define void @c() !x !0 {
ret void
}
define void @d() {
ret void
}
define void @e() !x !2 {
ret void
}
!foo = !{!2}
!bar = !{!0}
!aaa = !{!21}
!21 = !{!"foo"}
!2 = !{!"baz"}
!0 = !{!"bar"}
opt
output:
define void @a() !x !0 {
ret void
}
define void @b() !x !2 !a !0 {
ret void
}
define void @c() !x !1 {
ret void
}
define void @d() {
ret void
}
define void @e() !x !0 {
ret void
}
!foo = !{!0}
!bar = !{!1}
!aaa = !{!2}
!0 = !{!"baz"}
!1 = !{!"bar"}
!2 = !{!"foo"}
Hello all,
I would first like to say thanks for an amazing library and I appreciate the hard work and dedication to support LLVM through Go.
However, using the library in a project of my own to generate LLVM instructions for different variables, I ran into a problem where using NewFloat
did not produce the expected results. I will demonstrate using a C program as a comparison.
Using clang -S -emit-llvm main.c
on the following program:
Input C Program:
int main() {
float j = 1.1;
return 0;
}
Produces the following store instruction for the float variable:
store float 0x3FF19999A0000000, float* %2, align 4
This is a 64 bit float with the last 28 bits dropped and converted to hex. (According to: http://lists.llvm.org/pipermail/llvm-dev/2011-April/039811.html)
However, attempting to generate the same instruction using this library:
mainBlock.NewStore(constant.NewFloat(value, types.Float), mainBlock.NewAlloca(types.Float))
where value
is the float literal 1.1
.
I obtained the following instruction:
store float 1.1, float* %1
Putting this into LLVM to generate assembly using:
llc -march=x86 -o main.expr.assembly main.expr.ll
Generates an error of:
llc: main.expr.ll:6:14: error: floating point constant invalid for type
store float 1.1, float* %1
I can provide more information if needed, but a few questions:
I can get it to work using types.Double
, and if that is the solution then so be it for now, but I'd like to investigate if this is actually the expected output.
Again,
Thanks for the work and dedication
The issue is intended to track discussions and experimental implementation related to use tracking.
The C++ API of LLVM defines the concepts of a Use
and a User
. A Use
is an edge between a used value and its user. Each User
has a number of operands which specify the Use
d values. Pseudo code follows:
type Value interface {
Uses() []Use
}
type Use interface {
OpNum() int
User() User
Usee() Value
}
type User() interface {
NOps() int
Op(i int) Value
SetOp(i int, v Value) error
}
Anyone is invited to join the discussion. How would users of the API which to use it? May it be implemented by a dedicated package separate from the ir
package? How would the interaction work? May there co-exist several implementations of use-tracking, and is this ever useful?
This issue summarizes the requirements of the LLVM packages, as specified by its intended use cases.
The requirements of llgo as stated by @axw (in this issue) are as follows:
As for llgo's requirements:
- in terms of using the LLVM API for generating code, it's mostly write-only via the builder API. Bitcode and IR reading is not important (at the moment?), but writing is; one or the other is required, but preferably both.
- llgo uses the DIBuilder API for generating debug metadata (DWARF, et al.). This could be built outside of the core (it's just a matter of creating metadata nodes in a particular format), just be aware that it's pretty finicky and easy to break.
- llgo needs to be able to look up target data (arch word size, alignment, etc.) from triples
For the decompilation pipeline the llvm
packages should be able to:
Thanks for this cool project!
@mewmew I'm wondering if llir/llvm could be used to generate code in a Go program that interacts with the Go (gc) runtime... allocating objects that are gc-ed; starting goroutines that are scheduled; making blocking system calls that the runtime handles with blocking... the kinds of things that would make llir/llvm viable for implementing an interpreter in Go.
What kind of interaction (if any) are you thinking could be supported? Or is that not on the radar at all?
Given the following input:
%t1 = type {}
%t19 = type {}
%t20 = type {}
%t21 = type {}
%t2 = type {}
define void @main() {
alloca %t1
alloca %t2
alloca %t19
alloca %t20
alloca %t21
ret void
}
opt -S -o < foo.ll
produces the following output, in which type definitions are sorted using natural instead of lexicographic sorting:
%t1 = type {}
%t2 = type {}
%t19 = type {}
%t20 = type {}
%t21 = type {}
define void @main() {
%1 = alloca %t1
%2 = alloca %t2
%3 = alloca %t19
%4 = alloca %t20
%5 = alloca %t21
ret void
}
Edit: for comparison, using sort.Strings
, we current get the follow output:
%t1 = type {}
%t19 = type {}
%t2 = type {}
%t20 = type {}
%t21 = type {}
define void @main() {
; <label>:0
%1 = alloca %t1
%2 = alloca %t2
%3 = alloca %t19
%4 = alloca %t20
%5 = alloca %t21
ret void
}
This has come up in discussion and would enable APIs such as Operands() []*value.Value
, which would allow not only use tracking, but also value replacement; as proposed by @pwaller. (ref: #42)
Depending on how far we wish to take this API change, there are some benefits and drawbacks. The main drawback I can see is if we change instructions (and terminators) to take value.Value
instead of *ir.BasicBlock
, since then users of the API cannot make use of the basic block directly, but would have to type assert to inspect the instructions of the basic block for instance. This is also true for the phi
instruction, for which Incoming
may be redefined as follows:
// Incoming is an incoming value of a phi instruction.
type Incoming struct {
// Incoming value.
X value.Value
// Predecessor basic block of the incoming value.
- Pred *BasicBlock
+ Pred value.Value // *ir.BasicBlock
}
Another instruction that would change is catchdad
, which would take a value.Value
instead of the concrete type *TermCatchSwitch
:
// InstCatchPad is an LLVM IR catchpad instruction.
type InstCatchPad struct {
// Name of local variable associated with the result.
LocalIdent
// Exception scope.
- Scope *TermCatchSwitch
+ Scope value.Value // *ir.TermCatchSwitch
// Exception arguments.
Args []value.Value
// extra.
// (optional) Metadata.
Metadata []*metadata.MetadataAttachment
}
Besides the phi
and catchpad
instructions, quite a few terminators would be update to make use of value.Value
instead of *ir.BasicBlock
.
// TermCondBr is a conditional LLVM IR br terminator.
type TermCondBr struct {
// Branching condition.
Cond value.Value
// True condition target branch.
- TargetTrue *BasicBlock
+ TargetTrue value.Value // *ir.BasicBlock
// False condition target branch.
- TargetFalse *BasicBlock
+ TargetFalse value.Value // *ir.BasicBlock
// extra.
// Successor basic blocks of the terminator.
Successors []*BasicBlock
// (optional) Metadata.
Metadata []*metadata.MetadataAttachment
}
The catchret
terminator would take a value.Value
instead of *ir.InstCatchPad
It is also possible we'd have to update ir.Case to take a value.Value
instead of a constant.Constant
if we want to use this approach for also refining/updating/replacing values and not just for read-only access to operands.
// Case is a switch case.
type Case struct {
// Case comparand.
- X constant.Constant // integer constant or integer constant expression
+ X value.Value // integer constant or integer constant expression
// Case target branch.
Target *BasicBlock
}
I'm currently on the fence whether this change is good or not. The data types become less exact with what values they may contain, and specifically for basic blocks, users of the API would have to type assert to access the fields specific to basic blocks (such as its instructions). On the other hand, it would enable a general and quite powerful API for operand tracking and replacement.
I'll label this for the v0.4.0 release for now, as it mostly targets data analysis and the use-def chains API.
Any input is warmly welcome.
Cheers,
/u
"The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time."
โ Tom Cargill, Bell Labs
To encourage the development of the final 10%, a set of challenges have been produced. This is a meta-issue to track the hello challenge of the parse-me repository.
Once these challenges have been beaten, lexing, parsing, and potentially type checking of LLVM IR assembly will have been implemented. At this point, the project is ready for an API overhaul and will welcome an open discussion with other members of the community interested in finding a clean, minimal API for interacting with LLVM IR.
Note: there will exist several, almost identical, challenge issues. The main reason for this is that the developer finds childish joy in closing issues once a challenge has been beaten :)
In LLVM 7.0 the concept of ThinLTO module summaries was introduced. We currently have no IR representation of module summaries, and the grammar for module summaries has not yet been written.
A test case containing a module summary is present in llvm/test/Assembler/thinlto-summary.ll:
; ModuleID = 'thinlto-summary.thinlto.bc'
^0 = module: (path: "thinlto-summary1.o", hash: (1369602428, 2747878711, 259090915, 2507395659, 1141468049))
^1 = module: (path: "thinlto-summary2.o", hash: (2998369023, 4283347029, 1195487472, 2757298015, 1852134156))
; Check a function that makes several calls with various profile hotness, and a
; reference (also tests forward references to function and variables in calls
; and refs).
^2 = gv: (guid: 1, summaries: (function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 10, calls: ((callee: ^15, hotness: hot), (callee: ^17, hotness: cold), (callee: ^16, hotness: none)), refs: (^13))))
; Function with a call that has relative block frequency instead of profile
; hotness.
^3 = gv: (guid: 2, summaries: (function: (module: ^1, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 10, calls: ((callee: ^15, relbf: 256)))))
; Summaries with different linkage types.
^4 = gv: (guid: 3, summaries: (function: (module: ^0, flags: (linkage: internal, notEligibleToImport: 0, live: 0, dsoLocal: 1), insts: 1)))
; Make this one an alias with a forward reference to aliasee.
^5 = gv: (guid: 4, summaries: (alias: (module: ^0, flags: (linkage: private, notEligibleToImport: 0, live: 0, dsoLocal: 1), aliasee: ^14)))
^6 = gv: (guid: 5, summaries: (function: (module: ^0, flags: (linkage: available_externally, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 1)))
^7 = gv: (guid: 6, summaries: (function: (module: ^0, flags: (linkage: linkonce, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 1)))
^8 = gv: (guid: 7, summaries: (function: (module: ^0, flags: (linkage: linkonce_odr, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 1)))
^9 = gv: (guid: 8, summaries: (function: (module: ^0, flags: (linkage: weak_odr, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 1)))
^10 = gv: (guid: 9, summaries: (function: (module: ^0, flags: (linkage: weak, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 1)))
^11 = gv: (guid: 10, summaries: (variable: (module: ^0, flags: (linkage: common, notEligibleToImport: 0, live: 0, dsoLocal: 0))))
; Test appending globel variable with reference (tests backward reference on
; refs).
^12 = gv: (guid: 11, summaries: (variable: (module: ^0, flags: (linkage: appending, notEligibleToImport: 0, live: 0, dsoLocal: 0), refs: (^4))))
; Test a referenced global variable.
^13 = gv: (guid: 12, summaries: (variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0))))
; Test a dsoLocal variable.
^14 = gv: (guid: 13, summaries: (variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 1))))
; Functions with various flag combinations (notEligibleToImport, Live,
; combinations of optional function flags).
^15 = gv: (guid: 14, summaries: (function: (module: ^1, flags: (linkage: external, notEligibleToImport: 1, live: 1, dsoLocal: 0), insts: 1)))
^16 = gv: (guid: 15, summaries: (function: (module: ^1, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 1, funcFlags: (readNone: 1, noRecurse: 1))))
; This one also tests backwards reference in calls.
^17 = gv: (guid: 16, summaries: (function: (module: ^1, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 1, funcFlags: (readOnly: 1, returnDoesNotAlias: 1), calls: ((callee: ^15)))))
; Alias summary with backwards reference to aliasee.
^18 = gv: (guid: 17, summaries: (alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 1), aliasee: ^14)))
; Test all types of TypeIdInfo on function summaries.
^19 = gv: (guid: 18, summaries: (function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 4, typeIdInfo: (typeTests: (^24, ^26)))))
^20 = gv: (guid: 19, summaries: (function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 8, typeIdInfo: (typeTestAssumeVCalls: (vFuncId: (^27, offset: 16))))))
^21 = gv: (guid: 20, summaries: (function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 5, typeIdInfo: (typeCheckedLoadVCalls: (vFuncId: (^25, offset: 16))))))
^22 = gv: (guid: 21, summaries: (function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 15, typeIdInfo: (typeTestAssumeConstVCalls: (vFuncId: (^27, offset: 16), args: (42), vFuncId: (^27, offset: 24), args: (43))))))
^23 = gv: (guid: 22, summaries: (function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 5, typeIdInfo: (typeCheckedLoadConstVCalls: (vFuncId: (^28, offset: 16), args: (42))))))
; Test TypeId summaries:
; Test the AllOnes resolution, and all kinds of WholeProgramDevirtResolution
; types, including all optional resolution by argument kinds.
^24 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7), wpdResolutions: ((offset: 0, wpdRes: (kind: branchFunnel)), (offset: 8, wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi")), (offset: 16, wpdRes: (kind: indir, resByArg: (args: (1, 2), byArg: (kind: indir, byte: 2, bit: 3), args: (3), byArg: (kind: uniformRetVal, info: 1), args: (4), byArg: (kind: uniqueRetVal, info: 1), args: (5), byArg: (kind: virtualConstProp)))))))
; Test TypeId with other optional fields (alignLog2/sizeM1/bitMask/inlineBits)
^25 = typeid: (name: "_ZTS1B", summary: (typeTestRes: (kind: inline, sizeM1BitWidth: 0, alignLog2: 1, sizeM1: 2, bitMask: 3, inlineBits: 4)))
; Test the other kinds of type test resoultions
^26 = typeid: (name: "_ZTS1C", summary: (typeTestRes: (kind: single, sizeM1BitWidth: 0)))
^27 = typeid: (name: "_ZTS1D", summary: (typeTestRes: (kind: byteArray, sizeM1BitWidth: 0)))
^28 = typeid: (name: "_ZTS1E", summary: (typeTestRes: (kind: unsat, sizeM1BitWidth: 0)))
Update summary, 23/11/2018: This repository currently requires ~10MiB of download, which isn't ideal considering the source is only a few hundreds of kilobytes. @mewmew and I propose to shrink it to ~800kiB, to give a faster "Go install" experience for anyone using the repository.
The reason for the blowup is that there were some large test cases (including sqlite) which measure in the 10's of MiBs, and various other bits relating to parsing were also quite large. Those have now moved into other repositories in the llir
organization, so we don't need to download those anymore if you just want to import llir.
Original issue text.
I just saw @mewmew's comment in ec48d54 but thought it would be easier to have a separate issue for discussion - the commit itself is very long so if I commented on the commit the discussion would be way down at the bottom!
First, can I clarify the question - are you asking how to remove lots of old large assets from the history of the repository?
If that is the question, the answer is, yes you can do it, but anyone who cloned the repository needs to know about it otherwise they might get in a mess, since it requires rewriting history. At least, that's the best I know. See github's guidance on the issue.
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[]) {
printf("%d\n", strncmp("a", "a", 1));
return 0;
}
clang translates this to:
%6 = call i32 @strncmp(i8* getelementptr inbounds ([2 x i8], [2 x i8]* @.str.1, i32 0, i32 0), i8* getelementptr inbounds ([2 x i8], [2 x i8]* @.str.1, i32 0, i32 0), i64 1) #3
%7 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i32 0, i32 0), i32 %6)
Note that strncmp
requires i8*
, but the constant globals are i8
arrays, so it uses a constant getelementptr
expression on the global to get the i8*
from the static data.
As far as I can tell, I can't achieve the same effect at the moment because *ir.Global
does not implement IsConstant
, so can't be fed into a constant.NewGetElementPtr
. Is that correct?
The grammar contains an ambiguity when parsing global variable alignment attributes. More specifically, an alignment attribute of a global variable may be interpreted either as a GlobalAttr
or a FuncAttr
, and since the list of both global attributes and function attributes may be optionally empty, this leads to a shift/reduce ambiguity in the parser.
From the ll.tm EBNF grammar:
GlobalDecl -> GlobalDecl
: Name=GlobalIdent '=' ExternLinkage Preemptionopt Visibilityopt DLLStorageClassopt ThreadLocalopt UnnamedAddropt AddrSpaceopt ExternallyInitializedopt Immutable ContentType=Type (',' Section)? (',' Comdat)? (',' Align)? Metadata=(',' MetadataAttachment)+? FuncAttrs=(',' FuncAttribute)+?
;
FuncAttribute -> FuncAttribute
: AttrString
| AttrPair
# not used in attribute groups.
| AttrGroupID
# used in functions.
#| Align # NOTE: removed to resolve reduce/reduce conflict, see above.
# used in attribute groups.
| AlignPair
| AlignStack
| AlignStackPair
| AllocSize
| FuncAttr
;
Specifically, the end of the line is of interest (',' Align)? Metadata=(',' MetadataAttachment)+? FuncAttrs=(',' FuncAttribute)+?
Given that there are no metadata attachments, the alignment attribute (align 8
) of the following LLVM IR:
@a = global i32 42, align 8
may be either reduced to a global attribute (i.e. Align
before MetadataAttachment
), or as a function attribute (i.e. FuncAttribute
after MetadataAttachment
).
The solution employed by the C++ parser is the opposite of maximum much, as it will try to reduce rather than shift when possible.
Corresponds to requirement 4, decomp/decomp#94.
"The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time."
โ Tom Cargill, Bell Labs
To encourage the development of the final 10%, a set of challenges have been produced. This is a meta-issue to track the rand challenge of the parse-me repository.
Once these challenges have been beaten, lexing, parsing, and potentially type checking of LLVM IR assembly will have been implemented. At this point, the project is ready for an API overhaul and will welcome an open discussion with other members of the community interested in finding a clean, minimal API for interacting with LLVM IR.
Note: there will exist several, almost identical, challenge issues. The main reason for this is that the developer finds childish joy in closing issues once a challenge has been beaten :)
Basically what I need is something like this:
https://godoc.org/llvm.org/llvm/bindings/go/llvm#Builder.CreateMalloc
In my case, I'm trying to create a currying function call, it would need to store parameters in a structure
Should we rethink sumtypes to allow for user-defined types?
For instance, ir.Instruction
currently requires the unexported isInstruction
method, but there are valid use cases where users may wish to define their own instructions to put in basic blocks.
One such use case seen in the wild is a comment pseudo-instruction which prints itself as ; data...
, e.g.
// Comment is a pseudo-instruction that may be used for adding LLVM IR comments
// to basic blocks.
type Comment struct {
// Line-comment contents.
Data string
}
// IsInstruction implements the ir.Instruction interface for Comment.
func (inst *Comment) IsInstruction() {}
// LLString returns the LLVM syntax representation of the comment.
func (inst *Comment) LLString() string {
return fmt.Sprintf("; %s", inst.Data)
}
Based on the time constraints of this project a full implementation of LLVM IR will not be developed during its time frame. The ambition is to develop a full implementation once the project is finished. With this in mind, the focus is now to implement a minimal subset required for decompilation. Any code not directly related to this subset will be removed from the repository for now, and will be added back once the project is completed. This issue will make sure to track these code changes so they can be reverted easily.
Per https://llvm.org/docs/LangRef.html#linkage-types global variables and functions can have linkage types. Trying to generate IR (as a compiler frontend), this is a required feature.
Unless I've been particularly blind, setting the linkage type isn't currently possible.
The callee operand of call instructions has two ways to represent their types. Namely as the return type of the callee, or as the complete function signature of the callee. Currently, the latter format causes a nil-pointer deref when the callee has a void return type.
Successful parse:
declare void @g()
define void @f() {
call void @g()
ret void
call void @g()
ret void
}
Crash with nil-pointer defer when parsing.
declare void @g()
define void @f() {
call void () @g()
ret void
call void () @g()
ret void
}
To be able to Implement the instructions taking floats, first floats need to be implemented. This Issue tracks all work related to implementing the float type and instructions.
Floating point types:
Instructions:
This notice is intended to give a heads up for those using the llir/llvm
library. The next release will include complete support for all intrinsics of the LLVM IR language. The work is currently in a flux, and to experiment with different API designs and simplify the parser logic and reduce the code duplication in the project, a new repo has been created during the experimental phase.
At the current stage, the grammar is capable of parsing the entirety of the LLVM IR language, including specialized metadata nodes (#26).
While working on this we will also try to take into consideration previous issues that have been identified with the parser (such as the handling of quoted strings #24).
The llir/llvm/ir
package will be extended to support the entire LLVM IR language; thus resolving #23 as linkage information will be present in the in-memory intermediate representation form.
With the upcoming release, read support for all of the LLVM IR language concepts will have been implemented; thus resolving #15.
Similarly; we will now have a grammar covering the entire LLVM IR language; thus resolving #2.
With the addition of support for specialized metadata nodes, the second requirement of llgo
will also be fully supported (#3); llgo uses the DIBuilder API for generating debug metadata (DWARF, et al.). This could be built outside of the core (it's just a matter of creating metadata nodes in a particular format), just be aware that it's pretty finicky and easy to break..
For IR construction, a similar approach will be used as has been done before. Personally, we feel this approach has worked out well and has been quite pleasant to use. If anyone has input on their own experience using the API of the llir/llvm/ir
package to construct LLVM IR, please let us know as that could help shape the upcoming release. As for llgo
, the first requirement in terms of using the LLVM API for generating code, it's mostly write-only via the builder API. Bitcode and IR reading is not important (at the moment?), but writing is; one or the other is required, but preferably both. is satisfied by this API, and has been for a while. Although, now the llir/llvm/ir
package will contain the support for the entire LLVM IR language, and now just a subset; thus the requirement should be satisfied in full.
Module top-level information such as target triple and data layout has been and will continue to be recorded and maintained by the IR API, thus supporting the third requirement of llgo
; llgo needs to be able to look up target data (arch word size, alignment, etc.) from triples.
Generating C-shared library bindings compatible with the official C library of the LLVM project is an ambitious goal that is left for a future release (#12). Anyone specifically interested in this topic, feel free to get in touch with us or continue the discussion in the dedicated issue.
Similarly, interaction with the Go runtime is targeted for a future release, and those with knowledge in this domain are happily invited to the discussion on what is needed and how to bring this about (#18).
As for use-tracking and data analysis support (#19), more thought will be required to get a clean API. This is therefore targeted for a future release.
So, to summarize, the upcoming release of the llir/llvm
project will include read and write support for the entire LLVM IR language. In other words, it will be possible to parse arbitrary LLVM IR assembly files into an in-memory representation, aka the one defined in package llir/llvm/ir
. And the in-memory IR representation will have support for the entire LLVM IR language, and can be converted back to LLVM IR assembly for interaction with other tools, such as the LLVM optimizer.
Any feedback is welcome, so we know we're heading in the right direction.
Cheerful regards,
/u & i
Could you provide an example of generating llvm ir with string usage and concatenation using the API? Your main example on the readme is great for using integer values and variables but I am struggling to convert the example to use strings or pointer values in general.
I know what I would like the ir to look like, but I am unclear how to generate the resultant ir using the API.
example:
From C:
#include <stdio.h>
#include <string.h>
int main() {
char src[50], dest[50];
strcpy(src, "This is source");
strcpy(dest, "This is destination");
strcat(dest, src);
printf("Final destination string : |%s|", dest);
return(0);
}
To desired llvm ir:
@.str = private unnamed_addr constant [15 x i8] c"This is source\00", align 1
@.str.1 = private unnamed_addr constant [20 x i8] c"This is destination\00", align 1
@.str.2 = private unnamed_addr constant [32 x i8] c"Final destination string : |%s|\00", align 1
; Function Attrs: noinline nounwind uwtable
define i32 @main() #0 {
%1 = alloca i32, align 4
%2 = alloca [50 x i8], align 16
%3 = alloca [50 x i8], align 16
store i32 0, i32* %1, align 4
%4 = getelementptr inbounds [50 x i8], [50 x i8]* %2, i32 0, i32 0
%5 = call i8* @strcpy(i8* %4, i8* getelementptr inbounds ([15 x i8], [15 x i8]* @.str, i32 0, i32 0)) #3
%6 = getelementptr inbounds [50 x i8], [50 x i8]* %3, i32 0, i32 0
%7 = call i8* @strcpy(i8* %6, i8* getelementptr inbounds ([20 x i8], [20 x i8]* @.str.1, i32 0, i32 0)) #3
%8 = getelementptr inbounds [50 x i8], [50 x i8]* %3, i32 0, i32 0
%9 = getelementptr inbounds [50 x i8], [50 x i8]* %2, i32 0, i32 0
%10 = call i8* @strcat(i8* %8, i8* %9) #3
%11 = getelementptr inbounds [50 x i8], [50 x i8]* %3, i32 0, i32 0
%12 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([32 x i8], [32 x i8]* @.str.2, i32 0, i32 0), i8* %11)
ret i32 0
}
; Function Attrs: nounwind
declare i8* @strcpy(i8*, i8*) #1
; Function Attrs: nounwind
declare i8* @strcat(i8*, i8*) #1
declare i32 @printf(i8*, ...) #2
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.