GithubHelp home page GithubHelp logo

mmcloughlin / avo Goto Github PK

View Code? Open in Web Editor NEW
2.6K 32.0 92.0 6.75 MB

Generate x86 Assembly with Go

License: BSD 3-Clause "New" or "Revised" License

Go 58.38% Shell 0.97% Assembly 40.45% C 0.20%
go golang assembly x86-64 code-generation

avo's People

Contributors

cadobot[bot] avatar cristaloleg avatar josharian avatar kalamay avatar klauspost avatar lukechampine avatar mmcloughlin avatar vsivsi avatar zchee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

avo's Issues

testing: extend tests for register allocator

We are reliant primarily on the examples for testing at this point. It would be good to "stress" the allocator because I would be (pleasantly) surprised if it's bug-free.

  • More extensive unit testing
  • Integration test under tests/ that (for example) uses the max number of registers of a given kind and confirms the register allocator doesn't fall over
  • More specifically, handling of "casting" is a little ugly. It would be good to have a test that stresses this specifically. Perhaps, something that allocates 16 virtual general purpose registers and then accesses all sub-registers. This is clearly a possible allocation, but I could imagine the allocator messing up.
  • I'm sure other tests will come to mind

Unit tests right now are trivial:

avo/pass/alloc_test.go

Lines 9 to 46 in 7752262

func TestAllocatorSimple(t *testing.T) {
c := reg.NewCollection()
x, y := c.XMM(), c.YMM()
a, err := NewAllocatorForKind(reg.KindVector)
if err != nil {
t.Fatal(err)
}
a.Add(x)
a.Add(y)
a.AddInterference(x, y)
alloc, err := a.Allocate()
if err != nil {
t.Fatal(err)
}
t.Log(alloc)
if alloc[x] != reg.X0 || alloc[y] != reg.Y1 {
t.Fatalf("unexpected allocation")
}
}
func TestAllocatorImpossible(t *testing.T) {
a, err := NewAllocatorForKind(reg.KindVector)
if err != nil {
t.Fatal(err)
}
a.AddInterference(reg.X7, reg.Z7)
_, err = a.Allocate()
if err == nil {
t.Fatal("expected allocation error")
}
}

ir,build: TEXT and GLOBL symbol names

TEXT and GLOBL symbols are currently too restrictive. I think we do not correctly support:

  • Static text symbols
  • Symbols with package references

See:

p.Printf("GLOBL %s(SB), %s, $%d\n", g.Symbol, g.Attributes.Asm(), g.Size)

p.Printf("TEXT %s%s(SB)", dot, f.Name)

Related to #37

Summary

Symbols have:

  • Name
  • Package qualification: what package does it belong to? . is a convenience for current package
  • Static flag: is it visible outside this file?
  • For arcane reasons . and / are replaced with unicode in asm source code

Name is required. The others are optional and we see all 4 possibilities in the standard library:

no pkg, no static: p256SubInternal
pkg, no static: sync∕atomic·LoadInt32
no pkg, static: _expand_key_192b<>
pkg, static: runtime∕internal∕atomic·kernelcas<>

Examples

The following TEXT symbol names do occur in the standard library:

_expand_key_192b<> (static, no dot)
cmpbody<> (static, no dot)
runtime·sigprofNonGoWrapper<> (static, package qualified)
runtime∕internal∕atomic·kernelcas<> (static, package qualified)
sync∕atomic·LoadInt32 (package qualified)
p256SubInternal (no dot)

Static TEXT Symbols

https://github.com/golang/go/blob/fd752d5ede482cdf52a920c75486677cbcb441b0/src/crypto/aes/asm_amd64.s#L243
https://github.com/golang/go/blob/master/src/internal/bytealg/compare_amd64.s#L30

Almost always these do not include the dot symbol, but there are some weird exceptions.

https://github.com/golang/go/blob/14560da7e469aff46a6f1270ce84204bbd6ffdb3/src/runtime/sys_linux_ppc64x.s#L420
https://github.com/golang/go/blob/14560da7e469aff46a6f1270ce84204bbd6ffdb3/src/runtime/internal/atomic/sys_linux_arm.s#L34

Package-Qualified TEXT Symbols

https://github.com/golang/go/blob/14560da7e469aff46a6f1270ce84204bbd6ffdb3/src/runtime/race_amd64.s#L204

Note this is the unicode "division slash U+2215"

Missing a dot

https://github.com/golang/go/blob/14560da7e469aff46a6f1270ce84204bbd6ffdb3/src/crypto/elliptic/p256_asm_amd64.s#L1313

Processing Code

Assembler lexer:

https://github.com/golang/go/blob/fd752d5ede482cdf52a920c75486677cbcb441b0/src/cmd/asm/internal/lex/lex.go#L104-L114
https://github.com/golang/go/blob/fd752d5ede482cdf52a920c75486677cbcb441b0/src/cmd/asm/internal/lex/tokenizer.go#L52-L67

  • The unicode characters are mapped directly to standard . and /
  • Leading unicode dot is mapped to "".

Assembler:

https://github.com/golang/go/blob/fd752d5ede482cdf52a920c75486677cbcb441b0/src/cmd/asm/internal/asm/asm.go#L79-L90
https://github.com/golang/go/blob/fd752d5ede482cdf52a920c75486677cbcb441b0/src/cmd/internal/obj/plist.go#L80-L82

  • Must be NAME_EXTERN or NAME_STATIC

asmdecl pass

https://github.com/golang/tools/blob/3ef68632349c4eab68426d81d981d131b625cafc/go/analysis/passes/asmdecl/asmdecl.go#L277-L279

  • Function names not containing <> must have Go declarations

ast: move from root to ir package

I don't think the ast.go types make much sense in the root avo directory. Consider moving them to a sub-package github.com/mmcloughlin/avo/ir.

Preference for ir (intermediate representation) since these types contain much more than a representation of the syntax, for example the results of liveness analysis or register allocation. Moreover I have a half-baked plan to write a parser for a superset of Go assembly into avo formats, so in that world ast may make more sense as a target for the parser (which is then transformed to ir).

doc: comment all public symbols

  • (root)
  • build
  • buildtags
  • gotypes
  • operand
  • pass
  • printer
  • reg
  • src
  • x86
  • internal/cmd/avogen
  • internal/gen
  • internal/inst
  • internal/load
  • internal/opcodescsv
  • internal/opcodesxml
  • internal/prnt
  • internal/stack
  • internal/test
  • package-level doc comments

Once done, enable public doc linting. Unfortunately this will require a workaround for golangci/golangci-lint#21.

off-topic: a disassembler using the same notation

In a previous life, I did a z80 disassembler that generated almost correct c for the "Small C" compiler. The combination could round-trip with a little human help, and I found it far clearer than disassembling to raw assembler notation.
This language might be an elegant disassembler target.

instructions: "ANDQ imm64 r64" and "MOVD r32 xmm" unsupported

The go assembler supports both of the following instructions but avo does not.

  • ANDQ imm64, r64
  • MOVD r32, xmm

I see that x86/zctors.go is autogenerated, but I'm a bit lost on how to update the generator to resolve this. To make sure that valid output would be generated I modified it by hand and the generated code was properly assembled:

diff --git a/x86/zctors.go b/x86/zctors.go
index 6d03480..f52ba55 100644
--- a/x86/zctors.go
+++ b/x86/zctors.go
@@ -1280,6 +1280,7 @@ func ANDPS(mx, x operand.Op) (*intrep.Instruction, error) {
 // 	ANDQ imm32 rax
 // 	ANDQ imm8  r64
 // 	ANDQ imm32 r64
+// 	ANDQ imm64 r64
 // 	ANDQ r64   r64
 // 	ANDQ m64   r64
 // 	ANDQ imm8  m64
@@ -1308,6 +1309,13 @@ func ANDQ(imr, mr operand.Op) (*intrep.Instruction, error) {
 			Inputs:   []operand.Op{mr},
 			Outputs:  []operand.Op{mr},
 		}, nil
+	case operand.IsIMM64(imr) && operand.IsR64(mr):
+		return &intrep.Instruction{
+			Opcode:   "ANDQ",
+			Operands: []operand.Op{imr, mr},
+			Inputs:   []operand.Op{mr},
+			Outputs:  []operand.Op{mr},
+		}, nil
 	case operand.IsR64(imr) && operand.IsR64(mr):
 		return &intrep.Instruction{
 			Opcode:   "ANDQ",
@@ -8450,6 +8458,7 @@ func MOVBWZX(mr, r operand.Op) (*intrep.Instruction, error) {
 // 	MOVD imm32 m64
 // 	MOVD r64   m64
 // 	MOVD xmm   r64
+// 	MOVD r32   xmm
 // 	MOVD r64   xmm
 // 	MOVD xmm   xmm
 // 	MOVD m64   xmm
@@ -8505,6 +8514,13 @@ func MOVD(imrx, mrx operand.Op) (*intrep.Instruction, error) {
 			Inputs:   []operand.Op{imrx},
 			Outputs:  []operand.Op{mrx},
 		}, nil
+	case operand.IsR32(imrx) && operand.IsXMM(mrx):
+		return &intrep.Instruction{
+			Opcode:   "MOVD",
+			Operands: []operand.Op{imrx, mrx},
+			Inputs:   []operand.Op{imrx},
+			Outputs:  []operand.Op{mrx},
+		}, nil
 	case operand.IsR64(imrx) && operand.IsXMM(mrx):
 		return &intrep.Instruction{
 			Opcode:   "MOVD",

printer: ensure compatibility with asmfmt

klauspost/asmfmt is the de facto standard for Go assembly formatting. It would be good to produce output that conforms to asmfmt.

I have a preference for avoiding non-Go dependencies (stdlib and sub-repos only). Therefore:

  • If it is possible to produce conforming output without actually depending on asmfmt explicitly, that would be preferred. This may actually be possible since (at the time of writing) many of the asmfmt rules simply wouldn't apply to avo output.
  • If the rules enforced by asmfmt are too complicated, then we can accept the additional dependency.

Either way, it would be good to have a check in CI to confirm that all generated files are formatted correctly. Something like find . -name '*.s' | xargs asmfmt -w and check the git repo is clean.

ports: port peachpy go projects to avo

Consider porting existing PeachPy Go projects to avo. At a minimum this would be really valuable feedback and system-level testing for avo. These could be committed to the avo examples directory or potentially committed back to the original repos if their maintainers are interested.

Repository Description Stars
Yawning/chacha20 ChaCha20 cryptographic cipher. 33
Yawning/aez AEZ authenticated-encryption scheme. 6
robskie/bp128 SIMD-BP128 integer encoding and decoding. 22
dgryski/go-marvin32 Microsoft's Marvin32 hash function. 7
dgryski/go-highway Google's Highway hash function. 55
dgryski/go-metro MetroHash function. 66
dgryski/go-stadtx Stadtx hash function. See examples/stadtx 7
dgryski/go-sip13 SipHash 1-3 function. 17
dgryski/go-chaskey Chaskey MAC. 5
dgryski/go-speck SPECK cipher. 7
dgryski/go-bloomindex Bloom-filter based search index. 79
dgryski/go-groupvariant SSE-optimized group varint integer encoding. 25
bwesterb/go-sha256x8 Eight-way SHA256 0
gtank/ed25519 radix51 sub-package originally generated with PeachPy 8

lint: address issues under examples directory

Our linter is passing but we're getting errors on goreportcard at the moment:

https://goreportcard.com/report/github.com/mmcloughlin/avo

These are all golint errors:

$ golint ./...
examples/args/args.go:3:6: exported type Struct should have comment or be unexported
examples/args/args.go:18:6: exported type Sub should have comment or be unexported
examples/returns/returns.go:3:6: exported type Struct should have comment or be unexported
examples/sha1/sha1.go:8:2: exported const Size should have comment (or a comment on this block) or be unexported
examples/sha1/sha1.go:12:1: exported function Sum should have comment or be unexported
examples/stadtx/stadtx.go:30:1: exported function SeedState should have comment or be unexported
examples/stadtx/stadtx.go:64:6: exported type State should have comment or be unexported

It appears that golangci-lint excludes examples directories by default.

https://github.com/golangci/golangci-lint/blob/3345c7136f58f042f8fa952a58eec0d8ff4f02c5/pkg/packages/skip.go#L22

I would like to enable linting of examples directory and fix any issues.

ast,build: only include textflag.h if required

We currently always include textflag.h

avo/printer/goasm.go

Lines 39 to 43 in e364d63

func (p *goasm) header() {
p.Comment(p.cfg.GeneratedWarning())
p.NL()
p.include("textflag.h")
}

We should be able to write a simple pass which uses ContainsTextFlags() to determine if a given file needs the include or not.

avo/attr.go

Lines 62 to 63 in e364d63

// ContainsTextFlags() returns whether the Asm() representation requires macros in "textflags.h".
func (a Attribute) ContainsTextFlags() bool {

build: change LABEL signature

Change this to Label and make it match the corresponding function in Context.

Related to #33 (we should keep the context and package-level functions in sync)

build: automate creation of global functions

Currently the build package consists of a Context struct with methods for incrementally building a program. Then for convenience, we have a global Context object and package-level functions operating on it. The two have to be kept in sync manually. Ideally this would be automated.

The following gist contains a start of what this could look like

https://gist.github.com/mmcloughlin/2f6ff496978ef57efce13eab84c69cd5

@myitcv suggested that godoc could be used to get most of the way:

I'd say writing a code generator in that case would be overkill. Use 'go doc` on the package to get the methods (via grep) then sed the output to generate the functions.

https://gophers.slack.com/archives/C0VPK4Z5E/p1544342455029800

build: include file/line in errors

avo errors are currently very hard to understand due to lack of file:line references. We should fix this.

At the same time, it may be good to consider how to make position information more widely available to other components (#6 for example).

(Reuse go/token.Position?)

examples/stadtx: remove leftover comments

Remove leftover comments from porting the Stadtx example

avo/examples/stadtx/asm.go

Lines 110 to 118 in 9fbb71b

LABEL(labels[6]) // LABEL(labels[6])
MOVBQZX(Mem{Base: ptr, Disp: 5}, ch) // MOVZX(reg_ch, byte[reg_ptr+5])
SHLQ(U8(48), ch) // SHL(reg_ch, 48)
ADDQ(ch, v1) // ADD(reg_v1, reg_ch)
//
LABEL(labels[5]) // LABEL(labels[5])
MOVBQZX(Mem{Base: ptr, Disp: 4}, ch) // MOVZX(reg_ch, byte[reg_ptr+4])
SHLQ(U8(16), ch) // SHL(reg_ch, 16)
ADDQ(ch, v0) // ADD(reg_v0, reg_ch)

ast,build: support function/data attributes

We should support TEXT and GLOBL attributes as defined in textflag.h.

  • define attribute type
  • add attribute to Function and Global
  • define builder interface
  • integration test that confirms our attributes are in sync with textflag.h?

ci: switch to travis

Using shippable while the repo is private. Prefer to switch to Travis CI once public.

doc: examples

Provide READMEs for examples:

  • add
  • sum
  • args
  • returns
  • complex
  • data
  • fnv1a
  • dot
  • geohash
  • sha1
  • stadtx
  • root examples/ folder: links and summary for each example (maybe auto-generated)

ir,build: add support for comments

It would be good to support comments in the output. At the moment it is natural to have comments in the generator:

avo/examples/sha1/asm.go

Lines 21 to 28 in 022d24d

// Load initial hash.
h0, h1, h2, h3, h4 := GP32(), GP32(), GP32(), GP32(), GP32()
MOVL(h.Offset(0), h0)
MOVL(h.Offset(4), h1)
MOVL(h.Offset(8), h2)
MOVL(h.Offset(12), h3)
MOVL(h.Offset(16), h4)

This could just as well be replaced with Comment("Load initial hash") and it would serve to document the generator and the output.

examples: demonstrate build tags

We should provide an example of using build tags (support added in #3).

As part of this it would be good to gather a collection of the sorts of common use cases for build tags (stdlib and popular packages using Go asm). I am not yet clear that the interface introduced in #3 is sufficient.

pass: use topological sort in liveness analysis

Liveness analysis should process instructions in topological sort order. Performance is acceptable for now, but this should be addressed at some point.

avo/pass/reg.go

Lines 18 to 19 in 9fbb71b

// Process instructions in reverse: poor approximation to topological sort.
// TODO(mbm): process instructions in topological sort order

testing: measure test coverage

Measuring unit test coverage is trivial. We should do this anyway.

A lot of the testing at the moment is integration testing (see examples and tests directories). Can we measure the coverage of this testing too?

build: syntactic sugar for common cases

Some common use cases are a bit ugly right now. Consider adding helpers to make this nicer.

Loading Pointers

avo/examples/sha1/asm.go

Lines 14 to 15 in 7752262

h := Mem{Base: Load(Param("h"), GP64())}
m := Mem{Base: Load(Param("m").Base(), GP64())}

Perhaps offer LoadPointer that wraps in a Mem operand for you.

Loading Slices

avo/examples/sum/asm.go

Lines 13 to 14 in 7752262

ptr := Load(Param("xs").Base(), GP64())
n := Load(Param("xs").Len(), GP64())

Perhaps offer a LoadSlice that returns a Slice object with registers in it for base, len, cap? But, then what about the common case where cap is ignored? Seems you might need some mechanism for lazy loading the parameters, depending on which of the subcomponents are actually accessed? Needs more thought.

Working with Labels

It's more annoying that it should be

avo/examples/sum/asm.go

Lines 20 to 34 in 7752262

// Loop until zero bytes remain.
Label("loop")
CMPQ(n, Imm(0))
JE(LabelRef("done"))
// Load from pointer and add to running sum.
ADDQ(Mem{Base: ptr}, s)
// Advance pointer, decrement byte count.
ADDQ(Imm(8), ptr)
DECQ(n)
JMP(LabelRef("loop"))
// Store sum to return value.
Label("done")

  • For labels defined and then used later (like "loop" above): if Label returned a LabelRef you would not need to repeat the string literal "loop"
  • For labels used and defined later (like "done" above)... not sure yet

Block Register Allocations

Allocating an array of registers is fairly common.

hash := [5]Register{GP32(), GP32(), GP32(), GP32(), GP32()}

In cases like these it could be nice to have a convenience to block allocate registers. So this would be something like GP32s(5).

ast,build: support build tags

It should be possible to add build tags to generated files. Questions/thoughts:

  • Do we need to support different tags for stubs files?
  • Do we want to support some kind of and/or helpers, or just allow raw text?
  • Automatically add amd64?

Related: Maratyszcza/PeachPy#75

build: generate test cases for instruction builders

It would be good to generate a massive test case that covers all the instruction builder functions.

The asmtest generator works by building a huge assembly function with one line for every instruction form.

// NewAsmTest prints one massive assembly function containing a line for every
// instruction form in the database. The intention is to pass this to the Go
// assembler and confirm there are no errors, thus helping to ensure our
// database is compatible.
func NewAsmTest(cfg printer.Config) Interface {
return &asmtest{cfg: cfg}
}

We could do the same but for the avo builders instead of going straight to assembly.

operand: cleanup handling of constant types

We currently have U8, U16, ... as well as Imm. I am concerned that Imm is confusing, so perhaps should be removed.

avo/operand/const.go

Lines 20 to 32 in 9fbb71b

// Imm returns an unsigned integer constant with size guessed from x.
func Imm(x uint64) Constant {
// TODO(mbm): remove this function
switch {
case uint64(uint8(x)) == x:
return U8(x)
case uint64(uint16(x)) == x:
return U16(x)
case uint64(uint32(x)) == x:
return U32(x)
}
return U64(x)
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.