GithubHelp home page GithubHelp logo

tetratelabs / wazero Goto Github PK

View Code? Open in Web Editor NEW
4.5K 43.0 231.0 21.07 MB

wazero: the zero dependency WebAssembly runtime for Go developers

Home Page: https://wazero.io

License: Apache License 2.0

Go 98.61% Makefile 0.44% Assembly 0.06% HTML 0.41% Rust 0.21% Dockerfile 0.01% PowerShell 0.04% Batchfile 0.02% Shell 0.19%
golang go wasm wasi runtime webassembly jit compiler ahead-of-time vm

wazero's Introduction

wazero: the zero dependency WebAssembly runtime for Go developers

WebAssembly Core Specification Test Go Reference License

WebAssembly is a way to safely run code compiled in other languages. Runtimes execute WebAssembly Modules (Wasm), which are most often binaries with a .wasm extension.

wazero is a WebAssembly Core Specification 1.0 and 2.0 compliant runtime written in Go. It has zero dependencies, and doesn't rely on CGO. This means you can run applications in other languages and still keep cross compilation.

Import wazero and extend your Go application with code written in any language!

Example

The best way to learn wazero is by trying one of our examples. The most basic example extends a Go application with an addition function defined in WebAssembly.

Runtime

There are two runtime configurations supported in wazero: Compiler is default:

By default, ex wazero.NewRuntime(ctx), the Compiler is used if supported. You can also force the interpreter like so:

r := wazero.NewRuntimeWithConfig(ctx, wazero.NewRuntimeConfigInterpreter())

Interpreter

Interpreter is a naive interpreter-based implementation of Wasm virtual machine. Its implementation doesn't have any platform (GOARCH, GOOS) specific code, therefore interpreter can be used for any compilation target available for Go (such as riscv64).

Compiler

Compiler compiles WebAssembly modules into machine code ahead of time (AOT), during Runtime.CompileModule. This means your WebAssembly functions execute natively at runtime. Compiler is faster than Interpreter, often by order of magnitude (10x) or more. This is done without host-specific dependencies.

Conformance

Both runtimes pass WebAssembly Core 1.0 and 2.0 specification tests on supported platforms:

Runtime Usage amd64 arm64 others
Interpreter wazero.NewRuntimeConfigInterpreter()
Compiler wazero.NewRuntimeConfigCompiler()

Support Policy

The below support policy focuses on compatibility concerns of those embedding wazero into their Go applications.

wazero

wazero's 1.0 release happened in March 2023, and is in use by many projects and production sites.

We offer an API stability promise with semantic versioning. In other words, we promise to not break any exported function signature without incrementing the major version. This does not mean no innovation: New features and behaviors happen with a minor version increment, e.g. 1.0.11 to 1.2.0. We also fix bugs or change internal details with a patch version, e.g. 1.0.0 to 1.0.1.

You can get the latest version of wazero like this.

go get github.com/tetratelabs/wazero@latest

Please give us a star if you end up using wazero!

Go

wazero has no dependencies except Go, so the only source of conflict in your project's use of wazero is the Go version.

wazero follows the same version policy as Go's Release Policy: two versions. wazero will ensure these versions work and bugs are valid if there's an issue with a current Go version.

Additionally, wazero intentionally delays usage of language or standard library features one additional version. For example, when Go 1.29 is released, wazero can use language features or standard libraries added in 1.27. This is a convenience for embedders who have a slower version policy than Go. However, only supported Go versions may be used to raise support issues.

Platform

wazero has two runtime modes: Interpreter and Compiler. The only supported operating systems are ones we test, but that doesn't necessarily mean other operating system versions won't work.

We currently test Linux (Ubuntu and scratch), MacOS and Windows as packaged by GitHub Actions, as well compilation of 32-bit Linux and 64-bit FreeBSD.

  • Interpreter
    • Linux is tested on amd64 (native) as well arm64 and riscv64 via emulation.
    • MacOS and Windows are only tested on amd64.
  • Compiler
    • Linux is tested on amd64 (native) as well arm64 via emulation.
    • MacOS and Windows are only tested on amd64.

wazero has no dependencies and doesn't require CGO. This means it can also be embedded in an application that doesn't use an operating system. This is a main differentiator between wazero and alternatives.

We verify zero dependencies by running tests in Docker's scratch image. This approach ensures compatibility with any parent image.


wazero is a registered trademark of Tetrate.io, Inc. in the United States and/or other countries

wazero's People

Contributors

abraithwaite avatar achille-roussel avatar anuraaga avatar ckaznocha avatar codefromthecrypt avatar dranikpg avatar evacchi avatar gaboose avatar ilmanzo avatar inkeliz avatar jcchavezs avatar jerbob92 avatar knqyf263 avatar lburgazzoli avatar mathetake avatar ncruces avatar nullpo-head avatar orsinium avatar pelletier avatar pims avatar pkedy avatar pryz avatar r8d8 avatar robbertvanginkel avatar summerwind avatar taction avatar valpackett avatar wdvxdr1123 avatar yagehu avatar zhiqiangxu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wazero's Issues

gasm doesn't work on bls.wasm file

Here is a repro test case for ya:

package main

import (
  "fmt"
  "bytes"
  "io/ioutil"
  "testing"
  "github.com/mathetake/gasm/wasi"
  "github.com/mathetake/gasm/wasm"
  "github.com/stretchr/testify/require"
)

func Test_Repro(t *testing.T) {
  buf, _ := ioutil.ReadFile("wasm/bls.wasm")
  mod, _ := wasm.DecodeModule(bytes.NewBuffer(buf))
  _, err := wasm.NewVM(mod, wasi.Modules)
  fmt.Println("err:", err)
  require.NoError(t, err)
}

And I've attached the bls.wasm file as a b2.txt since Github won't let me upload a filetype wasm.

b2.txt

There are other two ways of getting the wasm:

  1. download the bls.wasm file by going to https://herumi.github.io/bls-wasm/browser/demo.html , open Network tab in Chrome Developer Console, then download the file in data:application/octet-stream.
  2. Go to the repo file: https://github.com/herumi/bls-wasm/blob/master/src/bls_c.js#L403 and download that long line on line 403. That's base64. Write a little Golang to decode the base64 string to a buffer and you'll have the exact same thing.

This is the error message I get

p$ go test
err: build index space: resolve imports: failed to resolve import of module name a
--- FAIL: Test_Repro (0.01s)
    repro_test.go:19:
                Error Trace:    repro_test.go:19
                Error:          Received unexpected error:
                                build index space: resolve imports: failed to resolve import of module name a
                Test:           Test_Repro
FAIL
exit status 1

Custom name section isn't that custom

The web assembly specification points out that "name" is a custom section, yet it is also defined, including the subsections. This is directly mapped to the text format and implementations rely on this knowledge for round-tripping and debug information.

Right now, we leave "name" as a custom section because wasm.Module is defined needing to be the binary representation. OTOH, binary is only one representation, there's also text and the abstract representation. Moreover, implementors are not required to name types exactly to the spec. In other words, we can choose to top level the type in a way that helps including in round-tripping from the text format as well any future debug information.

Aside: Being flexible where it helps implementation isn't uncommon, even in standard implementations. For example, the %.wat parser is actually the wast parser in Wasm-tools, for convenience. That said, this doesn't mean declaring a type for the representation of the name section ends up more useful than decoding it each time, either!

Bug in WASISnapshotPreview1WithConfig for Environment variables

I found a small bug in the newly added Environment support in WASI:

if len(c.Environ) > 0 {
  environ := make([]string, len(c.Environ))
  for k, v := range c.Environ {
    environ = append(environ, fmt.Sprintf("%s=%s", k, v))
  }
…

with this logic, environ always contains "" as a first item and it fails validation.

Remove naivevm interpreter implementation

In order to reduce the maintenance burden, we would like to remove the naivevm interpreter, given that wazeroir interpreter outperforms well. My plan is when we finish the baseline JIT x86 engine!

Add wasi.Errno type implementing the error interface

Hello,

I'm opening this issue to discuss changing the way error codes in https://github.com/tetratelabs/wazero/blob/main/wasi/errno.go are represented.

Instead of being simple uint32, how about making the a type implementing the error interface. This would be useful to return error codes from custom implementations of the wasi.FS and wasi.File interfaces, which are otherwise converted to the generic EIO code in functions like fd_read or fd_write, for example https://github.com/tetratelabs/wazero/blob/main/wasi/wasi.go#L258

The Errno type would be defined as such:

type Errno uint32

func (e Errno) Error() string { return fmt.Sprintf("wasi.Errno(%d)", uint32(e)) }

The error codes declared in errno.go would be declared as constants of the Errno type:

const (
  ESUCCESS Errno = 0
  ...
)

The signature of functions like fd_read can then be modified to return an Errno value instead of a plain uint32. I believe this should remain compatible with the current implementation since the underlying type remains unchanged, the use of reflection to discover the function signature would keep seeing return values of kind reflect.Uint32.

Let me know if you have any concerns about it or I have overlooked implementation details, I can send a pull request if we agree the change would be useful.

Run Rust WASM Module

Hey @mathetake had a question that possibly has an easy answer, possibly not, but I have no idea who else to ask :)

I got it in my head to run a Rust->wasm module in gasm. Specifically it's swc, (rust code) and they used wasm-bindgen to generate the bindings for NodeJS.

I started by spelunking the bindings:

let imports = {};
imports['__wbindgen_placeholder__'] = module.exports;
let wasm;
const { TextDecoder } = require(String.raw`util`);

let cachedTextDecoder = new TextDecoder('utf-8', { ignoreBOM: true, fatal: true });

cachedTextDecoder.decode();

let cachegetUint8Memory0 = null;
function getUint8Memory0() {
    if (cachegetUint8Memory0 === null || cachegetUint8Memory0.buffer !== wasm.memory.buffer) {
        cachegetUint8Memory0 = new Uint8Array(wasm.memory.buffer);
    }
    return cachegetUint8Memory0;
}

function getStringFromWasm0(ptr, len) {
    return cachedTextDecoder.decode(getUint8Memory0().subarray(ptr, ptr + len));
}

const heap = new Array(32).fill(undefined);

heap.push(undefined, null, true, false);

let heap_next = heap.length;

function addHeapObject(obj) {
    if (heap_next === heap.length) heap.push(heap.length + 1);
    const idx = heap_next;
    heap_next = heap[idx];

    heap[idx] = obj;
    return idx;
}

function getObject(idx) { return heap[idx]; }

let WASM_VECTOR_LEN = 0;

let cachegetNodeBufferMemory0 = null;
function getNodeBufferMemory0() {
    if (cachegetNodeBufferMemory0 === null || cachegetNodeBufferMemory0.buffer !== wasm.memory.buffer) {
        cachegetNodeBufferMemory0 = Buffer.from(wasm.memory.buffer);
    }
    return cachegetNodeBufferMemory0;
}

function passStringToWasm0(arg, malloc) {

    const len = Buffer.byteLength(arg);
    const ptr = malloc(len);
    getNodeBufferMemory0().write(arg, ptr, len);
    WASM_VECTOR_LEN = len;
    return ptr;
}

let cachegetInt32Memory0 = null;
function getInt32Memory0() {
    if (cachegetInt32Memory0 === null || cachegetInt32Memory0.buffer !== wasm.memory.buffer) {
        cachegetInt32Memory0 = new Int32Array(wasm.memory.buffer);
    }
    return cachegetInt32Memory0;
}

function dropObject(idx) {
    if (idx < 36) return;
    heap[idx] = heap_next;
    heap_next = idx;
}

function takeObject(idx) {
    const ret = getObject(idx);
    dropObject(idx);
    return ret;
}
/**
* @param {string} s
* @param {any} opts
* @returns {any}
*/
module.exports.parseSync = function(s, opts) {
    var ptr0 = passStringToWasm0(s, wasm.__wbindgen_malloc, wasm.__wbindgen_realloc);
    var len0 = WASM_VECTOR_LEN;
    var ret = wasm.parseSync(ptr0, len0, addHeapObject(opts));
    return takeObject(ret);
};

/**
* @param {any} s
* @param {any} opts
* @returns {any}
*/
module.exports.printSync = function(s, opts) {
    var ret = wasm.printSync(addHeapObject(s), addHeapObject(opts));
    return takeObject(ret);
};

/**
* @param {string} s
* @param {any} opts
* @returns {any}
*/
module.exports.transformSync = function(s, opts) {
    var ptr0 = passStringToWasm0(s, wasm.__wbindgen_malloc, wasm.__wbindgen_realloc);
    var len0 = WASM_VECTOR_LEN;
    var ret = wasm.transformSync(ptr0, len0, addHeapObject(opts));
    return takeObject(ret);
};

module.exports.__wbindgen_json_parse = function(arg0, arg1) {
    var ret = JSON.parse(getStringFromWasm0(arg0, arg1));
    return addHeapObject(ret);
};

module.exports.__wbindgen_json_serialize = function(arg0, arg1) {
    const obj = getObject(arg1);
    var ret = JSON.stringify(obj === undefined ? null : obj);
    var ptr0 = passStringToWasm0(ret, wasm.__wbindgen_malloc, wasm.__wbindgen_realloc);
    var len0 = WASM_VECTOR_LEN;
    getInt32Memory0()[arg0 / 4 + 1] = len0;
    getInt32Memory0()[arg0 / 4 + 0] = ptr0;
};

module.exports.__wbindgen_string_new = function(arg0, arg1) {
    var ret = getStringFromWasm0(arg0, arg1);
    return addHeapObject(ret);
};

module.exports.__wbindgen_object_drop_ref = function(arg0) {
    takeObject(arg0);
};

module.exports.__wbg_new_59cb74e423758ede = function() {
    var ret = new Error();
    return addHeapObject(ret);
};

module.exports.__wbg_stack_558ba5917b466edd = function(arg0, arg1) {
    var ret = getObject(arg1).stack;
    var ptr0 = passStringToWasm0(ret, wasm.__wbindgen_malloc, wasm.__wbindgen_realloc);
    var len0 = WASM_VECTOR_LEN;
    getInt32Memory0()[arg0 / 4 + 1] = len0;
    getInt32Memory0()[arg0 / 4 + 0] = ptr0;
};

module.exports.__wbg_error_4bb6c2a97407129a = function(arg0, arg1) {
    try {
        console.error(getStringFromWasm0(arg0, arg1));
    } finally {
        wasm.__wbindgen_free(arg0, arg1);
    }
};

module.exports.__wbindgen_rethrow = function(arg0) {
    throw takeObject(arg0);
};

const path = require('path').join(__dirname, 'wasm_bg.wasm');
const bytes = require('fs').readFileSync(path);

const wasmModule = new WebAssembly.Module(bytes);
const wasmInstance = new WebAssembly.Instance(wasmModule, imports);
wasm = wasmInstance.exports;
module.exports.__wasm = wasm;

And then I also decompiled the wasm using wasm2wat. The output is extremely large so I won't put it here, but I could see certain bits that matched the bindings.

// ...

  (import "__wbindgen_placeholder__" "__wbindgen_json_parse" (func $__wbindgen_placeholder__.__wbindgen_json_parse (type $t3)))
  (import "__wbindgen_placeholder__" "__wbindgen_json_serialize" (func $__wbindgen_placeholder__.__wbindgen_json_serialize (type $t1)))
  (import "__wbindgen_placeholder__" "__wbindgen_string_new" (func $__wbindgen_placeholder__.__wbindgen_string_new (type $t3)))
  (import "__wbindgen_placeholder__" "__wbindgen_object_drop_ref" (func $__wbindgen_placeholder__.__wbindgen_object_drop_ref (type $t2)))
  (import "__wbindgen_placeholder__" "__wbg_new_59cb74e423758ede" (func $__wbindgen_placeholder__.__wbg_new_59cb74e423758ede (type $t8)))
  (import "__wbindgen_placeholder__" "__wbg_stack_558ba5917b466edd" (func $__wbindgen_placeholder__.__wbg_stack_558ba5917b466edd (type $t1)))
  (import "__wbindgen_placeholder__" "__wbg_error_4bb6c2a97407129a" (func $__wbindgen_placeholder__.__wbg_error_4bb6c2a97407129a (type $t1)))
  (import "__wbindgen_placeholder__" "__wbindgen_rethrow" (func $__wbindgen_placeholder__.__wbindgen_rethrow (type $t2)))

// ...

(func $transformSync (export "transformSync") (type $t7) (param $p0 i32) (param $p1 i32) (param $p2 i32) (result i32)
    (local $l3 i32) (local $l4 i32) (local $l5 i32) (local $l6 i32) (local $l7 i32) (local $l8 i64)
    (global.set $g0
      (local.tee $l3
        (i32.sub
          (global.get $g0)
          (i32.const 2432))))

// ...

(func $parseSync (export "parseSync") (type $t7) (param $p0 i32) (param $p1 i32) (param $p2 i32) (result i32)
    (local $l3 i32) (local $l4 i32)
    (global.set $g0
      (local.tee $l3
        (i32.sub
          (global.get $g0)
          (i32.const 16))))

// ...

  (func $printSync (export "printSync") (type $t3) (param $p0 i32) (param $p1 i32) (result i32)
    (local $l2 i32)
    (global.set $g0
      (local.tee $l2
        (i32.sub
          (global.get $g0)
          (i32.const 16))))

// ...

  (func $__wbindgen_malloc (export "__wbindgen_malloc") (type $t6) (param $p0 i32) (result i32)
    (block $B0
      (br_if $B0
        (i32.gt_u
          (local.get $p0)
          (i32.const -4)))
      (if $I1
        (i32.eqz
          (local.get $p0))
        (then
          (return
            (i32.const 4))))
      (br_if $B0
        (i32.eqz
          (local.tee $p0
            (call $f7923
              (local.get $p0)
              (i32.shl
                (i32.lt_u
                  (local.get $p0)
                  (i32.const -3))
                (i32.const 2))))))
      (return
        (local.get $p0)))
    (unreachable))

// ...

  (func $__wbindgen_realloc (export "__wbindgen_realloc") (type $t7) (param $p0 i32) (param $p1 i32) (param $p2 i32) (result i32)
    (block $B0
      (br_if $B0
        (i32.gt_u
          (local.get $p1)
          (i32.const -4)))
      (br_if $B0
        (i32.eqz
          (local.tee $p0
            (call $f7882
              (local.get $p0)
              (local.get $p1)
              (i32.const 4)
              (local.get $p2)))))
      (return
        (local.get $p0)))
    (unreachable))

// ...

  (func $__wbindgen_free (export "__wbindgen_free") (type $t1) (param $p0 i32) (param $p1 i32)
    (if $I0
      (local.get $p1)
      (then
        (call $f2265
          (local.get $p0)))))

// ...

  (memory $memory (export "memory") 270)

So the point is, I could see what the wasm module theoretically has to offer and what it expects.
When loading the module with gasm, it certainly threw errors for missing functions, so I defined everything that the module imported, working through the errors until I had all of the return types and input types correct.

(There are some error messages which are not that helpful, like return type 0x7f != but I understand that's not top priority ^_^)

Looking at the bindings, if I want to execute transformSync then the first thing I must do is execute __wbindgen_malloc to get a block of memory in the VM space, and then I can write a string to it and call transformSync with a pointer to that string and so on.

So here's some Go code doing that...

func main() {
	code, err := ioutil.ReadFile("./node_modules/@swc/wasm/wasm_bg.wasm")
	if err != nil {
		log.Fatal(err)
	}

	mod, err := wasm.DecodeModule(bytes.NewBuffer(code))
	if err != nil {
		log.Fatal(err)
	}

	vm, err := wasm.NewVM(mod, buildMyModule())
	if err != nil {
		log.Fatal(err)
	}

	ret, retT, err := vm.ExecExportedFunction("__wbindgen_malloc", uint64(1000))
	if err != nil {
		log.Fatal(err)
	}
}

However, I get a rather inscrutable error:

panic: runtime error: index out of range [-1]

goroutine 1 [running]:
github.com/mathetake/gasm/wasm.(*VirtualMachineOperandStack).Pop(...)
	/Users/me/go/pkg/mod/github.com/mathetake/[email protected]/wasm/vm_stack.go:34
github.com/mathetake/gasm/wasm.memoryGrow(0xc000120000)
	/Users/me/go/pkg/mod/github.com/mathetake/[email protected]/wasm/vm_memory.go:138 +0x311
github.com/mathetake/gasm/wasm.(*VirtualMachine).execNativeFunction(0xc000120000)
	/Users/me/go/pkg/mod/github.com/mathetake/[email protected]/wasm/vm_func.go:102 +0x38
github.com/mathetake/gasm/wasm.(*NativeFunction).Call(0xc004e6d890, 0xc000120000)
	/Users/me/go/pkg/mod/github.com/mathetake/[email protected]/wasm/vm_func.go:92 +0x19b
github.com/mathetake/gasm/wasm.(*VirtualMachine).ExecExportedFunction(0xc000120000, 0x1151e70, 0x11, 0xc0015eef38, 0x1, 0x1, 0x0, 0xc00009af10, 0x0, 0x3, ...)
	/Users/me/go/pkg/mod/github.com/mathetake/[email protected]/wasm/vm.go:111 +0x1f7
main.main()
	/Users/me/git/gosetta/main.go:117 +0x2c4
exit status 2

I'm really not sure what to make of that. I have the right number of arguments for malloc (gasm won't let me pass anything different) and I don't know of any initialization I'm missing. The module runs great in Node, per report.

Are you aware of an incompatibility here or some step I'm missing? Thanks!

Allowing host functions to return an error instead of panicing

Right now, host functions that reach an abnormal end, or have to exit are required to panic out.

Ex. intentional return

	panic(wasi.ExitCode(exitCode))

Ex. abend (ex on I/O)

		n, err := writer.Write(b)
		if err != nil {
			panic(err)
		}

This is an alternative to allowing signatures that return an error, then propagating out anyway. The reasons this isn't done yet are:

  • Slight confusion about error because Wasm doesn't define them. We'd have to document the behaviour.
  • Panic is easier internally as it is already used for global exits (exit from any recursive or reentrant calls)
  • Less internal code (possibly more efficient to panic vs otherwise)

I believe this is confusing from a developer point-of-view, and can result in unnecessary error wrapping. An alternative may be to retain using panics internally for global exit, but use a different call site for functions with an error signature. These functions could internally panic on error to retain the same behavior.

Custom data that is passed to host calls via `HostFunctionCallContext`

Hello! I am excited about wazero as a means to use Wasm without requiring CGO. Looking forward to ARM64 support!

I was trying to add wazero as an engine to wapc-go and ran into a small snag. I need the ability to attach custom data to the module instance in order to store function invocation state (context, payload, error, etc). This option is available in other Wasm runtimes and should be easy to add to wazero.

Envisioned usage:

Instantiation

	if err := m.store.Instantiate(m.module, moduleName); err != nil {
		return nil, err
	}

	ic := invokeContext{
		ctx: context.Background(),
		// ctx and request payload is set prior to calling `store.CallFunction`
                // response payload or error is set after calling the `hostCallHandler` (below)
	}

	m.store.SetInstanceData(moduleName, &ic)

Invocation

func (i *Instance) Invoke(ctx context.Context, operation string, payload []byte) ([]byte, error) {
	*i.ic = invokeContext{
		ctx:       ctx,
		operation: operation,
		guestReq:  payload,
	}

	results, _, err := i.m.store.CallFunction(i.name, "__guest_call", uint64(len(operation)), uint64(len(payload)))

	// Inspect response payload or error in `invokeContext`
        // results[0] = 1 for success, 0 for failures
}

Host call

func (m *Module) my_host_call(ctx *wasm.HostFunctionCallContext, operationPtr, operationLen, payloadPtr, payloadLen int32) int32 {
	ic := ctx.Data.(*invokeContext)  // <--- Grabs the custom user data
	data := ctx.Memory.Buffer
	operation := string(data[operationPtr : operationPtr+operationLen])
	payload := make([]byte, payloadLen)
	copy(payload, data[payloadPtr:payloadPtr+payloadLen])

	ic.hostResp, ic.hostErr = m.hostCallHandler(ic.ctx, operation, payload)
	if ic.hostErr != nil {
		return 0
	}

	return 1
}

Response payload or error are accessed via other host calls: full example here

Possible to override resolver behaviour?

... so that instead of having to call AddGlobal, AddHostFunction etc for everything that we want to inject, that we can instead set callbacks which return function pointers, allowing us to resolve missing functions, missing globals etc?

Create InstantiatedModule type

There's a lifecycle requirement that is tricky to prove here:

The start function is intended for initializing the state of a module. The module and its exports are not accessible before this initialization has completed.

https://www.w3.org/TR/wasm-core-1/#start-function%E2%91%A0

Right now, Store includes the instantiated module state internally.

	err = store.Instantiate(mod, "test")
        store.CallFunction("test", "fac-iter", in)

Due to this, to satisfy the above things, runtime checks like this have to happen internally.

	m, ok := s.ModuleInstances[moduleName]
	if !ok {
		return nil, nil, fmt.Errorf("module '%s' not instantiated", moduleName)
	}

This are subject to racing which makes them hard to test also. Finally, it burdens the user a little as they have a way to invoke something when it can be in the wrong state. Ex.

	// forgot to call store.Instantiate, or some other goroutine isn't done that.
        store.CallFunction("test", "fac-iter", in) // bombs

I think a cleaner design would be to make an InstantiatedModule type and move features that require start being invoked first, to be on that type. That features are only on the correct type, people wouldn't be tempted to do something at the wrong time or when something isn't available

Ex.

	store.CallFunction("test", "fac-iter", in) // compile error!
	imod, err = store.Instantiate(mod, "test")
        imod.CallFunction("fac-iter", in) // compile error!
        factIter, ok = imod.GetFunction("fac-iter") // if ok, definitely that function exists
        factIter.invoke(ctx, in) // naming this "call" is also fine. whichever you can use it as often as you like

Implement DWARF parser for better backtraces

Background

LLVM-based compilers for Wasm, for examples C/C++, Rust, Zig, TinyGo (virtually 100% of viable languages),
emit DWARF information into .debug_* custom sections. The following is the sections contained in a TinyGo binary:

$ wasm-objdump main.wasm -h

main.go.wasm:	file format wasm 0x1

Sections:

     Type start=0x0000000b end=0x00000158 (size=0x0000014d) count: 42
   Import start=0x0000015b end=0x000003df (size=0x00000284) count: 18
 Function start=0x000003e2 end=0x000004e8 (size=0x00000106) count: 260
    Table start=0x000004ea end=0x000004ef (size=0x00000005) count: 1
   Memory start=0x000004f1 end=0x000004f4 (size=0x00000003) count: 1
   Global start=0x000004f6 end=0x000004fe (size=0x00000008) count: 1
   Export start=0x00000501 end=0x000007ac (size=0x000002ab) count: 31
     Code start=0x000007b0 end=0x0001b258 (size=0x0001aaa8) count: 260
     Data start=0x0001b25c end=0x00020862 (size=0x00005606) count: 2
   Custom start=0x00020866 end=0x00034e9a (size=0x00014634) ".debug_info"
   Custom start=0x00034e9d end=0x00035f66 (size=0x000010c9) ".debug_pubtypes"
   Custom start=0x00035f6a end=0x000431a7 (size=0x0000d23d) ".debug_loc"
   Custom start=0x000431aa end=0x00044f60 (size=0x00001db6) ".debug_ranges"
   Custom start=0x00044f62 end=0x00044fa1 (size=0x0000003f) ".debug_aranges"
   Custom start=0x00044fa4 end=0x00046ef6 (size=0x00001f52) ".debug_abbrev"
   Custom start=0x00046efa end=0x00059503 (size=0x00012609) ".debug_line"
   Custom start=0x00059507 end=0x0006510b (size=0x0000bc04) ".debug_str"
   Custom start=0x0006510f end=0x0006bf6b (size=0x00006e5c) ".debug_pubnames"
   Custom start=0x0006bf6e end=0x0006e6e8 (size=0x0000277a) "name"
   Custom start=0x0006e6eb end=0x0006e778 (size=0x0000008d) "producers"

By reading debug sections, we can associate "each wasm instruction" in functions to a specific line of a source code which the binary is compiled from.

Why?

Some of the de-facto Wasm tools have already supported the DWARF format. For example Google Chrome[3] has allowed users to debug Wasm programs on the browser. Another example is Wasmtime -- when you run the panic example in this repo with WASMTIME_BACKTRACE_DETAILS=1, you can see the backtrace with source code info mation:

$ WASMTIME_BACKTRACE_DETAILS=1 wasmtime run examples/wasm/trap.wasm --invoke cause_panic
panic: causing panic!!!!!!!!!!
Error: failed to run main module `examples/wasm/trap.wasm`

Caused by:
    0: failed to invoke `cause_panic`
    1: wasm trap: unreachable
       wasm backtrace:
           0:  0x92a - runtime.abort
                           at /usr/local/lib/tinygo/src/runtime/runtime_tinygowasm.go:63:6
                     - runtime._panic
                           at /usr/local/lib/tinygo/src/runtime/panic.go:13:7
           1:  0x9ba - main.three
                           at /home/mathetake/gasm/examples/wasm/trap.go:19:7
           2:  0x9b0 - main.two
                           at /home/mathetake/gasm/examples/wasm/trap.go:15:7
           3:  0x9a6 - main.one
                           at /home/mathetake/gasm/examples/wasm/trap.go:11:5
           4:  0x99c - cause_panic
                           at /home/mathetake/gasm/examples/wasm/trap.go:7:5

On the other hand, at the moment of this writing, our backtrace is not using DWARF, but just parsing "name" custom sections and attach each function name:

panic: causing panic!!!!!!!!!!
wasm runtime error: unreachable
wasm backtrace:
	0: runtime._panic
	1: main.three
	2: main.two
	3: main.one
	4: cause_panic

This will be much more useful when users run non-TinyGo Wasms -- usually the function names are mangled by compilers (luckily TinyGo does not!) so they are basically not human-readable. For example, Rust binary's backtrace with custom sections would look like this:

  0:  0x42deb - __rust_start_panic
  1:  0x42c0c - rust_panic
  2:  0x42882 - _ZN3std9panicking20rust_panic_with_hook17h072472ae3822b936E
  3:  0x32914 - _ZN3std9panicking11begin_panic28_$u7b$$u7b$closure$u7d$$u7d$17hed88036b12f483dfE
  4:  0x34891 - _ZN3std10sys_common9backtrace26__rust_end_short_backtrace17h9133fcc3e85035deE
  5:  0x32810 - _ZN3std9panicking11begin_panic17he6f6e918174263cfE
  6:  0x39eb - _ZN77_$LT$http_headers..HttpHeaders$u20$as$u20$proxy_wasm..traits..HttpContext$GT$6on_log17hde90e85ea16e616eE
  7:  0x2ae53 - _ZN10proxy_wasm10dispatcher10Dispatcher6on_log17hc6cd4fb35c538b86E
  8:  0x2d3dd - _ZN10proxy_wasm10dispatcher12proxy_on_log28_$u7b$$u7b$closure$u7d$$u7d$17h3f864ec735f41e70E
  9:  0x311bd - _ZN3std6thread5local17LocalKey$LT$T$GT$8try_with17hc87d8e9cf2d2494cE

With DWARF information, we don't need to parse "name" custom section therefore we won't suffer this mangled dirty symbols and instead we can emit each trace with human-readable function names plus source code info.

How?

Wasm DWARF format[1] is almost same as the standard DWARF specification version 5?[2] with the difference where the address should be interpreted as an offset from the beginning of "the code section" vs the beginning of "the binary" in non-Wasm format.

So it should be simple to write parser by getting insights from other DWARF implementations.

Links

[1] https://yurydelendik.github.io/webassembly-dwarf/
[2] https://dwarfstd.org/doc/DWARF5.pdf pdf!
[3] https://twitter.com/ChromeDevTools/status/1192803818024710145

Add support for sign-extension instructions (post MVP feature)

Hello,

I'm opening this issue to report an error I ran into where it appeared wazero was unable to instantiate an AssemblyScript program compiled with asc.

Here is the error I got:

functions: invalid function at index 106/280: invalid instruction

Tools like wasm2wat are able to read through the file without issues, so it appears correct tho I do not know whether they perform checks similar to those in wazero.

I'm attaching the program that triggered the error:
test.wasm.gz

Let me know if you need any other information.

Making it possible to "clone" an initialized store

Hi peeps, playing around with wazero for fun - it's awesome! I'm currently using wazero to run wasi based libraries within Go. The library is built in Rust, and I was testing FFI vs Wazero because, well, it sounded interesting.

After implementing the basics, I'm noticing:

  • Instantiating a new store and interpreter takes the bulk of the time. For my library, that's anywhere from 400ms to 1.5 seconds.
  • wasi and wasm doesn't de-allocate memory via free(), so when I allocate memory in the library to pass objects (eg. strings, structs), they're never going to be cleaned up. memory grows linearly with usage.

So, that leaves you with a couple options to manage speed vs memory growth:

  • Create a new env in a separate goroutine when memory usage grows to a specific level, then atomically swap envs to reclaim space
  • Or, allow an instantiated Store to be cloned once initialized so that we can use the fresh state for each library invocation.

I've played around with cloning a store, but it's quite a pain. Theoretically, is this something that wazero would ever enable? Is there something I'm missing here?

Also, I haven't looked into JIT and this is only for the interpreter engine as ... well, I'm on debian arm, and i want to produce static builds for every OS & arch 😬

Host functions called from another host function cannot know the right MemoryInstance

In the current implementation, host functions try to get the MemoryInstance of HostFunctionCallContext from <function>.ModuleInstance.Memory of the parent stack's (in interpreter) or of the function itself (in jit).
However, a host function doesn't have associated MemoryInstance nor wasm modules in the first place. So, a host function only finds nil as MemoryInstance when it's called from another host function.

FYI, it looks wasmtime also doesn't provide the access to memory instances when a host function is called from another host function because there's no caller context in that case, but let me check.

memory = it.frames[len(it.frames)-1].f.funcInstance.ModuleInstance.Memory

e.execHostFunction(compiled.source.HostFunction, &wasm.HostFunctionCallContext{Memory: f.ModuleInstance.Memory})

This also prevents the WASI implementation from testing APIs by "import-and-exported-back" style, where tests call each WASI API by having a wasm module import a WASI API and re-export it to the host.

Refactor out a codec package

Follow-up from #133 (comment)

Currently, the wasm package is conflated with binary format and codec. This has led to some confusion for example, where logic for writing the binary format of names should live. If there was a separate package, this would be a no-brainer as the part that decodes should also have the logic that encodes.

Since later we need to refactor out an internal package, we should probably consider refactoring a text and binary codec package also.

Note: This isn't the highest priority as we don't even support the full text format, yet, and doing this before we have a functional second format is questionable. However, it makes sense to do!

Add lint configuration to prevent `import "C"`

wazero is by nature a tool that should never include import "C". When code accidentally includes this, it causes mysterious build failures. There's probably a linter that can fail the build on import "C" either by import name or preventing CGO another way.

Ex https://github.com/tetratelabs/wazero/actions/runs/1832310806

Run go list ./... | xargs -Ipkg go test pkg -c
  go list ./... | xargs -Ipkg go test pkg -c
  shell: /usr/bin/bash -e {0}
  env:
    GO_VERSION: 1.17
    GOROOT: /opt/hostedtoolcache/go/1.17.6/x64
    GOARCH: arm64
go: downloading github.com/twitchyliquid64/golang-asm v0.15.1
# github.com/tetratelabs/wazero/wasm/text
Error: wasm/text/abbreviation_parser.go:48:60: undefined: callbackPosition
Error: wasm/text/abbreviation_parser.go:78:6: undefined: parserPosition
Error: wasm/text/abbreviation_parser.go:106:9: undefined: callbackPositionUnhandledToken
Error: wasm/text/abbreviation_parser.go:107:10: undefined: positionInitial
Error: wasm/text/abbreviation_parser.go:287:38: undefined: callbackPosition

Move contents of wasm.Module to an internal package and replace it with an interface

Currently, we are doing some re-validation because wasm.Module is not required to be created by a ModuleDecoder. For example, we double-check certain things like the start function pointing to a valid function type. These sorts of checks can and should be done when parsing/decoding as otherwise you don't know where the failure line/col was. Moreover, there are so many rules that it is a fools errand to repeat each one in instantiate: we'd end up with a lot of code that isn't necessary if decoded properly. Finally, exposing fields like the CodeSection to mutation allow some pretty severe bugs to be possible. For example, someone could by accident mess up a signature order, something that decoders already enforce.

I don't think 3rd party manual instantiation of wasm.Module is viable. The only sustainable way is via validating decoders which we already maintain. I believe that we should instead move the existing logic of wasm.Module internal, guaranteeing it is only created by our trusted decoders. We'd backfill wasm.Module as a struct or interface that has no exported mutable fields, but could be implemented by or contain an internal wasm.Module created by the decoder.

Doing this allows us to move all validation possible to do during decoding to the edge and guarantee we can count on that having happened prior to instantiate. This also removes a large bug source area of random or misunderstood contents in function code or constant declarations.

JIT: consider locking Goroutine thread until JIT exits

We could use runtime.LockOSThread() to block Goruntime from switching the goroutine via async preemption which can happen at any point of execution.

Even though we've tested the JIT execution is stable without locking (as in jit/engine_test.go ), there might be still some condition where the JIT would fail.

This needs more through investigation on how Goroutine saves CPU registers when switches happens (at source code level), and more test cases to ensure whether or not we need thread blocking via LockOSThread.

Flatten type hierarchy removing single-implementation types

#79 removed the naivevm interpreter implementation leaving what we called wazeroir. We had to name the other implementation because there were multiple possibilities. In reality the maintenance of multiple stores implies another will never live. Let's try to reduce the amount of types and packages knowing there is only wazeroir. Part of this could be "denaming" wazeroir as it becomes the default and only impl. Any other opportunities to flatten package and type hierarchies, and generally concepts or terms should be taken as well!

backfill a common documentation interface for jit compiler

Right now, there are lack of overview comments on compiler methods and if we did, they would need to be copy/pasta'd and drift. Moreover, the compilers don't need to be exposed publicly as they can be hidden internally and where needed only an entry point exposed public. If we had all these things internal, we could make a documentation-only interface (with a test compile check) that has overview notes on for example, what compileUnreachable should do in abstract (ex any wasm opcodes or whatever). Then, the precise implementations can re-use those docs, like

// compileUnreachable implements Compiler.compileUnreachable for the arm64 (s390x risc-v etc) architecture.
//
// Specifically, this does blah blah and blah because blah

Note: this is documentation-only, meaning we can still use structs, this just allows us to ensure we have minimally documented things with no repetition and also we can easily identify what isn't common (ex. if compileXXX is specific only to one arch, we would see no compileXXX implements Compiler. compileXXX in the doc

unreachable

Running this module, triggers unreachable opcode 0x00 where it shouldn't.
It happens in the inner if block.

(module
  (type (;0;) (func (param i32) (result i32)))
  (func (;0;) (type 0) (param i32) (result i32)
    (local i32)
    local.get 0
    if  ;; label = @1
      i32.const 0
      local.set 1
      loop  ;; label = @2
        local.get 1
        i32.const 3
        i32.gt_u
        if  ;; label = @3
          local.get 1
          return
        end
        local.get 1
        i32.const 1
        i32.add
        local.tee 1
        local.get 0
        i32.lt_u
        br_if 0 (;@2;)
      end
    end
    local.get 1)
  (memory (;0;) 1)
  (export "mem" (memory 0))
  (export "f" (func 0)))

gasm executes the following opcodes:

op=0x20
op=0x4
op=0x41
op=0x21
op=0x3
op=0x20
op=0x41
op=0x4b
op=0x4 // if
op=0x0 // <- this is wrong

See the example code (and the binary module min.wasm):
https://github.com/ktye/goissues/blob/master/gasm/7/main.go

Tracking issue for the baseline JIT engine for amd64 target

Baseline JIT engine

Post baseline

  • achieve function calls without going back to Go code (engine.exec)
  • Adds tests and better documentations on mmap system calls (See #60 (comment))
  • Bunch of optimizations! (not limited to single pass)

Make wasm.HostFunctionCallContext an interface

Right now, someone can accidentally mess up the memory of the host function, I think both the context and its members should be accessed via interfaces, which can provide functions to manipulate memory instead of direct buffer access. This will ensure bounds etc aren't violated.

Particularly this means not exporting Memory.Buffer as Memory is becomes an interface and it has interface functions to perform common tasks.

Does gasm only support tinygo WASM modules?

I tried a little experiment to validate this but it didn't work. Below is the experiment:

package main

const wasmBytecode = `
	(module
	  (type (func (param i32 i32) (result i32)))
	  (func (type 0)
	    local.get 0
	    local.get 1
	    i32.add)
	  (export "sum" (func 0)))
`

func main() {
	wasm, _ := wasmtime.Wat2Wasm(wasmBytecode) // notice not tinygo
	mod, err := gasm.DecodeModule(bytes.NewBuffer(wasm))
	if err != nil {
		panic(err)
	}
	vm, _ := gasm.NewVM(mod, wasi.New().Modules())
	b.ResetTimer()

	_, _, _ = vm.ExecExportedFunction("sum", uint64(5), uint64(37))
}

Which resulted in a panic:

panic: runtime error: index out of range [0] with length 0

Just wanted to clarify if what I observed is expected.

CNCF SIG-Runtime Discussion/Presentation

Hello Gasm team,

I'm one of the co-chairs of the CNCF SIG-Runtime, I'm reaching out and think it would be great for you to present/discuss the project at one of our meetings. For example, discuss things such as architecture overview and use cases.

Let me know if this something you'd be interested in doing. If yes, please feel free to add it to our agenda or reach out to me (raravena80 at gmail.com)

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.