lunatic-solutions / lunatic Goto Github PK

View Code? Open in Web Editor NEW

4.5K 71.0 135.0 3.99 MB

Lunatic is an Erlang-inspired runtime for WebAssembly

Home Page: https://lunatic.solutions

License: Apache License 2.0

Rust 98.23% WebAssembly 1.77%

vm webassembly erlang actors rust assemblyscript runtime wasm lunatic

lunatic's People

Contributors

Stargazers

Watchers

Forkers

isgasho 8847141 kustomzone doytsujin zmilan longjohncoder stjordanis lffranca kusanagi2501 ameliak89 arnevogel imor flarelee sleipnir grippy mydnicq bkolobara zebp rusch95 benstiglitz teymour-aldridge darwin-systems yetanotheropensource rustbunker forkkit pingponglabs aksrustagi theduke linecode placrosse willemneal daana32 cruelu datalove-app lurk-lab zeta1999 spector-in-london diamantisk zabrane jarrodhroberson elssar olanod somuchspace meshx-org hoangpq ibnusina-cibening zyhong zeroows masterjj zhaopufeng waynemunro jtenner therustmonk bhatti maxblinder-godindustry frankfanslc pavelzx kokizzu drogus hellcoderz binadamu-isiyoonekana achyutha cyberflamego sinhasantos rustloverthecoder fanweixiao spread0x zhamlin pinkforest pingiun kosticmarin altoplano mdmarek markintoshz sshyran cristianstanciu astralmoonhalo ahalabs tqwewe hurricankai 5gapp pratikdhanave jeisc anupj gmh5225 shoz-f alecthomas leoshimo qwesdme squattingsocrates crt-fork clownsw redchew-fork roger countbleck ariesdevil lambdaxymox kowsheek xet7 yhm404

lunatic's Issues

What does lunatic mean?

Out of curiosity, why do you name the library as lunatic? This is totally one of the FAQs!

Unlink processes that finish normally

A linked process will propagate a failure to all links, but if a process finishes normally it will not notify others. Because of this, a process that spawns a lot of linked children that finish normally, will accumulate a list of all links that is never freed. We should also notify all links that a process finished normally, so that they can unlink it (remove from the list).

Also, the current implementation of linking is a bit fragile, maybe it would make sense to redesign the whole system. A good idea would be to research a bit on how Erlang is storing the information about what processes are linked together.

Use Wasmtime's shared host functions to improve process spawning performance

As mentioned in the issue #37, a recent addition to Wasmtime makes it possible to share host functions between all instances without adding them over and over again to each instance linker. This should greatly reduce instance creation time.

To be able to use this, we first need to figure out how to bind state/context to each instance. Currently we capture the state in a closure and add this closure as a host function for the instance. If we are going to have only global host functions, we need to do some work to fetch the state during each invocation of the host function.

As the RFC explains, we will need to switch to using the Store::set and Store::get methods together with the Caller structure to get the store for each instance, like Wasmtime's WASI implementation. Generally, this is a good approach as we always maintain one Store per instance in Lunatic, but a big obstacle for us here is the way that the uptown_funk macro works.

Store::set/get allows us to only set one context per Store/Instance. With uptown_funk we allow you to define different states belonging to a group of function. This means that we need to put all the individual states inside a common structure and during the macro generation assign unique IDs to each state structure, so that the functions can fetch the specific state belonging to them.

I misunderstood this API at first. Store::set/get works on a per type basis and should be a great fit for uptown_funk.

At this point I'm not sure what the performance implications are if we need to each time look up a state in something like a hash map for host functions that use state. I will run a few benchmarks to see if the cost of doing this is too hight.

Ideally, on each macro invocation we would assign an incremented id and be able to generate a tuple containing the state to minimise lookup speed. But as Rust macros can't keep state between invocation, this is currently not possible.

New plugin system

Lunatic already has a plugin system that is not well documented or particularly useful in the current state. It started out as a way to dynamically change host functions with WebAssembly modules, but ended up being a tool to modify WebAssembly modules before instantiating them.

I want to use this issue to discuss pros and cons of different approaches to a plugin system and figure out what the best path forward would be. In my grand vision plugins would be an easy way to extend the VM without needing to recompile it, just by adding Wasm modules to it. First of all this means extending the host functions. It could also be a playground for adding new functionality to the VM; we can test it as a plugin behind a feature flag and later provide a native implementation. This could potentially also improve development speed, because plugins can be dynamically loaded and added to environments inside one VM instance. Testing plugins or comparing performance between competing implementations becomes trivial, just spawn different environments, attach different plugins to them and run the same code. You could implement plugins in this case in any language compiling to Wasm, you would not only be limited to rust when extending the VM.

Another use case could be replacing existing functionality with custom implementations. One example of this would be to provide a custom networking layer for the distributed lunatic implementation.

We can go even further here and allow people to load native shared libraries that are exposed as host functions (an idea of @jtenner). Or even allow plugins to modify the loaded Wasm code before JIT compiling it, an approach that our heap profiler plugin is taking.

The most important questions I would like answered here are:

How do we provide a good user experience around plugins? If creating plugins is too complicated and the complexity around them is hard to manage, nobody is going to be using them.
What kind of plugin system would give us the best trade-off between user experience and performance?
Should we allow native code (that could potentially crash the whole VM) plugins?

1. Plugins that provide host functions

Originally I wanted lunatic to be just a set of core APIs (networking, filesystem, processes, ...) and every other functionality would be implemented as a plugin in WebAssembly on top of it. Around the same time Wasmtime got support for module-linking and I saw a great opportunity to actually accomplish this. The idea was simple, you could add new host functions to processes by defining them inside WebAssembly plugins. These host functions would provide higher level abstractions on top of the core APIs or even shadow/change the core APIs.

Lunatic would provide a set of "standard" plugins. WASI could be implemented that way, re-exposing lunatic's core filesystem API to guest code through a WASI compatible interface. You could also provide your own plugins by passing them to the lunatic binary: lunatic --plugin my-wasi.wasm .... That way you could provide a WASI implementation that actually uses lunatic's networking API and proxies your filesystem read/writes to a network storage.

It would give us an opportunity to keep the core API simple and small, but provide an elegant way to extend it without needing to step out of the WebAssembly space. The developer experience around creating these plugins would be simple, you just import existing host functions and use them inside new host functions that you export.

There are a few drawback to this approach. The module-linking proposal requires each module to be instantiated separately. This means that for each process we are now creating not one WebAssembly instance, but n (where n is the number of plugins we are loading). Process spawning speed is really important for most lunatic use cases and having it progressively slow down as we are extending functionality with plugins was not something I could easily accept. One solution here could be to have the module-linking implementation be lazy and only instantiate if we actually ever use plugin provided host function from a particular process, but the WebAssembly runtime we are using (Wasmtime) doesn't provide such a feature.

A bigger issue is the performance overhead when proxying calls. Let's say someone provides a wasi-socket plugin that builds on top of our core networking API to provide a WASI compatible networking interface. Now writing to the network would require us to first copy all the data from the guest process heap into the wasi-socket plugin instance. If the wasi-socket was building on top of other plugins it would even take more copies, because each layer can just talk to the next one and move data in-between. An alternative to creating all of theses copies would be a much more complex implementation of APIs that allow you to export different memories from different plugins and have a system of core APIs that also take memory "indexes". However, such a system would be much more complicated to work with. You would need to keep track of memory references and slices inside of plugins, on top of the logic you are implementing.

The interface-types proposal is not much of help here as it also assumes copying. From what I understand reading the spec, the Wasm runtime could potentially optimize out this copies, but it would only work between guest code and native host functions, and can't be optimized out between Wasm modules linked together (our case). There are some additions to interface types that could make it work (streams), but it's hard to tell at this stage how exactly everything will fall into place or when this might land.

Lunatic's design shines in I/O heavy workloads and introducing additional overhead is a no-go here. The only way forward would be our custom implementation with memory indexes and then eventually move to interface-types streams once they are ready.

2. Plugins that modify the WebAssembly bytecode

There are some use cases where you want plugins to modify the loaded WebAssembly bytecode. Our heap profiler plugin works that way. It examines the module, looks for functions with names such as malloc, alloc, or similar, and inserts additional call instructions into them that hook up to some host functions that count the memory usage.

One big issue with this approach is that it's really hard to modify WebAssembly modules correctly and depending from which host language you are compiling to WebAssembly the assumptions you are making about the generated bytecode may not hold (e.g. language doesn't have functions with the names malloc, alloc, ...).

The first implementation of this system in lunatic was just giving the whole binary module to the plugin and taking the modified one back. This required each plugin to ship with a whole WebAssembly parser inside and always re-parse the module. Also the order in which the plugins are loaded become significant in this case and can introduce weird issues.

The current implementation parses once the module and exposes hooks to plugins to query the module for functions, modify them or add new ones. I think that this is an ok way forward, it keeps the plugin modules small and we can only expose hooks that are "safe" to use (won't produce incorrect bytecode). It limits the number of ways in which you can modify the module, but this could be a good thing. I spent a lot of time writing Wasm code modification in the first versions of lunatic and it's super easy to produce broken modules.

Even it's super hard to write correct plugins by modifying the Wasm bytecode, I feel like there are always going to be cases where we need it. One good example would be providing polyfills. For some time lunatic used reference types in host functions to manage resources, but many languages don't support reference types on the guest side (rust, c, ...). What we would do is check during loading of the module if there was a mismatch between signatures on the guest and host, if yes we would polyfill the guest with wrappers that save the reference types into a local table and return the index (i32) inside of the table to the guest code that knows how to work with i32 values. If we ever want to introduce host functions that use reference types or interface types again, we will probably need to provide some polyfills for languages that don't understand these "higher level" concepts.

3. Native plugins

This idea is quite simple, but I'm not sure how it would be implemented or if it would even make sense to have it. You would be able to add a native shared library as a plugin (e.g. lunatic --plugin dangerous_stuff.so/dylib/dll ...). Of course these plugins would be OS and CPU architecture specific, but they would give you the power to call any native functions directly from Wasm modules. You would simply use them like every other host function, just import them by name.

I assume that eventually someone is going to want to hand write some assembly, use a specific peace of hardware or just want to call a library that can't be compiled to WebAssembly yet. This would be the easiest way give them access to it. On the other hand this breaks all security for the current VM instance and all processes inside of it, as the native code can do anything it wants. Once we get distributed lunatic this can be worked around by running a node just for native plugins, that would not have access to the memory space of other instances of the VM, basically isolating any damage the native plugin could do just to a small set of selected processes.

Conclusion

These 3 points should cover almost all use cases. We just need to figure out what's the right balance of power and complexity we want to actually expose to plugin writers. How can we make plugins performant and a joy to write?

I would also love to hear feedback from the community, what do you think? Are there maybe alternative approaches that would be interesting?

One-directional linking of processes

Currently, we can spawn processes using spawn or inherit_spawn, with the option to link the process using a tag. Then, if either one of the processes traps, a Signal message is sent to all linked processes. Processes who receive this Signal message have the option to:

(standard) also trap and send a Signal message to all linked nodes.
put the message in it's own mailbox to deal with it in a different way using die_when_link_dies.

When creating supervising structures, I think the best approach is to have one-directional links, where if the child dies, it sends a Signal message to the parent, who deals with it by putting it in the mailbox. If the parent process dies, then a Signal message should be sent to the children, who should deal with it by trapping as well.

Right now this setup is possible by creating another abstraction layer on top of the runtime, which has all Signal messages put into the mailbox, and which traps/signals based on it's own information. However, a problem with this approach is that we cannot search the mailbox specifically for Signal messages (At least I don't know how). This creates a problem where if the mailbox is full, and a child panics, the parent might only know this after it has processed 100 messages that were in it's queue.

I can see 2 possible ways of solving this:

Create a function like receive_signal, which only returns signal messages from its mailbox.
Have a way of creating a directional link when using spawn, which traps if the parent process dies, but puts the Signal message in the mailbox if the child dies. This would probably require some bigger changes in the runtime code, since now we would have to differentiate between different signal messages, and handle them in a different way.

Probably the first way would make more sense, this would also give more room for the systems built on top of the runtime for when to stop themselves. They could for example do some cleanup of their state (ie backup to database), before trapping themselves.

(also, I think the name Signal is confusing. What kind of signal? A better name might be TrapSignal or something along those lines. I could imagine other types of signals being added in the future)

Networking API UDP peer_addr Missing

Whilst working on lunatic-solutions/lunatic-rs#17 at lunatic-solutions/lunatic-rs#33
I realised that we don't have peer_addr() in host when we connect() and whilst using recv() and send()
If I would cache it in UdpSocket via connect() that would make UdpSocket mutable on connect() which doesn't match std's

Break the `Environment` abstraction apart

The Environment abstraction was a way of unifying process permissions and settings for Wasmtime. For each Environment we create a specific wasmtime::Engine & compile wasmtime::Modules with it.

At this point the Environment is just doing too many things:

Defines JIT compiler options & memory allocation strategies
Defines compute and memory limits
Defines a list of allowed host functions
Defines a list of accessible folders, files, cli arguments and environment variables
Allows for spawning processes on other nodes

In the past some of these capabilities were highly coupled, like the definition of allowed host functions and memory limitations, and it made sense to group everything together. Later, I just continued adding all other configuration options under the Environment because it was "easy" and fitted in nicely with most of the other stuff. Now I would like to rethink this abstraction.

Permissions

Allowing users to enable/disable host functions during runtime was a really "cool" idea originally, but it didn't turn out to be that practical. Processes that are spawned from a module that imports a particular function can't be instantiated without this function being available. To be able to remove some host functions during runtime we need to create "fake" host functions that trap on every access.

It's also really hard to model permissions based on host functions (see #73). In many cases you need to allow a whole group of host functions for one functionality (message sending). This can't be completely solved with namespacing.

The probably most used Environment feature is the ability to spawn processes with a limited amount of memory. I also see the file access permissions used fairly often. That's why I believe we should provide only higher level permissions instead of messing with enabling/disabling host functions. Being able to say:

This process can't access the filesyste
This process can't use more than 10 Mb of memory
This process can only connect to this IP subnet (domain name)
This process can only receive messages

is way more useful in practice, and none of these features can be expressed through enabling/disabling host functions. They all need higher level abstractions, like string based configurations.

I'm not sure what the best design would be around such a pattern. Could we call it a ProcessConfig and pass it to the spawn function? The spawn_inherit function would, like it does now, just inherit the configuration of the parent?

Redirecting streams

There is another use case that would be interesting to support, but that doesn't work currently. It would be great if we could replace a process i/o stream during spawning. Like, spawn me this process but redirect all stdio output to this stream. And the parent would be able to capture the stream output from the child process.

This doesn't fit completely into the ProcessConfig abstraction, because the config can be used by many children, but the stream needs to be provided on a per child basis. I'm not sure how to model this in the API yet.

Engine settings

The Environment also tries to configure the Wasmtime JIT Engine based on the max memory limit. I would simplify this setting and just provide a global Engine that can be modified through cli arguments. Many lunatic characteristics are defined through the engine, like max process count (see #79). I think we should just pick reasonable defaults, instead of allowing users to use different engine settings per process (group).

Distributed lunatic

I think we should introduce a Node abstraction. Processes should be able to look up all available nodes and spawn other processes on them. The VM would transfer the module in the background to the node, jit compile it and spawn a process. At the moment I don't have a clear picture how it would fit in with the permission design.

Feedback appreciated

I just wanted to kick off the discussion with this issue and gather some feedback from the community. Maybe there are other interesting features that I didn't think of? What do you think is the ideal permissions API?

Auto-generating documentation for host functions

I expect most people to write lunatic applications in higher level languages and use the features through a library, like we provide for Rust and AssemblyScript. To develop these libraries or write some programs directly with wat you need to use the host function (syscalls) we expose. Currently you need to look at the source code to see what host functions are available, but I would like to automatically generate an API reference that we can host somewhere (e.g. docs.lunatic.solutions).

Host functions have special comments starting with //%. Here is an example of the add_process function:

//% lunatic::message::add_process(process_id: u64) -> u64
//%
//% Adds a process resource to the next message and returns the location in the array the process
//% was added to. This will remove the process handle from the current process' resources.
//%
//% Traps:
//% * If process ID doesn't exist
//% * If it's called before the next message is created.

A tool could automatically extract the comments from the source code and generate some JSON. We would use this JSON in combination with a static site generator (Gatsby, nextjs, ...) to generate a hierarchical reference documentation.

This would make it easier for developers to build a lunatic library for their favourite language.

The Discord invite has expired

Lunatic runtime as a library / Embeddable lunatic

I think it would be useful to be able to embed the lunatic runtime as a library. This would enable native applications to use lunatic for user plugins. Many APIs still only exist on native, but guaranteed cancelable, preemptable, sandboxed user plugins are a very very desirable feature.

I would expect the API to let native applications spawn processes (by passing in a buffer containing compiled wasm) while precisely specifying their capabilities. For ease of use it should be possible to open a channel between the native side and a process inside the runtime. For performance it would also be desirable for the native process to be able to write directly into memory viewable by runtime processes, provided the runtime processes have been given the capability.

On the native side being able to cancel or pause runtime processes would also be useful. It would be a great reliability feature for many applications if they were able to kill user plugins that aren't responding.

Opening preopened_dirs repeatedly with each process spawn exceeds maximum file descriptor limit

The unit test process::recursive_count in rust-lib fails on my laptop running macOS Catalina. The cause of is issue is that the current working directory is opened repeatedly with each process spawn. With the default maximum file descriptor limit of 256, the unit test with 1000 recursion calls exceeds the limit easily.

lunatic/crates/lunatic-wasi-api/src/lib.rs

Line 22 in 1ba1c75

 let preopen_dir = Dir::open_ambient_dir(preopen_dir_path, ambient_authority())?; 

Is there a reason why preopening CWD is necessary?

Continuous Benchmarks with Other Runtimes and Native Execution

As part of the CI infrastructure can you set up benchmarks with other runtimes and native executions in marjor languages which can be make available for users to track.

How to lunch the app ?

Hello,

I try to follow the docs to try lunatic (https://crates.io/crates/lunatic)

here the code I'm working on
https://github.com/parweb/test-lunatic/

I succeed to build the app after adding extern crate lunatic;
cargo build --release --target=wasm32-wasi

But where I find the binary lunatic ? lunatic target/wasm32-wasi/release/<name>.wasm

Multiple definition linker error on fresh install

Just did a fresh install (literally installed rustup and did cargo install lunatic-runtime) and got this error:

error: linking with `cc` failed: exit status: 1
  |
  = note: "cc" "-m64" "/tmp/rustcrGJOdl/symbols.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.0.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.1.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.10.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.11.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.12.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.13.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.14.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.15.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.2.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.3.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.4.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.5.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.6.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.7.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.8.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.lunatic.c3e7b227-cgu.9.rcgu.o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9.3r91if9yi76wrqn7.rcgu.o" "-Wl,--as-needed" "-L" "/home/lotus/./.cargo/target/release/deps" "-L" "/home/lotus/./.cargo/target/release/build/psm-1a9a392a57de01c4/out" "-L" "/home/lotus/./.cargo/target/release/build/zstd-sys-c711d78fdd601bbc/out" "-L" "/home/lotus/./.cargo/target/release/build/wasmtime-fiber-c08d6472d90b0d6d/out" "-L" "/home/lotus/./.cargo/target/release/build/ittapi-rs-5ca8f1b2c40f97d0/out" "-L" "/home/lotus/./.cargo/target/release/build/wasmtime-runtime-de56bef3e6c7ff2d/out" "-L" "/home/lotus/./.cargo/target/release/build/wasmtime-runtime-fbf7fdfb213f8248/out" "-L" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-Wl,-Bstatic" "/home/lotus/.cargo/target/release/deps/liblunatic_runtime-03b482a8a341c5d9.rlib" "/home/lotus/.cargo/target/release/deps/liblunatic_registry_api-1e61444b6b80ea92.rlib" "/home/lotus/.cargo/target/release/deps/liblunatic_version_api-8972a81873d5fb9e.rlib" "/home/lotus/.cargo/target/release/deps/liblunatic_messaging_api-8973855eb3f874fb.rlib" "/home/lotus/.cargo/target/release/deps/liblunatic_networking_api-db9986b2d19fd9a3.rlib" "/home/lotus/.cargo/target/release/deps/libasync_net-3559d62c20c0facd.rlib" "/home/lotus/.cargo/target/release/deps/liblunatic_process_api-3893312dfdbc7e68.rlib" "/home/lotus/.cargo/target/release/deps/liblunatic_wasi_api-be35f836107a8a56.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime_wasi-8d3da9bc7099c9ba.rlib" "/home/lotus/.cargo/target/release/deps/libwasi_cap_std_sync-ccdd6ce5912e8f6c.rlib" "/home/lotus/.cargo/target/release/deps/libis_terminal-cdefa316b034f14c.rlib" "/home/lotus/.cargo/target/release/deps/libsystem_interface-064a28ad54ff9fc7.rlib" "/home/lotus/.cargo/target/release/deps/libcap_fs_ext-dd2a504a088f818c.rlib" "/home/lotus/.cargo/target/release/deps/libcap_time_ext-30a27455ee86913c.rlib" "/home/lotus/.cargo/target/release/deps/liblunatic_stdout_capture-9ca1092da368e24f.rlib" "/home/lotus/.cargo/target/release/deps/libwiggle-81d4626939e1a453.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime-033597db0ad5bdbc.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime_jit-8bdd7f88efe79e9b.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime_runtime-fe3c7d4426ae828c.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime_jit_debug-7bc2b3b11f933251.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime_environ-e48dd5dc617c9bf7.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime_types-46c1019201b1c79a.rlib" "/home/lotus/.cargo/target/release/deps/libcranelift_entity-ef68ccc872e33696.rlib" "/home/lotus/.cargo/target/release/deps/libwitx-e56c979ba2606b44.rlib" "/home/lotus/.cargo/target/release/deps/libwast-3eea21d6eaf71c6d.rlib" "/home/lotus/.cargo/target/release/deps/libwasi_common-c0a6a1fcd14e68bf.rlib" "/home/lotus/.cargo/target/release/deps/libwiggle-87543ed5a5e08537.rlib" "/home/lotus/.cargo/target/release/deps/libtracing-ed2c73608e6bf01b.rlib" "/home/lotus/.cargo/target/release/deps/libtracing_core-1b7fe123d081d012.rlib" "/home/lotus/.cargo/target/release/deps/libcap_rand-bb7a4e218386760d.rlib" "/home/lotus/.cargo/target/release/deps/libcap_std-02d92acada068d8e.rlib" "/home/lotus/.cargo/target/release/deps/libcap_primitives-b530c7fae98fea43.rlib" "/home/lotus/.cargo/target/release/deps/libipnet-806dc9ccd737a847.rlib" "/home/lotus/.cargo/target/release/deps/libmaybe_owned-0f11d887b12748a7.rlib" "/home/lotus/.cargo/target/release/deps/libfs_set_times-f77cd1885ae19e8a.rlib" "/home/lotus/.cargo/target/release/deps/libio_extras-ee128db75c7adde4.rlib" "/home/lotus/.cargo/target/release/deps/libambient_authority-d4bdae4a8ed2754e.rlib" "/home/lotus/.cargo/target/release/deps/liblunatic_error_api-934a9b0973dc971d.rlib" "/home/lotus/.cargo/target/release/deps/liblunatic_common_api-314691bd7bfa9905.rlib" "/home/lotus/.cargo/target/release/deps/liblunatic_process-a2cc9825c1de9f34.rlib" "/home/lotus/.cargo/target/release/deps/libtokio-92272a32569d64b3.rlib" "/home/lotus/.cargo/target/release/deps/libuuid-708e2e8467e131af.rlib" "/home/lotus/.cargo/target/release/deps/libhash_map_id-554c77a4ecdc0aa7.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime-ea0c1857fe63eea1.rlib" "/home/lotus/.cargo/target/release/deps/libwat-767ed27e4d0f39bb.rlib" "/home/lotus/.cargo/target/release/deps/libwast-666682f174472117.rlib" "/home/lotus/.cargo/target/release/deps/libwasm_encoder-416e4cc50ed2759e.rlib" "/home/lotus/.cargo/target/release/deps/libleb128-ae0d8a529842c665.rlib" "/home/lotus/.cargo/target/release/deps/libunicode_width-c420ccdeb5fd199d.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime_cranelift-16d81e84c82b06bb.rlib" "/home/lotus/.cargo/target/release/deps/libcranelift_native-39961aefe356fa20.rlib" "/home/lotus/.cargo/target/release/deps/libcranelift_wasm-87b18b72bcf23d29.rlib" "/home/lotus/.cargo/target/release/deps/libitertools-f2ad876a50acb2d0.rlib" "/home/lotus/.cargo/target/release/deps/libcranelift_frontend-d4b286c7d0529133.rlib" "/home/lotus/.cargo/target/release/deps/libcranelift_codegen-8c99c7223f1061ec.rlib" "/home/lotus/.cargo/target/release/deps/libcranelift_codegen_shared-35daceea25e73e8e.rlib" "/home/lotus/.cargo/target/release/deps/libregalloc-0e9b321300b6cf62.rlib" "/home/lotus/.cargo/target/release/deps/librustc_hash-25b604ccd7f235a3.rlib" "/home/lotus/.cargo/target/release/deps/libsmallvec-7914fba9a73e7b81.rlib" "/home/lotus/.cargo/target/release/deps/libcranelift_bforest-3c7bebec176e3697.rlib" "/home/lotus/.cargo/target/release/deps/libpsm-c1850cb83d862cfd.rlib" "/home/lotus/.cargo/target/release/deps/librayon-477b931e8bf78a02.rlib" "/home/lotus/.cargo/target/release/deps/librayon_core-72994d7e71bf287b.rlib" "/home/lotus/.cargo/target/release/deps/libcrossbeam_deque-7d1ca271d1deba55.rlib" "/home/lotus/.cargo/target/release/deps/libcrossbeam_epoch-10c58bc96126443c.rlib" "/home/lotus/.cargo/target/release/deps/libscopeguard-91dcd07744d1313e.rlib" "/home/lotus/.cargo/target/release/deps/libcrossbeam_channel-2a3b363da691a57a.rlib" "/home/lotus/.cargo/target/release/deps/libeither-5ca4aa86e514235d.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime_jit-96ca4fa525d86e92.rlib" "/home/lotus/.cargo/target/release/deps/libcpp_demangle-278605d5690f22e4.rlib" "/home/lotus/.cargo/target/release/deps/libittapi_rs-c00ba84c2d5000db.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime_cache-50e88e8d11c2bed2.rlib" "/home/lotus/.cargo/target/release/deps/libbase64-fcaf4d86daae2873.rlib" "/home/lotus/.cargo/target/release/deps/libbincode-f2d2a4b3fd7631b4.rlib" "/home/lotus/.cargo/target/release/deps/libfile_per_thread_logger-687aae944fad5ab8.rlib" "/home/lotus/.cargo/target/release/deps/libenv_logger-15c42bed0f789174.rlib" "/home/lotus/.cargo/target/release/deps/libhumantime-aad0b24b99ec8652.rlib" "/home/lotus/.cargo/target/release/deps/libregex-4cca14c248349eff.rlib" "/home/lotus/.cargo/target/release/deps/libaho_corasick-c0eef6fa9f7df825.rlib" "/home/lotus/.cargo/target/release/deps/libregex_syntax-9d6db44b9c0cec5a.rlib" "/home/lotus/.cargo/target/release/deps/libtoml-10c4aa57eae7f6b1.rlib" "/home/lotus/.cargo/target/release/deps/libzstd-83d7d508ba173572.rlib" "/home/lotus/.cargo/target/release/deps/libzstd_safe-acfa6dc221366795.rlib" "/home/lotus/.cargo/target/release/deps/libzstd_sys-045b632b830e0364.rlib" "/home/lotus/.cargo/target/release/deps/libdirectories_next-507ee0ed3eccafc8.rlib" "/home/lotus/.cargo/target/release/deps/libdirs_sys_next-6fe71a06e3d6e078.rlib" "/home/lotus/.cargo/target/release/deps/libsha2-062044c1a7e84a19.rlib" "/home/lotus/.cargo/target/release/deps/libcpufeatures-8fe634dad9fe1a55.rlib" "/home/lotus/.cargo/target/release/deps/libopaque_debug-30bca6372ba015ab.rlib" "/home/lotus/.cargo/target/release/deps/libdigest-fc38524b5b316935.rlib" "/home/lotus/.cargo/target/release/deps/libblock_buffer-24e527d6b3ba59f3.rlib" "/home/lotus/.cargo/target/release/deps/libgeneric_array-9c741897873636cd.rlib" "/home/lotus/.cargo/target/release/deps/libtypenum-7c26863d84f6e6dc.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime_runtime-9de67d4b5e08a324.rlib" "/home/lotus/.cargo/target/release/deps/libmemfd-fc79cb4eca6dc178.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime_fiber-8884a1857f825d91.rlib" "/home/lotus/.cargo/target/release/deps/libregion-291310fb14dd551a.rlib" "/home/lotus/.cargo/target/release/deps/libbacktrace-9b566d370f48425e.rlib" "/home/lotus/.cargo/target/release/deps/libminiz_oxide-05774ee8e8228cf8.rlib" "/home/lotus/.cargo/target/release/deps/libadler-93a3aafa08255620.rlib" "/home/lotus/.cargo/target/release/deps/libobject-83123cd3489d21cf.rlib" "/home/lotus/.cargo/target/release/deps/libaddr2line-0476ba4c0dd5ab5f.rlib" "/home/lotus/.cargo/target/release/deps/librustc_demangle-f1c891421c8c36f9.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime_jit_debug-5fc271d9335edbd9.rlib" "/home/lotus/.cargo/target/release/deps/librustix-59867dbcc714b349.rlib" "/home/lotus/.cargo/target/release/deps/libitoa-f5762eaa3bbe4bb3.rlib" "/home/lotus/.cargo/target/release/deps/libio_lifetimes-e8316e8e64156a97.rlib" "/home/lotus/.cargo/target/release/deps/liblinux_raw_sys-0dbf178634e06305.rlib" "/home/lotus/.cargo/target/release/deps/liblazy_static-53863b77cbb23189.rlib" "/home/lotus/.cargo/target/release/deps/librand-6036e47689e64144.rlib" "/home/lotus/.cargo/target/release/deps/librand_chacha-db67fd8528e54369.rlib" "/home/lotus/.cargo/target/release/deps/libppv_lite86-8e5c7f9497a3dd4b.rlib" "/home/lotus/.cargo/target/release/deps/librand_core-52d8c705db2cff2c.rlib" "/home/lotus/.cargo/target/release/deps/libgetrandom-3c2ca12d65a651c6.rlib" "/home/lotus/.cargo/target/release/deps/libmemoffset-b6667f27ce16838c.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime_environ-78b0c86c77eb3957.rlib" "/home/lotus/.cargo/target/release/deps/libgimli-8435c1db7225820f.rlib" "/home/lotus/.cargo/target/release/deps/libfallible_iterator-702ae43ad81c13a3.rlib" "/home/lotus/.cargo/target/release/deps/libstable_deref_trait-b556d4b03bdd9327.rlib" "/home/lotus/.cargo/target/release/deps/libmore_asserts-d0ccdea0f8ecf3d1.rlib" "/home/lotus/.cargo/target/release/deps/libwasmtime_types-3fece135773c39c7.rlib" "/home/lotus/.cargo/target/release/deps/libwasmparser-e23c81319f7ad0b5.rlib" "/home/lotus/.cargo/target/release/deps/libcranelift_entity-f63a4ae4fb84ace4.rlib" "/home/lotus/.cargo/target/release/deps/libtarget_lexicon-5f61c9ae748e07bb.rlib" "/home/lotus/.cargo/target/release/deps/libthiserror-3d2247a492ac88e2.rlib" "/home/lotus/.cargo/target/release/deps/libobject-59b66ecf5143d8bb.rlib" "/home/lotus/.cargo/target/release/deps/libcrc32fast-d3deb61882f3577f.rlib" "/home/lotus/.cargo/target/release/deps/libasync_std-ae48b8945ad433d2.rlib" "/home/lotus/.cargo/target/release/deps/libasync_global_executor-92457259c5f84258.rlib" "/home/lotus/.cargo/target/release/deps/libblocking-ef890de4d4f992eb.rlib" "/home/lotus/.cargo/target/release/deps/libatomic_waker-a2c9e3e5190b8f84.rlib" "/home/lotus/.cargo/target/release/deps/libasync_executor-6652838247d0912d.rlib" "/home/lotus/.cargo/target/release/deps/libasync_task-f99b00f7841c0ef2.rlib" "/home/lotus/.cargo/target/release/deps/libcrossbeam_utils-0d819927e6db9495.rlib" "/home/lotus/.cargo/target/release/deps/libasync_process-522042b57ee93762.rlib" "/home/lotus/.cargo/target/release/deps/libsignal_hook-d0d0aedc771ceed5.rlib" "/home/lotus/.cargo/target/release/deps/libsignal_hook_registry-9f20c86a3948a02d.rlib" "/home/lotus/.cargo/target/release/deps/libasync_io-2f26ec70ac164425.rlib" "/home/lotus/.cargo/target/release/deps/libslab-3b80d8f761e90519.rlib" "/home/lotus/.cargo/target/release/deps/libpolling-f47aa9b5e6d6cf56.rlib" "/home/lotus/.cargo/target/release/deps/libsocket2-9152082a97e45544.rlib" "/home/lotus/.cargo/target/release/deps/libfutures_lite-93e2c7d0e72bf6b9.rlib" "/home/lotus/.cargo/target/release/deps/libmemchr-82ac28409f03df8b.rlib" "/home/lotus/.cargo/target/release/deps/libfastrand-deacf0164f3f7777.rlib" "/home/lotus/.cargo/target/release/deps/libwaker_fn-c20a05be103f4ee6.rlib" "/home/lotus/.cargo/target/release/deps/libparking-aaad4e9e88c721ec.rlib" "/home/lotus/.cargo/target/release/deps/libfutures_io-52ff59f513108bf0.rlib" "/home/lotus/.cargo/target/release/deps/libasync_channel-8523c3d5861ae4ce.rlib" "/home/lotus/.cargo/target/release/deps/libconcurrent_queue-2e508a8cd80809a1.rlib" "/home/lotus/.cargo/target/release/deps/libcache_padded-6458f3a9c373fa67.rlib" "/home/lotus/.cargo/target/release/deps/libasync_lock-8f2c8635bce6ece5.rlib" "/home/lotus/.cargo/target/release/deps/libevent_listener-46bb95ba4f92548a.rlib" "/home/lotus/.cargo/target/release/deps/libpin_project_lite-fde9e586f72c7c9b.rlib" "/home/lotus/.cargo/target/release/deps/libpin_utils-d6f018c56e01fe8c.rlib" "/home/lotus/.cargo/target/release/deps/libfutures_core-cc3f95c48f424582.rlib" "/home/lotus/.cargo/target/release/deps/libkv_log_macro-6221e6cfd40dc6bf.rlib" "/home/lotus/.cargo/target/release/deps/liblog-c11df26e7280ff73.rlib" "/home/lotus/.cargo/target/release/deps/libvalue_bag-b00a6f9af3abe045.rlib" "/home/lotus/.cargo/target/release/deps/libdashmap-ff8a4fa275f3a281.rlib" "/home/lotus/.cargo/target/release/deps/libnum_cpus-aea3dd5006fc8173.rlib" "/home/lotus/.cargo/target/release/deps/libcfg_if-ae9304d0b4aa9b61.rlib" "/home/lotus/.cargo/target/release/deps/libclap-baf9e6a6210992bb.rlib" "/home/lotus/.cargo/target/release/deps/libatty-ae5489d74ff09fd2.rlib" "/home/lotus/.cargo/target/release/deps/liblibc-6fce902828354ed3.rlib" "/home/lotus/.cargo/target/release/deps/libstrsim-1690bcaa0e46981f.rlib" "/home/lotus/.cargo/target/release/deps/libtermcolor-1e3f58e07a9169f7.rlib" "/home/lotus/.cargo/target/release/deps/libtextwrap-ee4bb8f572f17702.rlib" "/home/lotus/.cargo/target/release/deps/libclap_lex-69bd0881b1817e9c.rlib" "/home/lotus/.cargo/target/release/deps/libos_str_bytes-606bef0f72d16215.rlib" "/home/lotus/.cargo/target/release/deps/libindexmap-1a8bddde9a1abfd0.rlib" "/home/lotus/.cargo/target/release/deps/libhashbrown-a583d5d85e228ff2.rlib" "/home/lotus/.cargo/target/release/deps/libserde-e2b53d50a972090e.rlib" "/home/lotus/.cargo/target/release/deps/libbitflags-d4b6a1deff53cdce.rlib" "/home/lotus/.cargo/target/release/deps/libonce_cell-c3b9943d0c9975b4.rlib" "/home/lotus/.cargo/target/release/deps/libanyhow-4a2af6bbff1dbb5b.rlib" "-Wl,--start-group" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-7ca39ac42651c3df.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-62c6d032818141a1.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libobject-50484fc03eb1eb5b.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libmemchr-758be083b246d9c6.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libaddr2line-3cdf9a3c68f76e2d.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libgimli-3a1b74821c25a0e1.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_demangle-e046d82ebd84bb7f.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd_detect-a61cdd33cfa8394f.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libhashbrown-410c38f8df854235.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libminiz_oxide-f79d7458e122215f.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libadler-7d24b750ce5b22e8.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_alloc-4c2aa1ea3133ab73.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunwind-41220dc85a7f114f.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcfg_if-230d004276c898f9.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-03ae30169a5438be.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-6f82c44b7818af35.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_std_workspace_core-1a0b7681f7efa789.rlib" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-83735dd4dae9b02c.rlib" "-Wl,--end-group" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcompiler_builtins-385029872275478f.rlib" "-Wl,-Bdynamic" "-lgcc_s" "-lutil" "-lrt" "-lpthread" "-lm" "-ldl" "-lc" "-Wl,--eh-frame-hdr" "-Wl,-znoexecstack" "-L" "/opt/rust/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib" "-o" "/home/lotus/./.cargo/target/release/deps/lunatic-c7112cfc902fd5e9" "-Wl,--gc-sections" "-pie" "-Wl,-zrelro,-znow" "-Wl,-O1" "-nodefaultlibs"
  = note: /usr/bin/ld: /home/lotus/.cargo/target/release/deps/libwasmtime_runtime-9de67d4b5e08a324.rlib(wasmtime_runtime-9de67d4b5e08a324.wasmtime_runtime.1178b907-cgu.4.rcgu.o): in function `resolve_vmctx_memory':
          wasmtime_runtime.1178b907-cgu.4:(.text.resolve_vmctx_memory+0x0): multiple definition of `resolve_vmctx_memory'; /home/lotus/.cargo/target/release/deps/libwasmtime_runtime-fe3c7d4426ae828c.rlib(wasmtime_runtime-fe3c7d4426ae828c.wasmtime_runtime.2863bd74-cgu.11.rcgu.o):wasmtime_runtime.2863bd74-cgu.11:(.text.resolve_vmctx_memory+0x0): first defined here
          /usr/bin/ld: /home/lotus/.cargo/target/release/deps/libwasmtime_runtime-9de67d4b5e08a324.rlib(wasmtime_runtime-9de67d4b5e08a324.wasmtime_runtime.1178b907-cgu.4.rcgu.o): in function `resolve_vmctx_memory_ptr':
          wasmtime_runtime.1178b907-cgu.4:(.text.resolve_vmctx_memory_ptr+0x0): multiple definition of `resolve_vmctx_memory_ptr'; /home/lotus/.cargo/target/release/deps/libwasmtime_runtime-fe3c7d4426ae828c.rlib(wasmtime_runtime-fe3c7d4426ae828c.wasmtime_runtime.2863bd74-cgu.11.rcgu.o):wasmtime_runtime.2863bd74-cgu.11:(.text.resolve_vmctx_memory_ptr+0x0): first defined here
          /usr/bin/ld: /home/lotus/.cargo/target/release/deps/libwasmtime_runtime-9de67d4b5e08a324.rlib(wasmtime_runtime-9de67d4b5e08a324.wasmtime_runtime.1178b907-cgu.4.rcgu.o): in function `set_vmctx_memory':
          wasmtime_runtime.1178b907-cgu.4:(.text.set_vmctx_memory+0x0): multiple definition of `set_vmctx_memory'; /home/lotus/.cargo/target/release/deps/libwasmtime_runtime-fe3c7d4426ae828c.rlib(wasmtime_runtime-fe3c7d4426ae828c.wasmtime_runtime.2863bd74-cgu.11.rcgu.o):wasmtime_runtime.2863bd74-cgu.11:(.text.set_vmctx_memory+0x0): first defined here
          collect2: error: ld returned 1 exit status
          

error: could not compile `lunatic-runtime` due to previous error
error: failed to compile `lunatic-runtime v0.9.0`, intermediate artifacts can be found at `/home/lotus/./.cargo/target`

Running on a clean Pop OS:

❯ lsb_release -a && uname -a
No LSB modules are available.
Distributor ID:	Pop
Description:	Pop!_OS 22.04 LTS
Release:	22.04
Codename:	jammy
Linux pop-os 5.17.15-76051715-generic #202206141358~1655919116~22.04~1db9e34 SMP PREEMPT Wed Jun 22 19 x86_64 x86_64 x86_64 GNU/Linux

I haven't used wasmtime before lunatic and it does a great job at keeping it hidden away from me so I don't know how to even try to debug this.

Improve process spawning performance

Lunatic encourages program architectures where it's common to spawn many short lived processes (e.g. a process per HTTP request). For this to work the process spawning overhead needs to stay low. I ran some benchmarks on my MacBook:

With the Wasmtime backend:

wasmtime instance creation                                                                             
                        time:   [26.164 us 26.284 us 26.424 us]
Found 10 outliers among 100 measurements (10.00%)
  4 (4.00%) high mild
  6 (6.00%) high severe

lunatic instance creation                                                                            
                        time:   [321.65 us 323.69 us 326.78 us]
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) high mild
  8 (8.00%) high severe

With the Wasmer backend:

wasmer instance creation                                                                             
                        time:   [23.603 us 23.727 us 23.863 us]
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

lunatic instance creation                                                                            
                        time:   [216.54 us 217.95 us 219.62 us]
                        change: [-32.116% -30.953% -29.620%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

Ideally we want the instance creation time to be in the single digit micro second range, matching Erlang. There are many improvements we can do to get there.

As we keep adding features the instance creation time has been getting worse, mostly because every time we add a new host function it will increase the Linker creation time. The good news is that with a recent addition to Wasmtime it's possible to define "global" host functions so we can completely skip the step of adding all host functions to the instance linker each time we spawn a process.

Another recent Wasmtime addition allows us to reuse and pool resources. We could create a pool of other resources too, like AsyncWormhole stacks. As the Wasm code can't observe the "real" stack it would be even safe to reuse it between instances without clearing it first.

Even both of this optimisations are Wasmtime specific, I believe that Wasmer is going to add similar functionality in the future. I will open separate issues for both of this approaches and keep this as a tracking issue for further ideas and discussions around spawning performance.

Http bindings?

A large portion of the AssemblyScript community could really use a way to parse http, and making bindings for something battle tested and safe would be beneficial to help aid in adoption of lunatic by users of AssemblyScript. This might affect other languages like grain which probably don't have http libraries to choose from.

These bindings can also be hidden behind a flag so that binaries remain small for rust developers.

Stdout/Stderr not "Process-safe"

Currently it seems like there is no locking of stdout and stderr between processes, which leads to intermingled output when printing from different processes.

This might of course also be related to the WASI implementation in wasmtime.

Also, this is not necessarily something that has to be fixed, I'm sure there are performance tradeoffs here.

I also noticed that trying to lock stdout, stderr handles results in nothing being printed. Which again is probably a limitation of the Rust wasi build and/or wasmtime.

Lunatic for client side?

Hi,

I don't know if it make much sense but I would like to know if, in theory, Lunatic could be used for client application (Web, Desktop, mobiles,...)

The advantage I see is that Lunatic would allow to reuse some code across platform (since it would be compiled to WASM) and also use an embeddable preemptive actor model (which is a thing that isn't easy to get) which be a nicer way to structure your app.

The potential issues I see :

Being able to call "platform" code for side-effects and be called back too.
Have some way to execute code in a specific thread (for example run UI code in the UI thread)

Yet, those don't seem impossible given enough time and effort. I would even say that it might become a nice way to write client side applications but I might miss some parts of the picture.

Anyway, did you think about Lunatic usage on the client side? Do you think it make some sense? Is it something you would like to explore at some point?

WASI payload working in Wasmtime, not working in Lunatic

Hello, I've tried to run a demo WASI application build using F# & .NET 7.0 using https://github.com/SteveSandersonMS/dotnet-wasi-sdk on Lunatic and I'm getting the following error:

$ WASMTIME_BACKTRACE_DETAILS=1 RUST_BACKTRACE=1 lunatic bin/Release/net7.0/FullstackWasmFSharpAppBackend.wasm -- --tcplisten localhost:8080 --env ASPNETCORE_URLS=http://localhost:8080

Unhandled Exception:
System.EntryPointNotFoundException: SystemNative_ReadLink
   at Interop.Sys.ReadLink(Byte& path, Byte[] buffer, Int32 bufferSize)
   at Interop.Sys.ReadLink(ReadOnlySpan`1 path)
   .....
      at Program.main(String[] args)
   [2022-06-10T21:25:40Z WARN  lunatic_process] Process 5becaa45-c926-4a8e-918b-523ef9c72714 failed, notifying: 0 links 
                            (Set ENV variable `RUST_LOG=lunatic=debug` to show stacktrace)
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Exited with i32 exit status 1
wasm backtrace:
    0: 0x35b5 - SystemNative_MAdvise
                    at /home/runner/work/dotnet-wasi-sdk/dotnet-wasi-sdk/modules/runtime/src/native/libs/System.Native/pal_io.c:978:37
', src/mode/execution.rs:104:16
stack backtrace:
   0: _rust_begin_unwind
   1: core::panicking::panic_fmt

The application in question is here: https://github.com/delneg/FullstackWasmFSharpApp

Looking at src/native/libs/System.Native/pal_io.c:978:37 from dotnet-wasi-sdk (https://github.com/dotnet/runtime/blob/cdce8167ad107c385189b3b85fd4fc22379c4f3f/src/native/libs/System.Native/pal_io.c#L978) I don't see anything related to the above error SystemNative_ReadLink which is actually located here:
https://github.com/dotnet/runtime/blob/82a2562fc9ee0986ee20ee309ad0bc259c561683/src/native/libs/System.Native/pal_io.h#L665

From what I understand, somehow function pointers are not working as they should in this case.
If this issue is not related to lunatic, feel free to close.

Add a (unified) API for timeouts.

Timeouts are very useful!

It would be great to have an API which makes it possible to conduct operations with a timeout.

Slow process spawning, can't parallelize

I've been looking into limits of spawning processes using lunatic and I'm running into two issues. One thing is that spawning processes seem to be rather slow. According to my tests it's about 5000 spawns per second (on not the strongest CPU, but a decent one i7-9750H CPU @ 2.60GHz).

What's worse, though, is that I can't seem to parallelize spawning processes. I created a test application which spawns a number of tasks in a loop and each of those tasks spawns processes as fast as it can. I also have a separate counter and display processes, which calculate the number of spawns per second.

I was testing using lunatic from the main branch (944935e857e53dc0278cbb4f16bb8a50f097d460 at the moment) using lunatic 0.7.1 Rust lib. It may very well be my mistake as I'm not very familiar with the runtime yet, so if you have any questions or suggestions please let me know.

Hardened wasi clock_time_get and clock_time_res

This is a stub issue regarding Spectre and timing attacks raised here #26 (comment)

@jtenner rised an issue that clock_time_get and clock_time_res might be succeptible to https://en.wikipedia.org/wiki/Spectre_(security_vulnerability)

@jocaml advised that we might add a parameter to specify max resolution

Some points from Spectre wiki:

As of Firefox 57.0.4, Mozilla was reducing the resolution of JavaScript timers to help prevent timing attacks, with additional work on time-fuzzing techniques planned for future releases.
The precision of performance.now() has been reduced from 5μs to 20μs, and the SharedArrayBuffer feature has been disabled because it can be used to construct a high-resolution timer. (https://www.mozilla.org/en-US/security/advisories/mfsa2018-01/)
https://blog.mozilla.org/security/2018/01/03/mitigations-landing-new-class-timing-attack/

Add support for custom test harness

It would be nice to run tests with custom test runners (e.g. nextest), to do which Lunatic would need to support the following.

This is admittedly a low-priority thing but if I can find time on the weekend, I'll try to submit a patch.

tcp_read return values inconsistent

lunatic/src/api/networking.rs

Line 709 in 944935e

) -> Box<dyn Future<Output = Result<u32, Trap>> + Send + '_> {

Per @bkolobara 's request, I have opened this issue to document a small inconsistency in the way socket reads occur.

Per our discussion, it should probably return

Ok -> 0
Err -> 1
Ok with no bytes read (socket closed normally) -> 0
Timeout -> 9027

Thanks!

Add a testing framework to the vm

After trying to find a nice testing solution for code running on lunatic (see lunatic-solutions/lunatic-rs#8) over the last few months, I concluded that it is probably impossible to do so without direct support from the vm.

There is an issue tracking custom testing frameworks for Rust, but it didn't see almost any progress in the last 5 years and I don't see this changing soon. Other solutions relay on libraries like inventory that require specific life-before-main solutions that don't work in Wasm code. This basically leaves us with only one way forward, provide a testing mode inside of the vm and decorate Rust guest code with a custom macro (#[lunatic::test]) that exposes tests to the vm.

I would at the same time use this opportunity to introduce sub-commands in the CLI. E.g lunatic run file.wasm, lunatic test file.wasm , and maybe some diagnostics in the future lunatic inspect file.wasm . If you run lunatic run file.wasm the vm would behave like it did until now. Load the wasm file, look for a _start function, spawn a process from it and wait until the process finishes.

The new testing mode would behave differently, it would look up all exported functions starting with __lunatic_test_ and spawn a process from them. If the process finishes without trapping the test would be marked as OK, otherwise as FAIL. The testing mode would expose a few additional host functions to allow the tested process to name the test and change the configuration (should_trap=true?). This would automatically parallelise all test execution and let the schedule auto balance any waiting. This would also just be the starting point (unit tests), later we can extend the functionality to cover more complicated testing scenarios.

Maybe we should also provide some heuristics for automatically detecting the testing mode? Rust guest code can configure only one runner (e.g. lunatic run) that is also used when you run cargo test. It could be useful for the lunatic command to default to run mode, except if it detects some environment variables that indicate tests running. This would provide a seamless integration with cargo test & cargo run for Rust's use case.

I'm wonderring how other languages, like AssemblyScript could integrate with this? From what I have seen, AssemblyScript doesn't have a standardised test runner. Having a lunatic custom one wouldn't impact the developer experience? @jtenner

WASI `proc_exit(0)` calls should not be treated as failures

If the WASI proc_exit(u32) function is called inside a process, it will stop immediately execution. In lunatic this is always assumed to be a crash, no matter with what exit code the function was called. This means that all linked processes will be notified of the crash.

I would like to change this behaviour to allow you to exit a process early with the error code 0 indicating success.

The issue is that proc_exit(u32) is implemented as a trap (wiggle::Trap::I32Exit) so be able to immediately abort execution. Every trap in lunatic is handled as a failure here: https://github.com/lunatic-solutions/lunatic/blob/main/src/process.rs#L170

Forbid processes to allow namespaces that they doesn't have access to themselves

Currently, if you create an environment from a process that has the permission to call lunatic::process::allow_namespace it can give greater permissions to sub-processes spawned into the new environment than it has itself.

The lunatic::process::allow_namespace host function should first check if we have permissions for the namespace before allowing us to add it to other environments. Similar to how lunatic::process::preopen_dir works, we can't allow an environment to pre-open directories if we don't have access to them ourself.

It should be a much safer default than relaying on the developer to revoke access to lunatic::process::allow_namespace.

I would also use the opportunity to change how namespaces are defined. Currently they work on a prefix basis, e.g. if you allow the namespace lunatic::process:: all host functions underneath are available. However, an empty string "" is a valid prefix for all namespaces. This means that if you call lunatic::process::allow_namespace with an empty string, you allow all namespace. I also think that this is a confusing default behaviour, developers may assume that they forbid all host functions with an empty namespace. I would change this to using an asterisk (*) to mean match all sub namespaces (e.g. lunatic::process::*). If there is no asterisk lunatic should assume it must be an exact match. In this case * would mean match all namespaces. Developers should be more familiar with using * as a wildcard and it should better communicate intend in this situation.

Provide a means for obtaining a snapshot of the system

One very useful technique for debugging is to be able to take a complete snapshot of the system.

An outline of some possible algorithms is given in https://mk.cs.msu.ru/images/5/52/Lecture-DA-11.pdf

The process state should be generic

Currently, each process contains a state (struct ProcessState) that holds resources belonging to the process (file descriptors, tcp connections, etc.). This design doesn't allow embedders of lunatic-runtime to provide their own host functions. They can't extend the ProcessState.

The ProcessState should just implement different traits (e.g. WasiCtx, NetworkingCtx, etc.) that expose the underlaying resources bound to the state. If a process state implements the right traits, it can use host functions that depend on those resources.

Some of this work has already started with the following PR: #96. The code working with ProcessState still needs to be generalized to take any struct instead. ProcessState could become a trait that requires you to implement a function register, that registers custom or provided host functions. And we could introduce a new struct (DefaultProcessState?) that takes over the role of the current ProcessState.

Implement a discrete event simulator

As described in https://sled.rs/simulation.html.

unresolved import `lunatic::net`

Hi,

I was trying out the networking example from this blog post and encountered this error:

error[E0432]: unresolved import `lunatic::net`
 --> src/main.rs:1:24
  |
1 | use lunatic::{Process, net}; // Once WASI gets networking support you will be able to use Rust's `std::net::TcpStream` instead.
  |                        ^^^ no `net` in the root

Cargo.toml:

[dependencies]
lunatic = "0.2.0"

Building with cargo build --release --target=wasm32-wasi

Thanks,
Arne

Roadmap?

Hi team Lunatic!

This is a really exciting concept and I'll be keeping a very close eye on this project as it shapes up.

A few ideas I think would be amazing:

ability to "auto-cluster" e.g drop binary on multiple servers with a supplied key and nodes form a cluster (inside a private network)
work loads seamlessly run across cluster
networking routing and load balancing automatically across cluster, same application can be reached from any node
some kind of simple web UI to manage applications: usual Add, Edit (manage env vars per app?), remove apps, scale apps etc
simple CLI as alternative to web UI

Remove timeout support on networking read/writes

Tcp stream's read & write and udp's send & receive functions take a timeout argument. If it's greater than 0, the function will only block for the given amount in ms before returning a timeout code. I would like to remove this functionality.

As someone who has implemented some of these functions, I believe it's almost impossible to reason about code using them. All of the listed functions will return 1 or more bytes and it's hard to express behaviour like "wait max 5 sec on the next 1 Mb". You need to create a state machine and keep counting and reducing time yourself. The complexity of using them correctly is just too high. Has anyone ever successfully used them for anything?

It makes the implementation also way more complicated than it needs to be. We need to use a macro that fuses futures together, set a timer in the kernel on each read/write call, make sure all the futures are cancel safe, etc. It makes the implementation really hard to reason about without much of a benefit. I also believe that it would lead to a high performance hit if the timeout argument was used in practice, but I never benchmarked it.

The most common scenario I assume is waiting for some structured data (e.g. HTTP request) and giving up if it doesn't arrive on time. And this can't be easily expressed with the current timeout functionality.

I propose instead to handle such cases by spawning a process, collecting all the data from the TcpStream until a valid request is formed and send it back to the parent. The message receive function of the parent would still keep the timeout parameter. One of lunatic's goals is to make process spawning extremely performant, so that developers can use it without fear of a performance hit. I think this approach aligns better with lunatic's design in general.

Add `kill` host API

Processes should be able to send a kill signal to others.

what is the memory cost for Lunatic's super lightweight processes

how many kb / bytes per process?

Switch to tokio.rs from async_std

I would like to propose switching back to tokio.rs as lunatic's async executor.

Many projects are moving away from async_std (launchbadge/sqlx#1669) and some "high profile" rust community members are advocating for everyone to just adopt tokio. There are many other good reasons to switch to tokio, like more active development and bigger community & library support.

We are already using tokio's select! macro and have it as a dependency. The only reason why I decided to switch to async_std is the fact that async_std's TcpStream is Clone. This allows us to expose a simple primitive to guest code that can be shared between processes. And having one process read from the TcpStream and another write to it is a common pattern in lunatic.

As I hinted already in this comment, we could introduce a split function and ReadTcpStream & WriteTcpStream primitive to work around this. This would also provide us with more security, because you can't accidentally write to WriteTcpStream from 2 processes (exclusively owned). At the same time, many (all?) other io_uring based async runtimes also provide this split approach to TcpStreams and introducing this change wouldn't stop us from adopting another async executor in the future.

Make license more obvious

Probably best to put license files in repo.

Breakage due to legacy_derive_helpers depr. on nightly for uptown_funk

warning: derive helper attribute is used before it is introduced
   --> uptown_funk/src/lib.rs:177:35
    |
177 | #[cfg_attr(feature = "vm-wasmer", error("{message}"))]
    |                                   ^^^^^
178 | #[cfg_attr(feature = "vm-wasmer", derive(thiserror::Error))]
    |                                          ---------------- the attribute is introduced here
    |
    = note: `#[warn(legacy_derive_helpers)]` on by default
    = warning: this was previously accepted by the compiler but is being phased out; it will become a hard error in a future release!
    = note: for more information, see issue #79202 <https://github.com/rust-lang/rust/issues/79202>

   Compiling lunatic-runtime v0.3.2 (/home/rusch/lunatic)

Flipping the order of these two seems to fix the issue.

Create WIT definitions for lunatic's host functions

We should provide *.wit files for lunatic's host functions. WIT (WebAssembly Interface Types) files are definitions of higher level types and function signatures that describe an API that the Wasm host is exporting and the guest is importing. It can also be an interface between two Wasm modules in the component model.

For example, the WASI functions that are provided by lunatic have a formal spec in the *.wit file format, but none of the other ones (networking, process spawning, messages, ...) do. I think we should also create them! However, this will result in some breaking API changes.

There are two main benefits in providing such a spec:

It provides documentation.
It can be used to automatically generate guest bindings for many languages or host bindings inside of lunatic.

Even if we don't end up using the automatically generated bindings, following the canonical ABI proposal for host functions will result in wider support from guests. Lunatic is already exposing an unspecified inconsistent ABI that is fairly similar to the proposal. So, why not go a step further and just use the standard everyone is using?

This will also give us an opportunity to go over all the existing host functions and remove inconsistencies.

Add host APIs to modify WASI command line arguments and environment variables inside the `Environment` configuration

If a new Environment is created, processes inside of it will not have access to any command line arguments or environment variables by default. They need to be added to the ConfigEnv struct from which the Environment is created. Currently this is only possible from the host, but is not exposed as a host function to the guest.

I propose adding 2 new host functions: set_wasi_args(conf_id: u64, ...) & set_wasi_envs(conf_id: u64, ...) that would take the config they are changing as first argument. This config can then be used to create new Environments and processes spawned inside this Environments would specified command line arguments and environment variables.

I'm not sure in what format the arguments and environment variables should be passed to the host and am open for suggestions.

Refactor into workspace

It might be worth refactoring Lunatic so that the API is defined in separate crates, which can then be activated through feature flags (so that we can add more experimental APIs that can be excluded from more stable builds).

It might also be worth bringing the rust-lib into the main repo, so that it can be tested in tandem with the runtime.

Insufficient memory error when spawning too many processes

I get a Insufficient resources: System call failed: Cannot allocate memory (os error 12)' error when trying to spawn too many processes. The processes are almost empty - just a single sleep instruction is in each of them. I've tried setting the max memory to a high number using the Config struct, but it only got me ~16k processes. Let me know if you need any more info, but the code is basically this:

use lunatic::process::sleep;
use lunatic::{process, Config, Environment, Mailbox};

#[lunatic::main]
fn main(m: Mailbox<()>) {
    let mut config = Config::new(10_000_000_000, Some(u64::MAX));
    config.allow_namespace("");
    let mut env = Environment::new(config).unwrap();
    let module = env.add_this_module().unwrap();

    module
        .spawn_link(m, |_parent: Mailbox<()>| {
            for n in 0..1000000 {
                process::spawn_with(n, handle).unwrap();
            }
        })
        .unwrap();
    sleep(1000000);
}

fn handle(n: u64, _: Mailbox<()>) {
    println!("Spawn: {}", n);
    sleep(1000000000)
}

What's interesting is that if I try set the memory limit higher than that (like 100B or I also tried u64::MAX), it either crashes right away or very early. With the 100B example it crashed while trying to allocate 100GB of memory. Even though it's the case tuning the memory down didn't help either - I tried values just high enough to create anything at all and it didn't really get me anywhere either.

lunatic::net::resolve no longer seems to work

This looks like a very exciting project!

I tried the following example code from the repo:

use lunatic::net;

fn main() {
    let wikipedia = net::resolve("wikipedia.org:80").unwrap();
    for addr in wikipedia {
        println!("wikipedia.org resolves to {:?}", addr);
    }
}

but it fails with:

thread 'async-std/runtime' panicked at 'there is no reactor running, must be called from the context of a Tokio 1.x runtime', /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.10.0/src/runtime/blocking/pool.rs:84:33
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'main' panicked at 'task has failed', /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/async-task-4.0.3/src/task.rs:368:45

This is using:

Lunatic 0.6.1 from releases
The lunatic crate version 0.7.0 or 0.6.1 (I tried both)
On Windows or Linux (I tried both)

Maybe it's related to the recent change to "switch from tokio.rs to async_std". But should this API still be expected to work? Is there an alternative?

Allow to switch async runtimes, possibly with ability to use io_uring

I've been playing with io_uring lately and I'm amazed at performance it can achieve for I/O operations. I'm also really interested in using lunatic for high performance applications that could also benefit from a sandboxed environment.

At the moment there are few options to run async code using io_uring: glommio, monoio and tokio-uring (although tokio has only the File API implemented). I'm not sure if there are any immediate plans to implement io_uring in async-std.

I would be interested in changing the lunatic code to allow using a different runtmie, most probably using features. There are a few problems, though:

I'm not sure if this is something that the maintainers would want? I think it's valuable even for tokio compatibility, but it comes with a more complex code. I don't want to start writing such a major change without an OK from maintainers, so I wanted to ask first.
How hard would it be to do it? I quickly skimmed through the code and it doesn't seem like there is a lot of places that would need to be customized, but that's more of a guess than informed opinion.
Would it even be possible to plug thread per core runtimes like monoio? As WASM modules are isolated my guess would be that it's possible, but again, I'm not familiar with the code

Rust-native async code

Is there any plans to support rust async functions inside processes? Like ability to pass an async function to Process::spawn or some other way of properly exposing async runtime? Or setting up the runtime by yourself (like calling smol::block_on inside process function) is the right way?

Actually I guess it's practically impossible to implement since process function is likely needs to be a raw pointer to actual process entry point and given the fact that processes don't share any memory, so I'd like to know is spawning a runtime per process is efficient enough or should be avoided?

Support socket access management

It seems that socket access management is missing by far, and socket operation is unrestricted. Untrusted libraries could bind any socket, so the host environment could be hacked. Does any further plan has been put on the agenda?

Recently, wasmtime-wasi had released v0.34.0 which provides the basic networking(bytecodealliance/wasmtime#3711) with access management API like preopened_socket. I think we could support socket access via it.

possible to provide the benchmark and usage costs?

e.g. how much memory (in kb?) does 1 green "thread" cost? how much in ns speed?
benchmark against... libaco etc?

Typo on main README page

The word “WebAssembly” is misspelled in the project’s main README.md.

Difference between Lunatic and Bastion

Hi,

I'm very new to both of these tools and both look very much awesome to me. But I'm not able to understand deeply what is difference between them except that the Lunatic target is wasm while Bastion stays in Rust. Could anyone help me to understand this better? Thanks in advance.

OOM on 15k+ concurrent - suspect VIRT?

Whilst I was stress testing the UDP stack - I wrote medium about it:
https://missmissm.medium.com/play-ping-pong-with-lunatic-udp-ef557a22a604

Running lunatic target/wasm32-wasi/debug/udp_ping_pong.wasm
thread 'main' panicked at 'Failed to spawn a process: Insufficient resources: System call failed: Cannot allocate memory (os error 12)', /home/foobar/lunatic/rust-lib/src/mailbox.rs:245:25

repro here:

git clone https://github.com/pinkforest/lunatic-udp-examples.git
cd lunatic-udp-examples; cargo run --bin udp_ping_pong

I hit OOM at around 15k flows on debug target - I suspect OOM (suspect virt mem) will not go away if target is release?

    FLOWS            VIRT      RES      SHR       %CPU     %MEM         TIME+   COMMAND
     1.5k             2.2t   155860    33700      15.6      0.5         0:22.77 lunatic  
     2k               16.4t   192336    42276      18.0      0.6         0:31.89 lunatic                          
     3k               23.5t   252900    56708      21.7      0.8         0:49.47 lunatic                        
     4k               31.9t   326300    73988      27.6      1.0         1:13.94 lunatic                                    
     5k               39.2t   388876    88852      28.2      1.2         1:39.30 lunatic   
     10k              81.9t   755756   176452      39.3      2.3         4:41.42 lunatic

$ ulimit -a

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 127764
max locked memory       (kbytes, -l) 65536
max memory size         (kbytes, -m) unlimited
open files                      (-n) 100240
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 127764
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

lunatic-solutions/lunatic-rs#33
lunatic-solutions/lunatic-rs#17