luukdegram / zwld Goto Github PK

View Code? Open in Web Editor NEW

41.0 3.0 1.0 353 KB

Experimental wasm linker

License: MIT License

Zig 100.00%

linker wasm wasmlinker wasm-ld wasm-linker zig webassembly ziglang

zwld's Introduction

zwld

Note: This repository has been archived as all development is now being done in zld and the Zig toolchain

Experimental linker for wasm object files. The idea is to implement a linker that stays close to wasm-ld in regards to features ~~so that one day this could potentially be used within the Zig self-hosted compiler to incrementally link Zig code with other wasm object files.~~ With zwld now having been upstreamed, the main development of the linker is done directly within the Zig compiler. Features and improvements will be backported to zwld at one point. Until then, this repository is mostly inactive.

While there's no official specification for linking, zwld follows the wasm tool-convention closely. The initial goal is to support mvp features and have a base skeleton which would provide us with enough information on how to integrate this within the Zig compiler. The first step is to make static linking work as specified by tool-convention, once that is completed, dynamic linking will be tackled.

Usage

Usage: zwld [options] [files...] -o [path]

Options:
-h, --help                         Print this help and exit
-o [path]                          Output path of the binary
--entry <entry>                    Name of entry point symbol
--global-base=<value>              Value from where the global data will start
--import-memory                    Import memory from the host environment
--import-table                     Import function table from the host environment
--initial-memory=<value>           Initial size of the linear memory
--max-memory=<value>               Maximum size of the linear memory
--merge-data-segments              Enable merging data segments
--no-entry                         Do not output any entry point
--stack-first                      Place stack at start of linear memory instead of after data
--stack-size=<value>               Specifies the stack size in bytes
--features=<value>                 Comma-delimited list of used features, inferred by object files if unset

Building

zwld uses the latest Zig, which can either be built from source or you can download the latest binary. Zwld can then be built running the following command:

zig build [-Denable-logging]

Right now zwld only contains debug logging, which is hidden behind the enable-logging flag. It is set to false by default.

zwld's People

Contributors

Stargazers

Watchers

Forkers

icodein

zwld's Issues

Write output to file

Once we finished all steps for linking the objects into the binary in memory,
we must convert any objects into their binary representation according to the spec and write the binary form into the file that was originally specified by path in the CLI.

Synthesizing functions to call constructors and perform other initialization

Save indexes of entries of certain sections

The symbols inside the symbol table have a field index.
This index represents the index of the object itself within a section. This means that if we have 2 imported functions, and 2 defined functions, the max function index is 3. Currently, we only save imports within the import section. This means that for a symbol with index 2, we would get an out-of-bounds panic. To combat this, we must save an index of entries inside sections (such as a function section) that is not only increased by the section entries themselves, but also by imports.
Then, during the parsing of the symbol table, we can find the corresponding function of a symbol by matching the index.

This would need to be done for:

functions
memory
globals
tables

Another possibility is to count imports for that symbol type, and subtract that amount from the symbol index. This may be simpler, but the first option may provide us with more information during other processes. Time will tell what the final decision will be based on the usefulness of saving the index on the section entry itself.

Merging of table sections (re-numbering tables)

Merging of data segments (re-positioning data)

Restructure wasm parser

Currently, we're relying on an existing project wasmparser which I've created before. However, its intention was to parse final wasm binaries.
This means we're doing an initial pass, and then once again parsing the custom sections seperately afterwards, causing not only extra work but also duplicating memory.
A big benefit of the current linking convention (as well as the wasm spec), is that all parsing can be done in a single pass. Allowing us to perform the linking a lot quicker. This means we will extract the parser from the wasmparser project and modify it directly within the zwld project to iterate faster and specialize it, rather than keeping it generic.

New architecture:

Wasm struct, owning the wasm object file
This struct contains a parse function that reads from the file and parses each section in a single pass.
Linker struct, which owns the output file, and contains the logic to link the object files into a final binary.

TODO: Verify if we can perhaps utilize an iterator pattern while parsing each wasm object file and call this from within the linker.
This would allow us to not only parse everything we need in a single pass, but we could perhaps already handle some of the linking on each iteration.

Resolving undefined external references

Merging of globals sections (re-numbering globals)

Merging of sections

Merging of function sections (re-numbering functions)
Merging of globals sections (re-numbering globals)
Merging of event sections (re-numbering events)
Merging of table sections (re-numbering tables)
Merging of data segments (re-positioning data)

Verify feature compatibility

Each wasm object file can contain a "features" section that describes which features are required or used.
We must verify for each module that those match or else generate an appropriate error.

Handle start section

Although we do not need to handle the merging of start sections (that would be a linking error), we may have to re-index the start section to point to the new index of a function.

For now, we're not going to verify if we're building an executable or library. We'll handle that once we merge into zld.

Merging of function sections (re-numbering functions)

Perform relocations

After merging sections, sections will be renumbered.
This means we must perform relocations to overwrite old indexes (such as a function index) into their new respective index.

Merging of event sections (re-numbering events)

Personally, this has the lowest priority to me as the only type of event that exists right now, is the Exception handling proposal.
Unfortunately, this feature is already at phase 3, which means we must implement it at some point once it goes to phase 4 or 5.