ewasm / design Goto Github PK
View Code? Open in Web Editor NEWEwasm Design Overview and Specification
License: Apache License 2.0
Ewasm Design Overview and Specification
License: Apache License 2.0
I think since Byzantium has finally introduced returndatacopy
and returndatasize
the original result buffers of call*
are obsolete (create
is a different beast).
It would clean semantics and implementations a lot of if remove the result buffer support from call
s and require clients to use returndata*
.
Related #12
eth_interface.md lists the exported method to the Wasm interface. These methods will be effectively available in the languages used to write contracts, such as C, C++ and anything else having an llvm backend.
These methods are made available via import statements (e.g. (import $codeSize "ethereum" "codeSize" (result i64))
)
Every import statement has the following properties:
Currently the environment in ewasm is set to ethereum
.
There is an easy way to compiler C or C++ code online: http://mbebenita.github.io/WasmExplorer/
This llvm compiler has env
hardcoded as environment and the method names are mangled according to C/C++ rules.
For C input, it is the raw method name:
extern unsigned long long codeSize();
int main() {
codeSize();
}
becomes in C99 mode:
(module
(memory 1)
(export "memory" memory)
(type $FUNCSIG$j (func (result i64)))
(import $codeSize "env" "codeSize" (result i64))
(export "main" $main)
(func $main (result i32)
(call_import $codeSize)
(return
(i32.const 0)
)
)
)
while in C++14 mode:
(module
(memory 1)
(export "memory" memory)
(type $FUNCSIG$j (func (result i64)))
(import $_Z8codeSizev "env" "_Z8codeSizev" (result i64))
(export "main" $main)
(func $main (result i32)
(call_import $_Z8codeSizev)
(return
(i32.const 0)
)
)
)
Similarly, C++ namespaces are part of the name mangling:
namespace ethereum {
extern unsigned long long codeSize();
}
int main() {
ethereum::codeSize();
}
becomes
(module
(memory 1)
(export "memory" memory)
(type $FUNCSIG$j (func (result i64)))
(import $_ZN8ethereum8codeSizeEv "env" "_ZN8ethereum8codeSizeEv" (result i64))
(export "main" $main)
(func $main (result i32)
(call_import $_ZN8ethereum8codeSizeEv)
(return
(i32.const 0)
)
)
)
I'm not sure the environment field will be accessible from C/C++, but it could easily become an attribute: __attribute(environment("ethereum"))__
Unless that happens and considering the complexity it might cause for other languages, perhaps the best choice is to stick to the standard environment and include ethereum
in the method names:
ethereum_codeSize
ethereum_caller
In C/C++ this means importing them with the C name convention:
extern "C" {
unsigned long long ethereum_codeSize();
}
The main issue with the cost of e.g. getBalance()
is the fact that the account lookup in the database might be needed. This idea is to split the account loading from the accessing account metadata.
Instead of
getBalance(address, resultOffset);
we should have
handle = loadAccount(address);
getBalance(handle, resultOffset);
Similarly to getBalance()
there should be getters for code hash, code, code size and others.
loadAccount
would append new entry to the array.Just dump account matadata to memory: loadAccount(memoryOffset)
. This is simpler, but might waste some memory when contract is not interested in all data. Is is also not extensible, i.e. we cannot change the account representation in the future.
Extension of the alternative 1 where the contract specify the bit mask of account fields it is interested in, e.g. loadAccount(BALANCE | NONCE, memoryOffset)
. This at least allows adding more fields to the account in the future. But the output would be a mess, especially when getting the account code is considered.
relaxed queus for async messages http://www.faculty.idc.ac.il/gadi/MyPapers/2015ST-RelaxedDataStructures.pdf
This is a planning issue for next Wendays weekly meeting.
See the conversation at Binaryen: WebAssembly/binaryen#663
We need to investigate how does this affects eWASM.
My understanding is that the only function of merkleizing the entire storage space is for thin client support. I wonder this is an opportunity to introduce an optional storage space which is not summarized at all. The burden of proof is on the dapp dev if they want to use this space and roll their own app-level thin client proof solution.
For some apps the limiting factor for transaction bandwidth is plain old cpu speed on one thread. Sharding is not enough to be competitive with centralized exchanges or architectures like bitshares
In case WASM will not support asynchronous methods we might need to take an alternative approach.
A simple way is to have each operation work with polling, storageLoad
would become:
ethereum.storageLoad(args)
ethereum.storageLoadResult() -> i32
(where 1 means result is written to the specified memory location)And the following pseudocode in WASM:
ethereum.storageLoad()
do {
result = ethereum.storageLoadResult()
} while(!result)
The JS interface would need to do the sleep
then. It's not efficient at all, but there's no way to sleep in WASM.
eWASM defines metering as an optional layer to accommodate for these use cases.
https://github.com/ewasm/design/blob/master/rationale.md
While I haven't read in further details yet, assuming that this is an argument to a function, should this be optional, but enabled by default, unless specified explicitly as disabled, for greater security? Or alternatively, require to explicitly state whether it is enabled or disabled.
In EVM the caller must define the available space for return values.
In the current EVM2 design this is carried over, however it could be improved:
call
doesn't defines the return value spacereturn
doesn't writes the value to the caller's memory space (I understand this today can depend on the VM anyway)return
places the values into an intermediate, in-memory storage (callstore
) and the contract is charged according to the size of thiscallResultCopy
is introduced for copying between callstore
and the caller's memory space (and callResultSize
to retrieve the total size)callstore
is erased when a call
is executedIt is a rough design, but it could be ironed out.
Based on the library design described in #17, suggest to support a small standard library at address 0x00..0f
to have the following exports:
memcpy(dst:i32, src:i32, len:i32)
memset(dst:i32, val:i32, len:i32)
The only features ewasm would need to expose are:
useGas
calldatacopy
/calldatasize
)return
/ revert
This could be a way to get wasm VMs implemented and experimented with in a more controlled environment on the main chain.
It is not clear whether the precompiles would have "magic gas calculation rules" or just use "a metering process" on them.
There are three ways to achieve backwards compatibility with EVM1's gas prices in EVM2 in a secure manner. Currently in the prototype we are just leaving EVM1 gasPrices unmodified but this is not secure under the assumption that some severe mismatch in the gas price of an EVM1 opcode is found in the future that doesn't affect EVM2.
This issue is to explore option 3.
As the recent DOS attacks have shown metering in EVM1 can be inaccurate therefore EVM1 contracts need to have isolation from EVM2 contracts if they are metered differently. Having different Metering Types for EVM1 and EVM2 would allow nodes to change gas limits and accept different gas prices for each type. In case of a DOS attack on EVM1 the gaslimit could be lowered just for EVM1 without affecting operation of EVM2 contracts.
name | binary encoding |
---|---|
EVM1 | 0x00 |
EVM1 (after EIP 150) | 0x01 |
EVM2 | 0x02 |
Change the gasLimit
and gasUsed
fields to an array containing [meterType, gasLimit, gasUsed]
. If one of the metering types is omitted then this block doesn't contain any computation of that required that type. Gas Limit is calculated independently for each gas type using the canonical gas limit calculation as defined in the Yellow Paper.
In the tx replace the gasLimit
and gasPrice
fields with an array the contains the elements [[meterType, gasPrice, gasLimit] ... ]
where each type
must be a unique metering type. If one of the metering types is omitted then this tx will not fund any computation of that required that type.
This has been suggested by @poemm as a possible solution for updating gas costs in deployed contracts.
The current proposal is to meter contracts at deployment time which would lock in gas costs from that point on. In this method any "metering statements" (aka. call $useGas
) does not receive any special handling and is just treated as a regular call.
With re-metering we could have two options:
Our layer one compression will be slightly different from wasm canonical layer one compression. Since we have global knowledge of all the code in the Ethereum State we can duplicate on that global state. This make layer one compression more efficient.
EVM1 provides no option to convey why an executed was stopped. We should think about ways supporting different exceptions, perhaps even with messages.
This isn't only a change for the VM, but the protocol: transaction receipts should also include the execution outcome.
What do they do when values are out of bounds? (Many of them currently in Hera throw an exception.)
Need to also what are the in bounds values.
This proposes the process of introducing changes in the EEI specification to move it from the revision 2 to the revision 3.
Each of the methods in EEI should be reviewed and discusses (with proposing alternative solutions) as an individual discussion threads. See examples:
When a consensus about the changes is reached, changes are applied to the document.
In some cases we might want to reach out to https://ethresear.ch or https://ethereum-magicians.org.
In the end the revision 3 is published as a draft to be reviewed by broader audience.
In what form do you want to keep the documents? Do we want to keep the obsoleted revisions?
Disallowing the start function and using a main
function was decided based on a limitation in the Wasm Javascript API.
I think it is time to review this limitation and if it still holds.
From here: https://github.com/ewasm/design/blob/master/eth_interface.md#selfdestruct
I don't understand what this means:
Note: multiple invocations will overwrite the benficiary address.
Does it mean the beneficiary changes to the address specified by the most recent SELFDESTRUCT
call?
Testing in remix, it seems the first address is the beneficiary.
This is a planning issue for next Mondays weekly meeting. Here is the link https://meet.jit.si/DizzyGorillasRejoiceAlone
@gcolvin says the 10:00 AM is too early. If we move to 11:00 AM will that be too late for @axic @chriseth ?
Also feel free to add comments for agenda items.
Abort execution and store a reason.
Parameters
reasonCode
i32 the reason codedescriptionOffset
i32 the memory offset to load the reason text fromdescriptionLength
i32 the length of the reason text (limited to 32 bytes)Returns
nothing
There is an easy way to compile C or C++ code online: http://mbebenita.github.io/WasmExplorer/
Two-level imports aren't supported yet: WebAssembly/design#522 (comment)
There is an ongoing effort in having a Rust compiler to output WASM:
(wasm/wast toolkit written in Rust: https://gitlab.com/DanielKeep/wasm)
In EVM1 the SELFDESTRUCT is messy and complex due to the additional ETH transfer coupled with it.
It would be good to have an FAQ, with topics like
https://github.com/ewasm/design/blob/master/eth_interface.md
We should be trying to follow the format of the WASM spec here, instead of having our own style/format. This will force us to disambiguate a lot of the behaviour here which is potentially ambiguous.
I can do this work if people would like, or at least start the branch doing the work. Just want to make sure there is agreement that this should happen.
The word ABI is overused in ethereum and I think there are several levels to it. Not all of them are properly documented:
Since eWASM changes the word size from 256 bit to at most 64 bit, it is important to state whether it will follow the same ABI for contract data passing or define a new one, more appropriate to its word size.
Currently unreachable works just like an invalid opcode in EVM1, but it could provide a nicer way of handling reverts
The system contracts are currently defined as precompiles and live at addresses 0x000...0
to 0x000...b
.
We should research how these can be moved into being part of the Ethereum state.
The stack size and stack depth may be different on different engines (or machines the engine) is executed on, especially in the case of a JIT engine. The target machine stack size could have a big influence and potentially introduce non-determinism.
The number of locals can influence the amount memory used in a stack frame and the depth of the call frame may be different.
Current Solidity code is tied to emitting EVM opcodes. Investigate the possibility of moving it to LLVM or emitting WASM directly.
This is a planning issue for next Wendays weekly meeting.
Since we have been having troubles with Jitsu lets try google hangouts.
https://hangouts.google.com/hangouts/_/qgkj5jkxzjhnbhm567f54ui6tqe?authuser=0&hl=en
I followed this link from the readme, and got the standard github 'page not found'
https://github.com/ewasm/rust-libeth
We have several options for backward compatibility EVM1
Source: https://webassembly.github.io/spec/core/_download/WebAssembly.pdf
pp48:
idiv_u๐ (๐1, ๐2)
โข If ๐2 is 0, then the result is undefined.
idiv_s๐ (๐1, ๐2)
โข If ๐2 is 0, then the result is undefined.
โข Else if ๐1 divided by ๐2 is 2๐โ1, then the result is undefined.
irem_u๐ (๐1, ๐2)
โข If ๐2 is 0, then the result is undefined.
irem_s๐ (๐1, ๐2)
โข If ๐2 is 0, then the result is undefined.
pp61:
trunc_u๐,๐ (๐ง)
(Not a problem since this is floating point only.)
pp62:
trunc_s๐,๐ (๐ง)
(Not a problem since this is floating point only.)
(raised by @holiman)
Continue of discussion started here:
What does it mean "the contract shall halt execution after this call"? Is it means that contract code is supposed to some how return control after calling selfDestruct? What if it doesn't do so? Why just don't trap?
@axic :
Next time please open an issue on the repo - it is way harder to track it that way.
It just means that effectively selfDestruct is just a marking for deletion a buffer. Any subsequent calls overwrite that buffer. Any successful halt will enforce the selfdestruct.
Combining it with a trap would put the condition detection onus no the VM implementation - not all (especially browser) VMs make that easy.
Hm, I had an impression that VMs should usually provide a way to trap from the inside of the host function with the ability to distingish between different host traps.
For example, binaryen implements traps with exceptions, you can just implement selfDestruct
to trap with a special string or just throw your own exception. WAVM is also lets you do the same by throwing and catching exceptions, which possibly created by the embedder.
As for browser trapping inside the browser definitely possible! traps can be implemented by JS exceptions and JS exceptions are easy to discriminate.
For example, this is how abort
implemented in expiremental wasm musl implementation.
TerminateWasmException
abort
this exception is thrown.TerminateWasmException
is caught hereSo it seems easy to me. Maybe I misunderstood you?
And about the current approach:
what should happen, if contract called selfDestruct
and then tried to touch the storage (read or write), call to a create create
or *call
?
On account creation the if the code is invalid should it OOG?
EVM doesn't have a real concept of libraries, rather it was added retroactively with DELEGATECALL and that Solidity ensures a contract defined as a library cannot make use of SSTORE
/SLOAD
. The VM however still needs to consider it the same as other contracts and ensure that proper rollback mechanism is in place.
A WASM code is called a module
, which defines the memory needed and has one or more functions
.
One of the premises of using WASM is that we wouldn't need precompiled contracts given the speed loss caused by the bytecode is insignificant compared to EVM. Not using precompiles can also lead to a lot of code duplication.
I think it could be useful supporting a way to store WASM modules on the blockchain, which could be loaded by contracts during the linking stage. Perhaps these modules would be special contracts, which are not meant to be executed.
Ideas welcome how this could fit into the blockchain model we have.
The problem is that return
is usually a keyword in languages and therefore in most of them when importing the return
EEI method the user has to use an alternative name.
The benefit of the change is that this won't be a problem anymore.
Each test case should have:
The following methods must be tested:
I'd suggest to start with return
and then the others can use it to return data thus reduce the complexity of checking the test's output.
It would be fairly simple to move this from an isolated test into a more useful testing framework by adjusting one of the VM implementations.
The adjustment would include to run the eWASM VM when a contract bytecode starts with the WASM bytecode signature (\0asm
).
This can be easily achieved by using ethereumjs-vm
, which then would provide a state and full blockchain.
Depending on ABI decision (#1), wrapper methods for callCode
and delegateCall
might be needed to transform between the new and current ABIs.
Currenty sstore/sload writes/reads in 256 bit chunks - similarly to EVM.
They could be changed to have a 3. parameter for length
:
length
must be > 0At some point we need to update the fee schedule with accurate gas prices.
One way to do this would be to do is to see how many cycle each equivalent opcode takes on physical hardware.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.