GithubHelp home page GithubHelp logo

design's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

design's Issues

EEI: calls and return data

I think since Byzantium has finally introduced returndatacopy and returndatasize the original result buffers of call* are obsolete (create is a different beast).

It would clean semantics and implementations a lot of if remove the result buffer support from calls and require clients to use returndata*.

Related #12

Wasm interface/environment naming

eth_interface.md lists the exported method to the Wasm interface. These methods will be effectively available in the languages used to write contracts, such as C, C++ and anything else having an llvm backend.

These methods are made available via import statements (e.g. (import $codeSize "ethereum" "codeSize" (result i64)))

Every import statement has the following properties:

  • local identifier (in the scope of the wasm code)
  • environment
  • method name
  • (return value)

Currently the environment in ewasm is set to ethereum.

There is an easy way to compiler C or C++ code online: http://mbebenita.github.io/WasmExplorer/

This llvm compiler has env hardcoded as environment and the method names are mangled according to C/C++ rules.

For C input, it is the raw method name:

extern unsigned long long codeSize();

int main() {
  codeSize();
}

becomes in C99 mode:

(module
  (memory 1)
  (export "memory" memory)
  (type $FUNCSIG$j (func (result i64)))
  (import $codeSize "env" "codeSize" (result i64))
  (export "main" $main)
  (func $main (result i32)
    (call_import $codeSize)
    (return
      (i32.const 0)
    )
  )
)

while in C++14 mode:

(module
  (memory 1)
  (export "memory" memory)
  (type $FUNCSIG$j (func (result i64)))
  (import $_Z8codeSizev "env" "_Z8codeSizev" (result i64))
  (export "main" $main)
  (func $main (result i32)
    (call_import $_Z8codeSizev)
    (return
      (i32.const 0)
    )
  )
)

Similarly, C++ namespaces are part of the name mangling:

namespace ethereum {
  extern unsigned long long codeSize();
}

int main() {
  ethereum::codeSize();
}

becomes

(module
  (memory 1)
  (export "memory" memory)
  (type $FUNCSIG$j (func (result i64)))
  (import $_ZN8ethereum8codeSizeEv "env" "_ZN8ethereum8codeSizeEv" (result i64))
  (export "main" $main)
  (func $main (result i32)
    (call_import $_ZN8ethereum8codeSizeEv)
    (return
      (i32.const 0)
    )
  )
)

I'm not sure the environment field will be accessible from C/C++, but it could easily become an attribute: __attribute(environment("ethereum"))__

Unless that happens and considering the complexity it might cause for other languages, perhaps the best choice is to stick to the standard environment and include ethereum in the method names:

  • ethereum_codeSize
  • ethereum_caller
  • and so on

In C/C++ this means importing them with the C name convention:

extern "C" {
  unsigned long long ethereum_codeSize();
}

EEI: Account handle

The main issue with the cost of e.g. getBalance() is the fact that the account lookup in the database might be needed. This idea is to split the account loading from the accessing account metadata.

Instead of

getBalance(address, resultOffset);

we should have

handle = loadAccount(address);
getBalance(handle, resultOffset);

Similarly to getBalance() there should be getters for code hash, code, code size and others.

Pros

  1. The cost of account lookup is separated from the cost of the getter.
  2. Getting information about the current account should be cheap because it is already loaded (the handle to the current account could be predefined).
  3. The cost of accessing the same account multiple times is lower.

Cons

  1. Handles must be deterministic as they may leak. Simple solution would be to have an array of loaded accounts and return the index in this array. Each call to loadAccount would append new entry to the array.
  2. Contracts are responsible of accounts management.

Alternatives

  1. Just dump account matadata to memory: loadAccount(memoryOffset). This is simpler, but might waste some memory when contract is not interested in all data. Is is also not extensible, i.e. we cannot change the account representation in the future.

  2. Extension of the alternative 1 where the contract specify the bit mask of account fields it is interested in, e.g. loadAccount(BALANCE | NONCE, memoryOffset). This at least allows adding more fields to the account in the future. But the output would be a mess, especially when getting the account code is considered.

Specify a license for this repo

Probably Apache 2.0 makes sense given its consideration for patents.

(Also webassembly/design uses Apache 2.0.)

@wanderer @gcolvin are you fine with Apache 2.0? It is only three of us as contributors to this repo so far.

optionally merkle-ized storage

My understanding is that the only function of merkleizing the entire storage space is for thin client support. I wonder this is an opportunity to introduce an optional storage space which is not summarized at all. The burden of proof is on the dapp dev if they want to use this space and roll their own app-level thin client proof solution.

For some apps the limiting factor for transaction bandwidth is plain old cpu speed on one thread. Sharding is not enough to be competitive with centralized exchanges or architectures like bitshares

Design a poll-based interface

In case WASM will not support asynchronous methods we might need to take an alternative approach.

A simple way is to have each operation work with polling, storageLoad would become:

  • ethereum.storageLoad(args)
  • ethereum.storageLoadResult() -> i32 (where 1 means result is written to the specified memory location)

And the following pseudocode in WASM:

ethereum.storageLoad()
do { 
  result = ethereum.storageLoadResult()
} while(!result)

The JS interface would need to do the sleep then. It's not efficient at all, but there's no way to sleep in WASM.

Arbitrary call return sizes

In EVM the caller must define the available space for return values.

In the current EVM2 design this is carried over, however it could be improved:

  • call doesn't defines the return value space
  • return doesn't writes the value to the caller's memory space (I understand this today can depend on the VM anyway)
  • return places the values into an intermediate, in-memory storage (callstore) and the contract is charged according to the size of this
  • a new opcode, callResultCopy is introduced for copying between callstore and the caller's memory space (and callResultSize to retrieve the total size)
  • the callstore is erased when a call is executed

It is a rough design, but it could be ironed out.

"Standard library" as "system library"

Based on the library design described in #17, suggest to support a small standard library at address 0x00..0f to have the following exports:

  • memcpy(dst:i32, src:i32, len:i32)
  • memset(dst:i32, val:i32, len:i32)
  • TBD

Propose an ewasm subset for precompiles on the main chain

The only features ewasm would need to expose are:

  • useGas
  • calldata access (calldatacopy/calldatasize)
  • return / revert

This could be a way to get wasm VMs implemented and experimented with in a more controlled environment on the main chain.

It is not clear whether the precompiles would have "magic gas calculation rules" or just use "a metering process" on them.

Backwards Compatibility; Secure Metering Isolation

Overview

There are three ways to achieve backwards compatibility with EVM1's gas prices in EVM2 in a secure manner. Currently in the prototype we are just leaving EVM1 gasPrices unmodified but this is not secure under the assumption that some severe mismatch in the gas price of an EVM1 opcode is found in the future that doesn't affect EVM2.

  1. Meter all the EVM1 contracts in with the EVM2 gasprices which target 0.5 second processing time. This is perhaps the cleanest way but it has the disadvantage to this is that some EVM1 opcodes and precompiles will cost more than they currently do which could break some contracts (it will also force all contract to run wasm VMs, there will be no way to fallback to EVM1 implementation when running an EVM1 contract).
  2. Lower all the EVM2 gasprices to the point that all the EVM1 opcodes and precompiles cost less or equivalent to what they cost now. This would might have the effect of lowering the overall gasLimit for EVM1 contracts and may make some EVM1 contracts unusable
  3. Have separate gasPrices for EVM1 and EVM2 contracts. This is the most pluralistic option but is more complex than 1 and 2.

This issue is to explore option 3.

Rational

As the recent DOS attacks have shown metering in EVM1 can be inaccurate therefore EVM1 contracts need to have isolation from EVM2 contracts if they are metered differently. Having different Metering Types for EVM1 and EVM2 would allow nodes to change gas limits and accept different gas prices for each type. In case of a DOS attack on EVM1 the gaslimit could be lowered just for EVM1 without affecting operation of EVM2 contracts.

Metering Types

name binary encoding
EVM1 0x00
EVM1 (after EIP 150) 0x01
EVM2 0x02

BlockHeader structure

Change the gasLimit and gasUsed fields to an array containing [meterType, gasLimit, gasUsed]. If one of the metering types is omitted then this block doesn't contain any computation of that required that type. Gas Limit is calculated independently for each gas type using the canonical gas limit calculation as defined in the Yellow Paper.

Tx structure

In the tx replace the gasLimit and gasPrice fields with an array the contains the elements [[meterType, gasPrice, gasLimit] ... ] where each type must be a unique metering type. If one of the metering types is omitted then this tx will not fund any computation of that required that type.

Re-metering

This has been suggested by @poemm as a possible solution for updating gas costs in deployed contracts.

The current proposal is to meter contracts at deployment time which would lock in gas costs from that point on. In this method any "metering statements" (aka. call $useGas) does not receive any special handling and is just treated as a regular call.

With re-metering we could have two options:

  1. update the constants in previously inserted metering statements (with the special rule of handling the first statement in each block)
  2. always remove metering statements prior to metering

specify layer 1 compression

Our layer one compression will be slightly different from wasm canonical layer one compression. Since we have global knowledge of all the code in the Ethereum State we can duplicate on that global state. This make layer one compression more efficient.

ref

Exception (or error reporting) system

EVM1 provides no option to convey why an executed was stopped. We should think about ways supporting different exceptions, perhaps even with messages.

This isn't only a change for the VM, but the protocol: transaction receipts should also include the execution outcome.

EEI redesign process

This proposes the process of introducing changes in the EEI specification to move it from the revision 2 to the revision 3.

Each of the methods in EEI should be reviewed and discusses (with proposing alternative solutions) as an individual discussion threads. See examples:

When a consensus about the changes is reached, changes are applied to the document.
In some cases we might want to reach out to https://ethresear.ch or https://ethereum-magicians.org.

In the end the revision 3 is published as a draft to be reviewed by broader audience.

In what form do you want to keep the documents? Do we want to keep the obsoleted revisions?

New EEI method: abort

Abort execution and store a reason.

Parameters

  • reasonCode i32 the reason code
  • descriptionOffset i32 the memory offset to load the reason text from
  • descriptionLength i32 the length of the reason text (limited to 32 bytes)

Returns

nothing

Do not transfer ETH with selfDestruct

In EVM1 the SELFDESTRUCT is messy and complex due to the additional ETH transfer coupled with it.

  1. Simplify selfDestruct by removing its address argument and do no transfer ether. The ether is destroyed with the account.
  2. Add transfer() function that only transfers ether to other account. The target account's code is not executed. This function is needed to implement SELFDESTRUCT in evm2wasm.

add FAQ

It would be good to have an FAQ, with topics like

  • what are the high level goals of this project
  • whats the timeline
  • will this be compatible with WASM
  • how will gas work
  • how will solidity and serpent work

Define ABI

The word ABI is overused in ethereum and I think there are several levels to it. Not all of them are properly documented:

  • the way contracts pass data between themselves (was depending on the language, now it seems to be converging), this includes precompiled contracts
  • the way external inputs are entered to the contract (both during calls and creation) - it is defined in the Contract ABI
  • the way storage is used (specific to the language)

Since eWASM changes the word size from 256 bit to at most 64 bit, it is important to state whether it will follow the same ABI for contract data passing or define a new one, more appropriate to its word size.

Call stack metering / deterministic depth restriction

The stack size and stack depth may be different on different engines (or machines the engine) is executed on, especially in the case of a JIT engine. The target machine stack size could have a big influence and potentially introduce non-determinism.

The number of locals can influence the amount memory used in a stack frame and the depth of the call frame may be different.

eWASM prototype todo

  • test RPC prototype
    • VM run levels
      • tx
      • block
      • code
    • Store blocks
    • Make blocks
    • Make Txs - reuse etheruemjs-tx
  • AST transformation
    • create streaming transformation for gas injections (currently not streaming) - on hold till stable binary
    • add streaming ewasm validator (restricts wasm to i32 and i64's) - on hold till stable binary
    • basic
  • test Network
    • Simple PoS? / PoW
  • EVM -> WASM transpiler

Backward compatibility

We have several options for backward compatibility EVM1

  1. run both the VM's and use wasm's magic number to determine which VM to run the code in
  2. write an EVM1 interpreter in EVM2
  3. transpile EVM1 contracts to EVM2
  1. would be the easiest to do but would have the additional concern of twice the surface area for consensus breaking bugs.

Non-determinism with division by zero

Source: https://webassembly.github.io/spec/core/_download/WebAssembly.pdf

pp48:
idiv_u๐‘ (๐‘–1, ๐‘–2)
โ€ข If ๐‘–2 is 0, then the result is undefined.

idiv_s๐‘ (๐‘–1, ๐‘–2)
โ€ข If ๐‘—2 is 0, then the result is undefined.
โ€ข Else if ๐‘—1 divided by ๐‘—2 is 2๐‘โˆ’1, then the result is undefined.

irem_u๐‘ (๐‘–1, ๐‘–2)
โ€ข If ๐‘—2 is 0, then the result is undefined.

irem_s๐‘ (๐‘–1, ๐‘–2)
โ€ข If ๐‘–2 is 0, then the result is undefined.

pp61:
trunc_u๐‘€,๐‘ (๐‘ง)
(Not a problem since this is floating point only.)

pp62:
trunc_s๐‘€,๐‘ (๐‘ง)
(Not a problem since this is floating point only.)

(raised by @holiman)

selfDestruct clarification

Continue of discussion started here:

@pepyakin :

What does it mean "the contract shall halt execution after this call"? Is it means that contract code is supposed to some how return control after calling selfDestruct? What if it doesn't do so? Why just don't trap?

@axic :

Next time please open an issue on the repo - it is way harder to track it that way.

It just means that effectively selfDestruct is just a marking for deletion a buffer. Any subsequent calls overwrite that buffer. Any successful halt will enforce the selfdestruct.

Combining it with a trap would put the condition detection onus no the VM implementation - not all (especially browser) VMs make that easy.

Hm, I had an impression that VMs should usually provide a way to trap from the inside of the host function with the ability to distingish between different host traps.

For example, binaryen implements traps with exceptions, you can just implement selfDestruct to trap with a special string or just throw your own exception. WAVM is also lets you do the same by throwing and catching exceptions, which possibly created by the embedder.

As for browser trapping inside the browser definitely possible! traps can be implemented by JS exceptions and JS exceptions are easy to discriminate.

For example, this is how abort implemented in expiremental wasm musl implementation.

  • here is definition of TerminateWasmException
  • upon a call to abort this exception is thrown.
  • the code of main start executing here in a try block.
  • TerminateWasmException is caught here

So it seems easy to me. Maybe I misunderstood you?

And about the current approach:
what should happen, if contract called selfDestruct and then tried to touch the storage (read or write), call to a create create or *call?

WASM modules as libraries

EVM doesn't have a real concept of libraries, rather it was added retroactively with DELEGATECALL and that Solidity ensures a contract defined as a library cannot make use of SSTORE/SLOAD. The VM however still needs to consider it the same as other contracts and ensure that proper rollback mechanism is in place.

A WASM code is called a module, which defines the memory needed and has one or more functions.

One of the premises of using WASM is that we wouldn't need precompiled contracts given the speed loss caused by the bytecode is insignificant compared to EVM. Not using precompiles can also lead to a lot of code duplication.

I think it could be useful supporting a way to store WASM modules on the blockchain, which could be loaded by contracts during the linking stage. Perhaps these modules would be special contracts, which are not meant to be executed.

Ideas welcome how this could fit into the blockchain model we have.

Rename "return" to "finish"

The problem is that return is usually a keyword in languages and therefore in most of them when importing the return EEI method the user has to use an alternative name.

The benefit of the change is that this won't be a problem anymore.

Test suite for the EEI

Each test case should have:

  • wasm bytecode
  • account / block / tx state
  • expected return or revert data

The following methods must be tested:

  • useGas
  • getAddress
  • getBalance
  • getBlockHash
  • call
  • callDataCopy
  • getCallDataSize
  • callCode
  • callDelegate
  • storageStore
  • storageLoad
  • getCaller
  • getCallValue
  • codeCopy
  • getCodeSize
  • getBlockCoinbase
  • create
  • getBlockDifficulty
  • externalCodeCopy
  • getExternalCodeSize
  • getGasLeft
  • getBlockGasLimit
  • getTxGasPrice
  • log
  • getBlockNumber
  • getTxOrigin
  • return
  • selfDestruct
  • getBlockTimestamp

I'd suggest to start with return and then the others can use it to return data thus reduce the complexity of checking the test's output.

Bootstrapping into useful testing phase

It would be fairly simple to move this from an isolated test into a more useful testing framework by adjusting one of the VM implementations.

The adjustment would include to run the eWASM VM when a contract bytecode starts with the WASM bytecode signature (\0asm).

This can be easily achieved by using ethereumjs-vm, which then would provide a state and full blockchain.

Depending on ABI decision (#1), wrapper methods for callCode and delegateCall might be needed to transform between the new and current ABIs.

sstore/sload without fixed field size

Currenty sstore/sload writes/reads in 256 bit chunks - similarly to EVM.

They could be changed to have a 3. parameter for length:

  • length must be > 0
  • gas should be calculated according to the length

Determine Gas Price for opcodes

At some point we need to update the fee schedule with accurate gas prices.
One way to do this would be to do is to see how many cycle each equivalent opcode takes on physical hardware.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.