spacemeshos / poet Goto Github PK
View Code? Open in Web Editor NEWSpacemesh PoET service reference implementation
License: MIT License
Spacemesh PoET service reference implementation
License: MIT License
When round broadcast is failing, the round stays in "unbroadcasted" state, and only the recovery mechanism can initiate another attempt to broadcast. This is not enough, and we should introduce delayed retries after the initial failure.
Add a server param - iter - the number of sequential iterations per Hx() when using sha256(), so we shift total cpu time of proof building to sha256() computations.
func (h *sha256Hash) Hash(data ...[]byte) []byte {
h.hash.Reset()
h.hash.Write(h.x)
for _, d := range data {
_, _ = h.hash.Write(d)
}
temp := h.hash.Sum([]byte{})
for i := 0; i < h.iters; i++ {
h.hash.Reset()
h.hash.Write(h.x)
h.hash.Write(temp)
temp = h.hash.Sum(h.emptySlice)
}
return temp
}
Need to figure out why actual dag storage size for n >= 30 is double the expected size T = 2^n
The prover key-value storage file is cleaned after execution finished and the NIP was generated.
Need to clean the file also when the execution was interrupted and didn't finish.
We need logs to validate that the poet behaves as expected.
initialduration
mandatory flag).duration
optional flag). If it happens before round 1 end of execution, there would be 2 rounds executing in-parallel. If after, there would be a period in which no round would be executing. Current behaviour is to start round 2 execution on round 1 end of execution if scheduling time is after or not defined (so there's always at least 1 execution).Option 1 is currently implemented. This creates an issue of creating an additional parallel round execution for every server restart attempt, and thus should be changed.
Main consideration is to whether we plan to use the rounds scheduling or just start executing each round on the previous round end of execution. If the first, options 2-4 are viable. If the later, option 2 is the fallback and need to be implemented anyway.
Please provide feedback for the desired behaviour.
One case where we're not thread safe is when closing a round -- we may be losing the last membership submissions if they're coming in the last moment.
A thorough review is in order.
Client should submit challenges for rounds with an expected duration.
Build the client/server protocol over the POET core implementation to serve multiple miners per one sequential work.
At the moment everything is saved in memory, including the round's challenges list.
Goal is to benchmark increasing values of n until we test and bench a ~7 days poet
Which is taken place in loadConfig
function.
In order to parse poet log messages by ES we need to have those logs as JSON.
Use logs infra from go-spacemesh to print the log messages as JSON when configuring --test-mode execution parameter.
internal
to a separated verifier
package.internal
package as prover
, and extract shared code to shared
/common
package.Both on the Service
type and on the gRPC server.
We want to support multiple PoET services and allow a miner to discover them and select between them.
TODO: Clearly describe the issue requirements here...
TODO: Add links to relevant resources, specs, related issues, etc...
Important: Issue assignment to developers will be by the order of their application and proficiency level according to the tasks complexity. We will not assign tasks to developers who have'nt introduced themselves on our Gitter dev channel
develop
to your own repo and work in your repodevelop
With the introduction of dependabot a small issue popped up; PRs created with dependabot failed their CI builds (e.g. this one: #117)
The reason for that is that dependabot doesn't have access to the same set of secrets as PRs created from others have. So it cannot authenticate against Dockerhub and the build fails.
The question here is should we move the "Docker build & push" step out of the CI workflow and move it into one that is only triggered AFTER the merge of a PR, instead of on every commit? At the moment a new image is released every time the image build has been successful, even if tests failed.
The number of leafs to prove depends on the security parameter constant (T
).
merkle-tree can be used to create more efficient proof of multiple leafs, specially for the verifier which currently use a key-value cache to detect repeating nodes in the tree.
The verifier currently crashes (panic) if is given incorrect values (invalid proof). Need to gracefully report these error instead.
Upgrade the poet server broadcaster to use the new API. A new service was added to the API for the poet in spacemeshos/api#118 and implemented in spacemeshos/go-spacemesh#2159.
Alpine Linux (which we use to run our tests in docker) has a funny version of lsof installed that doesn't tell the truth:
/go/src/github.com/spacemeshos/go-spacemesh # lsof -i tcp:18550 | grep LISTEN | awk '{print $2}' | xargs kill -9
kill: you need to specify whom to kill
/go/src/github.com/spacemeshos/go-spacemesh # apk add busybox-extras
(1/1) Installing busybox-extras (1.31.1-r19)
Executing busybox-extras-1.31.1-r19.post-install
Executing busybox-1.31.1-r16.trigger
OK: 160 MiB in 44 packages
/go/src/github.com/spacemeshos/go-spacemesh # busybox-extras telnet localhost 18550
Connected to localhost
@hello
/go/src/github.com/spacemeshos/go-spacemesh # lsof -i tcp:18550
1 /bin/busybox /dev/pts/0
1 /bin/busybox /dev/pts/0
1 /bin/busybox /dev/pts/0
1 /bin/busybox /dev/tty
2941 /tmp/poet/poet /dev/null
2941 /tmp/poet/poet /dev/null
2941 /tmp/poet/poet pipe:[865420]
2941 /tmp/poet/poet /root/.poet/logs/poet.log
2941 /tmp/poet/poet anon_inode:[eventpoll]
2941 /tmp/poet/poet pipe:[862641]
2941 /tmp/poet/poet pipe:[862641]
2941 /tmp/poet/poet socket:[862642]
2941 /tmp/poet/poet socket:[862646]
2941 /tmp/poet/poet socket:[864392]
2941 /tmp/poet/poet socket:[863516]
2941 /tmp/poet/poet /tmp/poet/data/1680/challengesDb/LOCK
2941 /tmp/poet/poet /tmp/poet/data/1680/challengesDb/LOG
2941 /tmp/poet/poet /tmp/poet/data/1680/challengesDb/MANIFEST-000000
2941 /tmp/poet/poet /tmp/poet/data/1680/challengesDb/000001.log
2941 /tmp/poet/poet /tmp/poet/data/1679/layercache_0.bin (deleted)
/go/src/github.com/spacemeshos/go-spacemesh # apk add lsof
(1/1) Installing lsof (4.93.2-r0)
Executing busybox-1.31.1-r16.trigger
OK: 160 MiB in 45 packages
/go/src/github.com/spacemeshos/go-spacemesh # lsof -i tcp:18550
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
poet 2941 root 11u IPv4 862642 0t0 TCP localhost:18550 (LISTEN)
poet 2941 root 13u IPv4 864392 0t0 TCP localhost:50380->localhost:18550 (ESTABLISHED)
poet 2941 root 14u IPv4 863516 0t0 TCP localhost:18550->localhost:50380 (ESTABLISHED)
/go/src/github.com/spacemeshos/go-spacemesh # lsof -i tcp:18550 | grep LISTEN | awk '{print $2}'
2941
/go/src/github.com/spacemeshos/go-spacemesh # lsof -i tcp:18550 | grep LISTEN | awk '{print $2}' | xargs kill -9
/go/src/github.com/spacemeshos/go-spacemesh # lsof -i tcp:18550
/go/src/github.com/spacemeshos/go-spacemesh #
This causes problems inside the harness:
Line 132 in 79d0b52
I see this error from several of the go-spacemesh tests in github.com/spacemeshos/go-spacemesh/cmd/node
including this one:
/go/src/github.com/spacemeshos/go-spacemesh # go test -timeout 0 -p 1 -count 1 -v github.com/spacemeshos/go-spacemesh/
cmd/node -run Test_PoETHarnessSanity
=== RUN Test_PoETHarnessSanity
Test_PoETHarnessSanity: app_test.go:72:
Error Trace: app_test.go:72
Error: Received unexpected error:
error during killing process: exit status 123 | kill: you need to specify whom to kill
Test: Test_PoETHarnessSanity
--- FAIL: Test_PoETHarnessSanity (1.16s)
FAIL
FAIL github.com/spacemeshos/go-spacemesh/cmd/node 1.184s
FAIL
This code should make sure the output of the shell command here is not null before passing it to kill
.
The client shouldn't send the requests too early.
This is an epic - should be broken down to several tasks
The Poet period should consist of an execution part where the PoSW is executed and an idle period during which the poet will wait for the beginning of the next execution round. The purpose of the idle period is to allow the miners to generate a PoST proof and get ready for the next round of PoET (assuming that a miner will mostly work with the same PoET for many epochs)
@noamnelke wrote:
When all PoET server attempts to broadcast a proof fail we currently can’t re-broadcast a proof without restarting the PoET server. This can easily be improved by initiating a re-broadcast whenever new gateway nodes are set via the API.
Either a memory leak or out of memory issue in the current code-base. n=33, 32GB Ram. Crashes at 75% DAG generation. Blocks further benchmarks. We need a better way to generate the dag than the naive in-memory depth-first traversal that is currently implements. @zalmen @moshababo @noamnelke
This should include the multiple memberships proof & its verification.
sorry for private link, but basically see comments that start at https://spacemesh.slack.com/archives/C01BBB2U64U/p1664858910998709
We ran into an issue today where after #105 was merged, go-spacemesh system tests started failing because they don't pin the poet version (see spacemeshos/go-spacemesh#2189). In order to facilitate pinning, we should add versioning here and set up automatic dockerbuild/push when a new version is merged or a new release is cut.
Now that PoET sends membership and PoET proofs via gossip, we should clean up the old gRPC endpoints for obtaining proofs. This may require changing how some tests work.
When a round execution fails, it writes the error into errChan
. However, nobody reads that channel anywhere. Because a write to a full channel blocks, it will lead to goroutines hanging indefinitely and leaking resources.
At the moment for every membership proof request the merkle tree is being entirely re-constructed.
It is caused due to broken import to the log package.
Add support to update PoET service gateway nodes after it already started.
At the moment one must restart the server in order to do this.
There's currently no documentation on how to run the POET server. Add at least a basic "# Running the server" section to the README. It should contain instructions on how to connect go-spacemesh to it.
Happy to work on this.
@moshababo, does anything special need to be done? When I run poet locally, I noticed that it's looking for a poet.conf
. Can we add a sample of this to the repo? Do I need to supply any commandline args? Or should go-spacemesh connect to it (on the default port) with no special params/config required?
service.go:Info()
reads from openRound
and executingRounds
while the routine started at Start
writes to these variables. there's even protection (mutex) on executingRounds
inside Start
but none when reading in Info()
.
It is important to say that Info()
can be called at any time as it is triggered by an API call.
cc: @moshababo
Specially the service rounds duration, to be used from the test client.
In verifier.Verify
, the optimization that marks the proof as valid if we already handled one of the siblings on the way to the root is wrong. The fact that we got to a known sibling says nothing on the validity of the path that we already calculated (i.e. from the leaf to the current position). A correct optimization will require storing in the cache sibling pairs (instead of only one of them) and if we got to a pair we can indeed stop the verification and mark the proof as valid.
PoET is used as a dependency for go-spacemesh. Since go-spacemesh is built for Linux, MacOS and Windows, PoET should also be able to be built on all 3 platforms:
hash(concat(membershipRoot, poetId))
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.