netflix / p2plab Goto Github PK
View Code? Open in Web Editor NEWperformance benchmark infrastructure for IPLD DAGs
License: Apache License 2.0
performance benchmark infrastructure for IPLD DAGs
License: Apache License 2.0
In the future, we'll like to benchmark more than just data exchange. For example, given a cluster, write a scenario that can benchmark a DHT lookup. Right now the report is hardcoded to bitswap and libp2p bandwidth metrics, but we should turn that into an arbitrary Metrics
slice.
Over time there will be clusters that have been idling for a day or more. They should not be deleted but instead, all the ASGs in that cluster should scale down to size 1. Perhaps we should add a new command to scale them back to size in their definition, or running benchmarks should automatically do that.
$ go version go version go1.14.7 linux/amd64
go env
Output$ go env GO111MODULE="on" GOARCH="amd64" GOBIN="" GOCACHE="/home/eagr/.cache/go-build" GOENV="/home/eagr/.config/go/env" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="linux" GOINSECURE="" GONOPROXY="" GONOSUMDB="" GOOS="linux" GOPATH="/home/eagr/go" GOPRIVATE="" GOPROXY="https://goproxy.io,direct" GOROOT="/usr/local/go" GOSUMDB="sum.golang.org" GOTMPDIR="" GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64" GCCGO="gccgo" AR="ar" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="/dev/null" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build497021889=/tmp/go-build -gno-record-gcc-switches" GOROOT/bin/go version: go version go1.14.7 linux/amd64 GOROOT/bin/go tool compile -V: compile version go1.14.7 uname -sr: Linux 4.19.104-microsoft-standard Distributor ID: Ubuntu Description: Ubuntu 18.04.5 LTS Release: 18.04 Codename: bionic
After running go get -u github.com/Netflix/p2plab/cmd/labd
, I got these compilation errors below:
# github.com/Netflix/p2plab/peer
go/pkg/mod/github.com/!netflix/[email protected]/peer/peer.go:170:11: p.swarm.Filters undefined (type *swarm.Swarm has no field or method Filters)
go/pkg/mod/github.com/!netflix/[email protected]/peer/peer.go:196:11: p.swarm.Filters undefined (type *swarm.Swarm has no field or method Filters)
The file adder in peer/peer.go
currently is hardcoded to use the default size chunker, sha256 hash func, etc. We should plumb through chunking options from the scenario objects
definitions, so that we can start testing these options out.
Depending on the CLI, there is a reasonable printer to pick to use as the default. So each invocation of the CommandPrinter
should pass in a default value too, so commands have control what auto
will use.
Currently, we have unix
and json
printer. We should have a table
printer like the output of docker ps
. Fields should be customized per metadata type.
Deploy instances via terraform subprocess and serialize data into cluster boltdb storage.
Implement labctl node ssh
command to connect to NFLX bastion, and then into the nodes.
Actions are executable units performed against matched nodes of a query.
Implement the basic actions:
shard-[start]-[end]-[object]
shard-0-100-[object]
is just the action [object].spread-[object]
shard-0-33-[object]
, shard-33-66-[object]
, and shard-66-100-[object]
.I ran the benchmark locally, and the returned output is slightly different than what is displayed in the README. Obviously the values themselves will be different due to being tested on different machines, etc.. However some values are filled in for my result, and they are missing in the readme example. Primarily in my run, the bitswap nodes all have some value for BLOCKSRECV
vs the readme which only has 1 node with value for BLOCKSRECV
.
Is there some sort of documentation within the codebase explaining how the values are calculated? Thanks.
My example:
# Bandwidth
+-------------------+----------------------+---------+----------+---------+---------+
| QUERY | NODE | TOTALIN | TOTALOUT | RATEIN | RATEOUT |
+-------------------+----------------------+---------+----------+---------+---------+
| (not 'neighbors') | bp6qj8t85vcoqnd7imu0 | 442 MB | 226 kB | 73 MB/s | 37 kB/s |
+-------------------+----------------------+---------+----------+---------+---------+
| - | bp6qj8t85vcoqnd7imug | 1.3 kB | 203 MB | 0 B/s | 36 MB/s |
+ +----------------------+ +----------+ +---------+
| | bp6qj8t85vcoqnd7imv0 | | 252 MB | | 37 MB/s |
+-------------------+----------------------+---------+----------+---------+---------+
| TOTAL | 442 MB | 455 MB | 73 MB/s | 73 MB/s |
+-------------------+----------------------+---------+----------+---------+---------+
# Bitswap
+-------------------+----------------------+------------+------------+-----------+----------+----------+---------+
| QUERY | NODE | BLOCKSRECV | BLOCKSSENT | DUPBLOCKS | DATARECV | DATASENT | DUPDATA |
+-------------------+----------------------+------------+------------+-----------+----------+----------+---------+
| (not 'neighbors') | bp6qj8t85vcoqnd7imu0 | 1,887 | 0 | 5 | 486 MB | 0 B | 531 kB |
+-------------------+----------------------+------------+------------+-----------+----------+----------+---------+
| - | bp6qj8t85vcoqnd7imug | 1,227 | 873 | 28 | 313 MB | 225 MB | 4.2 MB |
+ +----------------------+------------+------------+-----------+----------+----------+---------+
| | bp6qj8t85vcoqnd7imv0 | 1,199 | 1,075 | 0 | 309 MB | 274 MB | 0 B |
+-------------------+----------------------+------------+------------+-----------+----------+----------+---------+
Now compare this with the one from the readme:
# Bandwidth
+-------------------+----------------------+---------+----------+----------+----------+
| QUERY | NODE | TOTALIN | TOTALOUT | RATEIN | RATEOUT |
+-------------------+----------------------+---------+----------+----------+----------+
| (not 'neighbors') | bp3ept7ic6vdctur3dag | 397 MB | 204 kB | 106 MB/s | 54 kB/s |
+-------------------+----------------------+---------+----------+----------+----------+
| - | bp3eptfic6vdctur3db0 | 96 kB | 188 MB | 26 kB/s | 52 MB/s |
+ +----------------------+---------+----------+----------+----------+
| | bp3eptvic6vdctur3dbg | 109 kB | 212 MB | 28 kB/s | 55 MB/s |
+-------------------+----------------------+---------+----------+----------+----------+
| TOTAL | 397 MB | 400 MB | 106 MB/s | 107 MB/s |
+-------------------+----------------------+---------+----------+----------+----------+
# Bitswap
+-------------------+----------------------+------------+------------+-----------+----------+----------+---------+
| QUERY | NODE | BLOCKSRECV | BLOCKSSENT | DUPBLOCKS | DATARECV | DATASENT | DUPDATA |
+-------------------+----------------------+------------+------------+-----------+----------+----------+---------+
| (not 'neighbors') | bp3ept7ic6vdctur3dag | 1,866 | 0 | 2 | 481 MB | 0 B | 263 kB |
+-------------------+----------------------+------------+------------+-----------+----------+----------+---------+
| - | bp3eptfic6vdctur3db0 | 0 | 888 | 0 | 0 B | 228 MB | 0 B |
+ +----------------------+ +------------+ + +----------+ +
| | bp3eptvic6vdctur3dbg | | 978 | | | 252 MB | |
+-------------------+----------------------+------------+------------+-----------+----------+----------+---------+
| TOTAL | 1,866 | 1,866 | 2 | 481 MB | 481 MB | 263 kB |
+-------------------+----------------------+------------+------------+-----------+----------+----------+---------+
Peers that fulfills bitswap wants are much more likely to have other blocks you might want. Bitswap sessions implements this by carrying context over the whole exchange of a IPLD DAG.
https://github.com/ipfs/go-merkledag/blob/master/merkledag.go#L154-L157
Hello,
I want to use p2plab to test and compare different versions/branches of IPFS and the underlying projects (like bitswap). I found this repo here, and have been using it and reading the code for a couple of days.
So far I'm able to create scenarios, clusters and run benchmarks, however this is all in the same version of IPFS. I can change the transport protocol (tcp,ws,quic), or to change the commit a peer is using (I was hopeful this was the commit that signalled what IPFS version to use, but it refers to the p2plab commit history).
I was hoping to use this repo to compare IPFS versions, am I missing something here?
Please correct me if this question is in the wrong place, or if I should reach you through another channel (eg: email).
Thank you.
Queries can be executed using labctl cluster query <cluster> <query>
. In addition, you should be able to use --add
and --remove
flags to add or remove labels to the matched nodes from the query.
Labels has type []string
. The labels stored should be unique and sorted.
Trying out the example and getting a few different errors.
Binary build fails due to dependency issue:
go: github.com/ipfs/[email protected] requires
github.com/golangci/[email protected] requires
github.com/go-critic/[email protected]: invalid pseudo-version: does not match version-control timestamp (2019-05-26T07:48:19Z)
This can be fixed with the following in go.mod
replace github.com/golangci/golangci-lint => github.com/golangci/golangci-lint v1.18.0
replace github.com/go-critic/go-critic v0.0.0-20181204210945-ee9bf5809ead => github.com/go-critic/go-critic v0.3.5-0.20190526074819-1df300866540
Running the benchmark fails
{"level":"debug","path":"/healthcheck","method":"GET","time":1582148787210,"message":"Registering route"}
{"level":"debug","path":"/update","method":"PUT","time":1582148787210,"message":"Registering route"}
{"level":"debug","time":1582148787361,"message":"Build initialized"}
{"level":"info","addrs":["/ip4/127.0.0.1/udp/43218/quic","/ip4/172.20.2.110/udp/43218/quic","/ip4/192.168.122.1/udp/43218/quic","/ip4/192.168.39.1/udp/43218/quic","/ip4/172.17.0.1/udp/43218/quic","/ip4/127.0.0.1/tcp/36871","/ip4/172.20.2.110/tcp/36871","/ip4/192.168.122.1/tcp/36871","/ip4/192.168.39.1/tcp/36871","/ip4/172.17.0.1/tcp/36871","/ip4/127.0.0.1/tcp/41985/ws","/ip4/172.20.2.110/tcp/41985/ws","/ip4/192.168.122.1/tcp/41985/ws","/ip4/192.168.39.1/tcp/41985/ws","/ip4/172.17.0.1/tcp/41985/ws"],"time":1582148787362,"message":"IPFS listening"}
{"level":"debug","path":"/healthcheck","method":"GET","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/clusters/json","method":"GET","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/clusters/{name}/json","method":"GET","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/clusters/create","method":"POST","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/clusters/label","method":"PUT","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/clusters/delete","method":"DELETE","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/clusters/{name}/nodes/json","method":"GET","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/clusters/{name}/nodes/{id}/json","method":"GET","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/clusters/{name}/nodes/label","method":"PUT","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/clusters/{name}/nodes/update","method":"PUT","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/scenarios/json","method":"GET","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/scenarios/{name}/json","method":"GET","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/scenarios/create","method":"POST","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/scenarios/label","method":"PUT","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/scenarios/delete","method":"DELETE","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/benchmarks/json","method":"GET","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/benchmarks/{id}/json","method":"GET","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/benchmarks/{id}/report/json","method":"GET","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/benchmarks/create","method":"POST","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/benchmarks/label","method":"PUT","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/benchmarks/delete","method":"DELETE","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/experiments/json","method":"GET","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/experiments/{id}/json","method":"GET","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/experiments/create","method":"POST","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/experiments/label","method":"PUT","time":1582148787362,"message":"Registering route"}
{"level":"debug","path":"/experiments/delete","method":"DELETE","time":1582148787362,"message":"Registering route"}
{"level":"info","addr":":7001","time":1582148787362,"message":"daemon listening"}
{"level":"info","bid":"my-cluster-neighbors-1582148789310816531","time":1582148789310,"message":"Retrieving nodes in cluster"}
{"level":"info","bid":"my-cluster-neighbors-1582148789310816531","time":1582148789310,"message":"Resolving git references"}
{"level":"debug","bid":"my-cluster-neighbors-1582148789310816531","ref":"HEAD","resolved":"81519eabe22e57cb131f6454fa91b87930fdaf69","time":1582148790008,"message":"Resolved remote git ref"}
{"level":"debug","bid":"my-cluster-neighbors-1582148789310816531","ref":"HEAD","resolved":"81519eabe22e57cb131f6454fa91b87930fdaf69","time":1582148790008,"message":"Resolved remote git ref"}
{"level":"debug","bid":"my-cluster-neighbors-1582148789310816531","ref":"HEAD","resolved":"81519eabe22e57cb131f6454fa91b87930fdaf69","time":1582148790055,"message":"Resolved remote git ref"}
{"level":"debug","bid":"my-cluster-neighbors-1582148789310816531","references":1,"time":1582148790055,"message":"Resolved unique references"}
{"level":"info","bid":"my-cluster-neighbors-1582148789310816531","commits":["81519eabe22e57cb131f6454fa91b87930fdaf69"],"time":1582148790055,"message":"Building p2p app(s)"}
{"level":"debug","bid":"my-cluster-neighbors-1582148789310816531","commits":["81519eabe22e57cb131f6454fa91b87930fdaf69"],"exec":["init"],"time":1582148790822,"message":"Initialized empty Git repository in /home/solidity/Code/Netflix/p2plab/bin/tmp/labd/builder/81519eabe22e57cb131f6454fa91b87930fdaf69828899841/p2plab/.git/"}
{"level":"debug","bid":"my-cluster-neighbors-1582148789310816531","commits":["81519eabe22e57cb131f6454fa91b87930fdaf69"],"exec":["fetch","--depth","1","origin","81519eabe22e57cb131f6454fa91b87930fdaf69"],"time":1582148790870,"message":"From /home/solidity/Code/Netflix/p2plab/bin/tmp/labd/builder/p2plab"}
{"level":"debug","bid":"my-cluster-neighbors-1582148789310816531","commits":["81519eabe22e57cb131f6454fa91b87930fdaf69"],"exec":["fetch","--depth","1","origin","81519eabe22e57cb131f6454fa91b87930fdaf69"],"time":1582148790870,"message":" * branch 81519eabe22e57cb131f6454fa91b87930fdaf69 -> FETCH_HEAD"}
{"level":"debug","bid":"my-cluster-neighbors-1582148789310816531","commits":["81519eabe22e57cb131f6454fa91b87930fdaf69"],"exec":["reset","--hard","81519eabe22e57cb131f6454fa91b87930fdaf69"],"time":1582148790878,"message":"HEAD is now at 81519ea Add additional godocs for top-level interfaces"}
{"level":"debug","bid":"my-cluster-neighbors-1582148789310816531","commits":["81519eabe22e57cb131f6454fa91b87930fdaf69"],"exec":["-o","build","./cmd/labapp"],"time":1582148791560,"message":"go: github.com/ipfs/[email protected] requires"}
{"level":"debug","bid":"my-cluster-neighbors-1582148789310816531","commits":["81519eabe22e57cb131f6454fa91b87930fdaf69"],"exec":["-o","build","./cmd/labapp"],"time":1582148791560,"message":"\tgithub.com/golangci/[email protected] requires"}
{"level":"debug","bid":"my-cluster-neighbors-1582148789310816531","commits":["81519eabe22e57cb131f6454fa91b87930fdaf69"],"exec":["-o","build","./cmd/labapp"],"time":1582148791560,"message":"\tgithub.com/go-critic/[email protected]: invalid pseudo-version: does not match version-control timestamp (2019-05-26T07:48:19Z)"}
{"level":"debug","error":"failed to update cluster: failed to build commit \"81519eabe22e57cb131f6454fa91b87930fdaf69\": exit status 1","time":1582148791567,"message":"failed request"}
2020/02/19 13:46:31 http: superfluous response.WriteHeader call from github.com/opentracing-contrib/go-stdlib/nethttp.(*statusCodeTracker).WriteHeader (status-code-tracker.go:19)
In the benchmark report, we should include properties of the DAG of transformed objects. For example, the number of blocks, min/max depth, and any other properties that will help understand the shape of the DAG.
Similar to how containerd
does GRPC service plugins, we'll like to provide a single function, that when given an implementation that fulfills the application interface, will create a working labapp
that is benchmarkable. This way, we can build wrappers for go-ipfs
and js-ipfs
and test against releases and rcs.
Instrument client and servers (labd, labagent, labapp) with opentracing via jaeger.
labd
:
lab-agent
s.lab-agent
:
Assuming you have the following nodes:
{
"nodes": [
{
"name": "apple",
"labels": [
"everyone",
"apple",
"slowdisk",
"region=us-west-2",
]
},
{
"name": "banana",
"labels": [
"everyone",
"banana",
"region=us-west-2",
]
},
{
"name": "cherry",
"labels": [
"everyone",
"cherry"
"region=us-east-1",
]
}
]
}
Then executing the queries should return:
apple
["apple"]
(not ‘apple’)
["banana","cherry"]
(and ‘slowdisk’ ‘region=us-west-2’)
["apple"]
(or ‘region=us-west-2’ ‘region=us-east-1’)
["apple","banana","cherry"]
(or (not ‘slowdisk’) ‘banana’)
["banana","cherry"]
Arbitrarily print resources via their JSON representation or a "standard output" (unix friendly or markdown tables).
Before myself and George and implement Experiments API we need to get some concrete implementation details, my main area of uncertainty is how the Experiments API will integrate with the rest of p2plab.
So far I have the following notes on what the Experiments API needs to do:
While looking through the codebase I noticed experiment.go so this leads me to think that the Experiments API doesn't need to be written from scratch, and instead modified to allow the aforementioned functionality in an automation-capable way. If this is true, how this integrates with the rest of p2plab is a bit unclear
Experiments API has been written, and allows running multiple trials/benchmarks concurrently. While working on this a race condition triggered when updating nodes session state when doing tracing. This has been disabled for now, and is detailed in a code comment
We want to be able to generate flamegraphs for the benchmark. Expose a /pprof
endpoint so that labd
can collect profiles and display it somewhere.
The codahale/hdrhistogram repo has been transferred under the github HdrHstogram umbrella with the help from the original author in Sept 2020 (new repo url https://github.com/HdrHistogram/hdrhistogram-go). The main reasons are to group all implementations under the same roof and to provide more active contribution from the community as the original repository was archived several years ago.
The dependency URL should be modified to point to the new repository URL. The tag "v0.9.0" was applied at the point of transfer and will reflect the exact code that was frozen in the original repository.
If you are using Go modules, you can update to the exact point of transfer using the @v0.9.0 tag in your go get command.
go mod edit -replace github.com/codahale/hdrhistogram=github.com/HdrHistogram/[email protected]
From the point of transfer, up until now (mon 16 aug 2021), we've released 3 versions that aim support the standard HdrHistogram serialization/exposition formats, and deeply improve READ performance.
We recommend to update to the latest version.
Query is executed against a cluster and returns a set of matching nodes. The grammar of a query is similar to Lips because its simple to parse. A developer can run labctl cluster query <cluster> <query>
in order to test their queries before adding it to a scenario.
query := <label>|(<func> <query|label>)
func := [not|and|or]
label := '<label-glob>’
For instance, consider the following cluster:
{
"nodes": [
{
"name": "apple",
"labels": [
"everyone",
"apple",
"slowdisk",
"region=us-west-2",
]
},
{
"name": "banana",
"labels": [
"everyone",
"banana",
"region=us-west-2",
]
},
{
"name": "cherry",
"labels": [
"everyone",
"cherry"
"region=us-east-1",
]
}
]
}
Then the following queries and their results would be:
apple
["apple"]
(not ‘apple’)
["banana","cherry"]
(and ‘slowdisk’ ‘region=us-west-2’)
["apple"]
(or ‘region=us-west-2’ ‘region=us-east-1’)
["apple","banana","cherry"]
(or (not ‘slowdisk’) ‘banana’)
["banana","cherry"]
The signature should look something like:
type Query interface {
Match(ctx context.Context, nset NodeSet) (NodeSet, error)
}
Deploy in-memory instances using libp2p mocknet and serialize clusters in boltdb store.
Currently when walking a merkledag, the walker doesn't look ahead as much as we can. See: ipfs/go-ipld-format#53
We should implement a walker that prefetches as quickly as possible to maximize bandwidth.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.