GithubHelp home page GithubHelp logo

trudi-group / ipfs-crawler Goto Github PK

View Code? Open in Web Editor NEW
69.0 10.0 16.0 167.75 MB

A crawler for the IPFS network, code for our paper (https://arxiv.org/abs/2002.07747). Also holds scripts to evaluate the obtained data and make similar plots as in the paper.

License: MIT License

Go 41.99% Makefile 3.14% Shell 1.33% R 31.33% Python 9.40% Lua 4.90% TeX 6.53% Dockerfile 1.21% Perl 0.17%
ipfs ipfs-network libp2p kademlia-dht crawler

ipfs-crawler's People

Contributors

daviddias avatar dependabot[bot] avatar harlequix avatar jorropo avatar mrd0ll4r avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ipfs-crawler's Issues

Crawling exists immediately

When I try to run a crawl with ./start_crawl I get no results, the crawler just exists immediately, as shown in the output below.
I'm running it on Ubuntu 18.04.4 with go version go1.14.3

INFO[19:59:22] Thank you for running our IPFS Crawler!
INFO[19:59:22] Checking whether weak RSA keys are allowed...  weak_RSA_keys=true
INFO[19:59:22] Creating workers...                           numberOfWorkers=1
INFO[19:59:22] Adding cached peer to crawl queue.            amount=0
INFO[19:59:22] Starting crawl...
INFO[19:59:24] Stopping crawl...
INFO[19:59:24] Crawl finished. Summary of results.           connectable nodes=0 end time:="20-08-20--19:59:24" number of nodes=0 start time="20-08-20--19:59:22"
INFO[19:59:24] Online nodes saved in cache                   File=nodes.cache

Null nodes found

Hello!
There is a problem running the crawler. Although i have tested every possible case i get this after executing the following command:

export LIBP2P_ALLOW_WEAK_RSA_KEYS="" && go run cmd/ipfs-crawler/main.go

INFO[20:04:07] Thank you for running our IPFS Crawler!
INFO[20:04:07] Checking whether weak RSA keys are allowed... weak_RSA_keys=true
INFO[20:04:16] Creating workers... numberOfWorkers=5
DEBUG[20:04:16] Size of Queue capacity=2000 maxCap=2000 sumCap=2000
DEBUG[20:04:16] Size of Queue QueueSize=2000
DEBUG[20:04:16] Size of Queue capacity=2000 maxCap=2000 sumCap=4000
DEBUG[20:04:16] Size of Queue QueueSize=4000
DEBUG[20:04:16] Size of Queue capacity=2000 maxCap=2000 sumCap=6000
DEBUG[20:04:16] Size of Queue QueueSize=6000
DEBUG[20:04:16] Size of Queue capacity=2000 maxCap=2000 sumCap=8000
DEBUG[20:04:16] Size of Queue QueueSize=8000
DEBUG[20:04:17] Size of Queue capacity=2000 maxCap=2000 sumCap=10000
DEBUG[20:04:17] Size of Queue QueueSize=10000
INFO[20:04:17] Adding cached peer to crawl queue. amount=0
INFO[20:04:17] Starting crawl...
DEBUG[20:04:17] Adding bootstraps
DEBUG[20:04:17] Dispatch crawler request node=QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN
DEBUG[20:04:17] Dispatch crawler request node=QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa
DEBUG[20:04:17] Dispatch crawler request node=QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb
DEBUG[20:04:17] Dispatch crawler request node=QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt
DEBUG[20:04:17] IPFSWorker connecting to IPFSWorkerID=0 destAddr="{QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa: [/dnsaddr/bootstrap.libp2p.io]}"
DEBUG[20:04:17] Dispatch crawler request node=QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ
DEBUG[20:04:17] Dispatch crawler request node=QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ
DEBUG[20:04:17] IPFSWorker connecting to IPFSWorkerID=0 destAddr="{QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt: [/dnsaddr/bootstrap.libp2p.io]}"
DEBUG[20:04:17] IPFSWorker connecting to IPFSWorkerID=0 destAddr="{QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb: [/dnsaddr/bootstrap.libp2p.io]}"
DEBUG[20:04:17] IPFSWorker connecting to IPFSWorkerID=0 destAddr="{QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: [/ip4/104.131.131.82/tcp/4001]}"
DEBUG[20:04:17] IPFSWorker connecting to IPFSWorkerID=0 destAddr="{QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: [/ip4/104.131.131.82/udp/4001/quic]}"
DEBUG[20:04:17] IPFSWorker connecting to IPFSWorkerID=0 destAddr="{QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN: [/dnsaddr/bootstrap.libp2p.io]}"
DEBUG[20:04:17] Could not connect. IPFSWorkerID=0 destAddr="{QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: [/ip4/104.131.131.82/udp/4001/quic]}" err="failed to dial QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: no good addresses"
DEBUG[20:04:17] Error while crawling Error="failed to dial QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: no good addresses"
DEBUG[20:04:18] Could not connect. IPFSWorkerID=0 destAddr="{QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: [/ip4/104.131.131.82/tcp/4001]}" err="failed to dial QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: all dials failed\n * [/ip4/104.131.131.82/tcp/4001] failed to negotiate stream multiplexer: remote error: tls: bad certificate"
DEBUG[20:04:18] Error while crawling Error="failed to dial QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: all dials failed\n * [/ip4/104.131.131.82/tcp/4001] failed to negotiate stream multiplexer: remote error: tls: bad certificate"
DEBUG[20:04:27] Could not connect. IPFSWorkerID=0 destAddr="{QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN: [/dnsaddr/bootstrap.libp2p.io]}" err="failed to dial QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN: no good addresses"
DEBUG[20:04:27] Error while crawling Error="failed to dial QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN: no good addresses"
DEBUG[20:04:27] Could not connect. IPFSWorkerID=0 destAddr="{QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa: [/dnsaddr/bootstrap.libp2p.io]}" err="failed to dial QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa: no good addresses"
DEBUG[20:04:27] Error while crawling Error="failed to dial QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa: no good addresses"
DEBUG[20:04:27] Could not connect. IPFSWorkerID=0 destAddr="{QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb: [/dnsaddr/bootstrap.libp2p.io]}" err="failed to dial QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb: no good addresses"
DEBUG[20:04:27] Error while crawling Error="failed to dial QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb: no good addresses"
DEBUG[20:04:27] Could not connect. IPFSWorkerID=0 destAddr="{QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt: [/dnsaddr/bootstrap.libp2p.io]}" err="failed to dial QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt: no good addresses"
DEBUG[20:04:27] Error while crawling Error="failed to dial QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt: no good addresses"
INFO[20:04:27] Stopping crawl...
INFO[20:04:27] Crawl finished. Summary of results. connectable nodes=0 end time:="18-10-20--20:04:27" number of nodes=0 start time="18-10-20--20:04:07"
INFO[20:04:27] Online nodes saved in cache File=nodes.cache
christos@christos-VirtualBox:~/Desktop/Ipfscrawler/ipfs-crawler$ export LIBP2P_ALLOW_WEAK_RSA_KEYS="" && go run cmd/ipfs-crawler/main.go > outp.txt
INFO[20:06:02] Thank you for running our IPFS Crawler!
INFO[20:06:02] Checking whether weak RSA keys are allowed... weak_RSA_keys=true
INFO[20:06:11] Creating workers... numberOfWorkers=5
DEBUG[20:06:11] Size of Queue capacity=2000 maxCap=2000 sumCap=2000
DEBUG[20:06:11] Size of Queue QueueSize=2000
DEBUG[20:06:11] Size of Queue capacity=2000 maxCap=2000 sumCap=4000
DEBUG[20:06:11] Size of Queue QueueSize=4000
DEBUG[20:06:12] Size of Queue capacity=2000 maxCap=2000 sumCap=6000
DEBUG[20:06:12] Size of Queue QueueSize=6000
DEBUG[20:06:12] Size of Queue capacity=2000 maxCap=2000 sumCap=8000
DEBUG[20:06:12] Size of Queue QueueSize=8000
DEBUG[20:06:12] Size of Queue capacity=2000 maxCap=2000 sumCap=10000
DEBUG[20:06:12] Size of Queue QueueSize=10000
INFO[20:06:12] Adding cached peer to crawl queue. amount=0
INFO[20:06:12] Starting crawl...
DEBUG[20:06:12] Adding bootstraps
DEBUG[20:06:12] Dispatch crawler request node=QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN
DEBUG[20:06:12] Dispatch crawler request node=QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa
DEBUG[20:06:12] Dispatch crawler request node=QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb
DEBUG[20:06:12] Dispatch crawler request node=QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt
DEBUG[20:06:12] Dispatch crawler request node=QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ
DEBUG[20:06:12] Dispatch crawler request node=QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ
DEBUG[20:06:12] IPFSWorker connecting to IPFSWorkerID=0 destAddr="{QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa: [/dnsaddr/bootstrap.libp2p.io]}"
DEBUG[20:06:12] IPFSWorker connecting to IPFSWorkerID=0 destAddr="{QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN: [/dnsaddr/bootstrap.libp2p.io]}"
DEBUG[20:06:12] IPFSWorker connecting to IPFSWorkerID=0 destAddr="{QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: [/ip4/104.131.131.82/udp/4001/quic]}"
DEBUG[20:06:12] IPFSWorker connecting to IPFSWorkerID=0 destAddr="{QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb: [/dnsaddr/bootstrap.libp2p.io]}"
DEBUG[20:06:12] IPFSWorker connecting to IPFSWorkerID=0 destAddr="{QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: [/ip4/104.131.131.82/tcp/4001]}"
DEBUG[20:06:12] IPFSWorker connecting to IPFSWorkerID=0 destAddr="{QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt: [/dnsaddr/bootstrap.libp2p.io]}"
DEBUG[20:06:12] Could not connect. IPFSWorkerID=0 destAddr="{QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: [/ip4/104.131.131.82/udp/4001/quic]}" err="failed to dial QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: no good addresses"
DEBUG[20:06:12] Error while crawling Error="failed to dial QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: no good addresses"
DEBUG[20:06:13] Could not connect. IPFSWorkerID=0 destAddr="{QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: [/ip4/104.131.131.82/tcp/4001]}" err="failed to dial QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: all dials failed\n * [/ip4/104.131.131.82/tcp/4001] failed to negotiate stream multiplexer: remote error: tls: bad certificate"
DEBUG[20:06:13] Error while crawling Error="failed to dial QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ: all dials failed\n * [/ip4/104.131.131.82/tcp/4001] failed to negotiate stream multiplexer: remote error: tls: bad certificate"
DEBUG[20:06:22] Could not connect. IPFSWorkerID=0 destAddr="{QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb: [/dnsaddr/bootstrap.libp2p.io]}" err="failed to dial QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb: no good addresses"
DEBUG[20:06:22] Error while crawling Error="failed to dial QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb: no good addresses"
DEBUG[20:06:22] Could not connect. IPFSWorkerID=0 destAddr="{QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt: [/dnsaddr/bootstrap.libp2p.io]}" err="failed to dial QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt: no good addresses"
DEBUG[20:06:22] Error while crawling Error="failed to dial QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt: no good addresses"
DEBUG[20:06:22] Could not connect. IPFSWorkerID=0 destAddr="{QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa: [/dnsaddr/bootstrap.libp2p.io]}" err="failed to dial QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa: no good addresses"
DEBUG[20:06:22] Error while crawling Error="failed to dial QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa: no good addresses"
DEBUG[20:06:22] Could not connect. IPFSWorkerID=0 destAddr="{QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN: [/dnsaddr/bootstrap.libp2p.io]}" err="failed to dial QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN: no good addresses"
DEBUG[20:06:22] Error while crawling Error="failed to dial QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN: no good addresses"
INFO[20:06:22] Stopping crawl...
INFO[20:06:22] Crawl finished. Summary of results. connectable nodes=0 end time:="18-10-20--20:06:22" number of nodes=0 start time="18-10-20--20:06:02"
INFO[20:06:22] Online nodes saved in cache File=nodes.cache

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.