GithubHelp home page GithubHelp logo

c29r3 / solana-snapshot-finder Goto Github PK

View Code? Open in Web Editor NEW
141.0 5.0 65.0 195 KB

Automatic search and download of snapshots for Solana

License: GNU General Public License v3.0

Dockerfile 2.59% Python 97.41%
solana crypto cryptocurrency sol web3

solana-snapshot-finder's Introduction

solana-snapshot-finder

Automatic search and download of snapshots for Solana

Navigation

What exactly does the script do:

  1. Finds all available RPCs
  2. Get the number of the current slot
  3. In multi-threaded mode, checks the slot numbers of all snapshots on all RPCs
    *Starting from version 0.1.3, only the first 10 RPCs speed are tested in a loop. See details here
  4. List of RPCs sorted by lowest latency slots_diff = current_slot - snapshot_slot
  5. Checks the download speed from RPC with the most recent snapshot. If download_speed <min_download_speed, then it checks the speed at the next node.
  6. Download snapshot
options:
  -h, --help            show this help message and exit
  -t THREADS_COUNT, --threads-count THREADS_COUNT
                        the number of concurrently running threads that check snapshots for rpc nodes
  -r RPC_ADDRESS, --rpc_address RPC_ADDRESS
                        RPC address of the node from which the current slot number will be taken
                        https://api.mainnet-beta.solana.com
  --slot SLOT           search for a snapshot with a specific slot number (useful for network restarts)
  --version VERSION     search for a snapshot from a specific version node
  --max_snapshot_age MAX_SNAPSHOT_AGE
                        How many slots ago the snapshot was created (in slots)
  --min_download_speed MIN_DOWNLOAD_SPEED
                        Minimum average snapshot download speed in megabytes
  --max_download_speed MAX_DOWNLOAD_SPEED
                        Maximum snapshot download speed in megabytes - https://github.com/c29r3/solana-
                        snapshot-finder/issues/11. Example: --max_download_speed 192
  --max_latency MAX_LATENCY
                        The maximum value of latency (milliseconds). If latency > max_latency --> skip
  --with_private_rpc    Enable adding and checking RPCs with the --private-rpc option.This slow down
                        checking and searching but potentially increases the number of RPCs from which
                        snapshots can be downloaded.
  --measurement_time MEASUREMENT_TIME
                        Time in seconds during which the script will measure the download speed
  --snapshot_path SNAPSHOT_PATH
                        The location where the snapshot will be downloaded (absolute path). Example:
                        /home/ubuntu/solana/validator-ledger
  --num_of_retries NUM_OF_RETRIES
                        The number of retries if a suitable server for downloading the snapshot was not
                        found
  --sleep SLEEP         Sleep before next retry (seconds)
  --sort_order SORT_ORDER
                        Priority way to sort the found servers. latency or slots_diff
  -ipb IP_BLACKLIST, --ip_blacklist IP_BLACKLIST
                        Comma separated list of ip addresse (ip:port) that will be excluded from the
                        scan. Example: -ipb 1.1.1.1:8899,8.8.8.8:8899
  -b BLACKLIST, --blacklist BLACKLIST
                        If the same corrupted archive is constantly downloaded, you can exclude it.
                        Specify either the number of the slot you want to exclude, or the hash of the
                        archive name. You can specify several, separated by commas. Example: -b
                        135501350,135501360 or --blacklist 135501350,some_hash
  -v, --verbose         increase output verbosity to DEBUG

alt text

Without docker

Install requirements

sudo apt-get update \
&& sudo apt-get install python3-venv git -y \
&& git clone https://github.com/c29r3/solana-snapshot-finder.git \
&& cd solana-snapshot-finder \
&& python3 -m venv venv \
&& source ./venv/bin/activate \
&& pip3 install -r requirements.txt

Start script
Mainnet

python3 snapshot-finder.py --snapshot_path $HOME/solana/validator-ledger

$HOME/solana/validator-ledger/ - path to your validator-ledger

TdS

python3 snapshot-finder.py --snapshot_path $HOME/solana/validator-ledger -r http://api.testnet.solana.com

Run via docker

Mainnet

sudo docker pull c29r3/solana-snapshot-finder:latest; \
sudo docker run -it --rm \
-v ~/solana/validator-ledger:/solana/snapshot \
--user $(id -u):$(id -g) \
c29r3/solana-snapshot-finder:latest \
--snapshot_path /solana/snapshot

~/solana/validator-ledger - path to validator-ledger, where snapshots stored

TdS

sudo docker pull c29r3/solana-snapshot-finder:latest; \
sudo docker run -it --rm \
-v ~/solana/validator-ledger:/solana/snapshot \
--user $(id -u):$(id -g) \
c29r3/solana-snapshot-finder:latest \
--snapshot_path /solana/snapshot \
-r http://api.testnet.solana.com

Update

sudo docker pull c29r3/solana-snapshot-finder:latest

solana-snapshot-finder's People

Contributors

c29r3 avatar meyerbro avatar redref avatar unordered-set avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

solana-snapshot-finder's Issues

Download incremental snapshot after full one

Hello,

I can see people recommending to download both full and incremental when you don't have any snapshot in your machine:

wget --trust-server-names http://145.40.67.83/snapshot.tar.bz2
wget --trust-server-names http://145.40.67.83/incremental-snapshot.tar.bz2

I believe we should add that too since otherwise we will be too far from the chain...

Thanks!

AttributeError: 'str' object has no attribute 'text'

In the latest version, version 0.2.2 (2b6d132899ac).

docker run -it --rm -v /mnt/snapshots:/solana/snapshot --user $(id -u):$(id -g) c29r3/solana-snapshot-finder:latest --snapshot_path /solana/snapshot -r http://api.testnet.solana.com --min_download_speed 50 --max_snapshot_age 300 --measurement_time 5
Version: 0.2.2
https://github.com/c29r3/solana-snapshot-finder
RPC='http://api.testnet.solana.com'
MAX_SNAPSHOT_AGE_IN_SLOTS=300
MIN_DOWNLOAD_SPEED_MB=50
SNAPSHOT_PATH='/solana/snapshot'
THREADS_COUNT=1000
NUM_OF_MAX_ATTEMPTS=5
WITH_PRIVATE_RPC=False
SORT_ORDER='latency'
get_current_slot()
Traceback (most recent call last):
  File "/solana/snapshot-finder.py", line 379, in <module>
    current_slot = get_current_slot()
  File "/solana/snapshot-finder.py", line 152, in get_current_slot
    if 'result' in str(r.text):
AttributeError: 'str' object has no attribute 'text'

Improvement

Hey, first I would like to thank you for this great work!

I was wondering if it could make sense to add an improvement on your scripts.

I managed to build a very silly script in bash that was pinging all RPCs nodes and the ordering them by latency so that I would try downloading first from them.

It's slow (I know), it takes around 2min but once it starts downloading it pays the wait since some nodes offer me 10gbps links so that I download it much faster...

I was wondering if you could add this improvement based on this script:

#!/bin/bash

RPC_LIST=$(solana gossip --output json | jq -r '.[] | select(.rpcHost != null).rpcHost')
LATENCY=100
PING_TIMEOUT=$(awk -v var1=$LATENCY -v var2=1000 'BEGIN { print  ( var1 / var2 ) }')

LIST=""
for RPC in $RPC_LIST;
do
	curl -f -s -m1 -I $RPC/health > /dev/null || continue

	RPC_IP=$(echo $RPC | cut -d":" -f1)
	PING=$(ping -c1 -W$PING_TIMEOUT $RPC_IP | tail -1| awk '{print $4}' | cut -d '/' -f 2 | cut -d"." -f1)
	test $PING || continue

	if [[ "$PING" -le $LATENCY ]] ; then
		LIST+=$(echo "$RPC $PING\n")
	fi
done

GOOD_RPCS=$(echo -e $LIST | head -c -1)

echo -e "GOOD_RPCS:\n$GOOD_RPCS"

ORDERED_RPCS=$(echo -e "$GOOD_RPCS" | sort -n -k2 | head -n 10)
echo -e "\nORDERED_RPCS:\n$ORDERED_RPCS"

echo -e "SCRIPT:\n#!/bin/bash\n"
echo "SNAPSHOTS_FOLDER=/mnt/snapshots"
echo -e "cd \$SNAPSHOTS_FOLDER\n"

#for RPC in $(echo $ORDERED_RPCS);
while IFS= read -r RPC
do
	GOOD_RPC=$(echo $RPC | cut -d" " -f1)
	LATENCY=$(echo $RPC | cut -d" " -f2)
	echo -e "# $LATENCY ms\nwget --trust-server-names http://$GOOD_RPC/snapshot.tar.bz2 && exit 0\n"
done <<< $(echo -e "$ORDERED_RPCS")

echo -e "exit 1"

measurement_time might not be working

Hello, I usually pass --measurement_time so that I let it take a little more time to truly check if the speed is indeed present since I have a 10Gbps connection. But I just tested now passing 15 seconds and it looks like there's a problem with the calculation. The origin host can't provide more than 116Mbps but it showed 360, which is impossible...

Feature Request

It might be interesting to cache the best servers or at least the ones that served snapshots keeping the announced/tested speed? Then the next time it might run over those ones before even searching the full list of nodes? Just an idea!

Thanks for your great work!

OVH issue: limitation due to maximum download speed

Hosting provider OVH gives a bandwidth of 2 gigabits per second, while the network interface operates in 10 gigabit mode and the speed is actually not limited. Because of this, when downloading about 19-20 GB of data at maximum speed, OVH sends an "attack" notification to the server and filters the traffic
image

Don't assume wget is located at /usr/bin

Great utility.

One issue: it would be good to not assume wget is in /usr/bin, or at least provide a way to override this without modifying the script. For example on NixOS wget is located elsewhere.

solona 1.10.26, can not find snapshot.

The logs repeat like below

 docker pull c29r3/solana-snapshot-finder:latest; \                                                                                                                                     [11/11]
> sudo docker run -it --rm \
> -v /data/sol/download_snapshot:/solana/snapshot \
> --user $(id -u):$(id -g) \
> c29r3/solana-snapshot-finder:latest \
> --snapshot_path /solana/snapshot
latest: Pulling from c29r3/solana-snapshot-finder
99046ad9247f: Pull complete
dae61f727682: Pull complete
466485ee6277: Pull complete
de739b056673: Pull complete
1374279231a5: Pull complete
14c6393d521b: Pull complete
4f4fb700ef54: Pull complete
0572cf8dc607: Pull complete
8d8e63246265: Pull complete
Digest: sha256:dcdf2f5584568c53c00aed8988758373e47269e99e1bc3b370ffc19e2c9873af
Status: Downloaded newer image for c29r3/solana-snapshot-finder:latest
docker.io/c29r3/solana-snapshot-finder:latest
2022-06-22 05:15:02,104 [INFO] Version: 0.2.8
2022-06-22 05:15:02,104 [INFO] https://github.com/c29r3/solana-snapshot-finder

2022-06-22 05:15:02,104 [INFO] RPC='https://api.mainnet-beta.solana.com'
MAX_SNAPSHOT_AGE_IN_SLOTS=1300
MIN_DOWNLOAD_SPEED_MB=60
SNAPSHOT_PATH='/solana/snapshot'
THREADS_COUNT=1000
NUM_OF_MAX_ATTEMPTS=5
WITH_PRIVATE_RPC=False
SORT_ORDER='slots_diff'
2022-06-22 05:15:02,224 [INFO] Attempt number: 1. Total attempts: 5
  0%|                                                                                                                                                                                                                | 0/723 [00:00<?, ?it/s]
2022-06-22 05:15:02,409 [INFO] RPC servers in total: 723 | Current slot number: 138523758

2022-06-22 05:15:02,410 [INFO] Can't find any full local snapshots in this path /solana/snapshot --> the search will be carried out on full snapshots
Searching information about snapshots on all found RPCs
 99%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 718/723 [00:07<00:00, 39.77it/s]2022-06-22 05:15:11,320 [INFO] Found suitable RPCs: 0
2022-06-22 05:15:11,321 [INFO] No snapshot nodes were found matching the given parameters: args.max_snapshot_age=1300
2022-06-22 05:15:11,322 [INFO] Sleeping 30 seconds before next try
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 723/723 [00:19<00:00, 39.77it/s]2022-06-22 05:15:41,628 [INFO] Attempt number: 2. Total attempts: 5
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 723/723 [00:39<00:00, 18.20it/s]
2022-06-22 05:15:42,126 [INFO] RPC servers in total: 723 | Current slot number: 138523818                                                                                                                            | 0/723 [00:00<?, ?it/s]

2022-06-22 05:15:42,126 [INFO] Can't find any full local snapshots in this path /solana/snapshot --> the search will be carried out on full snapshots
Searching information about snapshots on all found RPCs
                                                                                                                                                                                                                                            2
022-06-22 05:15:51,019 [INFO] Found suitable RPCs: 0█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 719/723 [00:07<00:00, 34.67it/s]
2022-06-22 05:15:51,020 [INFO] No snapshot nodes were found matching the given parameters: args.max_snapshot_age=1300
2022-06-22 05:15:51,021 [INFO] Sleeping 30 seconds before next try

ModuleNotFoundError: No module named 'distutils' (Python 3.12)

Hi + thanks for your great work!

ran into this issue ....

python3.12 snapshot-finder.py --snapshot_path fo
Traceback (most recent call last):
File "/home/solana/solana-snapshot-finder/snapshot-finder.py", line 1, in
from distutils.log import debug
ModuleNotFoundError: No module named 'distutils'

tl;dr: with Python 3.12 the distutils modules has been phased out which results in this error:

Workaround:

  • stick with Python 3.10

Solution:

  • remove from distutils.log import debug

Background:

Can't get current slot

docker run -it --rm -v /mnt/solana/snapshots:/solana/snapshot --user $(id -u):$(id -g) c29r3/solana-snapshot-finder:latest --snapshot_path /solana/snapshot -r http://api.testnet.solana.com --min_download_speed 50 --max_snapshot_age 1000 --measurement_time 5 --num_of_retries 100
Version: 0.2.3
https://github.com/c29r3/solana-snapshot-finder


RPC='http://api.testnet.solana.com'
MAX_SNAPSHOT_AGE_IN_SLOTS=1000
MIN_DOWNLOAD_SPEED_MB=50
SNAPSHOT_PATH='/solana/snapshot'
THREADS_COUNT=1000
NUM_OF_MAX_ATTEMPTS=100
WITH_PRIVATE_RPC=False
SORT_ORDER='latency'
get_current_slot()
Can't get current slot

I believe this should't just close the process but try again after 30 seconds... Thanks!

MismatchedSlotHash

Most of the times (75% in my case) I get "datapoint: panic program="validator" thread="main" one=1i message="panicked at 'Load from snapshot failed: MismatchedSlotHash..." when I download a snapshot in mainnet using this tool...

Is this expected? Can we avoid it?

[ERROR] Exception in download() func

2022-06-28 01:45:26,460 [INFO] Suitable snapshot server found: rpc_node={'snapshot_address': '5.9.59.245:8899', 'slots_diff': 959, 'latency': 1.824, 'files_to_download': ['/incremental-snapshot-138832821-138839789-7yuKnzNdBbVdd2oauS9xazoFKwfF5xchr4jk3Qjk3jzL.tar.zst', '/snapshot-138832821-2Njp4xoMAuBgiVpDpBRHXgXABsZQYLBXbkcvdic49Squ.tar.zst'], 'cost': 506.5} down_speed_mb='95.8 MB'
2022-06-28 01:45:26,460 [INFO] Downloading http://5.9.59.245:8899/snapshot-138832821-2Njp4xoMAuBgiVpDpBRHXgXABsZQYLBXbkcvdic49Squ.tar.zst snapshot to /solana/snapshot
snapshot-138832821-2Njp4xoMAuBgiVpDpBRHXgXABsZQYLBXbkcvdic49Squ.tar.zst: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 0.99G/0.99G [00:10<00:00, 101MiB/s]
2022-06-28 01:45:36,918 [INFO] Rename the downloaded file /solana/snapshot/tmp-snapshot-138832821-2Njp4xoMAuBgiVpDpBRHXgXABsZQYLBXbkcvdic49Squ.tar.zst --> snapshot-138832821-2Njp4xoMAuBgiVpDpBRHXgXABsZQYLBXbkcvdic49Squ.tar.zst
2022-06-28 01:45:36,919 [ERROR] Exception in download() func
[Errno 18] Invalid cross-device link: '/solana/snapshot/tmp-snapshot-138832821-2Njp4xoMAuBgiVpDpBRHXgXABsZQYLBXbkcvdic49Squ.tar.zst' -> 'snapshot-138832821-2Njp4xoMAuBgiVpDpBRHXgXABsZQYLBXbkcvdic49Squ.tar.zst'
2022-06-28 01:45:36,919 [INFO] Downloading http://5.9.59.245:8899/incremental-snapshot-138832821-138839789-7yuKnzNdBbVdd2oauS9xazoFKwfF5xchr4jk3Qjk3jzL.tar.zst snapshot to /solana/snapshot
incremental-snapshot-138832821-138839789-7yuKnzNdBbVdd2oauS9xazoFKwfF5xchr4jk3Qjk3jzL.tar.zst: 100%|████████████████████████████████████████████████████████████████████████████| 78.3M/78.3M [00:00<00:00, 89.6MiB/s]
2022-06-28 01:45:37,836 [INFO] Rename the downloaded file /solana/snapshot/tmp-incremental-snapshot-138832821-138839789-7yuKnzNdBbVdd2oauS9xazoFKwfF5xchr4jk3Qjk3jzL.tar.zst --> incremental-snapshot-138832821-138839789-7yuKnzNdBbVdd2oauS9xazoFKwfF5xchr4jk3Qjk3jzL.tar.zst
2022-06-28 01:45:37,837 [ERROR] Exception in download() func
[Errno 18] Invalid cross-device link: '/solana/snapshot/tmp-incremental-snapshot-138832821-138839789-7yuKnzNdBbVdd2oauS9xazoFKwfF5xchr4jk3Qjk3jzL.tar.zst' -> 'incremental-snapshot-138832821-138839789-7yuKnzNdBbVdd2oauS9xazoFKwfF5xchr4jk3Qjk3jzL.tar.zst'
2022-06-28 01:45:37,837 [INFO] Done

Looks like it's not downloading a full snapshot because of an error and then it proceeds downloading the incremental and fails again, leaving tmp-files there...

[Errno 18] Invalid cross-device link...

This is testnet, but possibly related to the latest change to fix the other bug in mainnet...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.