Download logic:
- Download all names of a sub folder in a bucket
- Download data files to local, delete txt file that contains the names
- If data in csv.gz format, unzip and delete the original file
- The downloaded folder structure will be the same as on wasabi / aws
-
Install AWS CLI
-
Configure AWS CLI, and then check dir ~/.aws (macOS) if the credentials and config files are there, check content The content should contain 2 files:
- config
- credentials
Config should have 2 sections:
- [default]
- [profile wasabi]
Credentials should have 2 sections:
- [default]
- [wasabi]
This configuration is important, because in script it will directly use the AWS CLI to access.
-
Find files location on wasabi / AWS S3 (Sub folder etc.)
-
Open terminal and cd to the root folder, run
pip install -r requirements.txt
-
Run the script - download_wasabi.py, modify the function default value, etc. Prepare your mfa token (6 digits) More specifically, modify the following:
params_init = { 'bucket_name': 'trades-data', 'end_point_url': 'https://s3.us-east-2.wasabisys.com', 'aws_arn': 'iam::100000052685:user/zhenning.li' }
-
Check the result stored in ~/database_wasabi_mfa
For AWS S3, I didn't use random generated encypted key, so it's easier to run, but also easier to be hacked. A correct configuration is a must, this is stricter than Wasabi.
- Run the script - download_aws_s3.py, only modify the bucket name and folder name is enough.
bucket_name = 'dumps-kaiko' folder_name = 'markets/aggregated_trades/'
- Prepare your mfa token (6 digits)
- Check the result stored in ~/database_aws_mfa
Author: Zhenning Li Last update: 2023-2-20