Comments (9)
I think I understand what you have in mind here and I agree with the idea, but can you elaborate on how you would implement this?
something like
class SentinelAPI:
def _progress_bar(self, iter):
from tqdm import tqdm
for package in tqdm(iter):
yield package
def download(self):
packages = []
for package in self._progress_bar(packages):
_download_package(package)
so I can do
api = SentinelAPI()
api._progress_bar = some_qgis_progress_bar_func
from sentinelsat.
I do not have much experience with (the shortcomings of) homura
, but I can see several advantages of getting rid of that dependency chain, one being that we/I/someone could easily make a QGIS sentinelsat
plugin.
So the progress bar with tqdm
(very nice choice) would be based on (parallel) iteration over download chunks?
Ideally for me, the progress bar should be a lazy dependency and replaceable by a drop-in (e.g. by overwriting an API class method).
The maximum number of download threads would be two, obviously?
from sentinelsat.
Great idea @valgur!
From what I can remember homura
was implemented by @willemarcel mostly due to its convenience (integrated progress bar etc.) - so it is not set in stone as a dependency.
- removing a compiled dependency is a big, big plus, especially on Windows systems or for novice Python users (and enables easier integration into other tools like QGIS, etc.)
- we need to keep an eye on performance, but curl vs. requests benchmarks look like large file transfers should be about the same speed, and small requests in the form of search is done already with
requests
anyway - for most applications I don't see the benefit of parallel download support. On all systems I tested the throughput is saturated with a single transfer anyway. Since scihub limits each account to two connections at a time we would lose the ability to download and perform searches at the same time. This could pose problems should we change the query to a generator (#64). Would be nice to benchmark if concurrent downloads are faster, but I doubt it.
- better loggin would be nice, also freeing the way for a
--verbose
or--log
option for the CLI which would help us analyze issues people are having (e.g. in #89) - I'm always in favour of better testing
I'm all in favour for replacing homura
with requests
, mostly to get rid of the curl
dependency. Better logging, testing and progress bars are added benefits.
from sentinelsat.
Ideally for me, the progress bar should be a lazy dependency and replaceable by a drop-in (e.g. by overwriting an API class method).
I think I understand what you have in mind here and I agree with the idea, but can you elaborate on how you would implement this?
we need to keep an eye on performance, but curl vs. requests benchmarks look like large file transfers should be about the same speed, and small requests in the form of search is done already with requests anyway
I definitely agree. I've looked at the same results you linked to but they measure a slightly different use case. They measure the speed of running new queries against the server while in our case we repeatedly receive chunks from the same open connection. I suspect pycurl
and request
perform relatively close in such a case, but I think I'll run some benchmarks myself to verify that.
Regarding parallel downloads, the number of download threads would be configurable and would default to two, indeed, for obvious reasons. In my experience the download speeds vary greatly depending on the load on the servers and what specific server you happen to connect to. I suspect the latter because I've noticed the download speed toggling between a reasonable 8 MB/s and meager 1 MB/s when restarting a download within just a minute.
It's also worth keeping in mind that sentinelsat
can be used with other hosts besides the main Copernicus Data Hub. I have some experience using it with the Finnish Data Hub. The Finnish Hub does not limit the number of concurrent downloads or the download speed at all. I did some testing and I needed two parallel downloads there to max out my local connection. I assume the other national hosts are similar in being quite relaxed in their download limits.
Thanks for the feedback, both of you. I appreciate it.
from sentinelsat.
It's also worth keeping in mind that sentinelsat can be used with other hosts besides the main Copernicus Data Hub.
Good point. I just tested up to 4 connections to another hub.
If you want to implement threaded downloads, I think you should. Giving the useres the option to set their perfect/allowed/preferred connection number is a good idea. I was worried about the programming effort necessary to implement it (+the tests on 1 thread failing while one survives, etc.) - but once the tests are in place maintenance shouldn't be an issue. So if you want to to that - knock yourself out 👍
from sentinelsat.
Just curious, how is this coming @valgur?
from sentinelsat.
I have not started working on it yet. Don't have much time to spare right now, unfortunately. Maybe I'll find a couple of days to hammer this out in the coming months, but you should consider this idea to be on hold unless someone else feels like working on it.
from sentinelsat.
@valgur Don't stress yourself out implementing this. I think we can earmark this as one implementation towards the 1.0 milestone, or even > 1.0. While the curl
dependency is not ideal it looks like the download is working stable as is for most users right now.
from sentinelsat.
Included in v0.11 release.
from sentinelsat.
Related Issues (20)
- ServerError: 504 Gateway Time-out when downloading a product HOT 4
- make_path_filter does not work for excluding Sentinel-2 GRANULE data HOT 8
- Sentinel 2 - query doesn't return imagery before 2018
- serve sentinel tiles in frontend HOT 1
- Downloaded Images do not have a crs - linux only HOT 2
- I have error when download usin sentinelsat API HOT 2
- Defining Specific Sentinel-3 instrument and processing level HOT 1
- Some ids missing when querying the api
- HTTP status 500: UriSyntaxException : Invalid value: ''''.
- Future of sentinelsat with ramp-down of Copernicus Open Access Hub? HOT 5
- Missing L2A products when searching by tileid HOT 1
- Problem downloading Sentinel-3 LST data HOT 1
- Feature Request HOT 1
- Package Needs Update as Copernicus Open Access Hub is closing at the end of October 2023 and moving to dataspace.copernicus.eu HOT 1
- Compatibility with "Copernicus Data Space Ecosystem" HOT 1
- sentinelsat API access issue HOT 1
- Future of Sentinelsat after the termination of the Copernicus Open Access Hub HOT 14
- ServerError: HTTP status 200 OK HOT 1
- Where to find help for using the Copernicus Data Space Ecosystem (CDSE)?
- Remove SciHub references from code and docs
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sentinelsat.