GithubHelp home page GithubHelp logo

Comments (8)

avalentino avatar avalentino commented on June 12, 2024

Hi @massimozanetti, apparently I have the opposite problem.
The tool seems to download less than expected:

$ python3 -m sentinelsat --name S2B_MSIL2A_20230419T094029_N0509_R036_T34SBJ_20230419T113128 --exclude-pattern '*GRANULE/*' --path tmp --download
Found 1 products
Will download 1 products using 4 workers
Downloading MTD_MSIL2A.xml: 100%|███████████████████████████████████████████████████████████████████████████████████| 55.0k/55.0k [00:00<00:00, 211kB/s]
Downloading INSPIRE.xml: 100%|█████████████████████████████████████████████████████████████████████████████████████| 18.7k/18.7k [00:00<00:00, 33.9kB/s]
Downloading UserProduct_index.html: 100%|██████████████████████████████████████████████████████████████████████████| 8.50k/8.50k [00:00<00:00, 43.7kB/s]
Downloading UserProduct_index.xsl: 100%|███████████████████████████████████████████████████████████████████████████| 10.0k/10.0k [00:00<00:00, 23.6kB/s]
Downloading MTD_DS.xml: 100%|██████████████████████████████████████████████████████████████████████████████████████| 21.2M/21.2M [00:11<00:00, 1.80MB/s]
Downloading FORMAT_CORRECTNESS.xml: 100%|██████████████████████████████████████████████████████████████████████████| 3.93k/3.93k [00:00<00:00, 3.69MB/s]
Downloading GENERAL_QUALITY.xml: 100%|█████████████████████████████████████████████████████████████████████████████| 5.90k/5.90k [00:00<00:00, 5.08MB/s]
Downloading GEOMETRIC_QUALITY.xml: 100%|███████████████████████████████████████████████████████████████████████████| 8.03k/8.03k [00:00<00:00, 7.21MB/s]
Downloading RADIOMETRIC_QUALITY.xml: 100%|█████████████████████████████████████████████████████████████████████████| 6.98k/6.98k [00:00<00:00, 6.90MB/s]
Downloading SENSOR_QUALITY.xml: 100%|██████████████████████████████████████████████████████████████████████████████| 4.33k/4.33k [00:00<00:00, 2.86MB/s]
Downloading products: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:31<00:00, 31.18s/product]
Successfully downloaded 1/1 products.                                                                                                                   
$ ls -1 tmp/S2B_MSIL2A_20230419T094029_N0509_R036_T34SBJ_20230419T113128.SAFE/
DATASTRIP
HTML
INSPIRE.xml
MTD_MSIL2A.xml
manifest.safe

from sentinelsat.

massimozanetti avatar massimozanetti commented on June 12, 2024

I see, it seems there is a problem with filtered download of S2 data

from sentinelsat.

avalentino avatar avalentino commented on June 12, 2024

Actually it is strange, because the filtering part does not depend on the server behaviour but only on the content of the manifest file.
Unfortunately I did not have the time to investigate further.
By the way the entire logic should be in

def _filter_nodes(self, manifest, product_info, nodefilter=None):
nodes = {}
xmldoc = etree.parse(manifest)
data_obj_section_elem = xmldoc.find("dataObjectSection")
for elem in data_obj_section_elem.iterfind("dataObject"):
dataobj_info = _xml_to_dataobj_info(elem)
node_info = self._dataobj_to_node_info(dataobj_info, product_info)
if nodefilter is not None and not nodefilter(node_info):
continue
node_path = node_info["node_path"]
nodes[node_path] = node_info
return nodes

from sentinelsat.

massimozanetti avatar massimozanetti commented on June 12, 2024

I think here is the problem, if I am not wrong fnmatch normalizes the path, therefore make lowercase:

def node_filter(node_info):
match = fnmatch.fnmatch(node_info["node_path"], pattern)
return not match if exclude else match
return node_filter

Should fix using fnmatchcase instead?

from sentinelsat.

avalentino avatar avalentino commented on June 12, 2024

The normalisation should happen only on windows (according to https://docs.python.org/3/library/os.path.html#os.path.normcase).
My example is on mac, while I assume that you are using windows, correct?

By the way I agree that using fnmatchcase is more correct.

Does it solves the issue for you?

from sentinelsat.

massimozanetti avatar massimozanetti commented on June 12, 2024

Nope, I run Python code on a Ubuntu server. Btw, I will try making the fnmatchcase change and let you know.

from sentinelsat.

avalentino avatar avalentino commented on June 12, 2024

Just checked again and I made a mistake when I reported and error in the example above.
The example, indeed, works as expected in the sense that it downloads all the components listed in the manifest excluding the ones matching the specified pattern.

The strange thing is that not all the files included in the toplevel folder of the S2 product are listed in the manifest. This is unexpected to me and I'm not sure that this is in line with the SAFE specs.

To conclude, on my side all works as expected from the code point of view.

To verify I modified the filter function as follows to print the full list of files processed by the filter:

    def node_filter(node_info):
        print(node_info["node_path"])
        match = fnmatch.fnmatch(node_info["node_path"], pattern)
        return not match if exclude else match

Could I kindly ask you to repeat the exercise on your side or to just provide the name of one of the products for which you experiment the problem?

from sentinelsat.

massimozanetti avatar massimozanetti commented on June 12, 2024

I realized that my sentinelsat installation was outdated (there was a .lower() after fnmatch..., probably an old issue). After upgrading it no more .lower() and works fine, except that some top-level repositories are not copied, as you say, because they are not listed in the manifest.

Btw, for WINDOWS users, the fnmatch generates a problem anyway. To be fixed with fnmatchcase probably.
Thank you

from sentinelsat.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.