GithubHelp home page GithubHelp logo

Comments (10)

hanxiao avatar hanxiao commented on August 15, 2024

I just checked. There is no problem with the MD5 sum value.

On my mac:

$ curl http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
$ md5 t10k-images-idx3-ubyte.gz
MD5 (t10k-images-idx3-ubyte.gz) = bef4ecab320f06d8554ea6380940ec79
$ ls -lth t10k-images-idx3-ubyte.gz
-rw-r--r--  1 hanhxiao   4.2M Mar 12 20:37 t10k-images-idx3-ubyte.gz

from fashion-mnist.

leftthomas avatar leftthomas commented on August 15, 2024

@hanxiao , you could check it on Ubuntu16.04 and Windows, I also have checked it on mac, it is right, but on Ubuntu and Windows it is wrong, I have asked this problem on PyTorch torchvision Github issues.
screenshot from 2018-03-12 20-48-10

from fashion-mnist.

leftthomas avatar leftthomas commented on August 15, 2024

@hanxiao I also tested on my new MacBook Pro, you could found the download link is wrong, it redirects to MNIST!
qq20180313-110213 2x

from fashion-mnist.

hanxiao avatar hanxiao commented on August 15, 2024

I've tested on Linux machine (in Shenzhen China), it gives the correct result as follows:

hanhxiao:~# wget http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
--2018-03-13 11:21:08--  http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Connecting to 10.223.133.20:52107... connected.
Proxy request sent, awaiting response... 200 OK
Length: 4422102 (4.2M) [binary/octet-stream]
Saving to: ‘t10k-images-idx3-ubyte.gz’

100%[===================================================================================================>] 4,422,102    605KB/s   in 19s

2018-03-13 11:21:29 (223 KB/s) - ‘t10k-images-idx3-ubyte.gz’ saved [4422102/4422102]

hanhxiao:~# md5sum t10k-images-idx3-ubyte.gz
bef4ecab320f06d8554ea6380940ec79  t10k-images-idx3-ubyte.gz

I saw a 302 redirect response from your screenshot. This is really suspicious. We are hosting the data file as static file on Amazon S3 bucket, we are not setting any redirect there.

from fashion-mnist.

hanxiao avatar hanxiao commented on August 15, 2024

@leftthomas Can you do following in bash and check where 302 happens?

curl -v -L http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz -IX GET

or,

wget http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz -qSO /dev/null

For example, on my linux the second method gives:

hanhxiao:~# wget http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz -qSO /dev/null
  HTTP/1.1 200 OK
  x-amz-id-2: DK6MTacPtZm8r6WJMeSzrmG9DFQRTjnLc7irkMSCHcIF8UkZE5H2K2LSLwgTJs31AaEP5bZSoTk=
  x-amz-request-id: 2C990AB5ADEAA5D1
  Date: Tue, 13 Mar 2018 03:41:48 GMT
  Last-Modified: Thu, 31 Aug 2017 12:17:53 GMT
  ETag: "bef4ecab320f06d8554ea6380940ec79"
  Content-Type: binary/octet-stream
  Content-Length: 4422102
  Server: AmazonS3
  X-Cache: MISS from TENCENT
  X-Cache-Lookup: MISS from TENCENT
  Via: 1.1 TENCENT (squid/3.3.8)
  Connection: keep-alive

Notice that ETag gives the correct MD5 value. And there is no redirect.

Fyi, here is an example where a 302 happens on purpose.

hanhxiao:~# wget https://httpstat.us/302 -qSO /dev/null
  HTTP/1.1 302 Found
  Cache-Control: private
  Content-Length: 9
  Content-Type: text/plain; charset=utf-8
  Location: https://httpstat.us
  Server: Microsoft-IIS/10.0
  X-AspNetMvc-Version: 5.1
  Access-Control-Allow-Origin: *
  X-AspNet-Version: 4.0.30319
  X-Powered-By: ASP.NET
  Set-Cookie: ARRAffinity=7943fc4fbbc26574a46521f2ec212be4ee889bc2a88a9742ab65676f5bd7e9e6;Path=/;HttpOnly;Domain=httpstat.us
  Date: Tue, 13 Mar 2018 03:41:12 GMT
  Connection: close
  HTTP/1.1 200 OK
  Cache-Control: private
  Content-Length: 7958
  Content-Type: text/html; charset=utf-8
  Server: Microsoft-IIS/10.0
  X-AspNetMvc-Version: 5.1
  Access-Control-Allow-Origin: *
  X-AspNet-Version: 4.0.30319
  X-Powered-By: ASP.NET
  Date: Tue, 13 Mar 2018 03:41:12 GMT
  Connection: close

from fashion-mnist.

leftthomas avatar leftthomas commented on August 15, 2024

@hanxiao I just tried that two command on my Mac, and here are the results
qq20180313-122533 2x
qq20180313-122656 2x

from fashion-mnist.

leftthomas avatar leftthomas commented on August 15, 2024

@hanxiao Further more, I found not only the t10k-images-idx3-ubyte.gz is wrong, but also the train-images-idx3-ubyte.gz is wrong by using the curl command to download the datas on my Mac:
qq20180313-123626 2x
And if I use wget to download the datas on my Mac and Ubuntu, it just only train-images-idx3-ubyte.gz is wrong:
qq20180313-124409 2x

from fashion-mnist.

hanxiao avatar hanxiao commented on August 15, 2024

@leftthomas First of all, no panic, all links are correct. I've checked on multiple devices with different ISPs. Everything is fine.

Now to your specific problem, it looks like someone redirects your request to 10.6.0.123. You can see that from your curl log. The redirection is definitely not from us. It could be made by your ISP or your smart router, especially due to the following reasons:

  1. 10.6.0.123 is an internal IP address. It's not a public IP host by any public service. So your dataset must be downloaded from the intranet, not the internet!
  2. It looks like either your ISP or your smart router are doing some name-based caching. As you previously downloaded LeCun's data (which has the exact same name), it automatically redirects you to the cached files, not the real files. That also explains the high download speed.

Here are my suggestions:

  1. If you are using smart router, reset it or clear the cache might be helpful. Make a backup, no guarantee from my side.
  2. Meanwhile, please try visiting http://10.6.0.123 directly and see what it is? who is it? Is it a router? is it your ISP? Is it a proxy? As I said, 10.6.0.123 is an internal IP address specific to your network, I can not see it.

from fashion-mnist.

leftthomas avatar leftthomas commented on August 15, 2024

@hanxiao Thanks for your help, I'll convey this problem to our informatization office.

from fashion-mnist.

YaoQ avatar YaoQ commented on August 15, 2024

I have downloaded the t10k-images-idx3-ubyte.gz from the readme provided download link, but I checked the md5sum value is 9fb629c4189551a2d022fa330f9573f3, it not the same as readme given, I have deleted it and redownload again for five times, it is wrong.

I get same issue for downloading t10k-images-idx3-ubyte.gz with CMCC ISP, the md5sum is 9fb629c4189551a2d022fa330f9573f3, it is changed to minist testset. But when I use another server with different ISP (China Telecom), download the t10k-images-idx3-ubyte.gz again, the file is OK, md5sum is bef4ecab320f06d8554ea6380940ec79.

from fashion-mnist.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.