zalandoresearch / fashion-mnist Goto Github PK

A MNIST-like fashion product database. Benchmark :point_down:

Home Page: http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/

License: MIT License

Python 70.32% CSS 1.10% HTML 13.68% JavaScript 13.69% Dockerfile 1.21%

benchmark computer-vision convolutional-neural-networks dataset deep-learning fashion fashion-mnist gan machine-learning mnist zalando

fashion-mnist's People

Contributors

Stargazers

Watchers

Forkers

gitter-badger riomus robinsingh1 johndpope seanreed1111 edtruji ml-lab sam2015 farisology kashif ubaidsayyed54 dreadlord1984 rosssong mnrmja007 googed allan-rop orchestor yashbmewada linecode leezqcst sarthusarth rouseguy iqbal-chowdhury shikharateverest yfliao pandeyiyer allensmile truongthanhdat lyk125 nguyentu1323 morpheus3000 raldam ahmedhani mahmoudzareef banisafar gabrielsilva822 98950 mutual-ai wuyijiang07 mockmew ashishkej tphyhfighting awesome-archive dhananjaymehta jianweilin csjunxu eagles2f evitself kuyun-zhangyang midasc novelmartis oppa3109 jiajialin maggie0830 joizhang2012 tandalf zgsxwsdxg dreamclimb anand-singh rubenszimbres read-papers visionalyst tony32769 gjtjx mave5 uberman4740 walkoncross reddysainathn shaoweipng thefifthhead wavelets abhi-infrrd patrickjonesdotca hackerout d4le neuroradiology rohitsaha propellingbits huleg saifrahmed little1tow zhangjuju herbyme jimbog benjamesbabala deveshtarasia caesaryang cliff007 19ai convolutionroc grseb9s yijizhao vijayendra-tripathi kgl-prml noblestreet rukor swachalit haojunyu guanlongtianzi ashbt

fashion-mnist's Issues

Benchmark: dual path network with wide resnet 28-10 as backbone

Classifier: dual path network with WideResNet28-10 as the backbone network (47.75M).
Preprocessing: standard preprocessing (mean/std subtraction/division) and augmentation (random crops/horizontal flips).
Fashion test accuracy: 95.73%
Code:https://github.com/Queequeg92/DualPathNet
References:
[1] Chen, Yunpeng, Jianan Li, Huaxin Xiao, Xiaojie Jin, Shuicheng Yan, and Jiashi Feng. "Dual Path Networks." arXiv preprint arXiv:1707.01629 (2017).
[2] Zagoruyko, Sergey, and Nikos Komodakis. "Wide residual networks." arXiv preprint arXiv:1605.07146 (2016).

Convert dataset in t7 format for Torch research project usage

Hi,

I've converted the original Fashion-MNIST dataset format from Python/Numpy format to t7 format for Torch users.
The demo code for Fashion-MNIST classification can be seen from train-a-fashion-classifier.

It could be useful for Torch users.

Is it possible to get the colored version?

I wanted to ask if you could please also publish the colored version of the dataset. To be precise after the processing step 5 (extending) and before 6 (negating). This could, for example, be helpful for Image translation tasks.

benchmark: update on GRU+SVM with Dropout

Hey @hanxiao , it's me again. I saw an update in the dataset, regarding duplicate samples. I did another training using my GRU+SVM (with Dropout) model (from #8 ) on the updated dataset. Here's the result:

Epoch : 0 completed out of 100, loss : 316.9036560058594, accuracy : 0.734375
Epoch : 1 completed out of 100, loss : 201.2646026611328, accuracy : 0.83984375
Epoch : 2 completed out of 100, loss : 253.3709259033203, accuracy : 0.796875
Epoch : 3 completed out of 100, loss : 257.7744140625, accuracy : 0.8359375
Epoch : 4 completed out of 100, loss : 179.52682495117188, accuracy : 0.8671875
Epoch : 5 completed out of 100, loss : 224.97421264648438, accuracy : 0.83984375
Epoch : 6 completed out of 100, loss : 212.19381713867188, accuracy : 0.859375
Epoch : 7 completed out of 100, loss : 200.80978393554688, accuracy : 0.859375
Epoch : 8 completed out of 100, loss : 187.77052307128906, accuracy : 0.85546875
Epoch : 9 completed out of 100, loss : 190.96389770507812, accuracy : 0.86328125
Epoch : 10 completed out of 100, loss : 185.72314453125, accuracy : 0.85546875
Epoch : 11 completed out of 100, loss : 189.3765411376953, accuracy : 0.8515625
Epoch : 12 completed out of 100, loss : 130.086669921875, accuracy : 0.89453125
Epoch : 13 completed out of 100, loss : 151.38232421875, accuracy : 0.8828125
Epoch : 14 completed out of 100, loss : 159.71595764160156, accuracy : 0.88671875
Epoch : 15 completed out of 100, loss : 218.80592346191406, accuracy : 0.84375
Epoch : 16 completed out of 100, loss : 131.5895233154297, accuracy : 0.9140625
Epoch : 17 completed out of 100, loss : 162.96995544433594, accuracy : 0.8671875
Epoch : 18 completed out of 100, loss : 155.52630615234375, accuracy : 0.890625
Epoch : 19 completed out of 100, loss : 159.76901245117188, accuracy : 0.88671875
Epoch : 20 completed out of 100, loss : 137.74642944335938, accuracy : 0.890625
Epoch : 21 completed out of 100, loss : 162.48875427246094, accuracy : 0.890625
Epoch : 22 completed out of 100, loss : 179.6526336669922, accuracy : 0.8828125
Epoch : 23 completed out of 100, loss : 127.58981323242188, accuracy : 0.8984375
Epoch : 24 completed out of 100, loss : 185.6982421875, accuracy : 0.8671875
Epoch : 25 completed out of 100, loss : 159.8983612060547, accuracy : 0.8828125
Epoch : 26 completed out of 100, loss : 160.69525146484375, accuracy : 0.89453125
Epoch : 27 completed out of 100, loss : 173.42813110351562, accuracy : 0.859375
Epoch : 28 completed out of 100, loss : 166.0702667236328, accuracy : 0.87890625
Epoch : 29 completed out of 100, loss : 157.59085083007812, accuracy : 0.87109375
Epoch : 30 completed out of 100, loss : 127.72993469238281, accuracy : 0.9140625
Epoch : 31 completed out of 100, loss : 136.65415954589844, accuracy : 0.90234375
Epoch : 32 completed out of 100, loss : 172.4806365966797, accuracy : 0.8515625
Epoch : 33 completed out of 100, loss : 139.81488037109375, accuracy : 0.8984375
Epoch : 34 completed out of 100, loss : 144.55099487304688, accuracy : 0.85546875
Epoch : 35 completed out of 100, loss : 122.90949249267578, accuracy : 0.8984375
Epoch : 36 completed out of 100, loss : 150.0441131591797, accuracy : 0.890625
Epoch : 37 completed out of 100, loss : 153.2085723876953, accuracy : 0.88671875
Epoch : 38 completed out of 100, loss : 143.91455078125, accuracy : 0.8984375
Epoch : 39 completed out of 100, loss : 117.63712310791016, accuracy : 0.91796875
Epoch : 40 completed out of 100, loss : 93.80998229980469, accuracy : 0.92578125
Epoch : 41 completed out of 100, loss : 136.52537536621094, accuracy : 0.87109375
Epoch : 42 completed out of 100, loss : 137.24530029296875, accuracy : 0.90625
Epoch : 43 completed out of 100, loss : 108.73893737792969, accuracy : 0.921875
Epoch : 44 completed out of 100, loss : 106.48686218261719, accuracy : 0.9296875
Epoch : 45 completed out of 100, loss : 104.41219329833984, accuracy : 0.92578125
Epoch : 46 completed out of 100, loss : 101.19454956054688, accuracy : 0.94140625
Epoch : 47 completed out of 100, loss : 127.536376953125, accuracy : 0.91015625
Epoch : 48 completed out of 100, loss : 109.94172668457031, accuracy : 0.9296875
Epoch : 49 completed out of 100, loss : 85.25288391113281, accuracy : 0.94140625
Epoch : 50 completed out of 100, loss : 112.01800537109375, accuracy : 0.91796875
Epoch : 51 completed out of 100, loss : 107.6760482788086, accuracy : 0.91015625
Epoch : 52 completed out of 100, loss : 121.9848403930664, accuracy : 0.921875
Epoch : 53 completed out of 100, loss : 101.01953887939453, accuracy : 0.9375
Epoch : 54 completed out of 100, loss : 69.95838165283203, accuracy : 0.94921875
Epoch : 55 completed out of 100, loss : 119.3257827758789, accuracy : 0.91796875
Epoch : 56 completed out of 100, loss : 102.73481750488281, accuracy : 0.921875
Epoch : 57 completed out of 100, loss : 89.11821746826172, accuracy : 0.94921875
Epoch : 58 completed out of 100, loss : 110.71992492675781, accuracy : 0.9140625
Epoch : 59 completed out of 100, loss : 105.85194396972656, accuracy : 0.9375
Epoch : 60 completed out of 100, loss : 114.6805648803711, accuracy : 0.921875
Epoch : 61 completed out of 100, loss : 99.33323669433594, accuracy : 0.92578125
Epoch : 62 completed out of 100, loss : 128.26809692382812, accuracy : 0.90625
Epoch : 63 completed out of 100, loss : 117.59638214111328, accuracy : 0.9140625
Epoch : 64 completed out of 100, loss : 86.27313995361328, accuracy : 0.9453125
Epoch : 65 completed out of 100, loss : 114.16581726074219, accuracy : 0.92578125
Epoch : 66 completed out of 100, loss : 102.78227233886719, accuracy : 0.94921875
Epoch : 67 completed out of 100, loss : 88.23193359375, accuracy : 0.9375
Epoch : 68 completed out of 100, loss : 60.24769592285156, accuracy : 0.953125
Epoch : 69 completed out of 100, loss : 97.67103576660156, accuracy : 0.94140625
Epoch : 70 completed out of 100, loss : 86.58494567871094, accuracy : 0.91796875
Epoch : 71 completed out of 100, loss : 98.33272552490234, accuracy : 0.921875
Epoch : 72 completed out of 100, loss : 77.44849395751953, accuracy : 0.94921875
Epoch : 73 completed out of 100, loss : 114.52888488769531, accuracy : 0.9296875
Epoch : 74 completed out of 100, loss : 94.6647720336914, accuracy : 0.9453125
Epoch : 75 completed out of 100, loss : 106.62199401855469, accuracy : 0.921875
Epoch : 76 completed out of 100, loss : 116.0970230102539, accuracy : 0.91015625
Epoch : 77 completed out of 100, loss : 78.5435791015625, accuracy : 0.953125
Epoch : 78 completed out of 100, loss : 125.43787384033203, accuracy : 0.91796875
Epoch : 79 completed out of 100, loss : 112.84344482421875, accuracy : 0.9296875
Epoch : 80 completed out of 100, loss : 65.7440185546875, accuracy : 0.95703125
Epoch : 81 completed out of 100, loss : 115.66653442382812, accuracy : 0.91796875
Epoch : 82 completed out of 100, loss : 76.14566040039062, accuracy : 0.9375
Epoch : 83 completed out of 100, loss : 72.91943359375, accuracy : 0.95703125
Epoch : 84 completed out of 100, loss : 56.55884552001953, accuracy : 0.95703125
Epoch : 85 completed out of 100, loss : 87.09599304199219, accuracy : 0.93359375
Epoch : 86 completed out of 100, loss : 80.97771453857422, accuracy : 0.93359375
Epoch : 87 completed out of 100, loss : 94.14187622070312, accuracy : 0.9453125
Epoch : 88 completed out of 100, loss : 80.44708251953125, accuracy : 0.94140625
Epoch : 89 completed out of 100, loss : 52.18363952636719, accuracy : 0.96875
Epoch : 90 completed out of 100, loss : 93.15214538574219, accuracy : 0.9296875
Epoch : 91 completed out of 100, loss : 97.51387023925781, accuracy : 0.9296875
Epoch : 92 completed out of 100, loss : 82.44243621826172, accuracy : 0.9375
Epoch : 93 completed out of 100, loss : 60.52445983886719, accuracy : 0.96484375
Epoch : 94 completed out of 100, loss : 57.100406646728516, accuracy : 0.96484375
Epoch : 95 completed out of 100, loss : 89.62207794189453, accuracy : 0.94140625
Epoch : 96 completed out of 100, loss : 86.14447784423828, accuracy : 0.9375
Epoch : 97 completed out of 100, loss : 75.90823364257812, accuracy : 0.953125
Epoch : 98 completed out of 100, loss : 65.80587768554688, accuracy : 0.9609375
Epoch : 99 completed out of 100, loss : 114.98580169677734, accuracy : 0.92578125
Accuracy : 0.897300124168396

The hyper-parameters used were as follows:

BATCH_SIZE = 256
CELL_SIZE = 256
DROPOUT_P_KEEP = 0.85
EPOCHS = 100
LEARNING_RATE = 1e-3
SVM_C = 1

Trained using tf.train.AdamOptimizer(), with tf.nn.dynamic_rnn(). The source may still be found here.

The graph from TensorBoard, tracking the training (accuracy at the top, loss at the bottom):

The improved accuracy may not be too much, but I suppose it's still a considerable difference, i.e. ~85.5% v. ~89.7%.

AlexNet with Triplet loss Benchmark

I use AlexNet for feature extraction with Triplet Loss function and embedding creation part, and I use Linear SVM for classifier.
I got accuracies as;
Train: 0.9946
Test: 0.8989

S3 bucket is providing different files from repo

Noticed in this repo, but it happens directly in the browser for me as well.

MD5 hashes

manually downloaded from repo S3 links

7edbbf1fc824916c442268ac4dc845cd  - ./t10k-images-idx3-ubyte.gz.md5
b9859d5936603c782c6eb8dd14198360  - ./t10k-labels-idx1-ubyte.gz.md5
053aba987904a004d52cb333753041a3  - ./train-images-idx3-ubyte.gz.md5
7864864ad9592b0ffcc53c942eb67b24  - ./train-labels-idx1-ubyte.gz.md5

downloaded from repo directly

bef4ecab320f06d8554ea6380940ec79  - ./t10k-images-idx3-ubyte.gz.md5
bb300cfdad3c16e7a12a480ee83cd310  - ./t10k-labels-idx1-ubyte.gz.md5 
8d4fb7e6c68d591d4c3dfef9ec88bf0d  - ./train-images-idx3-ubyte.gz.md5
25c81989df183df01b3e8a0aad5dffbe  - ./train-labels-idx1-ubyte.gz.md5

Benchmark (MXNet gluon)

Tried a simple 2-layer conv net with MXNet gluon Accuracy is as follows:

[Epoch 24] Training: accuracy=0.935683
[Epoch 24] Validation: accuracy=0.900500

https://github.com/lianghong/fashion_mnist-on-mxnet

GoogleNet with Cross Entropy Benchmark

I use GoogleNet for feature extraction with Cross Entropy Loss function and embedding creation part, and I use Linear SVM for classifier.
I got accuracies as;
Train: 0.9980
Test: 0.9365

docs: Additional measure for benchmarking

Perhaps it would be good if we include not only the architecture used, preprocessing, training accuracy, and test accuracy in the benchmarks, but also include the time it took to train on Fashion v. MNIST. What do you think?

Benchmark: Conv Net - Accuracy: 90.26%

Network is as follows:

No pre-processing.
Convolutional layer with 16 feature maps of size 5 x 5 with ELU activation.
Max Pooling layer of size 2 x 2.
Convolutional layer with 32 feature maps of size 5 x 5 with ELU activation.
Max Pooling layer of size 2 x 2.

Accuracy achieved on Test Dataset is 90.26 %.
I know that CNN networks of 2 layers are already present in benchmarks, but those have been implemented in Keras and Tensorflow. This network has been implemented in PyTorch, if that counts.

The code can be found here.

Suggestion: add implementation details for benchmark results

This is a suggestion. As I find it's hard to reproduce some of the results in the benchmark list, I think it's better to ask all the submitters to include implementation details for their models.

Benchmark: Wide ResNet 28-10 + Random Erasing, Top-1 accuracy: 96.35%

Hi, we achieve 95.99% top-1 accuracy with using WRN-28-10 on Fashion-MNIST using standard preprocessing (mean/std subtraction/division) and augmentation (random crops/horizontal flips).

When using WRN-28-10 + Random Erasing, it gives 96.35% top-1 accuracy.

The code will be available soon on github.

benchmark: GRU+SVM for MNIST dataset

I saw you updated the README to report on comparison between the accuracy for your dataset and MNIST, so I did a benchmark of my proposed GRU+SVM model (issue #8) on MNIST as well.

Here are the results:

Epoch : 0 completed out of 10, loss : 35.062599182128906, accuracy : 0.9296875
Epoch : 1 completed out of 10, loss : 20.101139068603516, accuracy : 0.9609375
Epoch : 2 completed out of 10, loss : 11.310111999511719, accuracy : 0.984375
Epoch : 3 completed out of 10, loss : 14.316896438598633, accuracy : 0.96875
Epoch : 4 completed out of 10, loss : 13.816293716430664, accuracy : 0.9609375
Epoch : 5 completed out of 10, loss : 8.049131393432617, accuracy : 0.984375
Epoch : 6 completed out of 10, loss : 10.147947311401367, accuracy : 0.984375
Epoch : 7 completed out of 10, loss : 8.27488899230957, accuracy : 0.9921875
Epoch : 8 completed out of 10, loss : 21.153032302856445, accuracy : 0.9609375
Epoch : 9 completed out of 10, loss : 10.881654739379883, accuracy : 0.9765625
Accuracy : 0.9658001661300659

The source may be found here, in my GitHub Gist. The hyper-parameters used were as follows:

BATCH_SIZE = 128
CELL_SIZE = 32
HM_EPOCHS = 10
LEARNING_RATE = 0.01
NUM_CLASSES = 10
SVM_C = 0.5

Trained using tf.train.AdamOptimizer(), and used the tf.nn.static_rnn().

Benchmarking

Jupyter notebook:
https://github.com/abelusha/MNIST-Fashion-CNN/blob/master/Fashon_MNIST_CNN_using_Keras_10_Runs.ipynb

I re-Run my CNN model 10 times for taking an average accuracy.
Here is the model Result:

Model HyperParameter:

Model Architecture:

The download link of t10k-images-idx3-ubyte.gz is wrong

I have downloaded the t10k-images-idx3-ubyte.gz from the readme provided download link, but I checked the md5sum value is 9fb629c4189551a2d022fa330f9573f3, it not the same as readme given, I have deleted it and redownload again for five times, it is wrong.

Benchmark: 2 conv avg pool + 1 fc

No preprocessing. See source code for exact network config.

Fashion-MNIST test accuracy: 97.39 %
Digit-MNIST test accuracy: 99.13 %

Source code: https://github.com/rfratila/Vulcan/blob/master/train_mnist_conv.py

Built with Lasagne and Theano

CoolNameNet - 94.7% accuracy with 295.968 params

Just ended some home experiments and the last one gave 94.7% test accuracy with 295.968 params. Preprocessing: standard preprocessing (mean/std subtraction/division) and augmentation (random crops/horizontal flips) + Random Erasing

Source code: https://github.com/EgorDezhic/fashion-classifier

arxiv-Broken link

The link to the dataset on arxiv is broken because of a dot at the end (https://github.com/zalandoresearch/fashion-mnist.)

benchmark: I used it for training my proposed model

I proposed a GRU+SVM model at the university as my undergraduate research, the proposal paper may be read here. Simply put, my proposal is to use SVM as the classification function for GRU instead of the 'conventional' Softmax.

Here are the results of the training using your dataset:

Epoch : 0 completed out of 10, loss : 274.4726867675781, accuracy : 0.7890625
Epoch : 1 completed out of 10, loss : 201.513671875, accuracy : 0.87890625
Epoch : 2 completed out of 10, loss : 201.049072265625, accuracy : 0.859375
Epoch : 3 completed out of 10, loss : 155.28115844726562, accuracy : 0.890625
Epoch : 4 completed out of 10, loss : 145.94015502929688, accuracy : 0.9140625
Epoch : 5 completed out of 10, loss : 148.19613647460938, accuracy : 0.88671875
Epoch : 6 completed out of 10, loss : 155.27915954589844, accuracy : 0.87890625
Epoch : 7 completed out of 10, loss : 192.79263305664062, accuracy : 0.8671875
Epoch : 8 completed out of 10, loss : 151.52243041992188, accuracy : 0.90234375
Epoch : 9 completed out of 10, loss : 171.39292907714844, accuracy : 0.8984375
Accuracy : 0.8878000378608704

The source may be found in my GitHub Gist. The hyper-parameters used were as follows:

BATCH_SIZE = 256
CELL_SIZE = 256
EPOCHS = 10
LEARNING_RATE = 0.01
NUM_CLASSES = 10
SVM_C = 1

Trained using tf.train.AdamOptimizer(), and used tf.nn.static_rnn().

4 CONV layers with max pooling after each 2 CONV layers (93.37%)

The model consists of 2 conv layers with max pooling, dense layer (512) and batch normalization.

Detailed architecture:

Link Jupyter notebook: https://github.com/khanguyen1207/My-Machine-Learning-Corner/blob/master/Zalando%20MNIST/fashion.ipynb
Result : 0.9337 with 40 epochs

Is there a colored dataset?

I'm not an expert -- but shouldn't a canonical fashion dataset, which seems to be the objective of this repo, be in color?

Not that I'm complaining much - this is a great contribution... but i think it will be of more contribution if it were in color and a bit bigger? and preprocess to become grayscale...

Duplicate samples and overlap between train and test

I hope I have got this right, but it seems that there are 43 samples duplicated in the training set and 1 sample that is duplicated in the test set. There are also 10 samples in the training set that appear in the test set. This was done by comparing the samples at the byte level.

Here is a list of the duplicates:

Training set duplicates:
[601, 39865]
[831, 24228]
[1826, 23718]
[2024, 53883]
[4974, 6293]
[5520, 49165]
[5790, 11845]
[5822, 33399]
[6139, 37731]
[6280, 41036]
[8485, 31238]
[8841, 28184]
[12571, 56657]
[14096, 32343]
[14710, 22159]
[15587, 28635]
[19308, 20114]
[19668, 21571]
[19760, 39489]
[19888, 24443]
[21072, 32800]
[22852, 28789]
[23052, 57107]
[23413, 33731]
[24785, 46015]
[25297, 40077]
[25629, 49588]
[26314, 49351]
[27045, 40033]
[27421, 31627]
[32113, 38337]
[32300, 33730]
[32303, 56840]
[32888, 41918]
[32922, 54584]
[36634, 39841]
[38261, 41877]
[42756, 53842]
[46667, 57724]
[46782, 54829]
[47929, 54185]
[48480, 59607]
[48955, 51368]
Test set duplicates:
[6334, 8569]
Training set samples overlapping with test set:
Train samples [3763] overlap with test samples [7243]
Train samples [4944] overlap with test samples [7781]
Train samples [6168] overlap with test samples [9227]
Train samples [12404] overlap with test samples [4037]
Train samples [15943] overlap with test samples [6659]
Train samples [22403] overlap with test samples [7762]
Train samples [34617] overlap with test samples [4990]
Train samples [35772] overlap with test samples [7216]
Train samples [48228] overlap with test samples [5867]
Train samples [52205] overlap with test samples [9560]

The code required to generate the above output is as follows (assuming the input images are in the variables train_X and test_X:

def sample_bytes(x):
    result = []
    for i in range(len(x)):
        b = x[i].tobytes()
        result.append(b)
    return result

train_h = sample_bytes(train_X)
test_h = sample_bytes(test_X)

train_dict = {}
test_dict = {}
for i, h in enumerate(train_h):
    train_dict.setdefault(h, []).append(i)
for i, h in enumerate(test_h):
    test_dict.setdefault(h, []).append(i)

print('Training set duplicates:')
for k, v in train_dict.items():
    if len(v) > 1:
        for j in range(1, len(v)):
            assert (ds.train_X_u8[v[0]] == ds.train_X_u8[v[j]]).all()
        print(v)

print('Test set duplicates:')
for k, v in test_dict.items():
    if len(v) > 1:
        for j in range(1, len(v)):
            assert (ds.test_X_u8[v[0]] == ds.test_X_u8[v[j]]).all()
        print(v)

print('Training set samples overlapping with test set:')
for k, v in train_dict.items():
    if k in test_dict:
        assert (ds.train_X_u8[v[0]] == ds.test_X_u8[test_dict[k][0]]).all()
        print('Train samples {} overlap with test samples {}'.format(v, test_dict[k]))

overlap = set(train_h).intersection(set(test_h))
print(len(overlap))
assert overlap == set()

Simple convolutional neural network with 93.43% accuracy on testset

Hi Han, I evaluated some architectures and parameters. I end up with an accuracy of 93.43% on the fashion-MNIST testset. The same network reached an accuracy of 99.43% on MNIST.

https://github.com/cmasch/zalando-fashion-mnist

The architecture in code:

cnn = Sequential()

cnn.add(InputLayer(input_shape=(img_height,img_width,1)))

# Normalization
cnn.add(BatchNormalization())

# Conv + Maxpooling
cnn.add(Convolution2D(64, (4, 4), padding='same', input_shape=(img_height, img_width, channels), activation='relu'))
cnn.add(MaxPooling2D(pool_size=(2, 2)))

# Dropout
cnn.add(Dropout(0.1))

# Conv + Maxpooling
cnn.add(Convolution2D(64, (4, 4), activation='relu'))
cnn.add(MaxPooling2D(pool_size=(2, 2)))

# Dropout
cnn.add(Dropout(0.3))

# Converting 3D feature to 1D feature Vektor
cnn.add(Flatten())

# Fully Connected Layer
cnn.add(Dense(256, activation='relu'))

# Dropout
cnn.add(Dropout(0.5))

# Fully Connected Layer
cnn.add(Dense(64, activation='relu'))

# Normalization
cnn.add(BatchNormalization())

cnn.add(Dense(num_classes, activation='softmax'))
cnn.compile(loss='categorical_crossentropy',
            optimizer=optimizers.Adam(),
            metrics=['accuracy'])

I tried to keep it simple. Additionally I used augmentation to increase the training data.
Thanks for the great dataset!

Best
Christopher

Clustering performance

@hanxiao
I wonder what's the clustering performance of the state-of-the-art clustering algorithms on fashion-mnist.
I tested my algorithm and got an accuracy of 0.59 and NMI of 0.63.
Have you collected other clustering results?
Thanks.

[Suggestion] Point to `readMNIST` function from `darch` package to load fashion dataset

On readme.md, please point to readMNIST function from darch package to load fashion dataset.

The gitter link in README shows as missing picture

benchmark: Try HOG + SVM

Just out of curiosity, I tried an old-fashioned HOG+SVM approach, with almost no tuning of the HOG parameters.

Training time:
50 minutes

Test accuracy:
0.926

You can find the code here (Requires OpenCV!)

Loaders which use crop cause issue since difference in MNIST fashion and MNIST data in terms of pixels always black

Hi,

There is a difference in terms of a pixel which is always black in training images across all 60000 samples.

The original MNIST data has 67 pixels of 28*28 pixels

which are zero across all 60000 train images.

I wrote a MATLAB script to check the same; please use code.zip and run it on both data sets separately.

This makes all the parsers/ loaders which remove padding not work optimally with MNIST Fashion data.
E.g. The loader for MNIST Images for MATLAB mentioned in README.md (under this) doesn't work for MNIST Fashion data
https://de.mathworks.com/matlabcentral/fileexchange/27675-read-digits-and-labels-from-mnist-database?focused=5154133&tab=function

Please remove all Loaders from README.md which remove padding; since there exists atleast one image in MNIST Fashion training data where pixel is non-black.

Shape of train images returns 55000 although it says 60000. Did I misunderstood smth?

When, I try to print the data set it gives 55000 as shape? Is there something wrong that I did. I downloaded the data from github and put it in data/fashion as it says.

from tensorflow.examples.tutorials.mnist import input_data
fashion_mnist = input_data.read_data_sets('data/fashion', one_hot=True)

Extracting data/fashion\train-images-idx3-ubyte.gz
Extracting data/fashion\train-labels-idx1-ubyte.gz
Extracting data/fashion\t10k-images-idx3-ubyte.gz
Extracting data/fashion\t10k-labels-idx1-ubyte.gz

print('Features of Fashion MNIST dataset')
print('Shape of training set images: ', fashion_mnist.train.images.shape)
print('Shape of training set labels', fashion_mnist.train.labels.shape)

Features of Fashion MNIST dataset
Shape of training set images:  (55000, 784)
Shape of training set labels (55000, 10)

benchmark: Update on GRU+SVM with Dropout on MNIST dataset

Since I did 100 epochs for GRU+SVM with Dropout on Fashion dataset, I also did 100 epochs on MNIST (the old one was 10 epochs only). The following were the hyper-parameters used:

BATCH_SIZE = 256
CELL_SIZE = 256
DROPOUT_P_KEEP = 0.85
EPOCHS = 100
LEARNING_RATE = 1e-3
NUM_CLASSES = 10
SVM_C = 1

The following is the result:

Epoch : 0 completed out of 100, loss : 141.47535705566406, accuracy : 0.9140625
Epoch : 1 completed out of 100, loss : 67.05036926269531, accuracy : 0.96875
Epoch : 2 completed out of 100, loss : 51.171600341796875, accuracy : 0.9765625
Epoch : 3 completed out of 100, loss : 72.32965850830078, accuracy : 0.9609375
Epoch : 4 completed out of 100, loss : 37.7554817199707, accuracy : 0.98046875
Epoch : 5 completed out of 100, loss : 24.296039581298828, accuracy : 0.98828125
Epoch : 6 completed out of 100, loss : 37.4559211730957, accuracy : 0.984375
Epoch : 7 completed out of 100, loss : 38.10890197753906, accuracy : 0.97265625
Epoch : 8 completed out of 100, loss : 33.97040939331055, accuracy : 0.97265625
Epoch : 9 completed out of 100, loss : 25.034709930419922, accuracy : 0.99609375
Epoch : 10 completed out of 100, loss : 27.721952438354492, accuracy : 0.98046875
Epoch : 11 completed out of 100, loss : 8.290353775024414, accuracy : 1.0
Epoch : 12 completed out of 100, loss : 25.927515029907227, accuracy : 0.9921875
Epoch : 13 completed out of 100, loss : 11.549110412597656, accuracy : 0.99609375
Epoch : 14 completed out of 100, loss : 34.728797912597656, accuracy : 0.98046875
Epoch : 15 completed out of 100, loss : 21.197731018066406, accuracy : 0.98828125
Epoch : 16 completed out of 100, loss : 11.47766399383545, accuracy : 0.9921875
Epoch : 17 completed out of 100, loss : 13.01932144165039, accuracy : 0.98828125
Epoch : 18 completed out of 100, loss : 4.497049808502197, accuracy : 1.0
Epoch : 19 completed out of 100, loss : 12.586877822875977, accuracy : 0.9921875
Epoch : 20 completed out of 100, loss : 6.10440731048584, accuracy : 0.99609375
Epoch : 21 completed out of 100, loss : 8.886781692504883, accuracy : 0.99609375
Epoch : 22 completed out of 100, loss : 7.0670166015625, accuracy : 1.0
Epoch : 23 completed out of 100, loss : 16.550621032714844, accuracy : 0.98828125
Epoch : 24 completed out of 100, loss : 7.014737129211426, accuracy : 0.99609375
Epoch : 25 completed out of 100, loss : 29.812110900878906, accuracy : 0.98828125
Epoch : 26 completed out of 100, loss : 2.2193398475646973, accuracy : 1.0
Epoch : 27 completed out of 100, loss : 14.020920753479004, accuracy : 0.9921875
Epoch : 28 completed out of 100, loss : 8.520711898803711, accuracy : 0.9921875
Epoch : 29 completed out of 100, loss : 2.4392218589782715, accuracy : 1.0
Epoch : 30 completed out of 100, loss : 25.517803192138672, accuracy : 0.98828125
Epoch : 31 completed out of 100, loss : 11.551563262939453, accuracy : 0.9921875
Epoch : 32 completed out of 100, loss : 10.277920722961426, accuracy : 0.99609375
Epoch : 33 completed out of 100, loss : 10.18214225769043, accuracy : 0.99609375
Epoch : 34 completed out of 100, loss : 2.365241289138794, accuracy : 1.0
Epoch : 35 completed out of 100, loss : 8.35222053527832, accuracy : 0.99609375
Epoch : 36 completed out of 100, loss : 1.9403200149536133, accuracy : 1.0
Epoch : 37 completed out of 100, loss : 4.6265153884887695, accuracy : 1.0
Epoch : 38 completed out of 100, loss : 4.685805797576904, accuracy : 0.99609375
Epoch : 39 completed out of 100, loss : 5.235599040985107, accuracy : 0.99609375
Epoch : 40 completed out of 100, loss : 32.585182189941406, accuracy : 0.98046875
Epoch : 41 completed out of 100, loss : 13.55459213256836, accuracy : 0.98828125
Epoch : 42 completed out of 100, loss : 3.7068004608154297, accuracy : 1.0
Epoch : 43 completed out of 100, loss : 5.768912315368652, accuracy : 0.99609375
Epoch : 44 completed out of 100, loss : 5.215768814086914, accuracy : 0.99609375
Epoch : 45 completed out of 100, loss : 8.629631042480469, accuracy : 0.9921875
Epoch : 46 completed out of 100, loss : 7.393224716186523, accuracy : 0.9921875
Epoch : 47 completed out of 100, loss : 17.475631713867188, accuracy : 0.9921875
Epoch : 48 completed out of 100, loss : 4.962292194366455, accuracy : 0.99609375
Epoch : 49 completed out of 100, loss : 4.288407802581787, accuracy : 1.0
Epoch : 50 completed out of 100, loss : 3.06554913520813, accuracy : 1.0
Epoch : 51 completed out of 100, loss : 2.9363889694213867, accuracy : 1.0
Epoch : 52 completed out of 100, loss : 7.425971031188965, accuracy : 0.99609375
Epoch : 53 completed out of 100, loss : 7.003169536590576, accuracy : 0.99609375
Epoch : 54 completed out of 100, loss : 13.44936466217041, accuracy : 0.99609375
Epoch : 55 completed out of 100, loss : 5.4362664222717285, accuracy : 0.99609375
Epoch : 56 completed out of 100, loss : 10.022172927856445, accuracy : 0.98828125
Epoch : 57 completed out of 100, loss : 2.9892423152923584, accuracy : 1.0
Epoch : 58 completed out of 100, loss : 1.7155311107635498, accuracy : 1.0
Epoch : 59 completed out of 100, loss : 2.44166898727417, accuracy : 1.0
Epoch : 60 completed out of 100, loss : 4.870673656463623, accuracy : 0.99609375
Epoch : 61 completed out of 100, loss : 1.7088404893875122, accuracy : 1.0
Epoch : 62 completed out of 100, loss : 21.897991180419922, accuracy : 0.98828125
Epoch : 63 completed out of 100, loss : 2.563978672027588, accuracy : 1.0
Epoch : 64 completed out of 100, loss : 1.151407241821289, accuracy : 1.0
2017-09-14 12:21:34.450895: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 14379760 get requests, put_count=14379777 evicted_count=2000 eviction_rate=0.000139084 and unsatisfied allocation rate=0.000141657
Epoch : 65 completed out of 100, loss : 1.0514287948608398, accuracy : 1.0
Epoch : 66 completed out of 100, loss : 10.431646347045898, accuracy : 0.99609375
Epoch : 67 completed out of 100, loss : 10.04415512084961, accuracy : 0.99609375
Epoch : 68 completed out of 100, loss : 9.506088256835938, accuracy : 0.99609375
Epoch : 69 completed out of 100, loss : 8.011089324951172, accuracy : 0.99609375
Epoch : 70 completed out of 100, loss : 0.9643533229827881, accuracy : 1.0
Epoch : 71 completed out of 100, loss : 9.283774375915527, accuracy : 0.9921875
Epoch : 72 completed out of 100, loss : 2.125692129135132, accuracy : 1.0
Epoch : 73 completed out of 100, loss : 21.240196228027344, accuracy : 0.9921875
Epoch : 74 completed out of 100, loss : 2.5445051193237305, accuracy : 1.0
Epoch : 75 completed out of 100, loss : 9.342909812927246, accuracy : 0.99609375
Epoch : 76 completed out of 100, loss : 29.229848861694336, accuracy : 0.98828125
Epoch : 77 completed out of 100, loss : 1.9726190567016602, accuracy : 1.0
Epoch : 78 completed out of 100, loss : 8.080221176147461, accuracy : 0.99609375
Epoch : 79 completed out of 100, loss : 7.3532915115356445, accuracy : 0.99609375
Epoch : 80 completed out of 100, loss : 1.3384674787521362, accuracy : 1.0
Epoch : 81 completed out of 100, loss : 6.711606025695801, accuracy : 0.99609375
Epoch : 82 completed out of 100, loss : 0.9907960891723633, accuracy : 1.0
Epoch : 83 completed out of 100, loss : 1.1378357410430908, accuracy : 1.0
Epoch : 84 completed out of 100, loss : 7.504663467407227, accuracy : 0.9921875
Epoch : 85 completed out of 100, loss : 1.9658554792404175, accuracy : 1.0
Epoch : 86 completed out of 100, loss : 1.3581955432891846, accuracy : 1.0
Epoch : 87 completed out of 100, loss : 2.964240789413452, accuracy : 1.0
Epoch : 88 completed out of 100, loss : 3.54362154006958, accuracy : 1.0
Epoch : 89 completed out of 100, loss : 1.5963693857192993, accuracy : 1.0
Epoch : 90 completed out of 100, loss : 4.597883224487305, accuracy : 1.0
Epoch : 91 completed out of 100, loss : 1.353342890739441, accuracy : 1.0
Epoch : 92 completed out of 100, loss : 2.763561964035034, accuracy : 1.0
Epoch : 93 completed out of 100, loss : 4.88947057723999, accuracy : 0.99609375
Epoch : 94 completed out of 100, loss : 4.4988112449646, accuracy : 0.99609375
Epoch : 95 completed out of 100, loss : 11.898427963256836, accuracy : 0.99609375
Epoch : 96 completed out of 100, loss : 1.51198410987854, accuracy : 1.0
Epoch : 97 completed out of 100, loss : 0.946499764919281, accuracy : 1.0
Epoch : 98 completed out of 100, loss : 5.954292297363281, accuracy : 0.99609375
Epoch : 99 completed out of 100, loss : 1.6741831302642822, accuracy : 1.0

The test accuracy was Accuracy : 0.9884001016616821, and this is without learning rate decay.

MNIST-Fashion-CNN

https://github.com/abelusha/MNIST-Fashion-CNN/blob/master/MNIST_Fashon_CNN_using_Keras.ipynb

Preprocessing : Normalization
Result:

Keras based Architecture:

Please add: Zappr config, GitHub tags to enhance searchability

For the latter: GitHub offers tags near the project one-liner to help people search for projects. Some suggestions: machinelearning, datascience, mnist, datasets, datavisualization, python.

Benchmark: Wide ResNet and DenseNets

WRN40-4 lands at 3.93% error using standard preprocessing (mean/std subtraction/division) and augmentation (random crops/horizontal flips), Nesterov Momentum with a half-wave cosine annealing schedule and an initial LR of 0.1, trained for 300 epochs. DenseNet-BC with k=12 and D=100 land at 4.64% error with the same settings and 100 epochs of training, currently running a 300 epoch run. The WRN has 8.9M params, the DenseNet-BC has 768K params.

tensorflow's read_data_sets would yield a validation set with the same labels

from tensorflow.examples.tutorials.mnist import input_data
data = input_data.read_data_sets('data/fashion', one_hot=True)

the validation set only contain 0 as labels

Link in arxiv has extra period at the end

Hey guys,

Thank you for publishing this new dataset.
No an issue really wiht the dataset, but the link that you have in the arxiv's abstract has an extra period at the end.

Regards

Truncated Squeeze-net - a couple of runs

Please see my implementation.

Truncated Squeeze net model
~1s per epoch
Ca. 0.9001 accuracy on ca. 200 epochs
Cyclical learning rate

squeeze_net_mnist.zip

VGG-like network with 26557k+ params ,93.5%

Bulit by keras
source code

Adding Japanese README translation

I found a Japanese translation of README.md on http://tensorflow.classcat.com/category/fashion-mnist/
Seems pretty complete to me. I sent a email to the website and asking for the permission to use it as official README-jp.md.

Still waiting their reply.

GRU+SVM+DROPOUT+LR-DECAY

GRU+SVM+DROPOUT+LR-DECAY in TF

https://github.com/mpekalski/zalando/blob/master/GRU%2BSVM%2BDROPOUT%2BLR-DECAY.ipynb

100 epochs, test accuracy: 0.9841001033782959

No preprocessing.

Three Layer CNN With 90.33% accuracy

I tried to implement a three-layered CNN with Batch Normalisation and MaxPooling. I got an accuracy of 90.33 % after 5000 iterations. I used Batch normalisation after every layer to accelerate the performance.

Also, I got an accuracy of 99.04 % on MNIST with the same network

The Architecture is as follows-
1.Convolutional Layer 1 with output feature map 16 and 5*5 kernel ReLU Activation

MaxPooling 1
BatchNormalisation 1

Convolutional Layer 2 with output feature map 32 and 5*5 kernel ReLU Activation

MaxPooling 2
BatchNormalisation 2

Convolutional Layer 3 with output feature map 64 and 5*5 kernel ReLU Activation

MaxPooling 3
BatchNormalisation 3

Fully Connected Layer.

Code

Benchmark: Conv Net - Accuracy: 92.56%

Tried this network topology that can be summarized as follows:

Convolutional layer with 32 feature maps of size 5×5.
Pooling layer taking the max over 2*2 patches.
Convolutional layer with 64 feature maps of size 5×5.
Pooling layer taking the max over 2*2 patches.
Convolutional layer with 128 feature maps of size 1×1.
Pooling layer taking the max over 2*2 patches.
Flatten layer.
Fully connected layer with 1024 neurons and rectifier activation.
Dropout layer with a probability of 50%.
Fully connected layer with 510 neurons and rectifier activation.
Dropout layer with a probability of 50%.
Output layer.

I used Normalization as Preprocessing and 5-fold cross-validation to evaluate the model.
Accuracy scores: [0.92433, 0.92133, 0.923581, 0.92391, 0.92466]
Mean Accuracy: 0.923567
Stdev Accuracy: 0.001175
Final Accuracy: 92.56%

You can find the code here.

Benchmark: ResNet18 and Simple Conv Net

Tried a simple 2-layer conv net and resnet18 on MNIST and Fashion-MNIST. Accuracy is as follows:

Model	MNIST	Fashion MNIST
ResNet18	0.979	0.949
SimpleNet	0.971	0.919

Preprocessing

Normalization, random horizontal flip, random vertical flip, random translation, random rotation.

You can find the code here.

Benchmark: ConvNet 0.932 on Fashion-MNIST, 0.994 on MNIST

Preprocessing
Normalization

Architecture in Keras

Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 1, 28, 28)         0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 1, 28, 28)         112       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 64, 24, 24)        1664      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 64, 12, 12)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 512, 8, 8)         819712    
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 512, 4, 4)         0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 8192)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               1048704   
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 64)                8256      
_________________________________________________________________
dropout_2 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_3 (Dense)              (None, 10)                650       
=================================================================
Total params: 1,879,098
Trainable params: 1,879,042
Non-trainable params: 56

Training Time
12+ mins

Accuracy
Fashion: 0.932
MNIST: 0.994

Notebook
https://github.com/Xfan1025/Fashion-MNIST/blob/master/fashion-mnist.ipynb

ResNeXt (2017, Facebook AI Research)

HI
When I tested the fashion mnist dataset with 50 epochs, the accuracy was up to 90%. Maybe it will be higher if you increase the epoch and adjust the hyperparameter.

Data augmentation has not been done.

https://github.com/taki0112/ResNeXt-Tensorflow

thank you

DENSER CNN - Performance results on Fashion-MNIST

With DENSER we have obtained an accuracy value of 95.26% on the Fashion-MNIST benchmark, and 99.70% on the standard MNIST dataset.

More information regarding denser can be found in the following links:

https://cdv.dei.uc.pt/denser/
http://github.com/fillassuncao/denser-models (models to be uploaded very soon)
https://arxiv.org/pdf/1801.01563.pdf

Suggestion: Rename the repository from MNIST to something else

Someone commented about this issue on Reddit (pasted below) and I think you should seriously consider changing the name of the benchmark to something else while it's still early on.

MNIST stands for "Modified National Institute of Standards and Technology" and "National Institute of Standards and Technology" might not be too happy with their name being used. Call it something else. Especially when its an entirely new dataset and not a modification/extension of original NIST dataset.

Benchmark : XgBoost performance on fashion-mnist 89.8% and on MNIST 96.8%

Hello, I tried XgBoost on both Fashion-MNIST and MNIST dataset, with the only pre-processing as scaling the pixel values to mean=0.0 and var=1.0.

Fashion-MNIST
Train accuracy 99.5%
Validation accuracy 90.7%
Test accuracy 89.8%

MNIST
Train accuracy 99.7%
Validation accuracy 97.4%
Test accuracy 96.8%

Notebook link
https://github.com/anktplwl91/fashion_mnist.git

Capsule Network on fashion-mnist, test error 93.55%

Preprocessing

Scale pixel values to [0,1]
Data augmentation: shift at most 2 pixel and horizontal flip.

Keras model structure

Total params: 8,153,360
Trainable params: 8,141,840
Non-trainable params: 11,520

Training time
200 minutes

Accuracy
93.55%

Source code
https://github.com/XifengGuo/CapsNet-Fashion-MNIST

CapsNet paper
Sara Sabour, Nicholas Frosst, Geoffrey E Hinton. Dynamic Routing Between Capsules. NIPS 2017

3 Conv 2 FC layer Benchmark

I use 3 Conv and 2 FC layered network for feature extraction with Cosine Loss function and embedding creation part, and I use Linear SVM for classifier.
I got accuracies as;
Train: 0.9838
Test: 0.9072

Plain 9 layers CNN for Benchmark

I have several experimental results with different activation function and learning rate with
CNN architecture like this : C(3,32)-C(3,32)-P2-C(3,64)-C(3,64)-P2-FC64-FC64-S10

Activation	Learning Rate	MNIST	Fashion MNIST
RELU	0.01	0.9874	0.9883
RELU	0.001	0.9388	0.9368
SELU	0.01	0.9871	0.9819
SELU	0.001	0.9490	0.8202

For now, my best results are 98.74% and 98.83% for MNIST and Fashion-MNIST, respectively.
The train-val curve could be found in my repository : https://github.com/JMingKuo/fashion-mnist