GithubHelp home page GithubHelp logo

fchollet / deep-learning-with-python-notebooks Goto Github PK

View Code? Open in Web Editor NEW
18.1K 647.0 8.5K 6.83 MB

Jupyter notebooks for the code samples of the book "Deep Learning with Python"

License: MIT License

Jupyter Notebook 100.00%

deep-learning-with-python-notebooks's Introduction

Companion Jupyter notebooks for the book "Deep Learning with Python"

This repository contains Jupyter notebooks implementing the code samples found in the book Deep Learning with Python, 2nd Edition (Manning Publications).

For readability, these notebooks only contain runnable code blocks and section titles, and omit everything else in the book: text paragraphs, figures, and pseudocode. If you want to be able to follow what's going on, I recommend reading the notebooks side by side with your copy of the book.

These notebooks use TensorFlow 2.6.

Table of contents

deep-learning-with-python-notebooks's People

Contributors

bfaissal avatar derekchia avatar fchollet avatar iamaziz avatar kuz-man avatar rama100 avatar var-nan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deep-learning-with-python-notebooks's Issues

Unclear how to download data used in 6.3

The instructions in 6.3 say the data is

recorded at the Weather Station at the Max-Planck-Institute for Biogeochemistry in Jena, Germany: http://www.bgc-jena.mpg.de/wetter/.

The data file name in the code is jena_climate_2009_2016.csv. The website allows us to download data in 6-month increments, so the period 2009-2016 is split into 16 files.

Were these files concatenated to form the file used in the notebook? If so, a sentence to that effect might clarify where the data came from. Alternatively, if jena_climate_2009_2016.csv isn't the concatenation of these files, I think it's unclear where a reader would find the data.

output heatmap visualisation (5.4) changes everytime I load my network

I tested notebook 5.4 and it works as expected. However when I change the network from VGG16 to my own network (mobilenet with 2 dense and dropout layers behind it), the heatmaps change every time I load the network (load model from keras using .h5 file). Any idea why this happens? The output of the network (softmax probabilities) are equal for every network load by the way

3.4.1 Run slowness - [=====>...] - ETA: 59s✓✓✓✓16105472/17464789

I am having an issue with keras leading to my processor seemingly getting bogged down while working through examples.

In the 3.4.1 IMDB data set example, for instance, running the script:
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

Produces an output looking something like:
[=====>...] - ETA: 59s✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓✓16105472/17464789
That updates increasingly slowly as the numbers get larger and move toward completion.

I'm assuming my installation of keras/Tensorflow/CUDA/cuDNN is to blame, but curious if you know of anything obvious that would solve the issue.

Thanks!

why keep discriminator_model.trainable=False all the time?

I have the same issue as: mjdietzx/SimGAN#6

refiner_model.compile(optimizer=sgd, loss=self_regularization_loss)
discriminator_model.compile(optimizer=sgd, loss=local_adversarial_loss)
discriminator_model.trainable = False
combined_model.compile(optimizer=sgd, loss=[self_regularization_loss, local_adversarial_loss])

I think when training the discriminator network, It should be change to True and False alternately.
And I haven't gone deeper in the mechanism of keras, here if my understanding and I'm not sure if I'm wrong.

I guess that when you specify the trainable = True after compiling the discriminator, the trainable state will not be changed even if you use discriminator_model.trainable = False.
The combined model consists of trainable refiner model and the untrainable discriminate model. So we needn't change the trainable state alternatively during training which is different to many other example such as : https://github.com/osh/KerasGAN/blob/master/MNIST_CNN_GAN_v2.ipynb

(5.4) Handling 100% probability predictions causes all 0 gradients?

I'm applying this notebook to my own network to great effect, however when my model predicts a particular class with confidence of 1. (presumably this is an overflowed/rounded float32 ??) my call to K.gradients(class_loss_output, last_conv.output)[0] returns a Tensor with all 0s.

I've tested as thoroughly as I know how; this problem ONLY ever arises with 100% confidence predictions.
I've tried directly modifying the class_loss_output Tensor via clipping, normalization, and simply subtracting a very small scalar (1e-5) and verified that each of these truncate the value of this 1-element output Tensor. However, none of the above have resolved the issue.

Is K.gradients() using some particular Tensor attribute in its computation which I'm ignorant of?
Or perhaps is there some simple function available that's meant to handle these instances?

Ch. 2.2.2, Typo

The example in chapter 2.2.2 the code creates an array containing four entries. The section right beneath it starts with "This vector has five entries ..." and uses 5 throughout that section. I would believe that should be 4 in all cases?

[4.4-overfitting] which model is better? (original model vs L2-regularized model)

image
This example illustrates the impact of L2 regulation. I can see that the L2-regularied model gets overfitting slowly comparing with the original model. But the peak of validation loss of L2-regularied model is greater than that of the original model. Thus, for this example, the original model is still better than L2-regulaized model, right?

Training Steps per Epoch Chapter 6.3

The training generator is set up with a batch size of 128, and when fitting the models, the code uses 500 steps per training epoch.

history = model.fit_generator(train_gen,
                              steps_per_epoch=500,
                              epochs=20,
                              validation_data=val_gen,
                              validation_steps=val_steps)

Doesn't this mean the model is only training on 128 * 500 = 64000 training data points rather than all 200000? For validation, the steps per validation epoch are set to val_steps where

# This is how many steps to draw from `val_gen`
# in order to see the whole validation set:
val_steps = (300000 - 200001 - lookback) // batch_size

I probably am misunderstanding some part of this example, but why are we not using all the training examples every epoch? It seems like we could define the number of training steps as

train_steps = (200000 - lookback) // batch_size

in order to use all the training points every epoch.

Typo in R notebook?

Perhaps I'm misunderstanding, but in https://github.com/jjallaire/deep-learning-with-r-notebooks/blob/master/notebooks/6.1-one-hot-encoding-of-words-or-characters.Rmd , shouldn't

for (i in 1:length(samples)) { sample <- samples[[i]] words <- head(strsplit(sample, " ")[[1]], n = max_length) for (j in 1:length(words)) { # Hash the word into a "random" integer index # that is between 0 and 1,000 index <- abs(spooky.32(words[[i]])) %% dimensionality results[[i, j, index]] <- 1 } }

instead be

for (i in 1:length(samples)) { sample <- samples[[i]] words <- head(strsplit(sample, " ")[[1]], n = max_length) for (j in 1:length(words)) { # Hash the word into a "random" integer index # that is between 0 and 1,000 index <- abs(spooky.32(words[[j]])) %% dimensionality results[[i, j, index]] <- 1 } }

where you hash the word j in each sentence, rather than the sentence i index, so that you get the same hash for the same word across sentences?

Chapter 5.2 validation_generator always cause training fail at 2nd epoch

Follow the jupyter notebook of "5.2-using-convnets-with-small-datasets", we encounter issue to run below code:

history = model.fit_generator( train_generator, steps_per_epoch=100, epochs=30, validation_data=validation_generator, validation_steps=50)

The "fit_generator" will stop at 2nd epoch and keras throws below messages:

StopIteration: Asked to retrieve element 50, but the Sequence has length 50
And the error message indicates:
`
~\AppData\Local\Continuum\anaconda3\envs\ml\lib\site-packages\keras-2.0.8-py3.5.egg\keras\preprocessing\image.py in getitem(self, idx)
718 'has length {length}'.format(idx=idx,
--> 719 length=len(self)))
720 if self.seed is not None:

ValueError: Asked to retrieve element 50, but the Sequence has length 50
`

I am using Keras(v.2.0.8) with Tensorflow(v.1.3.0) on Windows10 os.

2.1 on Google Colab

I uploaded chapter 2.1 onto Google Colab, and was going through the snippets of code. The first difference is in the version of TensorFlow, reported as 2.1.6, so it's likely that this issue has to do with that. Nevertheless, I thought I'd report it, because others are likely going to do the same as what I'm doing.

When executing
network.fit(train_images, train_labels, epochs=5, batch_size=128)
I get an error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-15-8b04f355273b> in <module>()
----> 1 network.fit(train_images, train_labels, epochs=5, batch_size=128)

/usr/local/lib/python3.6/dist-packages/keras/models.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
   1000                               initial_epoch=initial_epoch,
   1001                               steps_per_epoch=steps_per_epoch,
-> 1002                               validation_steps=validation_steps)
   1003 
   1004     def evaluate(self, x=None, y=None,

/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
   1628             sample_weight=sample_weight,
   1629             class_weight=class_weight,
-> 1630             batch_size=batch_size)
   1631         # Prepare validation data.
   1632         do_validation = False

/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
   1478                                     output_shapes,
   1479                                     check_batch_axis=False,
-> 1480                                     exception_prefix='target')
   1481         sample_weights = _standardize_sample_weights(sample_weight,
   1482                                                      self._feed_output_names)

/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in _standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
    111                         ': expected ' + names[i] + ' to have ' +
    112                         str(len(shape)) + ' dimensions, but got array '
--> 113                         'with shape ' + str(data_shape))
    114                 if not check_batch_axis:
    115                     data_shape = data_shape[1:]

ValueError: Error when checking target: expected dense_2 to have 2 dimensions, but got array with shape (60000, 10, 2)

Is there an easy work around? Thanks.

chapter2.4.3 momentum issue

this is a naive implementation of gradient descent with momentum,
and velocity = past_velocity * momentum + learning_rate * gradient,
but i think it should be velocity = past_velocity * momentum - learning_rate * gradient

5.4 - wrong predictions

Hi!
So, I'm running the following code, copied and pasted straight from the notebook:

from keras.applications.vgg16 import VGG16

# Note that we are including the densely-connected classifier on top;
# all previous times, we were discarding it.
model = VGG16(weights='imagenet')

from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input, decode_predictions
import numpy as np

# The local path to our target image
img_path = 'path_to_my_image.jpg'

# `img` is a PIL image of size 224x224
img = image.load_img(img_path, target_size=(224, 224))

# `x` is a float32 Numpy array of shape (224, 224, 3)
x = image.img_to_array(img)

# We add a dimension to transform our array into a "batch"
# of size (1, 224, 224, 3)
x = np.expand_dims(x, axis=0)

# Finally we preprocess the batch
# (this does channel-wise color normalization)
x = preprocess_input(x)

preds = model.predict(x)
print('Predicted:', decode_predictions(preds, top=3)[0])

But my results are really strange:

Predicted: [('n03788365', 'mosquito_net', 0.076942876), ('n15075141', 'toilet_tissue', 0.034389611), ('n03887697', 'paper_towel', 0.018613255)]

Any tips?

Rewriting seed text in Listing 8.7 in Chapter 8.1.4

In the book and jupyter notebook (8.1-text-generation-with-lstm.ipynb) there is code that picks a random seed text to feed to the LSTM model:

    # Select a text seed at random
    start_index = random.randint(0, len(text) - maxlen - 1)
    generated_text = text[start_index: start_index + maxlen]
    print('--- Generating with seed: "' + generated_text + '"')

But in the next few lines as we loop through the different temperatures and predict new samples we are changing the variable generated_text and thus feeding in a different seed text at every temperature.

    for temperature in [0.2, 0.5, 1.0, 1.2]:
        print('------ temperature:', temperature)
        sys.stdout.write(generated_text)  # Printing seed text

        # We generate 400 characters
        for i in range(400):
            sampled = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(generated_text):
                sampled[0, t, char_indices[char]] = 1.

            preds = model.predict(sampled, verbose=0)[0]
            next_index = sample(preds, temperature)
            next_char = chars[next_index]

            generated_text += next_char
            generated_text = generated_text[1:]  # Redefining seed text

In the book it shows some generated text at different temperatures all with the same seed text but in the notebook there is different seed text for every temperature. Code and book text samples are on page 277.

I think this can be fixed by adding a couple new lines:

    for temperature in [0.2, 0.5, 1.0, 1.2]:
        new_text = generated_text  # Set new_text to original seed text
        print('------ temperature:', temperature)
        sys.stdout.write(generated_text)  # Printing seed text 

        # We generate 400 characters
        for i in range(400):
            sampled = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(new_text):   # Vectorizing new_text
                sampled[0, t, char_indices[char]] = 1.

            preds = model.predict(sampled, verbose=0)[0]
            next_index = sample(preds, temperature)
            next_char = chars[next_index]

            new_text += next_char
            new_text = new_text[1:]

5.3-using-a-pretrained-convnet

Hi,

I am trying to run the 5.3-using-a-pretrained-convnet notebook with my set of images (50x180, RGB) but I get the following error:

train_features, train_labels = extract_features(train_dir, 25000) File "F:/Experiments/CNN_chollet/exp02.py", line 156, in extract_features features[i * batch_size : (i + 1) * batch_size] = features_batch ValueError: could not broadcast input array from shape (20,1,5,512) into shape (20,4,4,512) Exception ignored in: <bound method BaseSession.__del__ of <tensorflow.python.client.session.Session object at 0x000000000921FEF0>> Traceback (most recent call last): File "D:\Users\lueck\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 582, in __del__ UnboundLocalError: local variable 'status' referenced before assignment

Any suggestion what I am doing wrong?

Thanks in advance

[API DESIGN REVIEW] TimeSeriesSequence for timeseries data generation

Doc link: https://docs.google.com/document/d/1GcMWE4sEf205Fbklt3w1l3HL1URSYc7lJIZ3AGh3EA0

Inspired by fchollet@’s proposal on the Projects page, this doc proposes implementation of TimeSeriesSequence as a subclass of keras.utils.Sequence. Given a time-series prediction problem (such as this example), TimeSeriesSequence would take in the temporal dataset and various parameters such as step-size, prediction-delay, lookback etc. to produce data in accordance with Sequence’s interface.

3.7-predicting-house-prices: Potential contamination of the validation data

In section 3.7, the numerical predictors are centered and scaled (separately for the training and test data) - outside the k-fold cross validation loops. Only afterward is the training data split further into training and validation subsets when the k-fold cross-validation is set up.

I believe it is highly recommended to perform all data-dependent transformations within the cross-validation loop. (See this blog post for additional information.)

While this may not affect performance in this case, performing data-dependent transformations outside the cross-validation loop is potentially dangerous. It allows for information learned from the full training data to leak across the cross-validation folds. (See e.g. this example using the Boston housing dataset.)

As the audience of this book includes beginners in the field of machine learning, it would be good to point out this potential pitfall (or, even better, to move this step into the cross-validation loop).

No chapter 7? Want to cross check layers.Embedding

Want to cross-check the code here with the book, but couldn't find chapter 7.

Why is

embedded_text = layers.Embedding(64, text_vocabulary_size)(text_input)

should the vocabulary_size be the 1st argument and 64 (embedding dim) be 2nd?

5.3-using-a-pretrained-convnet.ipynb: AttributeError: module 'scipy' has no attribute 'ndimage'

Hi,

I have been trying running the notebook: 5.3-using-a-pretrained-convnet.ipynb which runs perfectly fine when I am trying to run on my personal laptop.
But when I am trying to run on a GPU machine the below portion of the code throws error:

Code:

from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')

Note that the validation data should not be augmented!

test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
# This is the target directory
train_dir,
# All images will be resized to 150x150
target_size=(150, 150),
batch_size=20,
# Since we use binary_crossentropy loss, we need binary labels
class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
validation_dir,
target_size=(150, 150),
batch_size=20,
class_mode='binary')

model.compile(loss='binary_crossentropy',
optimizer=keras.optimizers.RMSprop(lr=2e-5),
metrics=['acc'])

history = model.fit_generator(
train_generator,
steps_per_epoch=100,
epochs=30,
validation_data=validation_generator,
validation_steps=50,
verbose=0)

Error:

Found 244 images belonging to 2 classes.
Found 153 images belonging to 2 classes.

AttributeError Traceback (most recent call last)
~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras/utils/data_utils.py in get(self)
577 while self.is_running():
--> 578 inputs = self.queue.get(block=True).get()
579 self.queue.task_done()

~/anaconda3/envs/fastai/lib/python3.6/multiprocessing/pool.py in get(self, timeout)
643 else:
--> 644 raise self._value
645

~/anaconda3/envs/fastai/lib/python3.6/multiprocessing/pool.py in worker(inqueue, outqueue, initializer, initargs, maxtasks, wrap_exception)
118 try:
--> 119 result = (True, func(*args, **kwds))
120 except Exception as e:

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras/utils/data_utils.py in get_index(uid, i)
400 """
--> 401 return _SHARED_SEQUENCES[uid][i]
402

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras_preprocessing/image.py in getitem(self, idx)
1295 self.batch_size * (idx + 1)]
-> 1296 return self._get_batches_of_transformed_samples(index_array)
1297

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras_preprocessing/image.py in _get_batches_of_transformed_samples(self, index_array)
1778 params = self.image_data_generator.get_random_transform(x.shape)
-> 1779 x = self.image_data_generator.apply_transform(x, params)
1780 x = self.image_data_generator.standardize(x)

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras_preprocessing/image.py in apply_transform(self, x, transform_parameters)
1142 channel_axis=img_channel_axis,
-> 1143 fill_mode=self.fill_mode, cval=self.cval)
1144

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras_preprocessing/image.py in apply_affine_transform(x, theta, tx, ty, shear, zx, zy, row_axis, col_axis, channel_axis, fill_mode, cval)
334 mode=fill_mode,
--> 335 cval=cval) for x_channel in x]
336 x = np.stack(channel_images, axis=0)

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras_preprocessing/image.py in (.0)
334 mode=fill_mode,
--> 335 cval=cval) for x_channel in x]
336 x = np.stack(channel_images, axis=0)

AttributeError: module 'scipy' has no attribute 'ndimage'

The above exception was the direct cause of the following exception:

StopIteration Traceback (most recent call last)
in ()
41 validation_data=validation_generator,
42 validation_steps=50,
---> 43 verbose=0)

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras/legacy/interfaces.py in wrapper(*args, **kwargs)
89 warnings.warn('Update your ' + object_name + 90 ' call to the Keras 2 API: ' + signature, stacklevel=2)
---> 91 return func(*args, **kwargs)
92 wrapper._original_function = func
93 return wrapper

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
1413 use_multiprocessing=use_multiprocessing,
1414 shuffle=shuffle,
-> 1415 initial_epoch=initial_epoch)
1416
1417 @interfaces.legacy_generator_methods_support

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras/engine/training_generator.py in fit_generator(model, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
175 batch_index = 0
176 while steps_done < steps_per_epoch:
--> 177 generator_output = next(output_generator)
178
179 if not hasattr(generator_output, 'len'):

~/anaconda3/envs/fastai/lib/python3.6/site-packages/keras/utils/data_utils.py in get(self)
582 except Exception as e:
583 self.stop()
--> 584 six.raise_from(StopIteration(e), e)
585
586 def _send_sequence(self):

~/anaconda3/envs/fastai/lib/python3.6/site-packages/six.py in raise_from(value, from_value)

StopIteration: module 'scipy' has no attribute 'ndimage'

Running on keras gpu version.

Would appreciate any help on this.

Thanks
Vishal

6.3-advanced-usage-of-recurrent-neural-networks: Missing arguments in Keras documentation?

Hi Everybody:

I am using Keras 2.0.5.
Here are code samples for section 6.3 of book,
"Advanced use of recurrent neural network" pertaining to GRU:

`from keras.models import Sequential
from keras import layers
from keras.optimizers import RMSprop

model = Sequential()
model.add(layers.GRU(32, input_shape=(None, float_data.shape[-1])))
model.add(layers.Dense(1))

model.compile(optimizer=RMSprop(), loss='mae')
history = model.fit_generator(train_gen,
steps_per_epoch=500,
epochs=20,
validation_data=val_gen,
validation_steps=val_steps)`

My question refers to this code:
model.add(layers.GRU(32, input_shape=(None, float_data.shape[-1])))

When I looked up documentation for layers.GRU(),
at https://keras.io/layers/recurrent/#gru,
there is no documentation for argument input_shape.
There is no mention of input-shape argument.
The same problem also occurs with layers.LSTM().
This makes learning very hard.

I also tried looking at the source code,
for class Recurrent(Layer) and class GRU(Recurrent) in recurrent.py
for Keras 2.0.5, input_shape does not seems to exists as an argument
for these classes!

What did I missed?

Why is argument input_shape not listed as one of the many arguments of LSTM() and GRU()
at https://keras.io/layers/recurrent/#lstm and at https://keras.io/layers/recurrent/#gru ?

How do I go about locating specific information on all arguments that are used by these classes?

Thank you.

5.4 the different between windows and linux

from keras.models import load_model
from keras.preprocessing import image
import numpy as np
import matplotlib.pyplot as plt
from keras import models
import sys
from keras.applications.imagenet_utils import preprocess_input

if __name__ == '__main__':
    sys.path.append(r"./")
    model = load_model(r"./outputs/cats_and_dogs_small_2.h5")
    model.summary()

    img_path = r'./cats_and_dogs_small/test/cats/cat.1700.jpg'
    img = image.load_img(img_path, target_size=(150, 150))
    img_tensor = image.img_to_array(img)
    img_tensor = np.expand_dims(img_tensor, axis=0)
    # Remember that the model was trained on inputs
    # that were preprocessed in the following way:
    img_tensor = preprocess_input(img_tensor)

    # Extracts the outputs of the top 8 layers:
    layer_outputs = [layer.output for layer in model.layers[:8]]
    # Creates a model that will return these outputs, given the model input:
    activation_model = models.Model(inputs=model.input, outputs=layer_outputs)

    # This will return a list of 5 Numpy arrays:
    # one array per layer activation
    activations = activation_model.predict(img_tensor)
    first_layer_activation = activations[0]
    print(first_layer_activation.shape)
    print(first_layer_activation)

I run this code in linux is fine, but when I use windows, It occur some errors.

Traceback (most recent call last):
  File "D:/python/python codes/dogs_and_cats/visualization.py", line 29, in <module>
    activations = activation_model.predict(img_tensor)
  File "D:\python\anaconda\lib\site-packages\keras\engine\training.py", line 1167, in predict
    steps=steps)
  File "D:\python\anaconda\lib\site-packages\keras\engine\training_arrays.py", line 294, in predict_loop
    batch_outs = f(ins_batch)
  File "D:\python\anaconda\lib\site-packages\keras\backend\tensorflow_backend.py", line 2666, in __call__
    return self._call(inputs)
  File "D:\python\anaconda\lib\site-packages\keras\backend\tensorflow_backend.py", line 2635, in _call
    session)
  File "D:\python\anaconda\lib\site-packages\keras\backend\tensorflow_backend.py", line 2587, in _make_callable
    callable_fn = session._make_callable_from_options(callable_opts)
  File "D:\python\anaconda\lib\site-packages\tensorflow\python\client\session.py", line 1480, in _make_callable_from_options
    return BaseSession._Callable(self, callable_options)
  File "D:\python\anaconda\lib\site-packages\tensorflow\python\client\session.py", line 1441, in __init__
    session._session, options_ptr, status)
  File "D:\python\anaconda\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 519, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: input_1:0 is both fed and fetched.
Exception ignored in: <bound method BaseSession._Callable.__del__ of <tensorflow.python.client.session.BaseSession._Callable object at 0x000001E29716B160>>
Traceback (most recent call last):
  File "D:\python\anaconda\lib\site-packages\tensorflow\python\client\session.py", line 1464, in __del__
    self._session._session, self._handle, status)
  File "D:\python\anaconda\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 519, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: No such callable handle: 2072641053968

Confusing about the generator definition at Chapter 6.3

Dear author,
In addition to the model prediction from Chapter 6.3 from the previous question:
Next prediction for 6.3, temperature after data set ends? Also, testing set?, I do not understand how the generator function is defined.
In the supervised learning, we may want to train the model through the training data (pairs of predictors and responses). Like the regression example in Chapter 3.6, the predictor is a vector and the response is a numerical value. But in Chapter 6.3,
indices = range(rows[j] - lookback, rows[j], step)
samples[j] = data[indices]
targets[j] = data[rows[j] + delay][1]
The predictor for a sample time point is a matrix and the response is a numerical value.
I do not know why you defined the regression in this way? I understand in the image classification as the example in Chapter 5.2, you use the generator to automatically generate the training data: a 3D figure corresponds to a label, so you do not need to store all the images in the memory.
Meanwhile, why the batch_size = 128? because you want to make the val_steps as an integer? In the real application, how we chose a good batch_size?
Thank you very much!

Change in activation function in 8.5

I used the code given in 8.5 just produce new images . But the images looked more like noise to me. But when i change the activation function of last layer of generator from "tanh" to "sigmoid" , it worked like a magic.
The reason I think , in generator we need output in range [0-1]. But "tanh" gives output in range [-1,1].
"sigmoid' gives output in range [0-1].
am i right ?

6.3 using LSTM layers instead of GRU layers gives nan, why?

When I use LSTM instead of GRU as the "Going even further" part suggests:
The Stacked LSTM part loss and val loss both are nan:
2018-04-25 14-57-01
Why LSTM and GRU different so much, and the nan?
When trying stacked LSTM on GPU, no longer nan, but very large number loss
Why GPU and CPU result different so much.
2018-04-25 15-00-28
When change RMSProp to Adam on GPU, the loss change is strange too,
such as 0.8** to 0.7** to 5*****.*** to 4*****.*** to very large number 2***********.****
2018-04-25 15-14-06

Issue with Class Activation Map" (CAM) visualization - zero mean intensity of gradient

I am using Keras with tensorflow backend and I have fine-tuned the last Conv layer and FC layer of my network based on VGG weights. Now I am using CAM technique to visualize which parts of my image triggered the prediction and I get all zeros for mean intensity of the gradient over a specific feature map channel.

I have 4 classes, for my test sample these are the prediction:

preds_sample = model.predict(x)
output>> array([[1., 0., 0., 0.]], dtype=float32)

# This is the "sample image" entry in the prediction vector    
image_0 = model.output[:, 0]

last_conv_layer = model.get_layer('conv2d_13')
grads = K.gradients(toilet_w, last_conv_layer.output)[0]
grads.shape
output>> TensorShape([Dimension(None), Dimension(512), Dimension(14), Dimension(14)])

Since I am using theano image ordering - when I calculate the mean of grads my axis is (0,2,3)

from keras import backend as K
K.set_image_dim_ordering('th')

pooled_grads = K.mean(grads, axis=(0,2,3))
pooled_grads.shape
output>> TensorShape([Dimension(512)])

iterate = K.function([model.input], [pooled_grads, last_conv_layer.output[0]])
pooled_grads_value, conv_layer_output_value = iterate([x])
pooled_grads_value.shape, conv_layer_output_value.shape
output>> ((512,), (512, 14, 14))

pooled_grads_value is all zero. Any thoughts/help appreciated.

Ch. 2.2.5, Type on page 33

About in the middle of page 33 the text says " ... an array of 60000 matrices of 28 x 8 integers.". I believe that should be 28 x 28 integers instead?

5.4-visualizing-what-convnets-learn: heatmap vs saliency map

Inside this notebook, subsection "Visualizing heatmaps of class activation", a heatmap of the last conv layer is shown, and in the bottom there is written "In particular, it is interesting to note that the ears of the elephant cub are strongly activated"

Just for curiosity, along with the heatmap example, I made myself a "saliency map". The result is following:
elephants_saliency

Looking at this the saliency map, It seems that pixels from both elephants contribute "qualsi" equally to the final classification result. I can't tell that the ears of the elephant cub contribute more.
So why there is such an incongruency between the heatmap and saliency map?

The following code is used to build and display the saliency map (borrowed from keras-team/keras#1777 and modified a bit. I am quite new to python and please forgive the naiveness in code):

def compile_saliency_function(model):
    """
    Compiles a function to compute the saliency maps and predicted classes
    for a given minibatch of input images.
    """
    inp = model.layers[0].input
    outp = model.layers[-1].output
    max_outp = K.max(outp, axis=1)
    saliency = K.gradients(max_outp[0], inp)[0]
    max_class = K.argmax(outp, axis=1)
    return K.function([inp], [saliency])

sal = compile_saliency_function(model)([x])
sal_gray = np.max(sal[0], axis=3) 
sal_gray = np.maximum(sal_gray, 0)
sal_gray /= np.max(sal_gray)
plt.imshow(sal_gray[0],cmap="gray")

Chapter 6.3 - why shuffle training data for time-series RNN?

In listing 6.33, for building a RNN for the jena weather dataset, the 'generator' function has a 'shuffle' parameter:

if shuffle:
            rows = np.random.randint(
                min_index + lookback, max_index, size=batch_size)

I don't understand this - given it's time data, why would you want to randomize its order? Can anybody explain?

Thanks!

Next prediction for 6.3, temperature after data set ends? Also, testing set?

Hello, I was wondering how to adapt the code in 6.3 to actually make a genuinely new prediction, based upon the final few data points (lookback) in the Jena temperature challenge. As in, what would the next temperature point likely be after the data set ended? Every time I try I get a dimension error, so what would prediction = model.predict(?) look like?

Also, how to write the code asking for a 1-pass test-step, as was recommended to do to avoid over-fitting through model tweaking on the validation results? Thank you!!

One-Hot Encoding

In section '6.1.1 One-hot encoding of words and characters' (as well as section '3.4.2 Preparing the data'), the encoding produced by 'Listing 6.3 Using Keras for word-level one-hot encoding' does not appear to be one-hot encoding as described in Listing 6.1 and 6.2.

The code from Listing 6.1 produces this encoding:

array([[[0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],

   [[0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
    [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
    [0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
    [0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],
    [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
    [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
    [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
    [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
    [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
    [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]])

The code for 6.3 produces this encoding:

array([[0., 1., 1., 1., 1., 1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0., 1., 1., 1., 1.]])

Why are they different?

Thanks!

5.2, "Building our network" step gives me ValueError: Negative dimension size caused by subtracting 2 from 1 for 'max_pooling2d_1/MaxPool'

Ubuntu 16.04, Keras 2.1.3, Tensorflow 1.4.1, Tensorflow-gpu 1.4.1. Using the GPU.
Previous steps are fine, at "Building our network' I have a problem.
I commented out code so that only these lines execute

from keras import layers
from keras import models

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))

And I get the following trace


InvalidArgumentError Traceback (most recent call last)
~/.local/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, require_shape_fn)
685 graph_def_version, node_def_str, input_shapes, input_tensors,
--> 686 input_tensors_as_shapes, status)
687 except errors.InvalidArgumentError as err:

~/.local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py in exit(self, type_arg, value_arg, traceback_arg)
472 compat.as_text(c_api.TF_Message(self.status.status)),
--> 473 c_api.TF_GetCode(self.status.status))
474 # Delete the underlying status object from memory otherwise it stays alive

InvalidArgumentError: Negative dimension size caused by subtracting 2 from 1 for 'max_pooling2d_2/MaxPool' (op: 'MaxPool') with input shapes: [?,32,148,1].

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
in ()
5 model.add(layers.Conv2D(32, (3, 3), activation='relu',
6 input_shape=(150, 150, 3)))
----> 7 model.add(layers.MaxPooling2D((2, 2)))
8 #model.add(layers.Conv2D(64, (3, 3), activation='relu'))
9 #model.add(layers.MaxPooling2D((2, 2)))

~/.local/lib/python3.5/site-packages/keras/models.py in add(self, layer)
490 output_shapes=[self.outputs[0]._keras_shape])
491 else:
--> 492 output_tensor = layer(self.outputs[0])
493 if isinstance(output_tensor, list):
494 raise TypeError('All layers in a Sequential model '

~/.local/lib/python3.5/site-packages/keras/engine/topology.py in call(self, inputs, **kwargs)
615
616 # Actually call the layer, collecting output(s), mask(s), and shape(s).
--> 617 output = self.call(inputs, **kwargs)
618 output_mask = self.compute_mask(inputs, previous_mask)
619

~/.local/lib/python3.5/site-packages/keras/layers/pooling.py in call(self, inputs)
156 strides=self.strides,
157 padding=self.padding,
--> 158 data_format=self.data_format)
159 return output
160

~/.local/lib/python3.5/site-packages/keras/layers/pooling.py in _pooling_function(self, inputs, pool_size, strides, padding, data_format)
219 output = K.pool2d(inputs, pool_size, strides,
220 padding, data_format,
--> 221 pool_mode='max')
222 return output
223

~/.local/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in pool2d(x, pool_size, strides, padding, data_format, pool_mode)
3652 x = tf.nn.max_pool(x, pool_size, strides,
3653 padding=padding,
-> 3654 data_format=tf_data_format)
3655 elif pool_mode == 'avg':
3656 x = tf.nn.avg_pool(x, pool_size, strides,

~/.local/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py in max_pool(value, ksize, strides, padding, data_format, name)
1956 padding=padding,
1957 data_format=data_format,
-> 1958 name=name)
1959
1960

~/.local/lib/python3.5/site-packages/tensorflow/python/ops/gen_nn_ops.py in _max_pool(input, ksize, strides, padding, data_format, name)
2804 _, _, _op = _op_def_lib._apply_op_helper(
2805 "MaxPool", input=input, ksize=ksize, strides=strides, padding=padding,
-> 2806 data_format=data_format, name=name)
2807 _result = _op.outputs[:]
2808 _inputs_flat = _op.inputs

~/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords)
785 op = g.create_op(op_type_name, inputs, output_types, name=scope,
786 input_types=input_types, attrs=attr_protos,
--> 787 op_def=op_def)
788 return output_structure, op_def.is_stateful, op
789

~/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in create_op(self, op_type, inputs, dtypes, input_types, name, attrs, op_def, compute_shapes, compute_device)
2956 op_def=op_def)
2957 if compute_shapes:
-> 2958 set_shapes_for_outputs(ret)
2959 self._add_op(ret)
2960 self._record_op_seen_by_control_dependencies(ret)

~/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in set_shapes_for_outputs(op)
2207 shape_func = _call_cpp_shape_fn_and_require_op
2208
-> 2209 shapes = shape_func(op)
2210 if shapes is None:
2211 raise RuntimeError(

~/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in call_with_requiring(op)
2157
2158 def call_with_requiring(op):
-> 2159 return call_cpp_shape_fn(op, require_shape_fn=True)
2160
2161 _call_cpp_shape_fn_and_require_op = call_with_requiring

~/.local/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py in call_cpp_shape_fn(op, require_shape_fn)
625 res = _call_cpp_shape_fn_impl(op, input_tensors_needed,
626 input_tensors_as_shapes_needed,
--> 627 require_shape_fn)
628 if not isinstance(res, dict):
629 # Handles the case where _call_cpp_shape_fn_impl calls unknown_shape(op).

~/.local/lib/python3.5/site-packages/tensorflow/python/framework/common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, require_shape_fn)
689 missing_shape_fn = True
690 else:
--> 691 raise ValueError(err.message)
692
693 if missing_shape_fn:

ValueError: Negative dimension size caused by subtracting 2 from 1 for 'max_pooling2d_2/MaxPool' (op: 'MaxPool') with input shapes: [?,32,148,1].

How to actually get the Temperature prediction - weather example chapter 6

After training the GRU architecture in the book on the Jena weather dataset (ch#6), I am having difficulties understanding the prediction phase:

The last layer - Dense with no activation - outputs as expected, a stream of numbers:
Dimensions: Num of rows X 1.
I guess these are the predictions.

Problem is the input is num rows X num cols (14 parameters) and the predictions output cannot be reshaped into a 14 columns array - such as the weather data set contains.

Aren't the predictions supposed to be of the same dimensions as the input ?

If the predictions are on all 14 parameters and they have been strained thru a Dense layer with ONE unit - then this predictions' "stream" should be able to be reshaped back to num rows X num cols ?

Thx

Chapter 6.3 How Would We Implement Preprocessing Using Pandas?

For the section we don't use preprocessing using Pandas, I was wondering how would we do the same preprocessing using pandas because it also deals with time series data and is overall much smoother. If anyone can help me on this point, I'd greatly appreciate it.

Training and validation loss plot uses wrong history.history keys

Notebook 3.5-classifying-movie-reviews

The code that is supposed to generate the Training and validation loss side by side uses wrong history.history keys:

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

which results in the following errors:
KeyError: 'acc'
KeyError: 'val_acc'

If you examine de history.history keys you can see that they have different names:
history_dict = history.history
history_dict.keys()

dict_keys(['loss', 'val_loss', 'binary_accuracy', 'val_binary_accuracy'])

This is probably due to the use of metrics.binary_accuracy as the metric for evaluating the model. The following change in keys would fix the error:
acc = history.history['binary_accuracy']
val_acc = history.history['val_binary_accuracy']

5.3 Using a pretrained convnet

When doing 5.3, the version where you add to the existing convolutional base, I accidentally forgot to set the trainable flag to False. Much to my surprise the model trained fine anyway and pretty much reproduced the results from the book. About 97% accuracy on the test set.

I later realized my mistake and tried again with the flag set to False but then the model trained much worse. Ending up with an accuracy of just 90%.

This seems weird to me but I can't figure out whats wrong. Has anyone else had this experience?

Question about 'label' in training of GAN

Code below is part of '8.5-introduction-to-gans.ipynb' .

# Combine them with real images
stop = start + batch_size
real_images = x_train[start: stop]
combined_images = np.concatenate([generated_images, real_images])

# Assemble labels discriminating real from fake images
labels = np.concatenate([np.ones((batch_size, 1)),
                                          np.zeros((batch_size, 1))])
# Add random noise to the labels - important trick!
labels += 0.05 * np.random.random(labels.shape)

# Train the discriminator
d_loss = discriminator.train_on_batch(combined_images, labels)

The label of generated_image for training discriminator is '1' ( np.ones((batch_size, 1)).

As I understand GAN, The output of discriminator is the probability that the image is real.
It means label of generated_images for discriminator should be '0' because It is fake. However, Above code is not...

Thus, I think labels should be like below
labels = np.concatenate([np.zeros((batch_size, 1)),
np.one((batch_size, 1))])

If this is wrong, Could you tell me why it is?

Thanks :)

Use of K.update_add leads to NoneType in K.gradients

I tried the code related to Deep Dream and ran into this warning:

WARNING:tensorflow:Variable += will be deprecated. Use variable.assign_add if you want assignment to the variable value or 'x = x + y' if you want a new python Tensor object.
In [ ]:

This warning is trigged by this:
loss += coeff * K.sum(K.square(activation[:, 2: -2, 2: -2, :])) / scaling

So I tried to use K.update_add() instead. But after that, "grads = K.gradients(loss, dream)[0]" will give NoneType. I googled and seem to hear that a non-differentiable "loss" can result in this. So I am thinking maybe .update_add() somehow causes this. I switch back to use "+=" and K.gradients() is returning the correct thing.

Is this a bug in Keras?

TensorBoard example in 7.2.2 fails

When I add the TensorBoard callback to my model.fit(), I get the following error after the first epoch:

018-01-16 20:19:43.503627: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: You must feed a value for placeholder tensor 'embed_input' with dtype float and shape [?,500]
[[Node: embed_input = Placeholderdtype=DT_FLOAT, shape=[?,500], _device="/job:localhost/replica:0/task:0/gpu:0"]]

model.fit() works fine without the callback. What is the solution? I unable to find a workaround.

Also, I think chapter 7 should have its own Jupyter notebook like the other chapters. Your Jupyter notebooks rock!

class activation heat map using resnet50?

Hello,

I'm trying to use ResNet50 for class activation heat map using the last conv layer, I chose 'bn5c_branch2c' but the heat map came out completely black. Can you advice me on which reset50 layer I should use?

I did successfully get the vgg16 heatmap working (using 'block5_conv3').

Thank you!
Xinxin

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.