keras-team / tf-keras Goto Github PK

Do you want to contribute a PR? (yes/no): If someone can tell me how to do it , I will do it .

Extra GPU-CPU memory transfer when broadcasting operations between integer tensors

I was asked to cross-post this issue here,
tensorflow/tensorflow#54197

Thank you

Please go to TF Forum for help and support:

https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh

If you open a GitHub issue, here is our policy:

It must be a bug, a feature request, or a significant problem with the documentation (for small docs fixes please send a PR instead).
The form below must be filled out.

Here's why we have that policy:.

Keras developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras):
OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
TensorFlow installed from (source or binary):
TensorFlow version (use command below):
Python version:
Bazel version (if compiling from source):
GPU model and memory:
Exact command to reproduce:

You can collect some of this information using our environment capture script:

You can obtain the TensorFlow version with:
python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"

Describe the problem.

Describe the problem clearly here. Be sure to convey here why it's a bug in Keras or why the requested feature is needed.

Describe the current behavior.

Describe the expected behavior.

https://colab.research.google.com/gist/amahendrakar/8b65a688dc87ce9ca07ffb0ce50b84c7/44199.ipynb#scrollTo=fEjmSrKIqiiM

Do you want to contribute a PR? (yes/no):
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing):

Standalone code to reproduce the issue.

Provide a reproducible test case that is the bare minimum necessary to generate
the problem. If possible, please share a link to Colab/Jupyter/any notebook.

Source code / logs.

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.

Unable to restore a layer of class TextVectorization - Text Classification

Moving user issue from: tensorflow/tensorflow#45231

Describe the problem.

**When I run the example provided by official tensorflow Basic text classification, everything runs fine until model save. But when I load the model it gives me this error.

RuntimeError: Unable to restore a layer of class TextVectorization. Layers of class TextVectorization require that the class be provided to the model loading code, either by registering the class using @keras.utils.register_keras_serializable on the class def and including that file in your program, or by passing the class in a keras.utils.CustomObjectScope that wraps this load call.
**

Model should be loaded successfully and process raw input

Example Link: https://tensorflow.google.cn/tutorials/keras/text_classification

Model consuming RaggedTensors fails during evaluation in a distributed setting

Please go to TF Forum for help and support:

https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh

If you open a GitHub issue, here is our policy:

It must be a bug, a feature request, or a significant problem with the documentation (for small docs fixes please send a PR instead).
The form below must be filled out.

Here's why we have that policy:.

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Colab and Debian 10
TensorFlow installed from (source or binary): Binary
TensorFlow version (use command below): 2.6.0
Python version:
Bazel version (if compiling from source):
GPU model and memory: V100 (16 GB)
Exact command to reproduce:

You can collect some of this information using our environment capture script:

You can obtain the TensorFlow version with:
python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"

Describe the problem.

We have a model that consumes multiple ragged tensors in a batch. Our model runs perfectly fine on a single GPU. But the moment we introduce distributed training, its evaluation fails.

Note that the training during the distributed settings proceeds smoothly but it's during the evaluation it fails. Since we cannot provide the original data and model, we are using we are providing a minimal snippet in the following notebook that reproduces the issue. You can use Colab to reproduce the issue as well as a multi-GPU machine. We have verified on both and the issue persists.

Describe the current behavior.

Model consuming RaggedTensors fails during evaluation in a distributed setting.

Describe the expected behavior.

The model should run during evaluation without any errors when exposed to a distributed setting.

Do you want to contribute a PR? (yes/no): No.
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing):

Standalone code to reproduce the issue.

Colab Notebook: https://colab.research.google.com/drive/1U9oeed5OMAH1KvN5T455kAsB2Nsh1-KF?usp=sharing.

Source code / logs.

ValueError: in user code:

    /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:1330 test_function  *
        return step_function(self, iterator)
    /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:1319 step_function  **
        data = next(iterator)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py:693 __next__
        return self.get_next()
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py:744 get_next
        self, self._strategy, return_per_replica=False)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py:611 _get_next_as_optional
        iterator._iterators[i].get_next_as_list())  # pylint: disable=protected-access
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py:1990 get_next_as_list
        strict=True,
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:206 wrapper
        return target(*args, **kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/deprecation.py:549 new_func
        return func(*args, **kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/control_flow_ops.py:1254 cond
        return cond_v2.cond_v2(pred, true_fn, false_fn, name)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/cond_v2.py:95 cond_v2
        op_return_value=pred)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py:1007 func_graph_from_py_func
        func_outputs = python_func(*func_args, **func_kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py:1989 <lambda>
        lambda: _dummy_tensor_fn(data.element_spec),
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py:1853 _dummy_tensor_fn
        return nest.map_structure(create_dummy_tensor, value_structure)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/nest.py:869 map_structure
        structure[0], [func(*x) for x in entries],
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/nest.py:869 <listcomp>
        structure[0], [func(*x) for x in entries],
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/input_lib.py:1849 create_dummy_tensor
        dummy_tensor, (row_splits,) * spec._ragged_rank, validate=False)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:206 wrapper
        return target(*args, **kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/ragged/ragged_tensor.py:745 from_nested_row_splits
        result = cls.from_row_splits(result, splits, validate=validate)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:206 wrapper
        return target(*args, **kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/ragged/ragged_tensor.py:454 from_row_splits
        return cls._from_row_partition(values, row_partition, validate=validate)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/ragged/ragged_tensor.py:348 _from_row_partition
        return cls(values=values, internal=True, row_partition=row_partition)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/ragged/ragged_tensor.py:294 __init__
        values.shape.with_rank_at_least(1)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/tensor_shape.py:1078 with_rank_at_least
        raise ValueError("Shape %s must have rank at least %d" % (self, rank))

    ValueError: Shape () must have rank at least 1

Cc: @Nilabhra

tf.keras computes incorrect loss values with Masking

**This issue is copied from tensorflow/tensorflow#34491. **

The issue is moved here for better tracking since the keras code has been moved to keras-team/keras repo.

Scipy affine transform

System information.

TensorFlow version (you are using):
master
Are you willing to contribute it (Yes/No) :
I need more detail
Describe the feature and the current behavior/state.
I think that we need to cover core image processing transformation with TF native ops.

Currently a core transformation in preprocessing still rely on numpy/scipy impl.
https://github.com/keras-team/keras/blob/master/keras/preprocessing/image.py#L2622

Describe the feature clearly here. Be sure to convey here why the requested feature is needed. Any brief description about the use-case would help.

Will this change the current api? How?

Who will benefit from this feature?

Do you want to contribute a PR? (yes/no):
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing):

How are the shapes of convolutional layers calculated

Hello Keras-Team :),

I use Tf2.8 with tf.keras.
For a web app I need to precalculate the shapes of the layers a user is using. So I have derived formulars for nearly all layers so far. But I am stuck on the convolutional layers.

For convolutional layers it is not allowed to have a dilation_rate > 1 and strides > 1. Why is that and why is it possible for other convolutional layers like SeperableConv or DepthwiseConv?
From my understanding defining a dilation rate > 1 can be understood as greater kernel size with gaps in it. So it should also be possible to jump with that „greater kernel“ a given stepsize (which is the strides) or not?

So far I came up with the following formular which works for any convolutional layer (except transpose of course) as long as one of the parameters dilation_rate or the strides are 1. (You don't have to get really into the following formula, what would really help me out is just the correct formula, but for completness sake I paste it here).


/*
        For a given dimension:
        p - is previus_shape dimension value
        k - is kernel_size
        d - dilation rate
        s - strides rate
*/


if(p < k) return "invalid";
if(d > 1) k = k + (k-1) * (d-1);

if(padding === "valid") {
        const kernel_poses = p-(k-1)-s; // the theoretical amount of positions we can place the kernel if strides (s) === 1
        if(s===1) return Math.ceil(kernel_poses);
        else return Math.ceil(kernel_poses/s);   // here the returned value differs sometimes from what I get when I call model.summarize()
} else if(padding === "same") {
        const kernel_poses = p; // the theoretival amount of positions we can place the kernel if strides (s) === 1
        if(s===1) return Math.ceil(kernel_poses);
        else return Math.ceil(kernel_poses/s);
}

To Summarize my questions and needs here:

I would like to understand why dilation_rate > 1 and strides > 1 is not allowed.
I would like to understand why on other convolutional layers (ConvSeparable, DepthwiseConv) it is allowed to set both parameters > 1 (allthough the documentation states something different)
My Main need are formulas for the shape calculation of each convolutional layer.

Thx in advance <3

Plotting a model (with model_to_dot) fails if the inputlabels/outputlabels contain brackets

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Macos BigSur
TensorFlow installed from (source or binary): pip
TensorFlow version (use command below): 2.7
Python version: 3.8
Bazel version (if compiling from source):
GPU model and memory:
Exact command to reproduce:

tf.keras.utils.plot_model(model,
                              to_file='model_dir/model.png'),
                              show_shapes=True,
                              show_dtype=True,
                              show_layer_names=True)

Describe the problem.

When I try to plot a model that contains layers with dictionary inputs, I get an error in the
model_to_dot function in vis_utils.py
that says: Error: invalid label format.
This error comes from having an invalid graphviz label name (defined here: https://github.com/keras-team/keras/blob/master/keras/utils/vis_utils.py#L293)
My input shape is a dictionary, and is in the form: {'a': (None, 1), 'b': (None, 2)}.
If I call plot_model with show_shapes=True, then the shape will be added to the label name for graphbiz.
The problem is that the brackets in the inputlabels and outputlabels need to be escaped so that the node's label can be interpreted correctly by graphviz. (otherwise, graphviz interprets it as a nested label: https://graphviz.org/doc/info/shapes.html#record)

I fixed the issue by adding:

inputlabels = inputlabels.replace('{', '\{')
inputlabels = inputlabels.replace('}', '\}')
outputlabels = outputlabels.replace('{', '\{')
outputlabels = outputlabels.replace('}', '\}')

before https://github.com/keras-team/keras/blob/master/keras/utils/vis_utils.py#L293
but there must be a more elegant way of fixing this.

Describe the current behavior.
Plotting the model fails.

Describe the expected behavior.
The model gets plotted correctly

tensorflow/tensorflow#53394

Do you want to contribute a PR? (yes/no): yes
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing):
Maybe we could have a list of characters to escape ('{', '}', '|', etc.), and escape all of these at once in the label to be plotted?

keras.models.load_model resets the optimizer's state

(Moving an issue from the tf repo)

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes, mostly based on the example from https://www.tensorflow.org/guide/keras/save_and_serialize
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): google colab (Linux 59a52e5448f6 5.4.104+ keras-team/keras#1 SMP Sat Jun 5 09:50:34 PDT 2021 x86_64 x86_64 x86_64 GNU/Linux)
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: no
TensorFlow installed from (source or binary): google colab version
TensorFlow version (use command below): v2.6.0-0-g919f693420e 2.6.0
Python version: 3.7.12 (default, Sep 10 2021, 00:21:48) [GCC 7.5.0]
Bazel version (if compiling from source): no
GCC/Compiler version (if compiling from source): no
CUDA/cuDNN version: 11.2
GPU model and memory: Tesla K80, 11441MiB

Describe the current behavior

When restoring a keras model with keras.models.load_model, the returned model's optimizer is in the reset state (e.g. its weights attribute is empty).

Describe the expected behavior

The original call:

reconstructed_model = tf.keras.models.load_model("my_model")

should have restored and kept the optimizer's weights.

Standalone code to reproduce the issue

import tensorflow as tf
import numpy as np

def get_model():
    # Create a simple model.
    inputs = tf.keras.Input(shape=(32,))
    outputs = tf.keras.layers.Dense(1)(inputs)
    model = tf.keras.Model(inputs, outputs)
    model.compile(optimizer="adam", loss="mean_squared_error")
    return model


model = get_model()

# Train the model.
test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)

# Calling `save('my_model')` creates a SavedModel folder `my_model`.
model.save("my_model")

# It can be used to reconstruct the model identically.
reconstructed_model = tf.keras.models.load_model("my_model")

print(reconstructed_model.optimizer.weights)

output:

4/4 [==============================] - 1s 4ms/step - loss: 0.1829
INFO:tensorflow:Assets written to: my_model/assets
[]

If we additionally provide a compile=False argument, the optimizer's weights are restored:

reconstructed_model = tf.keras.models.load_model("my_model", compile=False)
for w in reconstructed_model.optimizer.weights:
    print(w.shape)

output:

(32, 1)
(1,)
(32, 1)
(1,)

However, trying to use the restored optimizer fails with an exception:

reconstructed_model.compile(reconstructed_model.optimizer, loss="mean_squared_error")
reconstructed_model.fit(test_input, test_target)

output:

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-3-22a4ff24818b> in <module>()
      1 reconstructed_model.compile(reconstructed_model.optimizer, loss="mean_squared_error")
----> 2 reconstructed_model.fit(test_input, test_target)

9 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
    992           except Exception as e:  # pylint:disable=broad-except
    993             if hasattr(e, "ag_error_metadata"):
--> 994               raise e.ag_error_metadata.to_exception(e)
    995             else:
    996               raise

NotImplementedError: in user code:

    /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:853 train_function  *
        return step_function(self, iterator)
    /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:842 step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:1286 run
        return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:2849 call_for_each_replica
        return self._call_for_each_replica(fn, args, kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:3632 _call_for_each_replica
        return fn(*args, **kwargs)
    /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:835 run_step  **
        outputs = model.train_step(data)
    /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:791 train_step
        self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
    /usr/local/lib/python3.7/dist-packages/keras/optimizer_v2/optimizer_v2.py:522 minimize
        return self.apply_gradients(grads_and_vars, name=name)
    /usr/local/lib/python3.7/dist-packages/keras/optimizer_v2/optimizer_v2.py:660 apply_gradients
        apply_state)
    /usr/local/lib/python3.7/dist-packages/keras/optimizer_v2/optimizer_v2.py:707 _distributed_apply
        var, apply_grad_to_update_var, args=(grad,), group=False)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:2595 update
        var, fn, args=args, kwargs=kwargs, group=group)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:2473 _replica_ctx_update
        return replica_context.merge_call(merge_fn, args=args, kwargs=kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:3064 merge_call
        return self._merge_call(merge_fn, args, kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:3071 _merge_call
        return merge_fn(self._strategy, *args, **kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:2471 merge_fn  **
        return self.update(var, fn, merged_args, merged_kwargs, group=group)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:2592 update
        return self._update(var, fn, args, kwargs, group)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:3646 _update
        return self._update_non_slot(var, fn, (var,) + tuple(args), kwargs, group)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py:3652 _update_non_slot
        result = fn(*args, **kwargs)
    /usr/local/lib/python3.7/dist-packages/keras/optimizer_v2/optimizer_v2.py:689 apply_grad_to_update_var  **
        update_op = self._resource_apply_dense(grad, var, **apply_kwargs)
    /usr/local/lib/python3.7/dist-packages/keras/optimizer_v2/optimizer_v2.py:1241 _resource_apply_dense
        raise NotImplementedError("Must be implemented in subclasses.")

    NotImplementedError: Must be implemented in subclasses.

Support relative (in addition to absolute) `min_delta` parameters in `keras.callbacks.ReduceLROnPlateau`

System information

TensorFlow version (you are using): 2.7.0
Are you willing to contribute it (Yes/No): YES

Describe the feature and the current behavior/state

I am reopening the issue tensorflow/tensorflow#33675 to assess community interest in that feature and to discuss possible implementations.
Quoting the original issue, the requested feature can be described as:

Regarding tf.keras.callbacks.ReduceLROnPlateau: The min_delta parameter is currently an absolute number which indicates when a meaningful reduction in the monitored loss has accrued. It makes no sense to use an absolute number for two reasons -

Every loss has a different dynamic range and hence a different definition for a meaningful reduction

A "meaningful reduction" decreases as the training progresses. The higher the epoch the smaller of a change in loss is expected.
For these two reasons I think that a percentage of change in the monitored loss is much more useful.

I do not have access to the code that I used anymore, but the problem that I was trying to solve was:

[...] the loss is ~1e5 at the beginning of training, while the goal is to achieve a loss as low as 10 at the end. One can easily see that min_delta=10 has very different meanings in the beginning and in the end.

I was able to solve this by implementing a custom version of ReduceLROnPlateau that accepted relative min_deltas.

Just for the record, PyTorch supports this feature by accepting the parameters threshold (equivalent to Kera's min_delta) and threshold_mode, that specifies whether threshold should be considered an absolute or a relative change.

Will this change the current API? How?

Yes: a new parameter should be added to the initializer of ReduceLROnPlateau. This addition can be done in a backward-compatible manner with a sensible choice of default values.

Who will benefit from this feature?

Anyone who uses the ReduceLROnPlateau callback, especially people working with models whose loss varies a lot during training.

Contributing

Do you want to contribute a PR? (yes/no): YES
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing):

I currently have two candidate solutions:

add a new parameter min_delta_mode: Literal['absolute', 'relative']: passing min_delta_mode='absolute' (the default behavior) instructs Keras to consider min_delta as an absolute change, as in the current behavior; passing min_delta_mode='relative' instructs Keras to consider min_delta as a relative change.
add a new parameter min_delta_rel: Optional[float]: the user must pass either min_delta or min_delta_rel (but not both) - passing min_delta is the current option; passing min_delta_rel achieves the new behavior.

Note that both candidates are equivalent, it's just a matter of choosing the best interface. Supposing that we choose option 1, the ReduceLROnPlateau._reset method would be changed so that self.monitor_op is defined depending on self.mode and self.min_delta_mode according to the following table:

`mode`	`min_delta_mode`	`monitor_op`
`'min'`	`'absolute'`	`lambda current, best: np.less(current, best - self.min_delta)`
`'max'`	`'absolute'`	`lambda current, best: np.greater(current, best + self.min_delta)`
`'min'`	`'relative'`	`lambda current, best: np.less(current, (1 - self.min_delta)*best)`
`'max'`	`'relative'`	`lambda current, best: np.greater(current, (1 + self.min_delta)*best)`

Using StringLookup as 1st layer in a Sequential model raises UnimplementedError since TF 2.8.0

System information.

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04 (Google Colab)
TensorFlow installed from (source or binary): binary (preinstalled on Google Colab)
TensorFlow version (use command below): v2.8.0-0-g3f878cff5b6 2.8.0
Python version: 3.7.12
Bazel version (if compiling from source): N/A
GPU model and memory: N/A
Exact command to reproduce:

import tensorflow as tf

cat = ["Paris", "Singapore", "Auckland"]
str_lookup_layer = tf.keras.layers.StringLookup()
str_lookup_layer.adapt(cat)
lookup_and_embed = tf.keras.Sequential([
    str_lookup_layer,
    tf.keras.layers.Embedding(input_dim=str_lookup_layer.vocabulary_size(),
                              output_dim=2)
])
lookup_and_embed(tf.constant([["Paris"], ["Singapore"], ["Auckland"]]))  # ERROR!

This code is available in this gist.

Describe the problem.

Since TF 2.8.0, using a tf.keras.layers.StringLookup layer as the first layer in a Sequential model raises an exception when calling the model: UnimplementedError: Exception encountered when calling layer "sequential" (type Sequential). Cast string to int64 is not supported [Op:Cast]. The problem did not exist in TF 2.7.1.

Full stacktrace:

---------------------------------------------------------------------------
UnimplementedError                        Traceback (most recent call last)
[<ipython-input-2-ee4b4b94a15e>](https://localhost:8080/#) in <module>()
      7                               output_dim=2)
      8 ])
----> 9 lookup_and_embed(tf.constant([["Paris"], ["Singapore"], ["Auckland"]]))

1 frames
[/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py](https://localhost:8080/#) in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

[/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py](https://localhost:8080/#) in raise_from_not_ok_status(e, name)
   7184 def raise_from_not_ok_status(e, name):
   7185   e.message += (" name: " + name if name is not None else "")
-> 7186   raise core._status_to_exception(e) from None  # pylint: disable=protected-access
   7187 
   7188 

UnimplementedError: Exception encountered when calling layer "sequential" (type Sequential).

Cast string to int64 is not supported [Op:Cast]

Call arguments received:
  • inputs=tf.Tensor(shape=(3, 1), dtype=string)
  • training=None
  • mask=None

Describe the expected behavior.

In TF 2.7.1, the code works and gives an output similar to this one:

<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
array([[-0.02887753, -0.01268407],
       [ 0.04601531, -0.02668235],
       [ 0.03409723, -0.03205377]], dtype=float32)>

tensorflow/core/util/gpu_launch_config.h:129] Check failed: work_element_count > 0 (-1018167296 vs. 0)

I have written a custom Keras CNN-based GAN for synthesizing tabular datasets. The code works fine when I use reasonable batch size (generally 64 to 1024). However, users are allowed to specify a batch size and when they use large ones, I try to handle by catching ResourceExhaustedErrors and step the batch size down. I found that doing this, eventually leads to Check failed error in the post title and I can't catch the exception, the process just dies. This occurs using the following environments:

Windows
Tensorflow 2.6.0 (pip install)
Cuda 11.3
Titan RTX 24GB Founders Edition card
Driver: 465.89

AWS Linux (RHEL7)
Tensorflow 2.6.1 (pip install)
Cuda 11.3 and now 11.5
V100
Driver: 495.29.05

Batch shape is (10240, 328) so at least 2 samples (per related post).
train_step function:
@tf.function()
def _train_step(self, real_data):
noise = tf.random.normal([self._batch_size, self._noise_dim])

    with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
        synthetic_data = self._generator(noise, training=True)
        real_data_pred = self._discriminator(real_data, training=True)
        synth_data_pred = self._discriminator(synthetic_data, training=True)
        gen_loss = self.generator_loss(synth_data_pred)
        disc_loss = self.discriminator_loss(real_data_pred, synth_data_pred)

    gradients_of_generator = gen_tape.gradient(gen_loss, self._generator.trainable_variables)
    gradients_of_discriminator = disc_tape.gradient(disc_loss, self._discriminator.trainable_variables)
    self.generator_optimizer.apply_gradients(zip(gradients_of_generator, self._generator.trainable_variables))
    self.discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator,
                                                     self._discriminator.trainable_variables))

Also, happens with and without mixed precision.
Any help would be greatly appreciated.

Not able to restore the model from config with tf.einsum operation

System information.

OS Platform and Distribution: Linux Ubuntu 18.04
TensorFlow installed from (source or binary): Binary
TensorFlow version (use command below): 2.4.3 (also reproduced with 2.5.0 and 2.7.0)
Python version: 3.8.10
CUDA/cuDNN version: 11.0

Describe the current behavior
When restoring the model from config getting
ValueError: Got 0 inputs for equation "bmhwf,bmoh->bmowf", expecting 2
Although if the tf.einsum op is wrapped as a Keras Lambda layer, it works (able to dump to config and restore).

Describe the expected behavior
Should be able to restore the model from config.

Do you want to contribute a PR? (yes/no): Yes
Briefly describe your candidate solution(if contributing): Not sure of how the solution might look like.

Standalone code to reproduce the issue
https://colab.research.google.com/drive/10X2dDb_EGLL64w-MyMU4g9dSrHLw9PvI?usp=sharing

import tensorflow as tf
from tensorflow import keras


x1 = keras.Input(shape=(2, 4, 4, 1))
x2 = keras.Input(shape=(2, 2, 4))
x = tf.einsum('bmhwf,bmoh->bmowf', x1, x2)
model = keras.Model(inputs=[x1, x2], outputs=x)
model = tf.keras.Model.from_config(model.get_config())

Source code / logs.

Log from colab

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-5a94aa47793c> in <module>()
      7 x = tf.einsum('bmhwf,bmoh->bmowf', x1, x2)
      8 model = keras.Model(inputs=[x1, x2], outputs=x)
----> 9 model = tf.keras.Model.from_config(model.get_config())

4 frames
/usr/local/lib/python3.7/dist-packages/keras/engine/training.py in from_config(cls, config, custom_objects)
   2446     with generic_utils.SharedObjectLoadingScope():
   2447       input_tensors, output_tensors, created_layers = (
-> 2448           functional.reconstruct_from_config(config, custom_objects))
   2449       # Initialize a model belonging to `cls`, which can be user-defined or
   2450       # `Functional`.

/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py in reconstruct_from_config(config, custom_objects, created_layers)
   1336         while layer_nodes:
   1337           node_data = layer_nodes[0]
-> 1338           if process_node(layer, node_data):
   1339             layer_nodes.pop(0)
   1340           else:

/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py in process_node(layer, node_data)
   1280         input_tensors = (
   1281             base_layer_utils.unnest_if_single_tensor(input_tensors))
-> 1282       output_tensors = layer(input_tensors, **kwargs)
   1283 
   1284       # Update node index map.

/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/special_math_ops.py in _einsum_v2_parse_and_resolve_equation(equation, input_shapes)
   1279   if len(input_shapes) != len(input_labels):
   1280     raise ValueError('Got {} inputs for equation "{}", expecting {}'.format(
-> 1281         len(input_shapes), equation, len(input_labels)))
   1282 
   1283   # Special case: if there are no '->', then we create output subscripts from

ValueError: Exception encountered when calling layer "tf.einsum" (type TFOpLambda).

Got 0 inputs for equation "bmhwf,bmoh->bmowf", expecting 2

Call arguments received:
  • equation='bmhwf,bmoh->bmowf'
  • inputs=<class 'inspect._empty'>
  • kwargs=<class 'inspect._empty'>

Keras/tensorflow failed when specifying class_weight in model.fit()

System information.

OS Platform and Distribution: Linux.
TensorFlow version: Keras/tensorflow version 2.8.0.
Python version: Python 3.7
GPU model and memory: Use CPU (no GPU).

Describe the problem.

My network model works well without specifying class_weight in model.fit().

However, when I specify class_weight in model.fit(), no matter what weight values I give, keras/tensorflow failed with the following error:

  File "/opt/local/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 55, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

indices[9] = 16 is not in [0, 10)
         [[{{node GatherV2}}]]
         [[IteratorGetNext]] [Op:__inference_train_function_43205]

Keras/tensorflow failed with the above error even when I give all classes an equal weight 1.0 (which is equivalent to no class weights), as the following (I have 10 classes):

class_weights_dict = {0: 1.0, 1: 1.0, 2: 1.0, 3: 1.0, 4: 1.0, 5: 1.0, 6: 1.0, 7: 1.0, 8: 1.0, 9: 1.0}

history = model.fit(train_input,
                    train_true_labels,
                    class_weight=class_weights_dict,
                    validation_split=validation_split,
                    shuffle=True,
                    epochs=epochs,
                    batch_size=batch_size)

And I verified that my true labels array train_true_labels contains only integers 0-9, as the following:

values = np.unique(train_true_labels)
print(values)
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

However, when I do not specify class_weight in model.fit(), the training for my model works just fine.

So it looks like I just cannot use class_weight in training. But my classes are highly imbalanced; not using class weights would train a useless model.

I would greatly appreciate any solution for this issue.

Thank you very much!

Convert Functional API to Model Subclassing with Normalization Layers

Could anyone please teach me how to convert the Functional API to Model subclassing in this TensorFlow Official Tutorial? I suppose an elegant chuck of code should be what combines the Normalization layer with the remaining layers. I went through the tutorial, trying to reproduced the process to fit my purpose. I have only 1 predictor (age) and 1 target variable (se), both are continues variables.

Here's how I refactored the code:

from tensorflow.data import Dataset
from tensorflow.keras import Input
from tensorflow.keras.backend import clear_session
from tensorflow.keras.layers import Normalization


def df_to_dataset(dataframe, batch_size, shuffle=True):
  dataframe = dataframe.copy()
  labels = dataframe.pop("se")
  ds = Dataset.from_tensor_slices((dict(dataframe), labels))
  if shuffle:
    ds = ds.shuffle(buffer_size=len(dataframe))
  ds = ds.batch(batch_size)
  ds = ds.prefetch(batch_size)
  return ds

def get_normalization_layer(name, dataset):
  normalizer = Normalization(axis=None)
  feature_ds = dataset.map(lambda x, y: x[name])
  normalizer.adapt(feature_ds)
  return normalizer


test_ds = df_to_dataset(test, batch_size=5, shuffle=False)

[(train_features, label_batch)] = test_ds.take(1)
print(f"Every Feature: {list(train_features.keys())}")
print(f"A batch of ages: {train_features['age']}")
print(f"A batch of targets: {label_batch}")

batch_size = 128
train_ds = df_to_dataset(train, batch_size=batch_size)
val_ds = df_to_dataset(val, batch_size=batch_size, shuffle=False)
test_ds = df_to_dataset(test, batch_size=batch_size, shuffle=False)

all_inputs = []
encoded_features = []

# Numeric features
clear_session()
for header in ["age"]:
  numeric_col = Input(shape=(1, ), name=header)
  normalization_layer = get_normalization_layer(header, train_ds)
  encoded_numeric_col = normalization_layer(numeric_col)
  all_inputs.append(numeric_col)
  encoded_features.append(encoded_numeric_col)

Below the Functional API part, which worked as expected, like in the tutorial:

import tensorflow as tf
from tensorflow import math
from tensorflow.keras import Model, Input
from tensorflow.keras.backend import clear_session
from tensorflow.keras.losses import Loss
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dense, Normalization, Concatenate, Dropout
from tensorflow.keras.callbacks import ReduceLROnPlateau, EarlyStopping


class QuantileLoss(Loss):

  def __init__(self, quantiles):
    super().__init__()
    self.quantiles = tf.convert_to_tensor(quantiles)
  
  def call(self, y_true, y_pred):
    y_true = tf.convert_to_tensor(y_true)
    y_pred = tf.convert_to_tensor(y_pred)
    errors = math.subtract(y_true, y_pred)
    loss = math.reduce_mean(
        math.maximum(
            math.multiply(self.quantiles, errors),
            math.multiply(
                math.subtract(
                    self.quantiles, 1
                ),
                errors
            )
        ),
        axis=-1
    )    
    return loss

clear_session()

earlystopping = EarlyStopping(patience=10)
lr_schedule = ReduceLROnPlateau(
    patience=5, 
    monitor="val_loss",
    verbose=1
)
callbacks = [lr_schedule, earlystopping]
quantiles = [0.021, 0.157, 0.5, 0.841, 0.977, 0.998]
all_features = Concatenate()(encoded_features)
print(all_features.shape)
x = Dense(256, activation="selu")(all_features)
x = Dropout(0.3)(x)
x = Dense(64, activation="selu")(x)
output = Dense(len(quantiles))(x)
model = Model(all_inputs, output)
quantile_loss = QuantileLoss(quantiles)
model.compile(optimizer=Adam(learning_rate=0.001), loss=quantile_loss)
history = model.fit(train_ds, validation_data=val_ds, epochs=100, callbacks=callbacks)

I was trying to refactor the Functional implementation to make the code look more elegant.

class QuantileRegressor(Model):

  def __init__(self, quantiles, hidden_units):
    super().__init__()
    self.quantiles = quantiles
    self.concatenate = Concatenate()
    self.normalizer = Normalization(axis=None)
    self.hidden_dense = Dense(hidden_units, activation="selu")
    self.dropout = Dropout(0.3)
    self.output_dense = Dense(len(quantiles))
  
  def call(self, inputs):
    self.normalizer.adapt(inputs["age"])
    # The line above gave me an error! 
    # Is it a good idea to place encoded_features, all_inputs here?
    # inputs here seemed to be a dictionary. 
    return None

earlystopping = EarlyStopping(patience=10)
lr_schedule = ReduceLROnPlateau(
    patience=5, 
    monitor="val_loss",
    verbose=1
)
callbacks = [lr_schedule, earlystopping]
quantiles = [0.021, 0.157, 0.5, 0.841, 0.977, 0.998]
hidden_units = 256
clear_session()
model = QuantileRegressor(quantiles, hidden_units)
quantile_loss = QuantileLoss(quantiles)
model.compile(optimizer=Adam(learning_rate=0.001), loss=quantile_loss)
history = model.fit(train_ds, validation_data=val_ds, epochs=10, callbacks=callbacks)

self.normalizer.adapt(inputs["age"]) in the call method resulted in

RuntimeError: in user code:

    /usr/local/lib/python3.7/dist-packages/keras/engine/training.py:853 train_function  *
        return step_function(self, iterator)
    <ipython-input-24-3f4f9f8eec72>:18 call  *
        self.normalizer.adapt(inputs["age"])
    /usr/local/lib/python3.7/dist-packages/keras/engine/base_preprocessing_layer.py:230 adapt  **
        _disallow_inside_tf_function('adapt')
    /usr/local/lib/python3.7/dist-packages/keras/engine/base_preprocessing_layer.py:591 _disallow_inside_tf_function
        raise RuntimeError(error_msg)

    RuntimeError: Detected a call to `PreprocessingLayer.adapt` inside a `tf.function`. `PreprocessingLayer.adapt is a high-level endpoint that manages its own `tf.function`. Please move the call to `PreprocessingLayer.adapt` outside of all enclosing `tf.function`s. Note that you can call a `PreprocessingLayer` directly on `Tensor`s inside a `tf.function` like: `layer(x)`, or update its state like: `layer.update_state(x)`.

That's why I am asking the question. What is the standard or preferred way using Model Subclassing, when adapting a Normalization layer?

Keras `predict_step` is not preserved across save and restore

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04 and macOS
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.6.0 and nightly
Python version: 3.7, 3.8 and 3.9

Describe the current behavior

When implementing custom prediction logic for Keras models using predict_step as explained here, saving and restoring the Keras model with the saved model format ignores the custom prediction logic. Unfortunately the code silently fails and doesn't inform the user that this is not supported, which could lead to detrimental bugs.

The issue is explained in detail with a minimal example in this colab notebook.

I know I can save a custom serving function using

class MyModel(tf.keras.Model):
    @tf.function(input_signature=[tf.TensorSpec(shape=[None], dtype=tf.string)])
    def serve(self, data):
        ...

as described here.
But I feel the current behaviour breaks with user expectations since the saved model format is now the default saving format but doesn't support all of the features and might silently fail resulting in unexpected behaviour.
This makes it necessary for users to break the abstraction and start using low level TF APIs instead, which I think doesn't fit well with the progressive disclosure of complexity that Keras tends to strive for.

Describe the expected behavior

Keras models should preserve custom predict_step logic when saving and restoring models.

Standalone code to reproduce the issue

import tensorflow as tf
import numpy as np

class FullyConnectedModel(tf.keras.Model):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.dense = tf.keras.layers.Dense(10)

    def predict_step(self, data):
        logits = self(data, training=False)
        return tf.argmax(logits, axis=-1)

    def call(self, inputs):
        return self.dense(inputs)

x, y = np.random.uniform(size=(128, 20)).astype(np.float32), np.random.randint(0, 10, size=(128))

model = FullyConnectedModel()
model.compile(optimizer="sgd", loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True))
model.fit(x, y, epochs=2, batch_size=32)

model.save("/tmp/model", save_traces=True)
reloaded_model = tf.keras.models.load_model("/tmp/model")

y_pred = model.predict(x)
reloaded_y_pred = reloaded_model.predict(x)

np.testing.assert_allclose(reloaded_y_pred, y_pred)

See this notebook for more information.

Also checkout tensorflow/tensorflow#48149 which was originally posted to TF before the move to keras-team/keras.

Model with nested input spec fails

Cross post: tensorflow/tensorflow#50721

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.5.0
Python version: 3.9

Describe the current behavior
The logic to test if a structure is nested is wrong. Example of failing case: {"a": {"b": 1}}

Do you want to contribute a PR? (yes/no): yes
Briefly describe your candidate solution(if contributing):
Change this line from:

if (isinstance(self._nested_inputs, (dict, list, tuple)) and
        len(self._nested_inputs) != len(self.inputs)):

to:

if max([len(path) for path in nest.yield_flat_paths(
        self._nested_inputs)]) > 1:

Standalone code to reproduce the issue

import tensorflow as tf
import numpy as np

input_tensor_shape = [16]
random_tensor = np.random.random([1]+input_tensor_shape)

def sequential():
  layers = [tf.keras.layers.InputLayer(input_shape=input_tensor_shape),
            tf.keras.layers.Dense(8)]
  return tf.keras.Sequential(layers=layers)

network = sequential()
network2 = sequential()

nested_input = {'input': {'sub_input1': network.input,
                          'sub_input2': network2.input}}

model = tf.keras.Model(inputs=nested_input, outputs=network.output)

input = {'input': {'sub_input1': random_tensor,
                   'sub_input2': random_tensor}}
# Works
model(input)

fail_nested_input = {'input': {'sub_input': network.input}}
fail_model = tf.keras.Model(inputs=fail_nested_input, outputs=network.output)

input = {'input': {'sub_input': random_tensor}}
# Fails
fail_model(input)

Reshape layer drops mask from previous layers

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): Yes, very basic code
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ran it on google colab notebook (https://colab.research.google.com/) as well as CentOS 7
TensorFlow installed from (source or binary): Binary / however it is installed in colab
TensorFlow version (use command below): v2.6.0-0-g919f693420e 2.6.0
Python version: 3.7
Bazel version (if compiling from source): N/A
GPU model and memory: N/A
Exact command to reproduce:

import tensorflow as tf
import numpy as np

num = 10

embedding_size = 5
window_size = 2

emb = tf.keras.layers.Embedding(
    num, embedding_size, input_length=1, mask_zero=True
)

td = tf.keras.layers.TimeDistributed(emb)

inp = tf.constant(
    np.array([
              [0,1],[3,0],[4,0]
    ])
)

inp = tf.keras.layers.Reshape((window_size, 1))(inp)
print(inp)

out = td(inp)
print(out.shape,out._keras_mask)

out2 = tf.keras.layers.Reshape((window_size,embedding_size ))(out)
print(out2.shape)

#The following throws an error, because Reshape dropped the mask from the previous tensor:
print(out2._keras_mask)

Error from last line:

AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute '_keras_mask'

Describe the problem.

Applying the Reshape layer to a tensor that has a mask (e.g., from the Embedding Layer) gets rid of the mask, instead of reshaping it.
This is a problem because in order to re-use an Embedding Layer for different outputs, I need to apply reshape after certain transformations, before passing to other layers (e.g., LSTM), but instead of also reshaping the mask of the tensor (as a user would expect), Reshape apparently discards the mask entirely, so I cannot use it.

To resolve this I would have to write a custom version of the Reshape layer to replace it.

Describe the current behavior.
Reshape layer applied to a tensor with a mask discards the mask instead of reshaping it.

Describe the expected behavior.
Reshape layer should not discard the mask of a tensor when reshaping it, but instead should correspondingly reshape the mask as well (so the mask now correctly applies to the reshaped tensor and can be passed to subsequent layers like LSTM).

I.e., in the code snippet above, for 'out', I have a tensor with shape (3, 2, 1, 5) and mask =

tf.Tensor(
[[[False]
  [ True]]

 [[ True]
  [False]]

 [[ True]
  [False]]], shape=(3, 2, 1), dtype=bool)

After the Reshape application above, the output is shape (3,2,5) - I would expect the mask to be correspondingly reshaped like:

new_mask = tf.keras.layers.Reshape((window_size,))(out._keras_mask)

So that the new mask is now shape (3,2) =

<tf.Tensor: shape=(3, 2), dtype=bool, numpy=
array([[False,  True],
       [ True, False],
       [ True, False]])>

Do you want to contribute a PR? (yes/no): no

Standalone code to reproduce the issue.
See the code block at the beginning.

Source code / logs.

N/A

MobileNetV3 models can't infer the static shape

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): no
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): any
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.6.0
Python version: 3.8
Bazel version (if compiling from source): no
GPU model and memory: no
Exact command to reproduce:
https://colab.research.google.com/drive/1geUcvgluev88zRG-hGhRgJ4YbaA6pt2Y?usp=sharing

Describe the problem.
MobileNetV3 models can't estimate output shape of the intermediate layers because some functions (activations like hard_swith, i suppose) did not wrapped with layers.

Describe the current behavior.
Exception raised when compute_output_shape executed.

Describe the expected behavior.
Just like ALL other models in keras.applications, MobileNetV3* models should be able to compute their output shapes.

https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/TensorBoard says:

Do you want to contribute a PR? (yes/no): no
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing): no

Standalone code to reproduce the issue.

Provide a reproducible test case that is the bare minimum necessary to generate
the problem. If possible, please share a link to Colab/Jupyter/any notebook.

Source code / logs.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py in compute_output_shape(self, input_shape)
    782         try:
--> 783           outputs = self(inputs, training=False)
    784         except TypeError as e:

9 frames
/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py in __call__(self, *args, **kwargs)
    976       return self._functional_construction_call(inputs, args, kwargs,
--> 977                                                 input_list)
    978 

/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py in _functional_construction_call(self, inputs, args, kwargs, input_list)
   1114       outputs = self._keras_tensor_symbolic_call(
-> 1115           inputs, input_masks, args, kwargs)
   1116 

/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py in _keras_tensor_symbolic_call(self, inputs, input_masks, args, kwargs)
    847     else:
--> 848       return self._infer_output_signature(inputs, args, kwargs, input_masks)
    849 

/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py in _infer_output_signature(self, inputs, args, kwargs, input_masks)
    887           inputs = self._maybe_cast_inputs(inputs)
--> 888           outputs = call_fn(inputs, *args, **kwargs)
    889 

/usr/local/lib/python3.7/dist-packages/keras/layers/core.py in _call_wrapper(*args, **kwargs)
   1349     def _call_wrapper(*args, **kwargs):
-> 1350       return self._call_wrapper(*args, **kwargs)
   1351     self.call = tf.__internal__.decorator.make_decorator(function, _call_wrapper)

/usr/local/lib/python3.7/dist-packages/keras/layers/core.py in _call_wrapper(self, *args, **kwargs)
   1381       kwargs.pop('name', None)
-> 1382       result = self.function(*args, **kwargs)
   1383     self._check_variables(created_variables, tape.watched_variables())

/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py in wrapper(*args, **kwargs)
    205     try:
--> 206       return target(*args, **kwargs)
    207     except (TypeError, ValueError):

TypeError: _add_dispatch() missing 1 required positional argument: 'y'

The above exception was the direct cause of the following exception:

NotImplementedError                       Traceback (most recent call last)
<ipython-input-3-fe10d8214bfb> in <module>()
      1 base_model = mobilenet_v3.MobileNetV3Large(include_top=False, weights=None)
----> 2 base_model.compute_output_shape(input_shape=[224, 224, 3])

/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py in compute_output_shape(self, input_shape)
    468           layer_input_shapes = tf_utils.convert_shapes(
    469               layer_input_shapes, to_tuples=True)
--> 470           layer_output_shapes = layer.compute_output_shape(layer_input_shapes)
    471           # Convert back to TensorShapes.
    472           layer_output_shapes = tf_utils.convert_shapes(

/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py in compute_output_shape(self, input_shape)
    787               'layer\'s output. Please implement the '
    788               '`compute_output_shape` method on your layer (%s).' %
--> 789               self.__class__.__name__) from e
    790       return tf.nest.map_structure(lambda t: t.shape, outputs)
    791     raise NotImplementedError(

NotImplementedError: We could not automatically infer the static shape of the layer's output. Please implement the `compute_output_shape` method on your layer (TFOpLambda).

[BUG] Model state is not correctly after saving and loading

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.8.0
Python version: 3.7.10
Bazel version (if compiling from source):
GPU model and memory: cpu
Exact command to reproduce:

Describe the problem.

Keras model does not converge after saving and loading.

Describe the current behavior.
After calling model.save(...) and model = tf.keras.models.load_model(...), the model failed to converge.

Describe the expected behavior.
Adding model.save(...) and model = tf.keras.models.load_model(...) should not effect the training process.

Do you want to contribute a PR? (yes/no): No

Standalone code to reproduce the issue.

import tensorflow as tf
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
########### added lines ###########
model.save("/tmp/mnist_model")
model = tf.keras.models.load_model('/tmp/mnist_model')
################################

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

Source code / logs.

After adding saving and loading, the model does not converge:

2022-02-08 09:03:19.403489: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-02-08 09:03:19.698194: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
Epoch 1/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.2983 - accuracy: 0.1004
Epoch 2/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.1451 - accuracy: 0.0992
Epoch 3/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.1069 - accuracy: 0.0990
Epoch 4/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.0881 - accuracy: 0.0991
Epoch 5/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.0754 - accuracy: 0.0989
313/313 [==============================] - 1s 2ms/step - loss: 0.0775 - accuracy: 0.0992

Remove the saving and loading code, the model converges as expected:

2022-02-08 09:05:18.683461: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Epoch 1/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.3010 - accuracy: 0.9133   
Epoch 2/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.1452 - accuracy: 0.9574
Epoch 3/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.1099 - accuracy: 0.9666
Epoch 4/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.0878 - accuracy: 0.9729
Epoch 5/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.0746 - accuracy: 0.9766
313/313 [==============================] - 1s 2ms/step - loss: 0.0744 - accuracy: 0.9770

Why the loss function (mse) calculated by keras not the same as mine

I want to test the loss function, mse in keras by myself. However, the calculated answers are different. The definition of mse is below: https://en.wikipedia.org/wiki/Mean_squared_error

The test code is below:

from keras.datasets import boston_housing
import numpy as np
(train_data, train_targets), (test_data, test_targets) = boston_housing.load_data()


x_train = train_data.astype(np.float32)

from keras import models 
from keras import layers

model = models.Sequential() 
model.add(layers.Dense(64, activation='relu', input_shape=(13,))) 
model.add(layers.Dense(64, activation='relu')) 
model.add(layers.Dense(1))
model.compile(optimizer='rmsprop',loss='mse', metrics=['mae'])

y_train = train_targets.astype(np.float32)
# y_test = test_targets.astype(np.float32)

model.fit(x_train,y_train,epochs=1,batch_size=404)

print(np.mean((y_train - model.predict(x_train).ravel()) ** 2))

It shows that the loss function is around 816 in keras. However, from the definition of mse, the results is around 704. Why are the results different here?

Support input of temporal sample_weights for model training on ragged tensors

tensorflow/tensorflow#50401
Created: 2021-06-22T15:34:15Z

System information

TensorFlow version (you are using): 2.5.0
Are you willing to contribute it (Yes/No): Yes

Describe the feature and the current behavior/state.
Currently tensorflow throws an error if we input temporal sample_weights for a model that's fitting inputs/outputs that are in the format of ragged tensors. Example:

#input in general has shape (N_inputs, variable length, N_input_channels)    
X = [[[4.,3,2],[2,1,3],[-1,2,1]],
     [[1,2,3],[3,2,4]]]
X = tf.ragged.constant(X, ragged_rank=1, dtype=tf.float64)

#output in general has shape (N_inputs, variable but same as corresponding input, N_classification_classes)
Y = [[[0,0,1],[0,1,0],[1,0,0]],
     [[0,0,1],[1,0,0]]]
Y = tf.ragged.constant(Y, ragged_rank=1)

#Documentation says for temporal data we can pass 2D array with shape (samples, sequence_length)
weights = [[100,1,1],
           [100,1]]
weights = np.array(weights)

model = SimpleModel(width=16, in_features=3, out_features=3)
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(X,Y) #works fine
model.fit(X,Y, sample_weight=weights) #throws error

Where the error thrown is ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type list). If we do the equivalent operator for a non-ragged tensors

#input in general has shape (N_inputs, 2, N_input_channels)    
X = [[[4.,3,2],[2,1,3]],
     [[1,2,3],[3,2,4]]]
X = tf.constant(X, dtype=tf.float64)

#output in general has shape (N_inputs, 2, N_classification_classes)
Y = [[[0,0,1],[0,1,0]],
     [[0,0,1],[1,0,0]]]
Y = tf.constant(Y)

#Documentation says for temporal data we can pass 2D array with shape (samples, sequence_length)
weights = [[100,1],
           [100,1]]
weights=np.array(weights)

model = SimpleModel(width=16, in_features=3, out_features=3)
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(X,Y) #works fine
model.fit(X,Y, sample_weight=weights) #also works fine

Everything works fine. The desired feature would allow passing of sample_weights for ragged tensors in the same way we could pass sample_weights for non-ragged tensors

Will this change the current api? How?
This would change the tf.keras.Model.fit api so that ragged sample_weights are supported

Who will benefit with this feature?
People working with variable length data. This occurs in areas like computer vision and applications of deep learning to particle physics. This feature would allow people working with ragged tensors to deal with underrepresented classes in temporal data via reweighing.

Any Other info.
Definition of SimpleLayer and SimpleModel used above

class SimpleLayer(tf.keras.layers.Layer):
    """Just dummy layer to illustrate sample_weight for layer"""
    def __init__(self, in_features, out_features, n):
        super(SimpleLayer, self).__init__()
        self.out_features = out_features
        self.in_features = in_features

        self.Gamma = self.add_weight(name='Gamma'+str(n),
                shape=(in_features, out_features),
                initializer='glorot_normal', trainable=True)

    def call(self, inputs):
        #uses ragged map_flat_values for Ragged tensors to handle
        #variable number of jet
        xG = tf.ragged.map_flat_values(tf.matmul, inputs, self.Gamma)
        return xG

   
class SimpleModel(tf.keras.Model):
    """Composes SimpleLayer above to create simple network for ragged tensors"""
    def __init__(self, width, in_features, out_features, Sigma=tf.nn.leaky_relu):
        super(SimpleModel, self).__init__()
        self.in_features = in_features
        self.out_features = out_features
        self.width = width
        self.first_layer = SimpleLayer(self.in_features, self.width, 0)
        self.hidden = SimpleLayer(self.width, self.width, 1)
        self.last_layer = SimpleLayer(self.width, self.out_features, 2)
        self.Sigma = Sigma

    def call(self, inputs):
        #use map_flat_values to apply activation to ragged tensor
        x = tf.ragged.map_flat_values(self.Sigma, self.first_layer(inputs))
        x = tf.ragged.map_flat_values(self.Sigma, self.hidden(x))
        x = tf.ragged.map_flat_values(tf.nn.softmax, self.last_layer(x))
        return x

Feature request: re-add activation histograms

System information.

TensorFlow version: 2.8.0-rc1

Describe the feature and the current behavior/state.

histogram_freq: frequency (in epochs) at which to compute activation and weight histograms for the layers of the model. If set to 0, histograms won't be computed. Validation data (or split) must be specified for histogram visualizations.

But I am fairly certain activation histograms are not written, see also tensorflow/tensorflow#39755 and tensorflow/tensorflow#42027.

Solutions:

re-implement the documented feature (it existed before)
un-document the un-implemented feature

Will this change the current api? How?

1 will not change the API; 2 will change the docs.

Who will benefit from this feature?

Everyone

TensorBoard callback does not collect data per batch (accidentally removed)

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Debian Stable
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 2.8.0
Python version: 3.9

Describe the problem.

According to https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/TensorBoard, the value update_freq=N should collect logs every N batches. However, no logs are generated after batches, only after an epoch.

Describe the current behavior.

No logs are generated after N training batches.

Describe the expected behavior.

Logs should be generated after N training batches.

Details

The feature for collecting batch_* summaries was removed in keras-team/keras@7d06227 -- see how write_scalar_summaries was removed from keras/engine/training.py.

It should be enough to just revert the given commit.

Keras changes names of tensorflow operators

Hello,
When I created a keras model, the name of the layers were modified comparing to the tensorflow operators. Moreover, the naming in tensorflow seems not work as expected.
I have an example code to reproduce the issue. I installed tensorflow 2.7 and keras 2.7 on a windows 10 machine (version 21H1, build 19043.1348). I expected the operators/layers named as "Space2Depth", "Multiplication", "Depth2Space", but it's not the case.
Can you have a look on this issue? I also open an issue in the tensorflow github: tensorflow/tensorflow#53045
Thank you very much

import tensorflow as tf

def sample_network(input_layer):
    s2d = tf.nn.space_to_depth(input_layer, block_size=2, name="Space2Depth")
    mul = tf.multiply(s2d, 10.0, name="Multiplication")
    d2s = tf.nn.depth_to_space(mul, block_size=2, name="Depth2Space")
    print(s2d.name, d2s.name, mul.name)
    return d2s


if __name__ == "__main__":

    input_net = tf.keras.Input(shape=(64, 64, 3), dtype=tf.float32, name="inputLayer")
    output = sample_network(input_net)
    model = tf.keras.Model(inputs=input_net, outputs=output)
    for layer in model.layers:
        print("keras:", layer.name)

It prints out:

tf.nn.space_to_depth/SpaceToDepth:0 tf.nn.depth_to_space/DepthToSpace:0 tf.math.multiply/Mul:0
keras: inputLayer
keras: tf.nn.space_to_depth
keras: tf.math.multiply
keras: tf.nn.depth_to_space

Cannot load back model with no-op Concatenate layer

Please go to TF Forum for help and support:

https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh

If you open a GitHub issue, here is our policy:

It must be a bug, a feature request, or a significant problem with the documentation (for small docs fixes please send a PR instead).
The form below must be filled out.

Here's why we have that policy:.

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS Big Sur 11.6
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): v2.6.0-rc2-32-g919f693420e 2.6.0
Python version: 3.9.7
Bazel version (if compiling from source): N/A
GPU model and memory: N/A
Exact command to reproduce:

You can collect some of this information using our environment capture script:

You can obtain the TensorFlow version with:
python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"

Describe the problem.

When I create a simple model with a dummy Concatenate layer (i.e. the concatenation receives one single element), I am able to save it successfully, but the subsequent model loading fails.

Describe the current behavior.

Loading a trained model fails.

Describe the expected behavior.

The model loading should finish without errors.

Do you want to contribute a PR? (yes/no): No
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing): N/A

Standalone code to reproduce the issue.

Provide a reproducible test case that is the bare minimum necessary to generate
the problem. If possible, please share a link to Colab/Jupyter/any notebook.

import tensorflow as tf

if __name__ == "__main__":
    input_layer = tf.keras.Input(shape=[100])
    dense_layer = tf.keras.layers.Dense(1)(input_layer)
    concatenate_layer = tf.keras.layers.Concatenate()([dense_layer])
    model = tf.keras.Model([input_layer], [concatenate_layer])
    model.compile(optimizer="adam", loss="mean_absolute_error")
    model.save("model.h5")
    loaded_model = tf.keras.models.load_model("model.h5")

Source code / logs.

Full traceback:

Traceback (most recent call last):
  File "/Users/stefan/workspace/tierra/bug.py", line 10, in <module>
    loaded_model = tf.keras.models.load_model("model.h5")
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/saving/save.py", line 200, in load_model
    return hdf5_format.load_model_from_hdf5(filepath, custom_objects,
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/saving/hdf5_format.py", line 180, in load_model_from_hdf5
    model = model_config_lib.model_from_config(model_config,
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/saving/model_config.py", line 52, in model_from_config
    return deserialize(config, custom_objects=custom_objects)
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/layers/serialization.py", line 208, in deserialize
    return generic_utils.deserialize_keras_object(
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/utils/generic_utils.py", line 674, in deserialize_keras_object
    deserialized_obj = cls.from_config(
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/engine/functional.py", line 662, in from_config
    input_tensors, output_tensors, created_layers = reconstruct_from_config(
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/engine/functional.py", line 1283, in reconstruct_from_config
    process_node(layer, node_data)
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/engine/functional.py", line 1231, in process_node
    output_tensors = layer(input_tensors, **kwargs)
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/engine/base_layer.py", line 976, in __call__
    return self._functional_construction_call(inputs, args, kwargs,
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/engine/base_layer.py", line 1114, in _functional_construction_call
    outputs = self._keras_tensor_symbolic_call(
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/engine/base_layer.py", line 848, in _keras_tensor_symbolic_call
    return self._infer_output_signature(inputs, args, kwargs, input_masks)
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/engine/base_layer.py", line 886, in _infer_output_signature
    self._maybe_build(inputs)
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/engine/base_layer.py", line 2659, in _maybe_build
    self.build(input_shapes)  # pylint:disable=not-callable
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/utils/tf_utils.py", line 259, in wrapper
    output_shape = fn(instance, input_shape)
  File "/Users/stefan/workspace/tierra/.env/lib/python3.9/site-packages/keras/layers/merge.py", line 489, in build
    raise ValueError('A `Concatenate` layer should be called '
ValueError: A `Concatenate` layer should be called on a list of at least 1 input.

Memory leak in saving and loading a keras model containing CategoricalEncoding and Lookup layers

System information

Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
OS Platform and Distribution: CentOS Linux release 7.9.2009 (Core)
TensorFlow installed from: binary
TensorFlow version: 2.6.0 (v2.6.0-rc2-32-g919f693420e), 2.7.0 (v2.7.0-rc1-69-gc256c071bb2)
Python version: 3.7.11, 3.8.12, 3.9.6
Bazel version: not applicable
GPU model and memory: no GPU available
Exact command to reproduce: See code below

Description of the problem

To solve a binary classification problem, I have a keras model that processes categorical input (as as well as numeric input).
I need to save (model.save) and load (tf.keras.models.load_model) the model multiple times (performig training of the model inbetween).

I expect that the model consumes constant disk space and constant RAM everytime I load the model since the architecture does not change (only the parameter values change).
This does not happen when the model contains an IntegerLookup layer followed by a CategoryEncoding layer.

The issue can be reproduced without training the model at all.
Here is a minimal code example that creates a model and saves it to disk:

import tensorflow as tf
import numpy as np

input_layer = tf.keras.Input(shape=(1,), dtype="int32") 
index = tf.keras.layers.IntegerLookup(max_values=2)
index.adapt(np.array(range(2)))
encoder = tf.keras.layers.CategoryEncoding(max_tokens=index.vocab_size())
encoded_layer = encoder(index(input_layer))
output_layer = tf.keras.layers.Dense(2, activation=tf.keras.activations.softmax)(encoded_layer)
model = tf.keras.Model(input_layer, output_layer)
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy")
model.save("model")

Every time I execute

model = tf.keras.models.load_model("model")
model.save("model")

the space that the model consumes on disk increases by approx. 8 kB.
The even worse: When I load the model, the RAM useage increases by approx. 9 MB in each iteration.
So after 100 iterations, the model needs approx. 1 MB on disk and 950 MB RAM (which is problematic).

This also happens if I start a new python process in each iteration.

In my application, the memory consumption grows even faster because the model has several input layers and also several inner layers.
This makes the model unusable after some iterations because I cannot load it anymore.

Additionaly, of course, the load and save cycles are getting slower with each repetition.

So far, I could reproduce this issue on tensorflow versions 2.6 and 2.7 running on python 3.7, 3.8 or 3.9. The behavior is identical.

I originally posted this problem as a tensorflow issue.

Model stops training with variable-size dataset

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): yes, but very simple case
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Colab
TensorFlow installed from (source or binary): Colab default
TensorFlow version (use command below): v2.7.0-0-gc256c071bb2 2.7.0
Python version: 3
Bazel version (if compiling from source): no
GPU model and memory: no
Exact command to reproduce: https://colab.research.google.com/drive/1fY4v9WBRxfsywDyKKidu-lmFpaPdAn9D?usp=sharing

Describe the problem.

In real case I use tf.data.Dataset (based on tensorflow_datasets) instance to train model.
One big difference from default examples of keras.Model.fit + Dataset is unknown (variable) dataset length.
In my case dataset length is variable (+- 20%) because i make some random augmentations with filtering out some of them. See provided colab link to see what i mean.

As result when first epoch is finished (dataset has reached OutOfRangeError), keras remembers current step an if the same dataset on the next epoch has smaller length, all model training will be stopped.

Describe the current behavior.
Model stops training if second/third/etc dataset iterator has length smaller then first one.

Describe the expected behavior.
Model should not stop training. It can print warning, but not stop it.

Do you want to contribute a PR? (yes/no): no

Standalone code to reproduce the issue.
https://colab.research.google.com/drive/1fY4v9WBRxfsywDyKKidu-lmFpaPdAn9D?usp=sharing

Source code / logs.

model.fit(dataset, epochs=100)

# Epoch 1/15
# 819/819 [==============================] - 2s 1ms/step - loss: 1.3987
# Epoch 2/15
# 819/819 [==============================] - 1s 1ms/step - loss: 1.0563
# Epoch 3/15
# 819/819 [==============================] - 1s 1ms/step - loss: 1.0262
# Epoch 4/15
# 819/819 [==============================] - 1s 1ms/step - loss: 1.0156
# Epoch 5/15
# 782/819 [===========================>..] - ETA: 0s - loss: 1.0146WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 12285 batches). You may need to use the repeat() function when building your dataset.
# 819/819 [==============================] - 1s 1ms/step - loss: 1.0161

Cannot Build Intermediate Model to Nested Layers

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04
TensorFlow installed from (source or binary): pip
TensorFlow version (use command below): v2.4.0-49-g85c8b2a817f 2.4.1 (but also tested on 2.8.0)
Python version: 3.8
GPU model and memory: NVIDIA GeForce RTX 2080 SUPER
Exact command to reproduce: see code below

Describe the problem.
Note that I have previously reported this issue here for TF2.0. Back then the tensorflow team suggested a solution that worked under 2.0 but now does not work anymore.

Here is the problem: Using the functional API one can build an intermediate model starting and ending at any of the original models layers. This however does not work when layers are encapsulated in an inner model (lets say, some tf.keras.Sequential). The graph will differ due to the additional Input layer, but the computations should be the same. However, when trying to build intermediate model of a nested model up to an inner layer, a "Graph disconnected" error is thrown (see below). Previously, one could circumvent this by not building to final_model.get_layer("inner_model").get_layer("id_1").output but final_model.get_layer("inner_model").get_layer("id_1").get_output_at(1) (full example see below).

Standalone code to reproduce the issue.

import os
import tensorflow as tf
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

# NOT NESTED
inp = tf.keras.Input((4,))
y = tf.keras.layers.Dense(4, name="od_1")(inp)
y = tf.keras.layers.Dense(2, name="od_2")(y)
y = tf.keras.layers.Dense(4, name="id_1")(y)
y = tf.keras.layers.Dense(10, name="od_3")(y)
y = tf.keras.layers.Dense(10, name="od_4")(y)
final_model = tf.keras.Model(inputs=[inp], outputs=[y])
final_model.summary()

sub_model = tf.keras.Model(inputs=[final_model.input], outputs=[final_model.get_layer("id_1").output])
sub_model.summary()

# NESTED
inp_1 = tf.keras.Input(shape=(2,))
x = tf.keras.layers.Dense(4, name="id_1")(inp_1)
inner_model = tf.keras.Model(inputs=[inp_1], outputs=[x], name="inner_model")

inp_outer = tf.keras.Input((4,))
y = tf.keras.layers.Dense(4, name="od_1")(inp_outer)
y = tf.keras.layers.Dense(2, name="od_2")(y)
y = inner_model(y)
y = tf.keras.layers.Dense(10, name="od_3")(y)
y = tf.keras.layers.Dense(10, name="od_4")(y)
final_model = tf.keras.Model(inputs=[inp_outer], outputs=[y])
final_model.summary()

sub_model = tf.keras.Model(inputs=[final_model.input], outputs=[final_model.get_layer("inner_model").get_layer("id_1").output])
previously_working_sub_model = tf.keras.Model(
    inputs=[final_model.input],
    outputs=[final_model.get_layer("inner_model").get_layer("id_1").get_output_at(1)])

This throws ValueError: Asked to get output at node 1, but the layer has only 1 inbound nodes. whereas only the sub_model line throws ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, 2), dtype=tf.float32, name='input_2'), name='input_2', description="created by layer 'input_2'") at layer "id_1". The following previous layers were accessed without issue: []

Expected behavior.
To allow for accessing intermediate activations, it is crucial to be able to build intermediate models to (and preferably from) anywhere within the model.

Tokenizer converts padding integers to OOV when oov_token is not None

System information.

TensorFlow version (you are using): 2.4.1 (also re-produced in 2.6)
Are you willing to contribute it (Yes/No) : No

Describe the feature and the current behavior/state.

When used with padding, Tokenizer.sequences_to_texts() converts padding tokens to oov_token when oov_token is not None. This does not happen when oov_token = None, so sequences_to_texts() function skips padding integers as well as oov integers.

This behaviour is perhaps expected since padding value is not part of the vocabulary.
However I think it would make more sense if sequences_to_texts() function takes an optional padding_value argument and does not encode back these integers as oov_token.

To produce:

import tensorflow as tf

vocab_size = 5
seq_len = 5

text = "hello world test"
oov_token = "<OOV>"

tokenizer = tf.keras.preprocessing.text.Tokenizer(num_words = vocab_size, oov_token = oov_token)
tokenizer.fit_on_texts([text])

tokenized = tokenizer.texts_to_sequences([text])
padded = tf.keras.preprocessing.sequence.pad_sequences(tokenized, maxlen = seq_len, value = 0)

print('Non padded tokenization result:', tokenized)
print("Non padded de-tokenization result:", tokenizer.sequences_to_texts(tokenized))
print("\n")
print('Padded tokenization result:', padded)
print("Padded de-tokenization result:", tokenizer.sequences_to_texts(padded))

Non padded tokenization result: [[2, 3, 4]]
Non padded de-tokenization result: ['hello world test']

Padded tokenization result: [[0 0 2 3 4]]
Padded de-tokenization result: ['<OOV> <OOV> hello world test']

What it will de-tokenize to with this feature implemented:
Feature implemented padded de-tokenization result: ['hello world test']

Will this change the current api? How?
tf.keras.preprocessing.text.Tokenizer.sequences_to_texts() function will take an optional padding_value argument, which is None by default.

Who will benefit from this feature?
Those who use tf.keras.text.Tokenizer to tokenize strings with padding, and de-tokenize padded sequences to words.

Do you want to contribute a PR? (yes/no): No

Incorrect time per step estimation when fitting model

Hi there,

System information.
I observed the behavior both in a Colab notebook (TF v2.8.0-0-g3f878cff5b6 2.8.0) and in a custom Docker image (Ubuntu 20.04, TF v2.5.1-97-g957590ea15c 2.5.2). The issue can easily be reproduced by using validation sets of different sizes and I made an example notebook (see below).

Describe the problem.
The time/step reported when calling fit is not the time per training step. The phrasing can be misleading, for instance when trying to design your training scheme based on time constraints.

Describe the current behavior.
Currently, the time for a full epoch (including validation) divided by the number of training steps is reported. If the validation takes a significant amount of time, the time per training step might be way smaller than reported.

Describe the expected behavior.
I feel that reporting the time per training step (excluding validation) would be more informative.
For instance for the following output (from the Colab notebook linked below):

Epoch 1/3
5/5 [==============================] - 2s 613ms/step - loss: 0.3212 - val_loss: 0.1224

I would expect that using a single step instead of 5 would make each epoch take around 613ms. However each epoch would still take 2s as most of it is spent on the validation set.

Standalone code to reproduce the issue.

See this colab notebook:
https://colab.research.google.com/drive/1YGWstYcbFwkPY4ezZ2C-krDPahS36nnm?usp=sharing

Cheers.

Can not run all Keras tests successfully.

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04.3 LTS
TensorFlow installed from (source or binary): Got the TF docker container tensorflow/tensorflow:2.8.0-gpu
TensorFlow version (use command below): v2.8.0-rc1-32-g3f878cff5b6 2.8.0
Python version: 3.8.10
Bazel version (if compiling from source): 4.2.1 (used for running Keras unit tests)
GPU model and memory: Two GPUs. Both are Tesla P100-PCIE with 16GB memory
Exact command to reproduce: bazel test (more information is described below)

Describe the problem.
When I am trying to run the Keras unit-tests, one test fails. That unit test is //keras/distribute:minimize_loss_test_gpu

Describe the current behavior.

......
//keras/optimizer_v2:rmsprop_test_gpu                                    PASSED in 25.5s
//keras/tests:saver_test_gpu                                             PASSED in 9.1s
//keras/utils:multi_gpu_utils_test_gpu                                   PASSED in 16.3s
//keras/distribute:minimize_loss_test_gpu                                FAILED in 360.7s
  /root/.cache/bazel/_bazel_root/0b555d6a82cf650cedde1ae5c5212680/execroot/org_keras/bazel-out/k8-opt/testlogs/keras/distribute/minimize_loss_test_gpu/test.log

Executed 72 out of 72 tests: 71 tests pass and 1 fails locally.

Describe the expected behavior.
All tests should pass successfully. Am I missing something in launching tests and setting the environemnt?

Standalone code to reproduce the issue.
Run the following in a container from tensorflow/tensorflow:2.8.0-gpu

set -eux
pip3 uninstall keras
git clone -b r2.8 https://github.com/keras-team/keras.git
cd keras
sed -i "s/tf-nightly/#tf-nightly/g" requirements.txt
pip3 install -r requirements.txt
TF_TESTS_PER_GPU=1
N_BUILD_JOBS=$(grep -c ^processor /proc/cpuinfo)
N_TEST_JOBS=1
bazel test \
      --jobs=${N_BUILD_JOBS} \
      --local_test_jobs=${N_TEST_JOBS} \
      --test_output=errors \
      --test_sharding_strategy=disabled \
      --test_timeout 300,450,1200,3600 \
      --test_output=errors \
      --keep_going \
      --define=use_fast_cpp_protos=false \
      --build_tests_only \
      --build_tag_filters=-no_oss \
      --test_tag_filters=gpu,-no_oss,-oss_serial,-no_rocm,-no-gpu,-benchmark-test,-v1only \
      keras/...

Source code / logs.
Part of the /root/.cache/bazel/_bazel_root/0b555d6a82cf650cedde1ae5c5212680/execroot/org_keras/bazel-out/k8-opt/testlogs/keras/distribute/minimize_loss_test_gpu/test.log.

[ RUN      ] MinimizeLossStepTest.testRunStepsWithOutputContext_test_distribution_Mirrored2GPUsNoMergeCall_optimizerfn_AdagradV1_mode_graph_istpu_False
2022-02-09 05:30:19.457428: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /device:GPU:0 with 15401 MB memory:  -> device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:86:00.0, compute capability: 6.0
2022-02-09 05:30:19.457729: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /device:GPU:1 with 15401 MB memory:  -> device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:af:00.0, compute capability: 6.0
INFO:tensorflow:Using MirroredStrategy with devices ('/replica:0/task:0/device:GPU:0', '/replica:0/task:0/device:GPU:1')
I0209 05:30:19.461880 139974298908480 mirrored_strategy.py:374] Using MirroredStrategy with devices ('/replica:0/task:0/device:GPU:0', '/replica:0/task:0/device:GPU:1')
2022-02-09 05:30:19.557960: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 15401 MB memory:  -> device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:86:00.0, compute capability: 6.0
2022-02-09 05:30:19.558183: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 15401 MB memory:  -> device: 1, name: Tesla P100-PCIE-16GB, pci bus id: 0000:af:00.0, compute capability: 6.0
2022-02-09 05:30:19.582256: W tensorflow/core/grappler/utils/graph_view.cc:836] No registered 'MultiDeviceIteratorFromStringHandle' OpKernel for GPU devices compatible with node {{node MultiDeviceIteratorFromStringHandle}}
	.  Registered:  device='CPU'

2022-02-09 05:30:19.583421: W tensorflow/core/grappler/utils/graph_view.cc:836] No registered 'MultiDeviceIteratorGetNextFromShard' OpKernel for GPU devices compatible with node {{node MultiDeviceIteratorGetNextFromShard}}
	.  Registered:  device='CPU'

2022-02-09 05:30:19.595629: W tensorflow/core/grappler/utils/graph_view.cc:836] No registered 'MultiDeviceIteratorFromStringHandle' OpKernel for GPU devices compatible with node {{node MultiDeviceIteratorFromStringHandle}}
	.  Registered:  device='CPU'

2022-02-09 05:30:19.596083: W tensorflow/core/grappler/utils/graph_view.cc:836] No registered 'MultiDeviceIteratorGetNextFromShard' OpKernel for GPU devices compatible with node {{node MultiDeviceIteratorGetNextFromShard}}
	.  Registered:  device='CPU'

INFO:tensorflow:Collective all_reduce tensors: 2 all_reduces, num_devices = 2, group_size = 2, implementation = CommunicationImplementation.NCCL, num_packs = 1
I0209 05:30:19.736172 139974298908480 cross_device_ops.py:1152] Collective all_reduce tensors: 2 all_reduces, num_devices = 2, group_size = 2, implementation = CommunicationImplementation.NCCL, num_packs = 1
INFO:tensorflow:Collective all_reduce tensors: 1 all_reduces, num_devices = 2, group_size = 2, implementation = CommunicationImplementation.RING, num_packs = 1
I0209 05:30:19.804530 139974298908480 cross_device_ops.py:1152] Collective all_reduce tensors: 1 all_reduces, num_devices = 2, group_size = 2, implementation = CommunicationImplementation.RING, num_packs = 1
INFO:tensorflow:Collective all_reduce tensors: 1 all_reduces, num_devices = 2, group_size = 2, implementation = CommunicationImplementation.RING, num_packs = 1
I0209 05:30:19.831649 139974298908480 cross_device_ops.py:1152] Collective all_reduce tensors: 1 all_reduces, num_devices = 2, group_size = 2, implementation = CommunicationImplementation.RING, num_packs = 1
INFO:tensorflow:Collective all_reduce tensors: 1 all_reduces, num_devices = 2, group_size = 2, implementation = CommunicationImplementation.RING, num_packs = 1
I0209 05:30:19.884937 139974298908480 cross_device_ops.py:1152] Collective all_reduce tensors: 1 all_reduces, num_devices = 2, group_size = 2, implementation = CommunicationImplementation.RING, num_packs = 1
2022-02-09 05:30:20.068069: E tensorflow/core/common_runtime/base_collective_executor.cc:249] BaseCollectiveExecutor::StartAbort INTERNAL: NCCL: unhandled cuda error. Set NCCL_DEBUG=WARN for detail.
2022-02-09 05:30:20.068157: W tensorflow/core/nccl/nccl_manager.cc:858] NcclManager already aborted, ignoring subsequent StartAbort with CANCELLED: op cancelled
INFO:tensorflow:time(__main__.MinimizeLossStepTest.testRunStepsWithOutputContext_test_distribution_Mirrored2GPUsNoMergeCall_optimizerfn_AdagradV1_mode_graph_istpu_False): 0.62s
I0209 05:30:20.068961 139974298908480 test_util.py:2373] time(__main__.MinimizeLossStepTest.testRunStepsWithOutputContext_test_distribution_Mirrored2GPUsNoMergeCall_optimizerfn_AdagradV1_mode_graph_istpu_False): 0.62s
[  FAILED  ] MinimizeLossStepTest.testRunStepsWithOutputContext_test_distribution_Mirrored2GPUsNoMergeCall_optimizerfn_AdagradV1_mode_graph_istpu_False

Inconsistent behavior of tf.keras.losses.binary_crossentropy

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): no
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Debian testing
TensorFlow installed from (source or binary): source
TensorFlow version (use command below): 2.7
Python version: 3.9
Bazel version (if compiling from source): 4.1
GPU model and memory: NA
Exact command to reproduce:

Describe the problem.
tf.keras.losses.binary_crossentropy behaves inconsistently when broadcasting is used.

Describe the current behavior.
Following is an example that works:

y = tf.random.uniform((10, 1))
tf.keras.losses.binary_crossentropy(0.5, y)

Whereas this one fails:

tf.keras.losses.binary_crossentropy(0.5, y, from_logits=True)

The reason is that the latter calls internally tf.nn.sigmoid_cross_entropy_with_logits that does not support broadcasting.

Describe the expected behavior.
I would expect seamless broadcasting in both cases.

Using the same seed kwarg returns different values between GlorotUniform and GlorotUniformV2

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04.5 LTS
TensorFlow installed from (source or binary): Pip (binary)
TensorFlow version (use command below): v2.6.0-0-g919f693420e 2.6.0
Python version: 3.7.12
Bazel version (if compiling from source):
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version: NA
GPU model and memory: NA
Exact command to reproduce: Colab notebook

Describe the problem.
The random values got by GlorotUniform are different between the V1 and V2 APIs using the same method seed, as a consequence, it is not possible to get the same exact results obtained by the V1 version in V2.

Describe the current behavior.
Setting the operation seed returns different tensors when using GlorotUniform when imported from tf.compat.v1 and tensorflow.keras.initializers

Describe the expected behavior.
Both tensors must be equal

Do you want to contribute a PR? (yes/no): No
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing): NA

Standalone code to reproduce the issue.
Colab link

Source code / logs.
NA

P.S: Submitting bug here in the Keras repo due to comments in the TF repo, specifically: comment in issue 52294 in TF

At `build()` only two last axis are numerics

Hello,

I created weights at build():

def build(self, input_shape):
        super(PositionalEmbedding, self).build(input_shape)

        print(input_shape)
        self.position = self.add_weight(
            name="position",
            shape=(1, input_shape[1], input_shape[2], self.units),
            initializer=TruncatedNormal(stddev=0.02),
            trainable=True,
        )

The input's shape: (64, 7, 25, 81) # (batches, timesteps, patches, features)
input_shape of build(): (None, None, 25, 81) # (batches, timesteps, patches, features)

Error

ValueError: in user code:

    File "/Users/martin/miniforge3/lib/python3.9/site-packages/keras/engine/training.py", line 1021, in train_function  *
        return step_function(self, iterator)
    File "/Users/martin/miniforge3/lib/python3.9/site-packages/keras/engine/training.py", line 1010, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/Users/martin/miniforge3/lib/python3.9/site-packages/keras/engine/training.py", line 1000, in run_step  **
        outputs = model.train_step(data)
    File "/var/folders/7v/fqqcktvs23qc8fwgftjpz_gh0000gn/T/ipykernel_6464/643090246.py", line 97, in train_step
        y_pred, _ = self([inputs, targets_inputs], training=True)
    File "/Users/martin/miniforge3/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
        raise e.with_traceback(filtered_tb) from None

    ValueError: Exception encountered when calling layer "transformer_111" (type Transformer).
    
    in user code:
    
        File "/var/folders/7v/fqqcktvs23qc8fwgftjpz_gh0000gn/T/ipykernel_6464/643090246.py", line 52, in call  *
            x_e = self.pos_embs_0(x_e, training=training)
        File "/Users/martin/miniforge3/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler  **
            raise e.with_traceback(filtered_tb) from None
        File "/var/folders/7v/fqqcktvs23qc8fwgftjpz_gh0000gn/T/ipykernel_6464/3588400932.py", line 14, in build
            self.position = self.add_weight(
    
        ValueError: Can't convert Python sequence with mixed types to Tensor.
    
    
    Call arguments received:
      • inputs=['tf.Tensor(shape=(None, None, 25, 81), dtype=float32)']
      • training=True

dtype of RNN cell's state is changed to tf.float32 during reset_states

Have I written custom code (as opposed to using a stock example script
provided in Keras): yes
OS Platform and Distribution: both win10 and CentOS Linux
TensorFlow installed from: pip
TensorFlow version: 2.6.0
Python version: 3.8
Exact command to reproduce: tf.keras.layers.RNN(cell)

Describe the problem.

I have implemented a recurrent cell which is to be wrapped within a tf.keras.layers.RNN. The cell has a state whose data type is not tf.float32 but tf.complex64. However, each time when layer.reset_states() is invoked, the data type of the state is changed to tf.float32. As a result, a value error is thrown during the initial symbolic call. See attached stack trace.

Describe the current behavior.
The programm crashes at the construction of the RNN layer. See attached stack trace.

I assume, a reason for this issue is line 933, 934 in function reset_states in class RNN in file keras/layers/recurrent.py

      flat_states_variables = tf.nest.map_structure(
          backend.variable, flat_init_state_values)

Here, the initialized state values are stored in flat_init_state_values and backend.variable is called on each of the states. However, no dtype argument is passed to backend.variable. As a consequence it defaults to tf.float32 for all states. T
I would recommend the following patch, which solves the issue for me

      flat_states_variables = tf.nest.map_structure(
    lambda var: backend.variable(var, var.dtype), flat_init_state_values)

I also tried to run the example after replacing the files affected by the latest commit regarding mixed precision. Unfortunately it did not solve the issue for me

Standalone code to reproduce the issue.

Currently, the example fails at the construction of the RNN layer.

import tensorflow as tf

class RecurrentCell(tf.keras.layers.Layer):
    def __init__(self, state_size):
        super(RecurrentCell, self).__init__()
        self.state_size = state_size

    def build(self, input_shape):
        super(RecurrentCell, self).build(input_shape)

    def get_initial_state(self, inputs=None, batch_size=None, dtype=None):
        # explicit initialization with tf.complex64
        return tf.zeros((self.state_size, ), dtype=tf.complex64)

    @tf.function
    def call(self, inputs, states):
        # toy example
        x = inputs
        xfd = tf.signal.rfft(x)[..., :self.state_size]
        yfd = tf.multiply(xfd, states)
        return tf.signal.irfft(yfd), states


recCell = RecurrentCell(state_size=5)

inp = tf.keras.Input(shape=(None, 8),
                     batch_size=32)
out = tf.keras.layers.RNN(recCell,  # crashes
                          return_sequences=True,
                          stateful=True,
                          return_state=False)(inp)
model = tf.keras.Model(inputs=[inp], outputs=[out])

y = model.predict(tf.random.normal((32, 16, 8)))

Source code / Logs
Stacktrace:
stacktrace.txt

Models with custom metrics cannot be loaded from trace

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): YES
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10 + google-colab (???)
TensorFlow installed from (source or binary): Win10: binary (pip install) / google-colab: pre-installed (presumably also from binary)
TensorFlow version (use command below): Win 10: 2.7.1 / google-colab: 2.8.0
Python version: Win 10: 3.8.10 / google-colab: 3.7.12
GPU model and memory: N/A
Exact command to reproduce: N/A

Problem:
Models using a custom metric cannot be loaded after being saved to disk.
The Keras serialization guide seems to indicate that all custom layers/objects are saved to the TensorFlow SavedModel format (emphasis mine):

SavedModel is the more comprehensive save format that saves the model architecture, weights, and the traced Tensorflow subgraphs of the call functions. This enables Keras to restore both built-in layers as well as custom objects.

That seems to be correct for a lot of custom objects, but I can't get this to work with custom metrics. If trying to load such a model without supplying it as a custom object, I get the following error:

ValueError: Unable to restore custom object of type _tf_keras_metric. Please make sure that any custom layers are included in the custom_objects arg when calling load_model() and make sure that all layers implement get_config and from_config.

Further down in the docs there's a brief statement about the SavedModel limitations, but that doesn't mention custom metrics.

Describe the current behavior.
Keras throws a ValueError when loading a model with custom metrics, if not specified as custom object.
I know I can circumvent this by supplying custom_objects to the load function, but to me this issue is more that it is currently unclear to me from the documentation which kind of custom objects (or under what conditions) loading from traces is actually support and which not. If anyone can shed some light on that, please do so!

Describe the expected behavior.
Either this is a bug that must be fixed, or the documentation is not correct. A full breakdown of which custom objects (or in what conditions) can be loaded from traces and which not would be helpful.

Contributing.

Do you want to contribute a PR? (yes/no): no
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing):

Standalone code to reproduce the issue.
https://colab.research.google.com/drive/1XK4HJq52Zhf-ekLNOKGoM3Gk9JJBColL?usp=sharing

model doesn't learn when using shuffle='batch'

Describe the problem.

Moving this issue from tensorflow/tensorflow#45197

Describe the current behavior.
Becasue I train my model with data from HDF5 files, so I used model.fit(, , , shuffle='batch').

https://www.tensorflow.org/api_docs/python/tf/keras/Sequential:
"shuffle | Boolean (whether to shuffle the training data before each epoch) or str (for 'batch'). This argument is ignored when x is a generator. 'batch' is a special option for dealing with the limitations of HDF5 data; it shuffles in batch-sized chunks. Has no effect when steps_per_epoch is not None."

Before I upgrade my hardware, I used tensorflow 1.x as Keras 2.x's backend. My model used shuffle='batch' without any problems. Now, I have a new machine, so I need to transfer my codes. However, the new code doesn't work anymore.

Describe the expected behavior
I used MNIST dataset to show what happended: Code from (https://www.machinecurve.com/index.php/2020/04/13/how-to-use-h5py-and-keras-to-train-with-data-from-hdf5-files/)

Describe the expected behavior.
I used MNIST dataset to show what happended: Code from (https://www.machinecurve.com/index.php/2020/04/13/how-to-use-h5py-and-keras-to-train-with-data-from-hdf5-files/)

Do you want to contribute a PR? (yes/no):
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing):

Standalone code to reproduce the issue.

import h5py
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D
from tensorflow.keras.losses import sparse_categorical_crossentropy
from tensorflow.keras.optimizers import Adam

# Model configuration
batch_size = 50
img_width, img_height, img_num_channels = 28, 28, 1
loss_function = sparse_categorical_crossentropy
no_classes = 10
no_epochs = 25
optimizer = Adam()
validation_split = 0.2
verbosity = 1

# Load MNIST data
f = h5py.File('train.hdf5', 'r')
input_train = f['image'][...]
label_train = f['label'][...]
f.close()
f = h5py.File('test.hdf5', 'r')
input_test = f['image'][...]
label_test = f['label'][...]
f.close()

# Reshape data
input_train = input_train.reshape((len(input_train), img_width, img_height, img_num_channels))
input_test  = input_test.reshape((len(input_test), img_width, img_height, img_num_channels))

# Determine shape of the data
input_shape = (img_width, img_height, img_num_channels)

# Create the model
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(no_classes, activation='softmax'))

# Display a model summary
model.summary()

# Compile the model
model.compile(loss=loss_function,
              optimizer=optimizer,
              metrics=['accuracy'])

# Fit data to model
history = model.fit(input_train, label_train,
            batch_size=batch_size,
            epochs=no_epochs,
            verbose=verbosity,shuffle='batch',
            validation_split=validation_split)

# Generate generalization metrics
score = model.evaluate(input_test, label_test, verbose=0)
print(f'Test loss: {score[0]} / Test accuracy: {score[1]}')


**Other info / logs** Include any logs or source code that would be helpful to
The output is like this 
Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_9 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 24, 24, 64)        18496     
_________________________________________________________________
conv2d_11 (Conv2D)           (None, 22, 22, 128)       73856     
_________________________________________________________________
flatten_3 (Flatten)          (None, 61952)             0         
_________________________________________________________________
dense_6 (Dense)              (None, 128)               7929984   
_________________________________________________________________
dense_7 (Dense)              (None, 10)                1290      
=================================================================
Total params: 8,023,946
Trainable params: 8,023,946
Non-trainable params: 0
_________________________________________________________________
Epoch 1/25
960/960 [==============================] - 3s 3ms/step - loss: 6.4279 - accuracy: 0.1099 - val_loss: 2.3022 - val_accuracy: 0.1060
Epoch 2/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3012 - accuracy: 0.1141 - val_loss: 2.3012 - val_accuracy: 0.1060
Epoch 3/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3011 - accuracy: 0.1149 - val_loss: 2.3020 - val_accuracy: 0.1060
Epoch 4/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3010 - accuracy: 0.1142 - val_loss: 2.3021 - val_accuracy: 0.1060
Epoch 5/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3010 - accuracy: 0.1141 - val_loss: 2.3019 - val_accuracy: 0.1060
Epoch 6/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3010 - accuracy: 0.1162 - val_loss: 2.3019 - val_accuracy: 0.1060
Epoch 7/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3012 - accuracy: 0.1139 - val_loss: 2.3020 - val_accuracy: 0.1060
Epoch 8/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3013 - accuracy: 0.1128 - val_loss: 2.3025 - val_accuracy: 0.1060
Epoch 9/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3013 - accuracy: 0.1131 - val_loss: 2.3020 - val_accuracy: 0.1060
Epoch 10/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3011 - accuracy: 0.1156 - val_loss: 2.3021 - val_accuracy: 0.1060
Epoch 11/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3013 - accuracy: 0.1127 - val_loss: 2.3022 - val_accuracy: 0.1060
Epoch 12/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3010 - accuracy: 0.1143 - val_loss: 2.3024 - val_accuracy: 0.1060
Epoch 13/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3012 - accuracy: 0.1131 - val_loss: 2.3025 - val_accuracy: 0.1060
Epoch 14/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3009 - accuracy: 0.1148 - val_loss: 2.3019 - val_accuracy: 0.1060
Epoch 15/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3012 - accuracy: 0.1152 - val_loss: 2.3019 - val_accuracy: 0.1060
Epoch 16/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3009 - accuracy: 0.1149 - val_loss: 2.3020 - val_accuracy: 0.1060
Epoch 17/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3009 - accuracy: 0.1143 - val_loss: 2.3020 - val_accuracy: 0.1060
Epoch 18/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3014 - accuracy: 0.1125 - val_loss: 2.3022 - val_accuracy: 0.1060
Epoch 19/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3011 - accuracy: 0.1147 - val_loss: 2.3019 - val_accuracy: 0.1060
Epoch 20/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3011 - accuracy: 0.1144 - val_loss: 2.3020 - val_accuracy: 0.1060
Epoch 21/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3013 - accuracy: 0.1128 - val_loss: 2.3022 - val_accuracy: 0.1060
Epoch 22/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3012 - accuracy: 0.1122 - val_loss: 2.3024 - val_accuracy: 0.1060
Epoch 23/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3003 - accuracy: 0.1163 - val_loss: 2.3021 - val_accuracy: 0.1060
Epoch 24/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3011 - accuracy: 0.1151 - val_loss: 2.3021 - val_accuracy: 0.1060
Epoch 25/25
960/960 [==============================] - 3s 3ms/step - loss: 2.3012 - accuracy: 0.1131 - val_loss: 2.3021 - val_accuracy: 0.1060
Test loss: 2.3010358810424805 / Test accuracy: 0.11349999904632568


If I changed shuffle='batch' to shuffle=True or shuffle=False
I got convergent results like this

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 64)        18496     
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 22, 22, 128)       73856     
_________________________________________________________________
flatten (Flatten)            (None, 61952)             0         
_________________________________________________________________
dense (Dense)                (None, 128)               7929984   
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1290      
=================================================================
Total params: 8,023,946
Trainable params: 8,023,946
Non-trainable params: 0
_________________________________________________________________
Epoch 1/25
960/960 [==============================] - 5s 3ms/step - loss: 2.3020 - accuracy: 0.9032 - val_loss: 0.0738 - val_accuracy: 0.9786
Epoch 2/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0502 - accuracy: 0.9853 - val_loss: 0.0621 - val_accuracy: 0.9824
Epoch 3/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0333 - accuracy: 0.9896 - val_loss: 0.0811 - val_accuracy: 0.9792
Epoch 4/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0216 - accuracy: 0.9936 - val_loss: 0.0851 - val_accuracy: 0.9805
Epoch 5/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0244 - accuracy: 0.9922 - val_loss: 0.0757 - val_accuracy: 0.9832
Epoch 6/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0139 - accuracy: 0.9956 - val_loss: 0.1344 - val_accuracy: 0.9752
Epoch 7/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0202 - accuracy: 0.9935 - val_loss: 0.1379 - val_accuracy: 0.9779
Epoch 8/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0141 - accuracy: 0.9956 - val_loss: 0.0919 - val_accuracy: 0.9818
Epoch 9/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0125 - accuracy: 0.9962 - val_loss: 0.1184 - val_accuracy: 0.9811
Epoch 10/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0154 - accuracy: 0.9956 - val_loss: 0.1157 - val_accuracy: 0.9832
Epoch 11/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0176 - accuracy: 0.9952 - val_loss: 0.1221 - val_accuracy: 0.9803
Epoch 12/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0101 - accuracy: 0.9976 - val_loss: 0.1170 - val_accuracy: 0.9822
Epoch 13/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0124 - accuracy: 0.9969 - val_loss: 0.1216 - val_accuracy: 0.9846
Epoch 14/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0094 - accuracy: 0.9974 - val_loss: 0.1048 - val_accuracy: 0.9848
Epoch 15/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0067 - accuracy: 0.9982 - val_loss: 0.1130 - val_accuracy: 0.9835
Epoch 16/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0122 - accuracy: 0.9974 - val_loss: 0.1463 - val_accuracy: 0.9835
Epoch 17/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0091 - accuracy: 0.9976 - val_loss: 0.1685 - val_accuracy: 0.9833
Epoch 18/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0110 - accuracy: 0.9977 - val_loss: 0.1224 - val_accuracy: 0.9840
Epoch 19/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0036 - accuracy: 0.9989 - val_loss: 0.1733 - val_accuracy: 0.9838
Epoch 20/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0109 - accuracy: 0.9978 - val_loss: 0.1539 - val_accuracy: 0.9859
Epoch 21/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0074 - accuracy: 0.9982 - val_loss: 0.1791 - val_accuracy: 0.9826
Epoch 22/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0085 - accuracy: 0.9986 - val_loss: 0.2264 - val_accuracy: 0.9830
Epoch 23/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0124 - accuracy: 0.9979 - val_loss: 0.1722 - val_accuracy: 0.9840
Epoch 24/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0089 - accuracy: 0.9984 - val_loss: 0.1472 - val_accuracy: 0.9851
Epoch 25/25
960/960 [==============================] - 3s 3ms/step - loss: 0.0048 - accuracy: 0.9988 - val_loss: 0.2005 - val_accuracy: 0.9847
Test loss: 0.18761441111564636 / Test accuracy: 0.9829999804496765

Scoring for metrics and losses,

Disclaimer: I'm an engineer by training, not a statistician, so I can get this wrong. I don't know that the general thrust has technical gaps, but I could be off in some of the details. If I am, please go talk to your friendly neighborhood stats professor and get their thoughts on the idea.

Background:
I hang out (and interact, and have for years, and learn tons) at CrossValidated, a stack-exchange site.

One of the very substantial families of threads there is why accuracy is not ideal in many places, and many of the folks engaged in the discussions are fantastic PhD's, in academia and industry, teaching or working for decades, so they are a very important source of technical wisdom.

Here are some of the threads there:

They like these things called "strictly proper score functions" or "strictly proper scoring rules".

Here are references on strictly proper scoring rules:

When I got to keras loss and metrics pages I don't see those scoring rules explicitly, and I think its a miss. I think some may be in there, but I must have missed them.

Current losses from documentation:

binary/categorical cross-entropy (I think this is related to logloss)
KL divergence
Poisson class

Current metrics from documentation:

Accuracy
Binary/Categorical/TopK accuracy
Binary/categorical crossentropy
AUC/and the TF PN measures

Recommendation/Suggestion:
I think you should add the following "strictly proper scoring rules" to Keras because it can make it easier for new users (and their pointy-haired bosses) to use technically exemplary approaches in some of their problem solving.

Some rules to consider:

Brier/quadratic scoring rule
Hyvarinen scoring rule
Spherical Scoring Rule
Logarithmic scoring rule (log-probability)

Keras ModelCheckpoint Callback does not save the checkpoint proto file correctly resulting in Tensorflow returning incorrect checkpoint paths

Describe the problem.

Currently when we are saving the weights using the ModelCheckpoint Callback during training, we do not get the list of checkpoint files correctly from the tensorflow api tf.train.get_checkpoint_state(ckpt_folder).all_model_checkpoint_paths. I am raising this issue in Keras because the checkpoint proto is incorrectly written

Describe the current behavior.
The current behavior of tf.train.get_checkpoint_state(ckpt_folder).all_model_checkpoint_paths only returns the last checkpoint saved instead of all the checkpoints

Describe the expected behavior.
The tensorflow API tf.train.get_checkpoint_state(ckpt_folder).all_model_checkpoint_paths should return all the checkpoints saved

a10g_nvidia_smi.log
t4_tesla_nvidia_smi.log

Do you want to contribute a PR? (yes/no): No

Standalone code to reproduce the issue.

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import os
import shutil

def get_uncompiled_model():
    inputs = keras.Input(shape=(784,), name="digits")
    x = layers.Dense(64, activation="relu", name="dense_1")(inputs)
    x = layers.Dense(64, activation="relu", name="dense_2")(x)
    outputs = layers.Dense(10, activation="softmax", name="predictions")(x)
    model = keras.Model(inputs=inputs, outputs=outputs)
    return model

def get_compiled_model():
    model = get_uncompiled_model()
    model.compile(
        optimizer="rmsprop",
        loss="sparse_categorical_crossentropy",
        metrics=["sparse_categorical_accuracy"],
    )
    return model

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Preprocess the data (these are NumPy arrays)
x_train = x_train.reshape(60000, 784).astype("float32") / 255
x_test = x_test.reshape(10000, 784).astype("float32") / 255

y_train = y_train.astype("float32")
y_test = y_test.astype("float32")

# Reserve 10,000 samples for validation
x_val = x_train[-10000:]
y_val = y_train[-10000:]
x_train = x_train[:-10000]
y_train = y_train[:-10000]

model = get_compiled_model()

ckpt_folder = os.path.join(os.getcwd(), 'ckpt')
if os.path.exists(ckpt_folder):
    shutil.rmtree(ckpt_folder)

ckpt_path = os.path.join(ckpt_folder, 'mymodel_{epoch}')


callbacks = [
    keras.callbacks.ModelCheckpoint(
        # Path where to save the model
        # The two parameters below mean that we will overwrite
        # the current checkpoint if and only if
        # the `val_loss` score has improved.
        # The saved model name will include the current epoch.
        filepath=ckpt_path,
        save_best_only=False,
        save_weights_only=True,
        verbose=1,
    )
]

model.fit(
    x_train, y_train, epochs=3, batch_size=1, callbacks=callbacks, validation_split=0.2, steps_per_epoch=1
)

ckpts = tf.train.get_checkpoint_state(ckpt_folder).all_model_checkpoint_paths
print(ckpts)

Model serializer path is incompatible with Windows

System information.

OS Platform and Distribution: Windows Version 10.0.19044
TensorFlow version: 2.7
Python version: 3.9.12

Describe the problem.

Both the serializer and deserializer construct the temp_dir path using the hard coded prefix "ram://". This works for Unix-based systems, but the prefix is not valid on Windows.

Describe the current behavior.

Produces the following error, which stems from the fact that f (L75) is invalid because dest_path (L74) is not a valid memory address.

Traceback (most recent call last):
  File "C:\Users\Hope\Anaconda3\envs\Association\lib\site-packages\keras\saving\pickle_utils.py", line 77, in serialize_model_as_bytecode
    info.size = f.size()
  File "C:\Users\Hope\Anaconda3\envs\Association\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 99, in size
    return stat(self.__name).length
  File "C:\Users\Hope\Anaconda3\envs\Association\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 910, in stat
    return stat_v2(filename)
  File "C:\Users\Hope\Anaconda3\envs\Association\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 926, in stat_v2
    return _pywrap_file_io.Stat(compat.path_to_str(path))
tensorflow.python.framework.errors_impl.NotFoundError

Suggested fix.

Implement something like the _get_temp_folder() method from scikeras

model.save fails with ValueError __inference_conv2d_transpose_layer_call_fn_4530 when Conv2DTranspose is quantization aware

Originally I posted this bug #54753 on tensorflow/tensorflow and was advised to repost it here.

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
TensorFlow installed from (source or binary): pip install
TensorFlow version:
tf=2.7.0, tensorflow_model_optimization=0.7.1
tf=2.8.0, tensorflow_model_optimization=0.7.1
tf-nightly=2.9.0dev20211222, tensorflow_model_optimization=0.7.1
Python version: 3.7.12

Describe the problem
We save a quantization-aware keras-model in a .pb model format using model.save(). This operation fails with ValueError: __inference_conv2d_transpose_layer_call_fn_4530 when our model contains a Conv2DTranspose layer.

The error is reproducible with tf.keras.models.save_model() too
The error is reproducible when we quantize the entire model using tfmot.quantization.keras.quantize_model()
The error is also reproducible when we annotate layers using tf.keras.models.clone_model() and apply quantization using tfmot.quantization.keras.quantize_apply(). Our current workaround is to not annotate Conv2DTranspose but this prevents us from having a fully quantization-aware model.
The error is reproducible in tf2.7.0, tf2.8.0 and tf-nightly

Saving the same model as .h5 works (unfortunately this workaround is not suitable for us because our technical requirement is to save a .pb-model).

Describe the expected behavior
model.save() saves a QAT model with a Conv2DTranspose layer in a .pb-format successfully.

Standalone code to reproduce the issue
Here are the collabs to reproduce the issue using a very simple model with a Conv2DTranspose layer and two ways to make a model quantization aware mentioned above:
- Collab with tf2.7.0
- Collab with tf2.8.0

Other info / logs
Similar issue #868

Traceback
ValueError                                Traceback (most recent call last)
<ipython-input-7-dc1f93a93afb> in <module>()
      2 annotated_model = tf.keras.models.clone_model(base_model, clone_function=apply_quantization)
      3 q_aware_model = tfmot.quantization.keras.quantize_apply(annotated_model)
----> 4 q_aware_model.save('/output_folder/q_aware_model') # save keras model as .pb, fails

1 frames
/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

/usr/local/lib/python3.7/dist-packages/tensorflow/python/saved_model/save.py in map_resources(self)
    402           if capture_constant_value is None:
    403             raise ValueError(
--> 404                 f"Unable to save function {concrete_function.name} because it "
    405                 f"captures graph tensor {capture} from a parent function which "
    406                 "cannot be converted to a constant with `tf.get_static_value`.")

ValueError: Unable to save function b'__inference_conv2d_transpose_layer_call_fn_4530' because it captures graph tensor 
Tensor("model/quant_conv2d_transpose/transpose_1:0", shape=(3, 3, 16, 16), dtype=float32) from a parent function which 
cannot be converted to a constant with `tf.get_static_value`.

Mixed precision doesn't work properly on NVIDIA A10G GPUs

A large U-Net 3D model configured with mixed precision fails with No algorithm worked! (see full a10g.log attached) when running inference on a NVIDIA A10G 20GB GPU (compute capability 8.6).

Using tensorflow/tensorflow:nightly-gpu Docker image, the error points to an out-of-memory issue (see full log a10g_tf_nightly.log attached):

No algorithm worked!  Error messages:
  Profiling failure on CUDNN engine 1#TC: RESOURCE_EXHAUSTED: Allocating 4718624784 bytes exceeds the memory limit of 4294967296 bytes.
  Profiling failure on CUDNN engine 1: RESOURCE_EXHAUSTED: Allocating 4718624784 bytes exceeds the memory limit of 4294967296 bytes.
         [[{{node model/conv3d_transpose_3/conv3d_transpose}}]] [Op:__inference_predict_function_1150]

I'm able to overcome the issue by using full precision instead (i.e by setting mixed_precision.set_global_policy("float32").

The same model configured with mixed precision works fine on the previous generation T4 Tesla GPU (compute capability 7.5), which have even less GPU memory - 16GB (see full t4_tesla.log attached).

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04.3 LTS (GNU/Linux 5.11.0-1019-aws x86_64)
TensorFlow installed from (source or binary): Official tensorflow:latest-gpu Docker image (sha256@fc5eb0604722c7bef7b499bb007b3050c4beec5859c2e0d4409d2cca5c14d442)
TensorFlow version (use command below): 2.7.0
Python version: 3.8.10
CUDA/cuDNN version: 11.2.1 / 8.1.0.77-1
GPU model and memory: A10G (20GB) and Tesla T4 (16GB).
NVIDIA driver: 470.82
nvidia-smi outputs for both GPU types provided in attachments.

Describe the expected behavior

Mixed precision mode should not exhaust all GPU memory on the newest generation of NVIDIA A10G.

Standalone code to reproduce the issue
Steps to reproduce:

Start instance with A10G GPU
Start interactive Docker container and pass test.py (copy from Colab)

$ docker run --gpus all -v /path/to/test.py:/srv/test.py -it tensorflow/tensorflow:latest-gpu /bin/bash

Run script

python /srv/test.py

Repeat steps using Tesla T4 (no error obtained)

Other info / logs
a10g.log
a10g_tf_nightly.log
t4_tesla.log

[Contributors Wanted] Better `sample_weights` test coverage for Metrics

It seems that in at least one case in a Metric object, the sample_weight argument wasn't being tested. See: keras-team/keras#15997 keras-team/keras#15939

We should add coverage for sample_weight in a systematic manner to the tests for all metrics. Right now it seems we have generic tests for sample weights, but not systematic tests for each specific metrics.

LossScaleOptimizer fails to prevent overflow

The loss scale optimizer currently does not reduce the loss scale below 1:

https://github.com/keras-team/keras/blob/d8fcb9d4d4dad45080ecfdd575483653028f8eda/keras/mixed_precision/loss_scale_optimizer.py#L238-L239

My model is stuck not learning for the first ~1M gradient steps with the loss scale at its lower bound of 1.

AttributeError: 'VocabWeightHandler' object has no attribute 'name'

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras):
https://colab.research.google.com/drive/1BD9G1o7CNxAaKKyihCKx0wa1zW-tmn0r?usp=sharing
OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
TensorFlow installed from (source or binary): pip
TensorFlow version (use command below): 2.6.0
Python version: 3.8
Bazel version (if compiling from source):
GPU model and memory:
Exact command to reproduce:

Describe the problem.
Tensorflow profiler crashes when using string categorical layer, not allowing to profile models with those layers.

Maybe the same issue

SparseKerasTensor looses shape after casting

Please go to TF Forum for help and support:

https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh

If you open a GitHub issue, here is our policy:

It must be a bug, a feature request, or a significant problem with the documentation (for small docs fixes please send a PR instead).
The form below must be filled out.

Here's why we have that policy:.

System information.

Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Multiple, checked on Linux Ubuntu 18.04.6 LTS and MacOS 12.2.1 (21D62)
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): Multiple, checked on 2.5, 2.6 and v2.8.0-rc1-32-g3f878cff5b6 2.8.0
Python version: Multiple, checked on 3.6.9 and 3.8.2
Bazel version (if compiling from source): NA
GPU model and memory: NA
Exact command to reproduce: tf.cast(tf.keras.Input(3, 3, sparse=True, dtype=tf.bool), tf.int32).shape

You can collect some of this information using our environment capture script:

You can obtain the TensorFlow version with:
python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"

Describe the problem.

Describe the problem clearly here. Be sure to convey here why it's a bug in Keras or why the requested feature is needed.

Describe the current behavior.
The problem is the shape of the symbolic sparse tensor is lost and set to None after casting.

>>> a.shape
TensorShape([3, 3])
>>> b.shape
TensorShape([None, None])

Describe the expected behavior.
The shape should be preserved after casting.

Do you want to contribute a PR? (yes/no): no
If yes, please read this page for instructions
Briefly describe your candidate solution(if contributing):

Standalone code to reproduce the issue.

Provide a reproducible test case that is the bare minimum necessary to generate
the problem. If possible, please share a link to Colab/Jupyter/any notebook.

a = tf.keras.Input(3, 3, dtype=tf.bool, sparse=True)
b = tf.cast(a, dtype=tf.int32)
assert a.shape == b.shape, \
    f"a.shape ({a.shape} is different from b.shape ({b.shape}))"

Colab link

Source code / logs.

image_dataset_from_directory uses wrong directory when labels is list

Describe the problem.

The docs for image_dataset_from_directory say the following about the directory argument:

Directory where the data is located. If labels is "inferred", it should contain subdirectories, 
each containing images for a class. Otherwise, the directory structure is ignored.

This means that when labels is a list/tuple, we should ignore the directory structure (this makes sense, as the directory structure would only be used to generate labels).

Describe the current behavior.

However, this is not what happens - instead, see the following code snippet from dataset_utils.py:

  if labels is None:
    # in the no-label case, index from the parent directory down.
    subdirs = ['']
    class_names = subdirs
  else:
    subdirs = []
    for subdir in sorted(tf.io.gfile.listdir(directory)):

We only ignore the subdirectory structure if labels is None, instead of when labels != 'inferred'. This means that when labels is a list/tuple, we expect a subdirectory structure (when none exists), causing image_dataset_from_directory to fail in this case.

Describe the expected behavior.

We should ignore the subdirectory structure if labels is anything other than inferred (i.e. make the code match what the documentation says should happen). This should be a one-line change, and I'd be happy to make a PR.

However, the existence of this issue suggests the use case where labels is a list/tuple is not unit tested, so it would probably be good to write a test. Would love a suggestion from someone more familiar with the codebase about how best to do this.

Keras bisect and editable install

System information.

TensorFlow version (you are using):
Are you willing to contribute it (Yes/No) :
Just if we have a clear path on what will be accepted in the repo
Describe the feature and the current behavior/state.

I want to use git-bisect on different Keras commits to execute third party library tests.
This is hard to achieve currently as Keras doesn't support editable installs (e.g. pip -e) and we need to build and install the wheel on every single commit.

Describe the feature clearly here. Be sure to convey here why the requested feature is needed. Any brief description about the use-case would help.

I want to move between multiple Keras commits to execute third party library tests.
To achieve this we need to have something like an editable Keras install pip -e

Will this change the current api? How?
No

Who will benefit from this feature?
All the developers and third party libraries that need to execute tests on multiple Keras commits