GithubHelp home page GithubHelp logo

keras-team / tf-keras Goto Github PK

View Code? Open in Web Editor NEW
42.0 42.0 21.0 37.63 MB

The TensorFlow-specific implementation of the Keras API, which was the default Keras from 2019 to 2023.

License: Apache License 2.0

Dockerfile 0.01% Shell 0.11% Starlark 2.52% Python 97.37%

tf-keras's People

Contributors

abhaikollara avatar chenmoneygithub avatar edersantana avatar farizrahman4u avatar fchollet avatar frightera avatar gabrieldemarmiesse avatar haifeng-jin avatar hazemessamm avatar k-w-w avatar lukewood avatar matsuyamax avatar mattdangerw avatar maxpumperla avatar nkovela1 avatar nzw0301 avatar old-school-kid avatar ozabluda avatar pavithrasv avatar phreeza avatar qlzh727 avatar rchao avatar sachinprasadhs avatar sampathweb avatar samuelmarks avatar taehoonlee avatar tdhd avatar tensorflower-gardener avatar the-moliver avatar wxs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

tf-keras's Issues

InvalidArgumentError: Graph execution error:

I am trying to make an image classification program. I am following a video and I ran into some problems, some I think I solved, and others I just couldn't. I am trying to create a model with Transfer Learning and up until now the video was kind of ok but I think it is outdated.
I searched the web but couldn't find anything to help solve my problem. Anything would be helpful.

Here is my code:

    !pip install -q -U "tensorflow-gpu==2.2.0"
    !pip install -q -U tensorflow_hub
    !pip install -q -U tensorflow_datasets

    import time
    import numpy as np
    import matplotlib.pylab as plt

    import tensorflow as tf
    import tensorflow_hub as hub
    import tensorflow_datasets as tfds
    tfds.disable_progress_bar()

    from tensorflow.keras import layers

    #Here was supposed to be a split function to split the data 80% (train), 20% (validation), I don't know what I did but in the line below I did "split=['train[:80%]', 'train[20%:]']" is it ok? or should I change something there?
    splits, info = tfds.load('cats_vs_dogs', with_info=True, as_supervised=True, split=['train[:80%]', 'train[20%:]'])

    (train_examples, validation_examples) = splits

    def format_image(image, label):
        images = tf.image.resize(image, (IMAGE_RES, IMAGE_RES))//255.0
        return image, label

    num_examples = info.splits['train'].num_examples

    BATCH_SIZE = 32
    IMAGE_RES = 224

    train_batches      = train_examples.cache().shuffle(num_examples//4).map(format_image).batch(BATCH_SIZE).prefetch(1)
    validation_batches = validation_examples.cache().map(format_image).batch(BATCH_SIZE).prefetch(1)

    URL = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/4"
    feature_extractor = hub.KerasLayer(URL, input_shape=(IMAGE_RES,IMAGE_RES,3))

    feature_extractor.trainable = False

    model = tf.keras.Sequential([
        feature_extractor,
        layers.Dense(2, activation='softmax')
    ])

    model.summary()

    model.compile(
        optimizer='adam',
        loss=tf.losses.SparseCategoricalCrossentropy(),
        metrics=['accuracy'])

    EPOCHS = 2
    history = model.fit(train_batches,
                        epochs=EPOCHS,
                        validation_data=validation_batches) #From here I get the problem

Model Summary:
image

Error:

InvalidArgumentError: Graph execution error:

Detected at node 'IteratorGetNext' defined at (most recent call last):
    File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
      return _run_code(code, main_globals, None,
    File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
      exec(code, run_globals)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel_launcher.py", line 16, in <module>
      app.launch_new_instance()
    File "/usr/local/lib/python3.8/dist-packages/traitlets/config/application.py", line 992, in launch_instance
      app.start()
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelapp.py", line 612, in start
      self.io_loop.start()
    File "/usr/local/lib/python3.8/dist-packages/tornado/platform/asyncio.py", line 149, in start
      self.asyncio_loop.run_forever()
    File "/usr/lib/python3.8/asyncio/base_events.py", line 570, in run_forever
      self._run_once()
    File "/usr/lib/python3.8/asyncio/base_events.py", line 1859, in _run_once
      handle._run()
    File "/usr/lib/python3.8/asyncio/events.py", line 81, in _run
      self._context.run(self._callback, *self._args)
    File "/usr/local/lib/python3.8/dist-packages/tornado/ioloop.py", line 690, in <lambda>
      lambda f: self._run_callback(functools.partial(callback, future))
    File "/usr/local/lib/python3.8/dist-packages/tornado/ioloop.py", line 743, in _run_callback
      ret = callback()
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 787, in inner
      self.run()
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 748, in run
      yielded = self.gen.send(value)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelbase.py", line 365, in process_one
      yield gen.maybe_future(dispatch(*args))
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 209, in wrapper
      yielded = next(result)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelbase.py", line 268, in dispatch_shell
      yield gen.maybe_future(handler(stream, idents, msg))
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 209, in wrapper
      yielded = next(result)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelbase.py", line 543, in execute_request
      self.do_execute(
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 209, in wrapper
      yielded = next(result)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/ipkernel.py", line 306, in do_execute
      res = shell.run_cell(code, store_history=store_history, silent=silent)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/zmqshell.py", line 536, in run_cell
      return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 2854, in run_cell
      result = self._run_cell(
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 2881, in _run_cell
      return runner(coro)
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/async_helpers.py", line 68, in _pseudo_sync_runner
      coro.send(None)
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3057, in run_cell_async
      has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3249, in run_ast_nodes
      if (await self.run_code(code, result,  async_=asy)):
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3326, in run_code
      exec(code_obj, self.user_global_ns, self.user_ns)
    File "<ipython-input-11-ce74159d340b>", line 7, in <module>
      history = model.fit(train_batches,
    File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 64, in error_handler
      return fn(*args, **kwargs)
    File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1409, in fit
      tmp_logs = self.train_function(iterator)
    File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1051, in train_function
      return step_function(self, iterator)
    File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1039, in step_function
      data = next(iterator)
Node: 'IteratorGetNext'
Detected at node 'IteratorGetNext' defined at (most recent call last):
    File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
      return _run_code(code, main_globals, None,
    File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
      exec(code, run_globals)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel_launcher.py", line 16, in <module>
      app.launch_new_instance()
    File "/usr/local/lib/python3.8/dist-packages/traitlets/config/application.py", line 992, in launch_instance
      app.start()
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelapp.py", line 612, in start
      self.io_loop.start()
    File "/usr/local/lib/python3.8/dist-packages/tornado/platform/asyncio.py", line 149, in start
      self.asyncio_loop.run_forever()
    File "/usr/lib/python3.8/asyncio/base_events.py", line 570, in run_forever
      self._run_once()
    File "/usr/lib/python3.8/asyncio/base_events.py", line 1859, in _run_once
      handle._run()
    File "/usr/lib/python3.8/asyncio/events.py", line 81, in _run
      self._context.run(self._callback, *self._args)
    File "/usr/local/lib/python3.8/dist-packages/tornado/ioloop.py", line 690, in <lambda>
      lambda f: self._run_callback(functools.partial(callback, future))
    File "/usr/local/lib/python3.8/dist-packages/tornado/ioloop.py", line 743, in _run_callback
      ret = callback()
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 787, in inner
      self.run()
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 748, in run
      yielded = self.gen.send(value)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelbase.py", line 365, in process_one
      yield gen.maybe_future(dispatch(*args))
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 209, in wrapper
      yielded = next(result)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelbase.py", line 268, in dispatch_shell
      yield gen.maybe_future(handler(stream, idents, msg))
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 209, in wrapper
      yielded = next(result)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelbase.py", line 543, in execute_request
      self.do_execute(
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 209, in wrapper
      yielded = next(result)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/ipkernel.py", line 306, in do_execute
      res = shell.run_cell(code, store_history=store_history, silent=silent)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/zmqshell.py", line 536, in run_cell
      return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 2854, in run_cell
      result = self._run_cell(
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 2881, in _run_cell
      return runner(coro)
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/async_helpers.py", line 68, in _pseudo_sync_runner
      coro.send(None)
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3057, in run_cell_async
      has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3249, in run_ast_nodes
      if (await self.run_code(code, result,  async_=asy)):
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3326, in run_code
      exec(code_obj, self.user_global_ns, self.user_ns)
    File "<ipython-input-11-ce74159d340b>", line 7, in <module>
      history = model.fit(train_batches,
    File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 64, in error_handler
      return fn(*args, **kwargs)
    File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1409, in fit
      tmp_logs = self.train_function(iterator)
    File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1051, in train_function
      return step_function(self, iterator)
    File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1039, in step_function
      data = next(iterator)
Node: 'IteratorGetNext'
2 root error(s) found.
  (0) INVALID_ARGUMENT:  Cannot batch tensors with different shapes in component 0. First element had shape [408,500,3] and element 1 had shape [360,343,3].
	 [[{{node IteratorGetNext}}]]
	 [[IteratorGetNext/_2]]
  (1) INVALID_ARGUMENT:  Cannot batch tensors with different shapes in component 0. First element had shape [408,500,3] and element 1 had shape [360,343,3].
	 [[{{node IteratorGetNext}}]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_24172]

Support for StableHLO generation with JAX backend

System information.

TensorFlow version (you are using): N/A
Are you willing to contribute it (Yes/No): Not immediately.

Describe the feature and the current behavior/state.

In some of the Keras Core examples JAX backend has been used. IIUC this flow uses jitting via XLA. Here I assume that the lowering must generate StableHLO as an IR before its consumed by XLA. If this is truly the case, is it viable to produce StableHLO while using JAX as the backend? It will be useful for compiling the same model using IREE instead of XLA.

Existing flow: Keras Core model --> JAX.JIT (XLA)
Desired flow: Keras Core model --> JAX.JIT --> Side outputs is StableHLO --> IREE

Will this change the current api? How?

I am not sure.

Who will benefit from this feature?

IREE users.

Contributing

  • Do you want to contribute a PR? (yes/no): no
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing):

Cannot deserialize Adam optimizer on M1/M2 mac

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras):
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Mac arm M2 Ventura 13.3
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.13.0
  • Python version: 3.11.4
  • Bazel version (if compiling from source):
  • GPU model and memory: Apple M2 Pro 16 GB
  • Exact command to reproduce:

Describe the problem.

Cannot deserialize a model compiled with the tf.keras.optimizers.Adam optimizer on M1/M2 macs.

Describe the current behavior.

When creating a Keras model on a M1/M2 mac the following messages are displayed indicating that the default optimizer tf.keras.optimizers.Adam runs slowly on M1/M2 macs. Keras then "falls back" to the legacy optimizer tf.keras.optimizers.legacy.Adam.

WARNING:absl:At this time, the v2.11+ optimizer `tf.keras.optimizers.Adam` runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at `tf.keras.optimizers.legacy.Adam`.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., `tf.keras.optimizers.legacy.Adam`.

Then, when serializing and trying to deserialize the model the following error is raised:

self = <keras.src.optimizers.legacy.adam.Adam object at 0x29b503a90>
name = 'build'

    def __getattribute__(self, name):
        """Overridden to support hyperparameter access."""
        try:
>           return super().__getattribute__(name)
E           AttributeError: 'Adam' object has no attribute 'build'

Indicating that the legacy optimizer cannot be deserialized because it does not implement the build attribute. On all other platforms the model can be serialized and deserialized because the tf.keras.optimizers.Adam is used rather than the legacy version.

Describe the expected behavior.

The expected behaviour is that the model can be deserialized without raising an attribute error.

Standalone code to reproduce the issue.

import tensorflow as tf

import pickle


model = tf.keras.Model(inputs={"a": tf.keras.Input(shape=(10,))}, outputs={"b": tf.keras.Input(shape=(10,))})
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001), loss='mse')

with open('test.pkl', 'wb') as f:
    pickle.dump(model, f)

with open('test.pkl', 'rb') as f:
    opt = pickle.load(f)

kernel dimensions in Depthwiseconv1d()

Hi,
I noticed that in the Depthwiseconv1d() documentation the definition of kernel_size says "An integer, specifying the height and width of the 1D convolution window". Since this is a 1d conv window, isn't the height of the kernel always 1?

class_weight in .fit fails with InvalidArgumentError: Graph execution error:

A simple model throws an error "InvalidArgumentError: Graph execution error" when using the 'class_weight' parameter. Without this parameter, the model trains without any issues. Conducted multiple experiments; the error is reproducible both on a local PC and in Google Colab
keras version: '2.10.0'

import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Flatten

# X_train_original = np.random.rand(1000, 100, 4)
X_train = np.random.rand(1000, 400)
y_train = np.random.randint(0, 2, size=(1000, 100))

model_test = Sequential([
    # Flatten(input_shape=(4, 100)),  # Изменен input_shape
    Dense(10, activation='relu'),
    Dense(100, activation='sigmoid')  # Размер выходного слоя остается прежним
], name='model_test')

model_test.compile(loss='binary_crossentropy', metrics=['categorical_accuracy'])

model_test.fit(X_train, y_train, epochs=5, class_weight={0: 1., 1: 760.})  # - ERROR
# model_test.fit(X_train, y_train, epochs=5)  # OK

rnn with initial_state model can't be loaded with load_model

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

Yes

Source

source

TensorFlow version

2.13.0

Custom code

Yes

OS platform and distribution

No response

Mobile device

No response

Python version

No response

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

A simple RNN with LSTMcell model.
I want to initialize the states with initial_state_h and initial_state_c.

batch_size= 16
inputs = tf.keras.layers.Input(shape=(20,5),batch_size=batch_size)
units = 8
lstm_cell_fw = tf.keras.layers.LSTMCell(units)

initial_state_h = tf.random.normal(shape = (batch_size,units), mean=0., stddev=10., dtype=tf.dtypes.float32)
initial_state_c = tf.random.normal(shape = (batch_size,units), mean=0., stddev=10., dtype=tf.dtypes.float32)
lstm_layer_fw = tf.keras.layers.RNN(lstm_cell_fw, stateful=True, return_state=True, return_sequences=False)
outputs,states_h_fw, states_c_fw= lstm_layer_fw(inputs,initial_state = [initial_state_h,initial_state_c])

lstm_dense1 = tf.keras.layers.Dense(16, activation = 'relu')
lstm_dense2 = tf.keras.layers.Dense(2, activation = 'softmax')
out=lstm_dense2(lstm_dense1(outputs))

model = tf.keras.models.Model(inputs, out)

After compile and train, the model is saved with model.save('my_model_test.keras').

model.compile(optimizer='adam', loss='categorical_crossentropy',metrics=['accuracy'])
model.summary()

xTrain = np.random.rand(96,20,5)
yTrain = np.random.rand(96,2)

for i in range(10):
  model.fit(xTrain, yTrain,batch_size=batch_size)

model.save('my_model_test.keras')

But when I try to load it with load_model = tf.keras.models.load_model('my_model_test.keras'), it gives error:

13 frames
[/usr/local/lib/python3.10/dist-packages/keras/src/backend.py](https://localhost:8080/#) in int_shape(x)
   1530     """
   1531     try:
-> 1532         shape = x.shape
   1533         if not isinstance(shape, tuple):
   1534             shape = tuple(shape.as_list())

AttributeError: 'float' object has no attribute 'shape'

I tried to save in other format, .h5, .json, etc. All give the same error.

But, if I don't use initial_state in outputs,states_h_fw, states_c_fw= lstm_layer_fw(inputs), everything goes well. No problem with load_model.

Standalone code to reproduce the issue

https://colab.research.google.com/gist/sushreebarsa/df202f7ea6ad3c85bdf4184cc8e1c9a1/rnn_save_model.ipynb

Relevant log output

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-7-8e0130abf25e> in <cell line: 1>()
----> 1 load_model = tf.keras.models.load_model('my_model_test.keras')

13 frames
/usr/local/lib/python3.10/dist-packages/keras/src/backend.py in int_shape(x)
   1530     """
   1531     try:
-> 1532         shape = x.shape
   1533         if not isinstance(shape, tuple):
   1534             shape = tuple(shape.as_list())

AttributeError: 'float' object has no attribute 'shape'

Missing methods to easily access reset_state and states within keras.Model

System information.

TensorFlow version (you are using): 2.11.0
Are you willing to contribute it (Yes/No) : yes

Describe the feature and the current behavior/state.

Stateful RNN layers have the method layer.reset_state() and the states themselves can be fetched through layer.states. However, when I have a model consisting of many RNN layers, mixed with other layers, it becomes tedious to loop through all layers and reset them manually. So state-of-the-art is something like:

for l in model.layers:
	if hasattr(l, 'reset_state'):
		l.reset_state()

This becomes really combersome, when you use bidirectional RNNs, because then you need to also check if the layer has l.forward_layer and l.backward_layer and also reset the states in them.

Therefore my proposal is to add reset_state, get_states and set_states to keras.Model. The last two work similar to get_weights() and set_weights(). Possible implementation could be:

def reset_state(self):
	def reset_state(l):
		if hasattr(l, 'reset_state'):
			l.reset_state()
		if hasattr(l, 'forward_layer'):
			reset_state(l.forward_layer)
		if hasattr(l, 'backward_layer'):
			reset_state(l.backward_layer)
			
	for l in self.layers:
		reset_state(l)
			
def get_states(self):
	states = []
	def get_states(l):
		if hasattr(l, 'states'):
			lst += l.states
		if hasattr(l, 'forward_layer'):
			get_states(l.forward_layer)
		if hasattr(l, 'backward_layer'):
			get_states(l.backward_layer)
	
	for l in self.layers:
		get_states(l)
		
	return states
	
def set_states(self, states):
	it = iter(states)
	def set_states(l):
		if hasattr(l, 'states'):
			for s in l.states:
				s.assign(next(it))
		if hasattr(l, 'forward_layer'):
			set_states(l.forward_layer)
		if hasattr(l, 'backward_layer'):
			set_states(l.backward_layer)
	
	for l in self.layers:
		set_states(l)

Will this change the current api? How?
Yes, it adds the methods reset_state, get_states and set_states to class keras.Model, so people don't need to loop through the Keras data structures.

Who will benefit from this feature?
All people that use stateful RNN layers ;)

Contributing

  • Do you want to contribute a PR? (yes/no): yes, if my employer is willing to sign the CLA
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing): see above

Wrong weight names after deserializing model from config

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS 13.4.1
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): v2.13.0-rc2-7-g1cb1a030a62 2.13.0
  • Python version: 3.11
  • GPU model and memory: no
  • Do you want to contribute a PR? (yes/no): no

Describe the problem.

Models with custom layers miss their weight names after restoring from config

Describe the current behavior.

See example below. When replacing BatchNormalization with it's successor (the only thing changes - core vs custom class) after restoring from config we got wrong weight names.

Describe the expected behavior.

Weight names of custom layers should be preserved just like built-in ones.

Standalone code to reproduce the issue.

import tensorflow as tf
from keras import layers, models
from keras.saving import register_keras_serializable
    

inputs = layers.Input(shape=[None, None, 3], dtype='float32')
x = layers.Conv2D(32, 3, padding='same', name='conv')(inputs)
x = layers.BatchNormalization(name='bn')(x)  # !!! <-- core class
model = models.Model(inputs=inputs, outputs=x)
print([w.name for w in model.weights])

model2 = models.Model.from_config(model.get_config())
print([w.name for w in model2.weights])

# ========= Failure case
@register_keras_serializable(package='MyPackage>Normalization')
class CustomBatchNormalization(layers.BatchNormalization):
    pass

inputs = layers.Input(shape=[None, None, 3], dtype='float32')
x = layers.Conv2D(32, 3, padding='same', name='conv')(inputs)
x = CustomBatchNormalization(name='cbn')(x)  # !!! <-- custom class
model = models.Model(inputs=inputs, outputs=x)
print([w.name for w in model.weights])

model2 = models.Model.from_config(model.get_config())
print([w.name for w in model2.weights])

Source code / logs.

Code posted above will print:

['conv/kernel:0', 'conv/bias:0', 'bn/gamma:0', 'bn/beta:0', 'bn/moving_mean:0', 'bn/moving_variance:0']
['conv/kernel:0', 'conv/bias:0', 'bn/gamma:0', 'bn/beta:0', 'bn/moving_mean:0', 'bn/moving_variance:0']
['conv/kernel:0', 'conv/bias:0', 'cbn/gamma:0', 'cbn/beta:0', 'cbn/moving_mean:0', 'cbn/moving_variance:0']
['conv/kernel:0', 'conv/bias:0', 'gamma:0', 'beta:0', 'moving_mean:0', 'moving_variance:0']

Documentation for GitHub Codespaces

Hi 👋

I am Samruddhi, a Software Engineer for GitHub, working with the GitHub Codespaces team, and a maintainer for the devcontainers org 👩‍💻

I was looking at the keras-team/keras repo which has a pretty cool devcontainer configuration setup ✨

GitHub Codespaces is a cloud-based development environment that uses your dev container to provide great contributor and maintainer experiences for all users of your project 🪄

🙅‍♀️ No one would ever have to worry about setting up their local machines, adding tools/runtimes for a specific project (which most frequently contradicts with their current environment) and spend hours before even starting their actual work 🙅‍♀️

Would you be willing to accept improvements to your current devcontainer config and a documentation to improve the GitHub Codespaces experience? 💁‍♀️

Thanks,
Samruddhi

// cc @craiglpeters

keras_nlp RuntimeError: Exception encountered when calling MultiSegmentPacker.call().

Please go to TF Forum for help and support:

https://discuss.tensorflow.org/tag/keras

If you open a GitHub issue, here is our policy:

It must be a bug, a feature request, or a significant problem with the documentation (for small docs fixes please send a PR instead).
The form below must be filled out.

Here's why we have that policy:.

Keras developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Colab
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): 2.13.0
  • Python version: 3.10.12
  • Bazel version (if compiling from source):
  • GPU model and memory: TPU
  • Exact command to reproduce:

You can collect some of this information using our environment capture script:

https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh

You can obtain the TensorFlow version with:
python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"

Describe the problem.

Describe the problem clearly here. Be sure to convey here why it's a bug in Keras or why the requested feature is needed.

preprocessor = keras_nlp.models.BertPreprocessor.from_preset(
preset="bert_large_en", # Name of the model
sequence_length=200, # Max sequence length, will be padded if shorter
)

outs = preprocessor(df.options.iloc[0])

Source code / logs.
RuntimeError Traceback (most recent call last)
in <cell line: 1>()
----> 1 outs = preprocessor(df.options.iloc[0]) # Process options for the first row
2
3 # Display the shape of each processed output
4 for k, v in outs.items():
5 print(k, ":", v.shape)

7 frames
/usr/local/lib/python3.10/dist-packages/tensorflow_text/python/ops/trimmer_ops.py in (.0)
345 )
346 return [
--> 347 ragged_tensor.RaggedTensor.from_row_splits(m, s)
348 for m, s in zip(o_values, o_splits)
349 ]
RuntimeError: Exception encountered when calling MultiSegmentPacker.call().

Arguments received by MultiSegmentPacker.call():
• inputs=[[['5979', '1104', '1103', '1378', '8477', '14702', '4856', '1103', '3772', '1104', '12556', '22293', '8102', '1811', '25082', '113', '150', '11414', '2137', '114', '1113', '1103', '4379', '107', '3764', '2927', '15136', '1596', '3367', '107', '6187', '1874', '10224', '3457', '1107', '15593', '13687', '136', '150', '11414', '2137', '1110', '170', '2749', '1115', '13822', '1103', '4379', '3764', '2927', '15136', '1596', '3367', '1107', '15593', '13687', '1118', '2112', '10164', '1103', '3796', '1104', '170', '1207', '1532', '1104', '2187', '1270', '107', '22520', '1843', '2187', '119', '107'], ['5979', '1104', '1103', '1378', '8477', '14702', '4856', '1103', '3772', '1104', '12556', '22293', '8102', '1811', '25082', '113', '150', '11414', '2137', '114', '1113', '1103', '4379', '107', '3764', '2927', '15136', '1596', '3367', '107', '6187', '1874', '10224', '3457', '1107', '15593', '13687', '136', '150', '11414', '2137', '1110', '170', '2749', '1115', '6986', '1103', '6187', '1874', '10224', '3457', '1206', '1103', '4379', '3764', '2927', '15136', '1596', '3367', '1107', '15593', '13687', '1105', '1103', '7140', '10537', '4267', '20623', '14971', '1121', '170', '5318', '1104', '1213', '1275', '1106', '170', '5318', '1104', '1164', '1406', '119'], ['5979', '1104', '1103', '1378', '8477', '14702', '4856', '1103', '3772', '1104', '12556', '22293', '8102', '1811', '25082', '113', '150', '11414', '2137', '114', '1113', '1103', '4379', '107', '3764', '2927', '15136', '15...

keras create/load model fails on Apple M2 cpu. Same code works on Ryzen

tf_env.txt

You can obtain the TensorFlow version with:
python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"
result:
v2.13.0-rc2-7-g1cb1a030a62 2.13.0

Can't load a keras model that I just created. The issue is not specific to Adam; all of the optimizers have the same issue.

Describe the problem.

code:

  from tensorflow import keras
  
  
  optimizer = keras.optimizers.Adam()
  
  vh  = keras.Input(shape=(2,3), name = 'vh')
  v1  = keras.layers.Dense(512)(vh)
  
  output  = keras.layers.Dense(1, activation='softmax', name='prediction')(v1)
  
  model = keras.Model(inputs=vh, outputs=[output], name="antibody_model")
  
  model.compile(optimizer=optimizer )
  
  model.save('nn_model.keras')
  test = keras.models.load_model('nn_model.keras')

output:

the legacy Adam is missing the method "build". This same code works on non-mac platforms.

WARNING:absl:At this time, the v2.11+ optimizer tf.keras.optimizers.Adam runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at tf.keras.optimizers.legacy.Adam.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., tf.keras.optimizers.legacy.Adam.
WARNING:absl:At this time, the v2.11+ optimizer tf.keras.optimizers.Adam runs slowly on M1/M2 Macs, please use the legacy Keras optimizer instead, located at tf.keras.optimizers.legacy.Adam.
WARNING:absl:There is a known slowdown when using v2.11+ Keras optimizers on M1/M2 Macs. Falling back to the legacy Keras optimizer, i.e., tf.keras.optimizers.legacy.Adam.
Traceback (most recent call last):
File "/Users/matthewclark/eclipse-workspace/antibody_classification/test/test.py", line 17, in
test = keras.models.load_model('nn_model.keras')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/Shared/anaconda3/envs/ability/lib/python3.11/site-packages/keras/src/saving/saving_api.py", line 230, in load_model
return saving_lib.load_model(
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/Shared/anaconda3/envs/ability/lib/python3.11/site-packages/keras/src/saving/saving_lib.py", line 275, in load_model
raise e
File "/Users/Shared/anaconda3/envs/ability/lib/python3.11/site-packages/keras/src/saving/saving_lib.py", line 240, in load_model
model = deserialize_keras_object(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/Shared/anaconda3/envs/ability/lib/python3.11/site-packages/keras/src/saving/serialization_lib.py", line 710, in deserialize_keras_object
instance.compile_from_config(compile_config)
File "/Users/Shared/anaconda3/envs/ability/lib/python3.11/site-packages/keras/src/engine/training.py", line 3582, in compile_from_config
self.optimizer.build(self.trainable_variables)
^^^^^^^^^^^^^^^^^^^^
File "/Users/Shared/anaconda3/envs/ability/lib/python3.11/site-packages/keras/src/optimizers/legacy/optimizer_v2.py", line 997, in getattribute
raise e
File "/Users/Shared/anaconda3/envs/ability/lib/python3.11/site-packages/keras/src/optimizers/legacy/optimizer_v2.py", line 987, in getattribute
return super().getattribute(name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'Adam' object has no attribute 'build'

Feature: Flops calculation

Reopening from

Currently the adopted solutions (from community) using tf.

import tensorflow as tf
from tensorflow.python.profiler import model_analyzer, option_builder

model = tf.keras.applications.Xception(
    weights='imagenet',
    input_shape=(150, 150, 3),
    include_top=False
) 

input_signature = [
    tf.TensorSpec(
        shape=(1, *params.shape[1:]), 
        dtype=params.dtype, 
        name=params.name
    ) for params in model.inputs
]
forward_graph = tf.function(model, input_signature).get_concrete_function().graph
options = option_builder.ProfileOptionBuilder.float_operation()
graph_info = model_analyzer.profile(forward_graph, options=options)
flops = graph_info.total_float_ops // 2
flops # 1925897756 

And using official facebookresearch,

New optimizers fail to load CUDA installed through conda

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04 (WSL)
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.11
  • Python version: 3.9
  • Bazel version (if compiling from source): N/A
  • GPU model and memory: RTX 2080 Ti
  • Exact command to reproduce:
  1. Create a new environment, following the official installation instructions from here https://www.tensorflow.org/install/pip#linux:
conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/
pip install tensorflow
  1. Run the beginner MNIST tutorial (or any other tutorial that calls fit) from here https://keras.io/examples/vision/mnist_convnet/

Describe the problem.

An error is raised:

libdevice not found at ./libdevice.10.bc

Note that if you switch to using the legacy optimizers, by switching this line

model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

to this

model.compile(loss="categorical_crossentropy", optimizer=keras.optimizers.legacy.Adam(), metrics=["accuracy"])

then the example runs successfully.

Describe the current behavior.

An error occurs when running the example.

Describe the expected behavior.

The example should run without error, as it does when using the legacy optimizers.

  • Do you want to contribute a PR? (yes/no): no

Standalone code to reproduce the issue.

https://keras.io/examples/vision/mnist_convnet/

Source code / logs.

Full stack trace of the error:

    File ".../tmp.py", line 47, in <module>
      model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)
    File "/home/drasmuss/mambaforge/envs/tmp2/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "/home/drasmuss/mambaforge/envs/tmp2/lib/python3.9/site-packages/keras/engine/training.py", line 1650, in fit
      tmp_logs = self.train_function(iterator)
    File "/home/drasmuss/mambaforge/envs/tmp2/lib/python3.9/site-packages/keras/engine/training.py", line 1249, in train_function
      return step_function(self, iterator)
    File "/home/drasmuss/mambaforge/envs/tmp2/lib/python3.9/site-packages/keras/engine/training.py", line 1233, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/home/drasmuss/mambaforge/envs/tmp2/lib/python3.9/site-packages/keras/engine/training.py", line 1222, in run_step
      outputs = model.train_step(data)
    File "/home/drasmuss/mambaforge/envs/tmp2/lib/python3.9/site-packages/keras/engine/training.py", line 1027, in train_step
      self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
    File "/home/drasmuss/mambaforge/envs/tmp2/lib/python3.9/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 527, in minimize
      self.apply_gradients(grads_and_vars)
    File "/home/drasmuss/mambaforge/envs/tmp2/lib/python3.9/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1140, in apply_gradients
      return super().apply_gradients(grads_and_vars, name=name)
    File "/home/drasmuss/mambaforge/envs/tmp2/lib/python3.9/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 634, in apply_gradients
      iteration = self._internal_apply_gradients(grads_and_vars)
    File "/home/drasmuss/mambaforge/envs/tmp2/lib/python3.9/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1166, in _internal_apply_gradients
      return tf.__internal__.distribute.interim.maybe_merge_call(
    File "/home/drasmuss/mambaforge/envs/tmp2/lib/python3.9/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1216, in _distributed_apply_gradients_fn
      distribution.extended.update(
    File "/home/drasmuss/mambaforge/envs/tmp2/lib/python3.9/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1211, in apply_grad_to_update_var
      return self._update_step_xla(grad, var, id(self._var_key(var)))
Node: 'StatefulPartitionedCall_4'
libdevice not found at ./libdevice.10.bc
         [[{{node StatefulPartitionedCall_4}}]] [Op:__inference_train_function_1026]

Likely related to:

Histogram fail on multiple fits

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux / Colab
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.11 and 2.12
  • Python version: 3.10
  • Bazel version (if compiling from source):
  • GPU model and memory: n/a
  • Exact command to reproduce:

Describe the problem.

tf.summary.histogram are useful for understanding how a model trains. And, they work with Keras. However, they fail when .fit() is called more than once. Calling .fit() more than once is often useful in fine tuning or in running multiple evaluations per fitting.

Describe the current behavior.
When running the following pseudo code, the last command fails:

model = ...
model.compile(...)
model.fit(...) # success
model.fit(...) # failure

The following error is produced:

Node: 'model/histogram/foo/write_summary'
Resource localhost/_AnonymousVar17/N10tensorflow22SummaryWriterInterfaceE does not exist.
	 [[{{node model/histogram/foo/write_summary}}]] [Op:__inference_train_function_1429]

Describe the expected behavior.
The model should fit multiple times.

Contributing.

  • Do you want to contribute a PR? (yes/no): no
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing):

Standalone code to reproduce the issue.

See https://colab.research.google.com/drive/1xiEZkpu0Ojc7Mw7FF2wq7rRqHtWk7a03?usp=sharing

Source code / logs.

It appears that the summary writer resource is recycled, but the summary write doesn't grab a new summary writer.

Although I haven't been able to find a work around for our larger model, in this reduced use case, recompiling the model allows for additional fits:

model = ...
model.compile(...)
model.fit(...) # success
model.compile(...)
model.fit(...) # success

Again, this works on the reduced use case, but not on larger models. I haven't been able to reproduce this issue on a reduced use case.

Further, I have tried two approaches for the histogram - a Lambda Layer as well as a regular layer. Both suffer the same.

Serializing a `tf.keras.Model` in half-precision

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Colab
  • TensorFlow installed from (source or binary): Binary
  • TensorFlow version (use command below): 2.11.0
  • Python version: 3.8.16
  • Bazel version (if compiling from source):
  • GPU model and memory:
  • Exact command to reproduce:

Describe the problem.

After setting the precision policy to be mixed_float16 the serialized model is not in half-precision.

Is this expected? For larger models (Stable Diffusion, for example), allowing users to serialize models in half-precision can be quite beneficial.

Describe the current behavior.

The model gets serialized in full precision.

Describe the expected behavior.

Users should have a way to serialize a model in half-precision.

Contributing.

  • Do you want to contribute a PR? (yes/no): no
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing):

Standalone code to reproduce the issue.

Colab Gist: https://colab.research.google.com/gist/sayakpaul/c7abb6f8057ba21857985de5e26a22d8/scratchpad.ipynb

Support for `MelSpectrogram` layer

System information.

TensorFlow version (you are using): 2.11.0
Are you willing to contribute it (Yes/No) : Yes

Describe the feature and the current behavior/state.

MelSpectrogram layer would allow users to easily extract spectrogram features from raw audio data, which is a crucial step for many audio-related tasks. Currently, users have to perform this feature extraction as a separate data processing task on CPU with custom functions, which can be quite time-consuming and create a bottleneck in the overall training pipeline. By providing a MelSpectrogram layer that can compute spectrograms directly from raw audio on GPU/TPU/CPU within a model, TensorFlow Keras users would be able to significantly speed up their audio-related workflows and more effectively utilize available computing resources.

It is worth noting that other popular audio-focused deep learning libraries, such as nnaudio and torchaudio in PyTorch, already provide this feature. Thus, I believe it would be a valuable addition to TensorFlow Keras, particularly for users working with audio data.

Will this change the current api? How?: N/A

Who will benefit from this feature?
If implemented, this layer would be a powerful tool for all the researchers and practitioners in the audio domain, enabling them to extract and use audio spectrogram features seamlessly within their TensorFlow Keras models. It will certainly reduce n

Contributing

  • Do you want to contribute a PR? (yes/no): Yes
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing):

In a nutshell, the code will look like this which will use tf.signal for transformation,

class MelSpectrogram(tf.keras.layers.Layer):
    def __init__(self, nfft=2048, window=2048, stride=512, rate=16000, mels=128, 
                 fmin=20, fmax=8000, top_db=80, **kwargs):
        super(MelSpectrogram, self).__init__(**kwargs)
        self.nfft = nfft
        self.window = window
        self.stride = stride
        self.rate = rate
        self.mels = mels
        self.fmin = fmin
        self.fmax = fmax
        self.top_db = top_db
        
    def call(self, input):
        spec = self.spectrogram(input) # audio to sftt spectrogram
        spec = self.melscale(spec) # to_melscale
        spec = self.dbscale(spec) # log scale
        spec = tf.linalg.matrix_transpose(spec) # [time, mel] to [mel, time]
        return spec

Note: Above code is compatible with both single input and batch input. Thus, can be used within a model as layer very efficiently.

Serialization/deserialization of h5 model failed with custom initializer (TF 2.13)

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 22.04
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.13.0
  • Python version: 3.10

Describe the problem.
When saving and loading a model in h5 format with a custom kernel initializer, loading fails with TypeError: Error when deserializing class. Exception encountered: Unknown initializer. The same code works correctly in TF 2.12.

The same code where the model is saved as .keras format works well both in TF 2.12 and 2.13. I didn't check with other type of custom objects, e.g. Constraint or Regularizer.

Describe the current behavior.
Loading a h5 model containing a custom kernel initializer fails with the following error:

TypeError: Error when deserializing class 'Dense' using config=<ocnfig dict here>
Exception encountered: Unknown initializer: <initializer name>

Note that a h5 model saved with TF 2.12 is correctly loaded in TF 2.13. However, a model saved in TF 2.13 cannot be loaded in neither TF 2.12 nor TF 2.13. The serialization in TF 2.13 might be the root of the problem. The deserialization is only a consequence.

Describe the expected behavior.
I expect that the saved model in h5 format can be loaded correctly even with a custom initializer registered in Keras serializable objects.

Contributing

  • Do you want to contribute a PR? (yes/no): no

Standalone code to reproduce the issue.

The following standalone code fails in TF 2.13 but works in previous TF versions.
The custom initializer code is taken from the Keras documentation.

import tensorflow as tf

@tf.keras.saving.register_keras_serializable()
class ExampleRandomNormal(tf.keras.initializers.Initializer):
    def __init__(self, mean=0.0, stddev=1.0):
        self.mean = mean
        self.stddev = stddev

    def __call__(self, shape, dtype=None, **kwargs):
        return tf.random.normal(shape, mean=self.mean, stddev=self.stddev, dtype=dtype)

    def get_config(self):  # To support serialization
        return {"mean": self.mean, "stddev": self.stddev}


# Create model with custom kernel initializer
lay = tf.keras.layers.Dense(
    10, kernel_initializer=ExampleRandomNormal(), input_shape=(32,)
)
model = tf.keras.Sequential([lay])

# Save model to h5 format
model_path = "model.h5"  # with "model.keras", it works
model.save(model_path)

# Load h5 model => fail "TypeError: Error when deserializing class Dense using config"
# Exception encountered: Unknown initializer
model2 = tf.keras.models.load_model(model_path)

tf.keras.utils.get_file option of 'cached_dir' is not properly set for values that do not expand from user-directory

Problem:
tf.keras.utils.get_file option of 'cached_dir' is not properly set for values that do not expand from user-directory.
for instance on Windows, the following input fallbacks to '/tmp/.keras':

"D:/myCustomCacheDir"
e.g.:
path_to_downloaded_file = tf.keras.utils.get_file( origin="https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz", cache_dir=R"D:/myCustomCacheDir", untar=True)

looking in https://github.com/keras-team/keras/blob/v2.11.0/keras/utils/data_utils.py#L227

we see that in all cases (input or default ~/.keras), a call to os.path.expanduser(cache_dir) is called.
this causes inputs such as WINDOWS' "D:/myCustomCacheDir" to not be set up per API.

if cache_dir is None:
    cache_dir = os.path.join(os.path.expanduser("~"), ".keras")
if md5_hash is not None and file_hash is None:
    file_hash = md5_hash
    hash_algorithm = "md5"
datadir_base = os.path.expanduser(cache_dir)
if not os.access(datadir_base, os.W_OK):
    datadir_base = os.path.join("/tmp", ".keras")
datadir = os.path.join(datadir_base, cache_subdir)
_makedirs_exist_ok(datadir)

TF-version: 2.10 (CPU with directml-plugin)
python-version: 3.10
Windows11

thanks!

`build_from_config()` gets a dictionary with wrong types when loading a `keras_v3` model

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Fedora 38 x86_64
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.13
  • Python version: 3.11

Describe the problem.

When loading back a custom tf.keras.Model saved in the keras_v3 format, its build() method receives argument input_shape having type list rather than tf.TensorShape as stated in the documentation.

This is a problem whenever the build() method uses attributes of the tf.TensorShape object, such as rank, because an exception is raised and the model cannot be loaded.

Describe the current behavior.

The following line calls build_from_config() on the custom model passing a build_config dictionary which contains an input_shape key having value of type list.

Describe the expected behavior.

build_from_config() receives a dictionary with proper types.

Contributing.

  • Do you want to contribute a PR? (yes/no): no

Standalone code to reproduce the issue.

import tensorflow as tf

# Define a custom model
@tf.keras.saving.register_keras_serializable()
class CustomModel(tf.keras.models.Model):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

    def build(self, input_shape):
        assert isinstance(input_shape, tf.TensorShape)

    def call(self, *args, **kwargs):
        return tf.random.uniform([1], maxval=5)

# Instantiate and build it
x = tf.random.uniform([1, 10, 10])
model = CustomModel()
model(x)

# Save the model in the keras_v3 format
model.save("./model_v3.keras", save_format="keras_v3")

# Try to load it back
model = tf.keras.models.load_model("./model_v3.keras") # Raises AssertionError

Source code / logs.

AssertionError                            Traceback (most recent call last)
Cell In[36], line 1
----> 1 model = tf.keras.models.load_model("./model_v3.keras")
      2 model(x)

File ~/redacted/temp-venv/lib64/python3.11/site-packages/keras/src/saving/saving_api.py:230, in load_model(filepath, custom_objects, compile, safe_mode, **kwargs)
    225     if kwargs:
    226         raise ValueError(
    227             "The following argument(s) are not supported "
    228             f"with the native Keras format: {list(kwargs.keys())}"
    229         )
--> 230     return saving_lib.load_model(
    231         filepath,
    232         custom_objects=custom_objects,
    233         compile=compile,
    234         safe_mode=safe_mode,
    235     )
    237 # Legacy case.
    238 return legacy_sm_saving_lib.load_model(
    239     filepath, custom_objects=custom_objects, compile=compile, **kwargs
    240 )

File ~/redacted/temp-venv/lib64/python3.11/site-packages/keras/src/saving/saving_lib.py:275, in load_model(filepath, custom_objects, compile, safe_mode)
    272             asset_store.close()
    274 except Exception as e:
--> 275     raise e
    276 else:
    277     return model

File ~/redacted/temp-venv/lib64/python3.11/site-packages/keras/src/saving/saving_lib.py:240, in load_model(filepath, custom_objects, compile, safe_mode)
    238 # Construct the model from the configuration file in the archive.
    239 with ObjectSharingScope():
--> 240     model = deserialize_keras_object(
    241         config_dict, custom_objects, safe_mode=safe_mode
    242     )
    244 all_filenames = zf.namelist()
    245 if _VARS_FNAME + ".h5" in all_filenames:

File ~/redacted/temp-venv/lib64/python3.11/site-packages/keras/src/saving/serialization_lib.py:707, in deserialize_keras_object(config, custom_objects, safe_mode, **kwargs)
    705 build_config = config.get("build_config", None)
    706 if build_config:
--> 707     instance.build_from_config(build_config)
    708 compile_config = config.get("compile_config", None)
    709 if compile_config:

File ~/redacted/temp-venv/lib64/python3.11/site-packages/keras/src/engine/base_layer.py:2341, in Layer.build_from_config(self, config)
   2339 input_shape = config["input_shape"]
   2340 if input_shape is not None:
-> 2341     self.build(input_shape)

Cell In[33], line 9, in CustomModel.build(self, input_shape)
      8 def build(self, input_shape):
----> 9     assert isinstance(input_shape, tf.TensorShape)

Workaround

Do not use any of the tf.TensorShape attributes and methods, treating its instances as lists (e.g. using len(input_shape) instead of input_shape.rank).

Keras functional API can not save right weights in h5 files

Please go to TF Forum for help and support:

https://discuss.tensorflow.org/tag/keras

If you open a GitHub issue, here is our policy:

It must be a bug, a feature request, or a significant problem with the documentation (for small docs fixes please send a PR instead).
The form below must be filled out.

Here's why we have that policy:.

Keras developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras):Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): ubuntu 20.04 LST
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.11.0
  • Python version: 3.9.0
  • Bazel version (if compiling from source):
  • GPU model and memory: RTX 3090, 24GB memory.
  • Exact command to reproduce:

I have used functional API to build a ResNet neural network algorithm. However, this constructed model can not save the neural networks' weights in an appropriate result. The validated result was very good. But the inference result is very bad.

When I use tf.config.run_functions_eagerly(True) in the model training stage, the inference result is very good. Otherwise, the inference result was very bad. To tackle this problem, I have searched some sample code of Keras.applications. Bud, not working in model weight save.

The implemented ResNet code is shown in follow:

Describe the problem.

import tensorflow.compat.v2 as tf
from keras.regularizers import l2
from keras import layers
from keras.engine import sequential
from keras.engine import training as training_lib
import keras as K

identitys=None


def Bottleneck(inputs, out_channel, name, downsample, strides=1):
    expansion = 4
    key = out_channel * expansion
    identity = inputs
    global  identitys

    if downsample:
        identitys = layers.Conv2D(key, kernel_size=1, strides=strides,
                                  use_bias=False, kernel_initializer='he_normal',
                                  padding="SAME", kernel_regularizer=l2(1.e-5), name=name + "ds_conv")(identity)
        identitys = layers.BatchNormalization(momentum=0.9, epsilon=1e-5, name=name + "ds_normal")(identitys)
    else:
        identitys = inputs

    xb = layers.Conv2D(out_channel, kernel_size=1, use_bias=False, kernel_initializer='he_normal',
                       kernel_regularizer=l2(1.e-4), name=name + "Conv2D_1")(inputs)
    xb = layers.BatchNormalization(momentum=0.9,
                                   epsilon=1e-5, name=name + "BN_1")(xb)
    xb = layers.Activation(tf.keras.activations.swish, name=name + "ACT_1")(xb)

    xb = layers.Conv2D(out_channel, kernel_size=3, use_bias=False, strides=strides, padding="SAME",
                       kernel_initializer='he_normal', kernel_regularizer=l2(1.e-4), name=name + "Conv2D_2")(xb)
    xb = layers.BatchNormalization(momentum=0.9,
                                   epsilon=1e-5, name=name + "BN_3")(xb)
    xb = layers.ReLU(name=name + "ReLU")(xb)

    xb = layers.Conv2D(key, kernel_size=1, use_bias=False,
                       kernel_initializer='he_normal',
                       kernel_regularizer=l2(1.e-4),
                       name=name + "Conv2D_3")(xb)

    xb = layers.BatchNormalization(momentum=0.9,
                                   epsilon=1e-5,
                                   name=name + "BN_4")(xb)

    xb = layers.Add(name=name + "addition")([identitys, xb])
    xb = layers.BatchNormalization(momentum=0.9,
                                   epsilon=1e-5,
                                   name=name + "Last_BN")(xb)
    xb = layers.ReLU(name=name + "LastReLU")(xb)

    return xb


def _make_layer(inputs, make_block, channel, block_num, layer_name, strides, down_sample):
    i = 0
    name = layer_name + f"block_{i + 1}_"

    xm = make_block(inputs=inputs, out_channel=channel, name=name, strides=strides, downsample=down_sample)

    for i in range(1, block_num):
        i += 1
        name = layer_name + f"block_{i}_"
        xm = make_block(inputs=xm, out_channel=channel, name=name, strides=1, downsample=False)

    return xm


class ResnetBuilder(object):
    @staticmethod
    def build(block, blocks_num, im_width=224, im_height=224, num_classes=1000):
        img_input = layers.Input(shape=(im_width, im_height, 3),
                                 dtype="float32",
                                 name="layers_inputs")

        x = layers.Conv2D(filters=64, kernel_size=7, strides=2,
                          padding="SAME", use_bias=False,
                          name="layers_conv1")(img_input)  # 把这一行替换成ContourOperator

        x = layers.BatchNormalization(momentum=0.9, epsilon=1e-5, name="x_FBN")(x)
        x = layers.ReLU(name="FReLU")(x)

        x = layers.DepthwiseConv2D(kernel_size=3, padding="SAME", use_bias=False,
                                   depthwise_initializer=tf.keras.initializers.TruncatedNormal(mean=0.0,
                                                                                               stddev=0.05, seed=None)
                                   , kernel_regularizer=l2(1.e-4), name="FDW")(x)

        x = layers.BatchNormalization(momentum=0.9, epsilon=1e-5, name="FBN")(x)

        x = layers.Activation(tf.keras.activations.swish, name="FirstACT")(x)

        x = layers.MaxPool2D(pool_size=3, strides=2, padding="SAME", name="f_MP")(x)

        x = _make_layer(x, block, 64, block_num=blocks_num[0], layer_name="ml1", strides=1, down_sample=True)

        x = _make_layer(x, block, 128, block_num=blocks_num[1], layer_name="ml2", strides=2, down_sample=True)

        x = _make_layer(x, block, 256, block_num=blocks_num[2], layer_name="ml3",  strides=2, down_sample=True)

        x = _make_layer(x, block, 512, block_num=blocks_num[3], layer_name="ml4",  strides=2, down_sample=True)

        x = layers.GlobalAvgPool2D(name="GAP2D")(x)  # pool + flatten

        x = layers.Dense(num_classes, name="logits")(x)

        predict = layers.Softmax(name="SoftMax")(x)

        model = training_lib.Model(inputs=img_input,
                                   outputs=predict, name="model")


        return model

    @staticmethod
    def resnet101(im_width=448, im_height=448,
                  include_top=True, num_classes=5):
        return ResnetBuilder.build(Bottleneck, [3, 4, 23, 3],
                                   im_width, im_height, num_classes)

    @staticmethod
    def resnet50(im_width=448, im_height=448,
                 include_top=True,
                 num_classes=5, **kwargs):
        return ResnetBuilder.build(Bottleneck, [3, 4, 6, 3],
                                   im_width, im_height, num_classes)

Describe the problem clearly here. Be sure to convey here why it's a bug in Keras or why the requested feature is needed.

Describe the current behavior.

The inference result can not match the validation result.
The inference code is shown in follow:

from keras import Model
from keras.utils import image_utils
import tensorflow as tf
import numpy as np
import os
from test_code import ResnetBuilder

gpus = tf.config.experimental.list_physical_devices(device_type='GPU')
tf.config.experimental.set_visible_devices(devices=gpus[1], device_type='GPU')


def preprocess_image(img_path, target_size=(448, 448)):
    """Preprocess the image by reshape and normalization.

    Args:
        img_path:  A string.
        target_size: A tuple, reshape to this size.
    Return:
        An image ndarray.
    """
    img = image_utils.load_img(img_path, target_size=target_size)
    img = image_utils.img_to_array(img)
    img /= 255.0

    return img


def load_trained_model():
    model = ResnetBuilder.resnet50(448, 448, 5)
    model_name = r"./model.30-.h5"
    model.load_weights(model_name, by_name=True)
    print('model load success.')
    return model


def get_category_name(full_image_path, model):
    img = preprocess_image(full_image_path)
    img_tensor = np.expand_dims(img, axis=0)

    heatmap_model = Model([model.inputs], [model.output])

    predictions = heatmap_model(img_tensor)
    category_id = np.argmax(predictions[0])
    label_name = ['A1', 'A2', 'A3', "A4", "A5"]
    category_name = label_name[category_id]

    return category_name


model = load_trained_model()
model.summary()

image_folder = r"[Image_Path]"
save_path = r"[Save_Path]"

name_list = os.listdir(image_folder)
for file_name in name_list:
    full_image_name = image_folder + "/" + file_name
    category_name = get_category_name(full_image_name, model)
    save_name = category_name + "_" + file_name # just print result, not save image with a new name.
    print(save_name)

Describe the expected behavior.

Saving the right weights in model.h5 files, and inference the right result from the weight model.

Standalone code to reproduce the issue.

Provide a reproducible test case that is the bare minimum necessary to generate
the problem. If possible, please share a link to Colab/Jupyter/any notebook.

Sorry, inconvenient to provide

The model does not save and load correctly when containing `tf.keras.layers.experimental.preprocessing.StringLookup` layer

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): v2.14.0-rc0-34-gdd01672d9a9 2.14.0-rc1
  • Python version:
  • Bazel version (if compiling from source):
  • GPU model and memory:
  • Exact command to reproduce:

Describe the problem.

The model does not save and load correctly when containing tf.keras.layers.experimental.preprocessing.StringLookup layer.
It seems that the vocabulary is not saved or loaded correctly, which is empty when loading the model.
I manually checked the saved model files and found that:
The vocabulary is saved as a constant value in saved_model.pb and all layers arguments are dumped in keras_metadata.pb, but it's not saved in variables/variables.data-00000-of-00001 and variables/variables.index (cannot find the sting aaaa or bbbb in these two files).
If the saved model files are correct, there should be something wrong when loading the model.

I've reported this issue in tensorflow/tensorflow#61779, but the contributor suggested reporting it here.
This behavior may also relate to tensorflow/tensorflow#61369, but in a different API endpoint.

Describe the current behavior.

Loading the saved model throws an error, which is caused by the empty vocabulary argument of the StringLookup layer.

Describe the expected behavior.

Saving and loading the model should work correctly for the StringLookup layer.

Standalone code to reproduce the issue.

import pickle
import tensorflow as tf
print(tf.version.GIT_VERSION, tf.version.VERSION, flush=True)

model_input = tf.keras.Input(shape=(1,), dtype=tf.int64)
lookup = tf.keras.layers.experimental.preprocessing.StringLookup(vocabulary=['aaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbb'])(model_input)
output = tf.keras.layers.Dense(10)(lookup)
full_model = tf.keras.Model(model_input, output)

# this part works
try:
    print(full_model.layers[1].get_config())
    model_bytes = pickle.dumps(full_model)
    model_recovered = pickle.loads(model_bytes)
except Exception as e:
    print("Failed! Error:", e, flush=True)
else:
    print("Success!", flush=True)

# this part throws an error
try:
    full_model.save("/tmp/temp_model")
    full_model_loaded = tf.keras.models.load_model("/tmp/temp_model")
    print(full_model_loaded.layers[1].get_config())
    model_bytes = pickle.dumps(full_model_loaded)
    model_recovered = pickle.loads(model_bytes)
except Exception as e:
    print("Failed! Error:", e, flush=True)
else:
    print("Success!", flush=True)

Source code / logs.

v2.14.0-rc0-34-gdd01672d9a9 2.14.0-rc1
{'name': 'string_lookup', 'trainable': True, 'dtype': 'int64', 'invert': False, 'max_tokens': None, 'num_oov_indices': 1, 'oov_token': '[UNK]', 'mask_token': None, 'output_mode': 'int', 'sparse': False, 'pad_to_max_tokens': False, 'idf_weights': None, 'vocabulary': ListWrapper(['aaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbb']), 'vocabulary_size': 3, 'encoding': 'utf-8'}
Success!
{'name': 'string_lookup', 'trainable': True, 'dtype': 'int64', 'invert': False, 'max_tokens': None, 'num_oov_indices': 1, 'oov_token': '[UNK]', 'mask_token': None, 'output_mode': 'int', 'sparse': False, 'pad_to_max_tokens': False, 'idf_weights': None, 'vocabulary': ListWrapper([]), 'vocabulary_size': 3, 'encoding': 'utf-8'}
Failed! Error: Error when deserializing class 'StringLookup' using config={'name': 'string_lookup', 'trainable': True, 'dtype': 'int64', 'invert': False, 'max_tokens': None, 'num_oov_indices': 1, 'oov_token': '[UNK]', 'mask_token': None, 'output_mode': 'int', 'sparse': False, 'pad_to_max_tokens': False, 'idf_weights': None, 'vocabulary': [], 'vocabulary_size': 3, 'encoding': 'utf-8'}.

Exception encountered: Cannot set an empty vocabulary, you passed [].

Data Generator extending keras.utils.Sequence uses index=0 twice at the very first iteration

I am using tensorflow 2.11.

Having a Data Generator like so:

class DataGenerator(keras.utils.Sequence):

    def __init__(self, ...):
        super().__init__()

    def __len__(self):
        return self.length

    def __getitem__(self, index):
        # Generate data
        X, y = next(self.looper)
        print(index, len(X))

prints:

0 6
Epoch 1/10
0 6
1 6
3 6
4 6
5 6
6 6
1/8 [==>...........................] - ETA: 11s - loss: 0.7264 6

Thus the generator of only 8 items runs into a stop iteration exception. Therefore we need to use np.floor(size/number_of_batches) which is effectively skipping the last batch (if it is smaller then batch_size). This is why it should actually be np.ceil()

Unclear documentation for `use_multiprocessing`

The documentation is too broad for the use_multiprocessing argument for e.g. Model.predict.

What is being parallelized? Is it only data loading? Let's say we're using GPU for inference. Does this argument control whether data is loaded in multiple processes while the GPU is being used for inference? Or does this mean that data is loaded in multiple processes and inference is being run in multiple processes? Or does this mean that data is loaded in a single process but inference is being run in multiple processes? Or does this mean something similar to tf.config.threading.set_intra_op_parallelism_threads or tf.config.threading.set_inter_op_parallelism_threads in which the model execution is parallelized?

Please clarify what use_multiprocessing is parallelizing.

Unable to deserialize Sequential model from config

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS 13.4.1
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): v2.13.0-rc2-7-g1cb1a030a62 2.13.0
  • Python version: 3.11
  • GPU model and memory: no gpu
  • Exact command to reproduce: see below
  • Do you want to contribute a PR? (yes/no): no

Describe the problem.

Unable to deserialize Sequential model from config.

Describe the current behavior.

Got exception during deserialization.

Describe the expected behavior.

As far as Sequential model is a subtype of regular Model it should be sirializable and deserializable

Standalone code to reproduce the issue.

import tensorflow as tf
from tensorflow.keras import layers, models

model = models.Sequential([
    layers.Conv2D(32, 3, padding='same', name='conv'),
    layers.BatchNormalization(name='bn'),   
])
_ = model(tf.zeros([1, 16, 16, 3]))  # aka "build"

model2 = models.Model.from_config(model.get_config())

Source code / logs.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/.pyenv/versions/3.11.2/lib/python3.11/site-packages/keras/src/engine/training.py:3244, in Model.from_config(cls, config, custom_objects)
   3243 try:
-> 3244     model = cls(**config)
   3245 except TypeError as e:

File ~/.pyenv/versions/3.11.2/lib/python3.11/site-packages/tensorflow/python/trackable/base.py:204, in no_automatic_dependency_tracking.<locals>._method_wrapper(self, *args, **kwargs)
    203 try:
--> 204   result = method(self, *args, **kwargs)
    205 finally:

File ~/.pyenv/versions/3.11.2/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:

File ~/.pyenv/versions/3.11.2/lib/python3.11/site-packages/keras/src/utils/generic_utils.py:514, in validate_kwargs(kwargs, allowed_kwargs, error_message)
    513 if kwarg not in allowed_kwargs:
--> 514     raise TypeError(error_message, kwarg)

TypeError: ('Keyword argument not understood:', 'layers')

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
Cell In[3], line 13
      9 _ = model(tf.zeros([1, 16, 16, 3]))
     11 # print([w.name for w in model.weights])
---> 13 model2 = models.Model.from_config(model.get_config())
     14 # [w.name for w in model2.weights]

File ~/.pyenv/versions/3.11.2/lib/python3.11/site-packages/keras/src/engine/training.py:3246, in Model.from_config(cls, config, custom_objects)
   3244         model = cls(**config)
   3245     except TypeError as e:
-> 3246         raise TypeError(
   3247             "Unable to revive model from config. When overriding "
   3248             "the `get_config()` method, make sure that the "
   3249             "returned config contains all items used as arguments "
   3250             f"in the  constructor to {cls}, "
   3251             "which is the default behavior. "
   3252             "You can override this default behavior by defining a "
   3253             "`from_config(cls, config)` class method to specify "
   3254             "how to create an "
   3255             f"instance of {cls.__name__} from its config.\n\n"
   3256             f"Received config={config}\n\n"
   3257             f"Error encountered during deserialization: {e}"
   3258         )
   3259 return model

TypeError: Unable to revive model from config. When overriding the `get_config()` method, make sure that the returned config contains all items used as arguments in the  constructor to <class 'keras.src.engine.training.Model'>, which is the default behavior. You can override this default behavior by defining a `from_config(cls, config)` class method to specify how to create an instance of Model from its config.

Received config={'name': 'sequential_1', 'layers': [{'module': 'keras.layers', 'class_name': 'InputLayer', 'config': {'batch_input_shape': (1, 16, 16, 3), 'dtype': 'float32', 'sparse': False, 'ragged': False, 'name': 'conv_input'}, 'registered_name': None}, {'module': 'keras.layers', 'class_name': 'Conv2D', 'config': {'name': 'conv', 'trainable': True, 'dtype': 'float32', 'filters': 32, 'kernel_size': (3, 3), 'strides': (1, 1), 'padding': 'same', 'data_format': 'channels_last', 'dilation_rate': (1, 1), 'groups': 1, 'activation': 'linear', 'use_bias': True, 'kernel_initializer': {'module': 'keras.initializers', 'class_name': 'GlorotUniform', 'config': {'seed': None}, 'registered_name': None}, 'bias_initializer': {'module': 'keras.initializers', 'class_name': 'Zeros', 'config': {}, 'registered_name': None}, 'kernel_regularizer': None, 'bias_regularizer': None, 'activity_regularizer': None, 'kernel_constraint': None, 'bias_constraint': None}, 'registered_name': None, 'build_config': {'input_shape': (1, 16, 16, 3)}}, {'module': 'keras.layers', 'class_name': 'BatchNormalization', 'config': {'name': 'bn', 'trainable': True, 'dtype': 'float32', 'axis': [3], 'momentum': 0.99, 'epsilon': 0.001, 'center': True, 'scale': True, 'beta_initializer': {'module': 'keras.initializers', 'class_name': 'Zeros', 'config': {}, 'registered_name': None}, 'gamma_initializer': {'module': 'keras.initializers', 'class_name': 'Ones', 'config': {}, 'registered_name': None}, 'moving_mean_initializer': {'module': 'keras.initializers', 'class_name': 'Zeros', 'config': {}, 'registered_name': None}, 'moving_variance_initializer': {'module': 'keras.initializers', 'class_name': 'Ones', 'config': {}, 'registered_name': None}, 'beta_regularizer': None, 'gamma_regularizer': None, 'beta_constraint': None, 'gamma_constraint': None}, 'registered_name': None, 'build_config': {'input_shape': (1, 16, 16, 32)}}]}

Error encountered during deserialization: ('Keyword argument not understood:', 'layers')

Non deterministic training of LSTM with recurrent_dropout>0

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras):
  • OS Platform and Distribution: Linux Ubuntu 22.04.2 LTS
  • TensorFlow installed from: binary
  • TensorFlow version: v2.12.0-rc1-12-g0db597d0d75 2.12.0
  • Python version: 3.10.6
  • Exact command to reproduce: run the code below

Describe the problem.
The training of a LSTM model with recurrent_droput>0 is not deterministic: infact setting the op_determinism configuration (https://www.tensorflow.org/api_docs/python/tf/config/experimental/enable_op_determinism) different runs of creation and train of the same model produce different model (different training history).
In contrast, deterministic training is obtained with recurrent_dropout=0 and also also with the dropout>0.

The code below has been used to reproduce the bug.
Note that the det_session function is used each time before the creation of the model.
This function contains the suggested setting from https://www.tensorflow.org/api_docs/python/tf/config/experimental/enable_op_determinism, but I have tested this code, with the same results, also with the following version in which suggested determinism sets from various issues or sites have been added:

def det_session():
    os.environ['PYTHONHASHSEED'] = str(1)
    rn.seed(1)
    np.random.seed(1)
    tf.random.set_seed(1)
    tf.keras.utils.set_random_seed(1)
    tf.config.experimental.enable_op_determinism()

Describe the current behavior.
The training of a LSTM model with recurrent_droput>0 is not deterministic.

Describe the expected behavior.
The training of a LSTM model with recurrent_droput>0 should be deterministic.

Contributing.

  • Do you want to contribute a PR? no

Standalone code to reproduce the issue.

import tensorflow as tf
from tensorflow.keras.losses import mean_squared_error
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import RMSprop


def det_session():
    tf.keras.utils.set_random_seed(1)
    tf.config.experimental.enable_op_determinism()


def create_model(inputs_shape):
    # Define the model
    model = Sequential()
    model.add(tf.keras.layers.LSTM(4, input_shape=(inputs_shape[1], inputs_shape[2]), dropout=0.0, recurrent_dropout=0.1))
    # Compile the model
    model.compile(optimizer=RMSprop(1e-3), loss=mean_squared_error)
    # Give a summary
    model.summary()
    return model


if __name__ == '__main__':
    inputs = tf.random.normal([32, 10, 8])
    outputs = tf.random.normal([32, 4])

    # Create and train the model the first time
    det_session()
    model = create_model(inputs.shape)
    history_0 = model.fit(inputs, outputs, epochs=10)
    # Create and train the model more times and check the loss history
    for _ in range(10):
        # Create and train the model again and check the history loss
        det_session()
        model = create_model(inputs.shape)
        history = model.fit(inputs, outputs, epochs=10)
        assert history.history['loss'] == history_0.history['loss'], 'Losses history does not corresponds'

The issue was first open on tensorflow repo (tensorflow/tensorflow#60170).
The gist here (https://colab.research.google.com/gist/tilakrayal/740c36db1b70b0946d60e7984a61f254/untitled1059.ipynb).

Source code / logs.

image_dataset_from_directory uses wrong directory when labels is list

Describe the problem.

The docs for image_dataset_from_directory say the following about the directory argument:

Directory where the data is located. If labels is "inferred", it should contain subdirectories, 
each containing images for a class. Otherwise, the directory structure is ignored.

This means that when labels is a list/tuple, we should ignore the directory structure (this makes sense, as the directory structure would only be used to generate labels).

Describe the current behavior.

However, this is not what happens - instead, see the following code snippet from dataset_utils.py:

  if labels is None:
    # in the no-label case, index from the parent directory down.
    subdirs = ['']
    class_names = subdirs
  else:
    subdirs = []
    for subdir in sorted(tf.io.gfile.listdir(directory)):

We only ignore the subdirectory structure if labels is None, instead of when labels != 'inferred'. This means that when labels is a list/tuple, we expect a subdirectory structure (when none exists), causing image_dataset_from_directory to fail in this case.

Describe the expected behavior.

We should ignore the subdirectory structure if labels is anything other than inferred (i.e. make the code match what the documentation says should happen). This should be a one-line change, and I'd be happy to make a PR.

However, the existence of this issue suggests the use case where labels is a list/tuple is not unit tested, so it would probably be good to write a test. Would love a suggestion from someone more familiar with the codebase about how best to do this.

Internal seed of keras.layers.Dropout gets increased by +1 every time a tf.function gets created from the model

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux CentOS
  • TensorFlow installed from (source or binary): PYPI
  • TensorFlow version (use command below): 2.11.0
  • Python version: 3.7
  • Bazel version (if compiling from source):
  • GPU model and memory:
  • Exact command to reproduce:

Describe the problem.

When using keras.layers.Dropout with a user defined seed, the seed gets increments by +1 every time tf.function gets called (i.e. in make_train_function, ...).

See code and log below. As you can see, every time I run tf.function, the seed of the Keras dropout layer gets increased. BUT the seed of manually crafted Dropout using tf.random.uniform does not. So I assume it's a Keras problem. I tried to pinpoint the error and found the following code:

https://github.com/keras-team/keras/blob/d2a6b9e0efdd01dd1bc6cf0bb52336d0009aba6c/keras/backend.py#L2007-L2025

Here the seed gets increased "When user didn't provide any original seed". However, the check is wrong, as self._seed is the user defined seed. It get set in the constructor and also here: https://github.com/keras-team/keras/blob/d2a6b9e0efdd01dd1bc6cf0bb52336d0009aba6c/keras/backend.py#L1974-L1976
https://github.com/keras-team/keras/blob/d2a6b9e0efdd01dd1bc6cf0bb52336d0009aba6c/keras/backend.py#L2027-L2033

So I think the make_legacy_seed function needs to be fixed. I'm not sure about this comment "... it is important to generate different seed for stateful ops ...", because the underlying tf.random.uniform stores the state in the PhiloxRandom C++ class, which gets incremented throughout the training process and only gets reset when triggered by tf.random.set_seed(...). I don't see why the seed needs to be manually increased here.

Describe the current behavior.
Every time we run tf.function the seed of the keras.layers.Dropout gets increased.

Describe the expected behavior.
The seeds should be stable, so we can have deterministic behavior.

Contributing.

  • Do you want to contribute a PR? (yes/no): yes
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing):

Personally I would remove the make_legacy_seed method, but I'm not sure if there was a special purpose for this behavior. Alternative, we could store a self.user_provided_seed = seed is not None and check for this within make_legacy_seed.

Standalone code to reproduce the issue.

import tensorflow as tf
import numpy as np

tf.random.set_seed(123)

rate = 0.5
inp = tf.keras.Input((5,))
x = tf.keras.layers.Dropout(rate=rate, seed=345)(inp)
y = tf.cast(tf.random.uniform(shape=tf.shape(inp), dtype=inp.dtype, seed=345) >= rate, inp.dtype) * inp * (1/rate)
model = tf.keras.Model(inputs=[inp], outputs=[x, y])

args = np.ones((1, 5,), np.float32)
counter = 0

def print_seeds():
    global counter
    print(f"Run #{counter}")
    counter += 1

    print("\tModel:")
    for l in model.layers:
        if hasattr(l, 'seed'):
            print("\t\t", l.name, l.seed)

    print("\tGraph:")
    func = tf.function(model.__call__)
    graph = func.get_concrete_function(args, training=True).graph
    for node in graph.get_operations():
        if node.op_def.name == "RandomUniform":
            print("\t\t", node.name, 'Seed:', node.get_attr('seed'), 'Seed2:', node.get_attr('seed2'))
    print()

print_seeds()
print_seeds()
print_seeds()

Source code / logs.

Run #0
        Model:
                 dropout 345
        Graph:
                 model/tf.random.uniform/random_uniform/RandomUniform Seed: 123 Seed2: 345
                 model/dropout/dropout/random_uniform/RandomUniform Seed: 123 Seed2: 345

Run keras-team/keras#1
        Model:
                 dropout 345
        Graph:
                 model/tf.random.uniform/random_uniform/RandomUniform Seed: 123 Seed2: 345
                 model/dropout/dropout/random_uniform/RandomUniform Seed: 123 Seed2: 346

Run keras-team/keras#2
        Model:
                 dropout 345
        Graph:
                 model/tf.random.uniform/random_uniform/RandomUniform Seed: 123 Seed2: 345
                 model/dropout/dropout/random_uniform/RandomUniform Seed: 123 Seed2: 347

[Feature Request]: Warn user directly when custom loss is not differentiable

System information.

TensorFlow version (you are using): TF 2.11
Are you willing to contribute it (Yes/No) : Yes

Describe the feature and the current behavior/state.
Some people write custom loss functions from scratch that is not being differentiable. Then this leads getting None gradients while training. Warning users directly could be a useful feature instead of saying no gradients are provided.

There are way too many Stackoverflow posts about that, I'll include some of them:
https://stackoverflow.com/questions/63874265/keras-custom-loss-function-error-no-gradients-provided
https://stackoverflow.com/questions/73197501/raise-valueerror-no-gradients-provided-for-any-variables-custom-loss-function
https://stackoverflow.com/questions/59292992/tensorflow-2-custom-loss-no-gradients-provided-for-any-variable-error
https://stackoverflow.com/questions/65619581/no-gradients-provided-for-any-variable-for-custom-loss-function
https://stackoverflow.com/questions/70537503/custom-loss-function-error-valueerror-no-gradients-provided-for-any-variable
https://datascience.stackexchange.com/questions/116645/custom-loss-function-for-binary-classificatio-in-keras-gets-error-no-gradients
https://stackoverflow.com/questions/74074934/error-no-gradients-provided-for-any-variable-while-using-custom-loss
https://stackoverflow.com/questions/75738678/gradienttape-returning-none-with-custom-csi-loss-function
https://stackoverflow.com/questions/72259489/valueerror-no-gradients-provided-for-any-variable-custom-loss-function
...

Will this change the current api? How?
This will change the current API by adding some checks on loss function before starting to training, an error/warning can be thrown.

Contributing

TF 2.13 error load h5 with tf.keras.initializers.Constant of numpy

System information.

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): colab
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): v1.12.1-97744-g379184fe5ea 2.14.0-dev20230801
  • Python version: 3.10.12
  • Exact command to reproduce: colab code here

Describe the problem.
tf 2.13
save and load in h5
using tf.keras.initializers.Constant that value is NumPy array

Exception encountered: Unknown object: '__numpy__'. Please ensure you are using a `keras.utils.custom_object_scope` and that this object is included in the scope. See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object for details.

Describe the current behavior.
error in load

Describe the expected behavior.
load ok like in tf 2.12

Contributing.

  • Do you want to contribute a PR? (yes/no): yes
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing):

Standalone code to reproduce the issue.
colab code here

Source code / logs.

TypeError: Error when deserializing class 'Conv2D' using config={'name': 'conv2d_1', 'trainable': True, 'dtype': 'float32', 'filters': 16, 'kernel_size': [3, 3], 'strides': [1, 1], 'padding': 'valid', 'data_format': 'channels_last', 'dilation_rate': [1, 1], 'groups': 1, 'activation': 'linear', 'use_bias': True, 'kernel_initializer': {'module': 'keras.initializers', 'class_name': 'Constant', 'config': {'value': {'class_name': '__numpy__', 'config': {'value': [[[[0.18208567798137665, 0.3790721595287323, 0.8748882412910461, 0.7090641260147095, 0.3427969515323639, 0.2481449544429779, 0.4105699956417084, 0.5523862242698669, 0.9493476152420044, 0.9243046045303345, 0.08798110485076904, 0.8428385853767395, 0.6696479916572571, 0.24372068047523499, 0.7578450441360474, 0.7334632873535156], [0.4542553424835205, 0.6673123240470886, 0.09779606759548187, 0.03850134089589119, 0.36950162053108215, 0.6762547492980957, 0.3222072124481201, 0.0801936611533165, 0.6936928629875183, 0.8400056958198547, 0.9771584272384644, 0.765477180480957, 0.8348147869110107, 0.9998912811279297, 0.27731889486312866, 0.8404866456985474], [0.5638918280601501, 0.37882205843925476, 0.6372887492179871, 0.993785560131073, 0.5887098908424377, 0.6531947255134583, 0.18976713716983795, 0.20709063112735748, 0.11805081367492676, 0.18692167103290558, 0.23746803402900696, 0.34046757221221924, 0.5119008421897888, 0.055337220430374146, 0.012531446292996407, 0.8681045770645142], [0.48455944657325745, 0.963996410369873...

Exception encountered: Unknown object: '__numpy__'. Please ensure you are using a `keras.utils.custom_object_scope` and that this object is included in the scope. See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object for details.

`Cannot iterate over a scalar tensor.` with split_dataset

I try to use something like:

sentences = tf.data.TextLineDataset('/path/to/sentences.txt')
train_s, test_s = tf.keras.utils.split_dataset(data_zipped, left_size=0.98)

And get error like
Cannot iterate over a scalar tensor.

So, is it real to use your util with simple 1D arrays?

Getting error when calculating entropy for each images in the batch in the input tensor in a custom layer in tensorflow/Keras.

I am working on a problem in which I have to create a custom layer in keras, which takes, output of a conv layer of a pre-trained model as an input. This custom layer work is to select K best feature maps based on shannon entropy for each images in that input tensor and then outputs the final tensor with k feature maps or each images. So that this output tensor is passed to other conv layer in the model.

Let input tensor from a conv layer has shape = (None, 224,224, 128) and I want to take 64 best feature maps out of 128 based on shannon entropy . So the output tensor shape should be = (None, 224,224, 64).


System Informations:
python version = '3.9.12'
tensorflow version = '2.10.0'
Platform = Windows 11
Running on CPU

Below is the code snippet :

`

          class select_k_fmap(tf.keras.layers.Layer):
                def __init__(self):
                    super(select_k_fmap, self).__init__()

                def build(self, input_shape):
                    pass

                def call(self, inputs):
                    """
                    tensor shape = (batch_size, img_height, img_width, no. of filters)
                    """    
                    shape = tf.shape(inputs)
                    batch_size = shape[0]
                    print('batch size **** ', batch_size)
                    num_filters = shape[-1]
                    img_height = shape[1]
                    img_width = shape[2]

                    k = 4 # no. of best feature maps to select

                    def k_best_fmap(image):
                        """
                        returns k best feature maps of an image having n feature maps
                        """    
                        def shannon_entropy(feature_map):
                            """
                            returns entropy of a image in a input batch
                            """
                            value_ranges = [0.0, 1.0] 
                            nbins = 256 
                            histogram_bin_indexes = tf.histogram_fixed_width_bins(image, value_ranges, nbins)
                            _, _, count = tf.unique_with_counts(histogram_bin_indexes) 
                            prob = count/tf.reduce_sum(count)
                            prob = tf.cast(prob, tf.float32)
                            entropy = (-tf.reduce_sum(prob * tf.math.log(prob)))/(tf.math.log(2.0)) 
                            return entropy

                        final_image = tf.zeros_like(image) #shape = (img_height, img_width, no. of filters)
                        entropy = []
                        num_featuremaps = tf.shape(image)[-1]
                        for j in range(int(num_featuremaps)):
                            image_ith_filter = image[:,:,j]
                            image_ith_filter_entropy = shannon_entropy(image_ith_filter)
                            entropy.append(tf.get_static_value(image_ith_filter_entropy).item()) 


                        entropy_array = tf.argsort(entropy, direction='DESCENDING')
                        k_best_entropy_sort_index = tf.get_static_value(entropy_array[:k]).tolist()

                        for index, element in enumerate(k_best_entropy_sort_index):
                            final_image[:,:,index] = image[:,:,element]


                    output_tensor = tf.map_fn(fn=k_best_fmap, elems=inputs)

                    return output_tensor`

But when I ran this code I got error :

Screenshot (890)

I have tried many solutions available on the internet. But nothing worked.
I think this error is due to the shape of input tensor having None as first value in its shape (None, 224,224, 128). But I am unable to resolve this error. I want to wok on dynamic batch tensor as input .

It would be a great help for me if anyone of you could assist me. Thanks in advance.

Here is the link for google colab gist.

Unable to create Keras Package on Windows

Error while creating Keras package on windows from source. This issue is the same as #340.

System information.

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
  • TensorFlow installed from (source or binary): Binary
  • TensorFlow version: TF 2.12.rc1
  • Python version: 3.9
  • Bazel version (if compiling from source): 5.4.0
  • GPU model and memory: NA
  • Exact command to reproduce: .\bazel-bin\keras\tools\pip_package\build_pip_package C:\tmp\keras_package

I am getting an error while trying to create Keras Package on windows

Error in text format:

(venv39) D:\user\mraunak\priv_keras\frameworks.ai.keras>.\bazel-bin\keras\tools\pip_package\build_pip_package C:\tmp\keras_package
Thu Mar 9 13:34:20 PST 2023 : === Preparing sources in dir: /tmp/tmp.KBfCiElRiO
cp: cannot stat 'bazel-bin/keras/tools/pip_package/build_pip_package.runfiles/org_keras/keras': No such file or directory

I was successfully able to build from the source but got an error while creating the Keras package.

Adam (and other) optimizers miscalculate the momentum update for complex variables

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): CentOS Linux 7.9.2009
  • TensorFlow installed from (source or binary): Binary
  • TensorFlow version (use command below): 2.12.1
  • Python version: 3.11.4

Describe the problem.

When the Adam optimizer is used to minimize a variable of dtype complex64 or complex128, the momenta calculated are incorrect, causing slower or incorrect updates.

For example, suppose we are trying to find the roots of the complex polynomial $f(z)=(z-(1+3j))^2$ with respect to an absolute square loss function. $f$ has a root at $z=1+3j$, so we expect the minimization to converge to this.

There are currently two approaches. In the first approach, we take as input variables the real scalars $a$ and $b$, and inside the function to minimize we combine these to $z = a+bj$. In the second approach we directly define a variable $z$ of complex datatype as an input to the function to minimise. In the first approach, the momenta calculated for a and b are different, so we expect the optimization to behave slightly differently, but they both should converge in a similar manner with roughly the same rate.

This issue has been raised before #38541, also in pytorch, where it was eventually decided to adjust the computations, see (pytorch/pytorch#65711). I understand that a fix might not desired due to implications for real-valued variables, however in this I would expect usage of the complex case to atleast raise a warning that complex variables currently have unexpected behavior.

Describe the current behavior.

As can be seen in the attached code, this is not the case: the first approach converges much faster, while the second approach tends to oscillate around a the minimum before settling. Furthermore, the first approach does not actually follow the gradient in the complex loss landscape due to the independent momenta for $a$ and $b$.

Describe the expected behavior.

In the expected behavior, both results should converge in the same way and follow the gradient of the loss landscape. If we plot the updates of the variables in the complex plane together with the loss landscape, we get the following figure (generated in the colab document):

image

The expected behavior is shown by the fix: the optimised variable does not move around the minimum but obeys the symmetry of the loss function.

Contributing.

  • Do you want to contribute a PR? (yes/no): Yes
  • Briefly describe your candidate solution(if contributing):

The update function in the tf.keras.optimizers.Adam class currently computes the second moment as tf.square(gradient) * (1 - self.beta_2). For complex values, this should be gradient * tf.math.conj(gradient) * (1 - self.beta_2). This should still work for real valued variables, but I don't know if there are any performance related issues. An alternative would be to output a warning when using the optimiser on complex variables.

Standalone code to reproduce the issue.

import tensorflow as tf

f = lambda z: (z-(1+3j))**2

optimizer1 = tf.keras.optimizers.Adam(learning_rate=0.1)
a = tf.Variable(0.5, dtype=tf.float32)
b = tf.Variable(1.0, dtype=tf.float32)

def loss1():
    z2 = tf.complex(a, b)
    return tf.abs(f(z2))**2

optimizer2 = tf.keras.optimizers.Adam(learning_rate=0.1)
z = tf.Variable(0.5 + 1.0j, dtype=tf.complex64)

def loss2():
    return tf.abs(f(z))**2

for i in range(50):
  optimizer1.minimize(loss1, [a,b])
  optimizer2.minimize(loss2, [z])

print(f"[{a.value()}, {b.value()}], loss: {loss1()}")
# Outputs: [0.015215082094073296, 0.01610667072236538], loss: 0.004847094416618347
print(f"[{tf.math.real(z).numpy()}, {tf.math.imag(z).numpy()}], loss {loss2()}" )
# Outputs: [1.1107741594314575, 0.8542921543121338], loss 9.064789772033691

Colab link: https://colab.research.google.com/drive/1DKsktAM7MOUQhFxbr2kElDtDJE1AunK-?usp=sharing

Source code / logs.

N/A

tf.keras.metrics.F1Score produces ValueError: Tensor conversion requested dtype float32 for Tensor with dtype int32

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): RHELS 7.9
  • TensorFlow installed from (source or binary): Pip, binary
  • TensorFlow version (use command below): 2.13.0
  • Python version: 3.9
  • GPU model and memory: NVIDIA A100-SXM4-40GB
  • Exact command to reproduce:

Describe the problem.

During training, model is monitoring tf.keras.metrics.F1Score; however, when F1Score.update_state is called, a Value Error is thrown.

ValueError: Tensor conversion requested dtype float32 for Tensor with dtype int32: <tf.Tensor 'cond/Identity_4:0' shape=(None,) dtype=int32>

which is the result of the following line of code in the FBetaScore Class:

 y_true = tf.convert_to_tensor(y_true, dtype=self.dtype)

Describe the current behavior.

F1Score metric unable to update_state. Error thrown. Unable to train model.

Describe the expected behavior.

I would expect F1Score to update_state based on a y_true tensor with an int32 datatype and a y_pred tensor of float32 datatype without throwing an error.

In the tfa.metrics.FBetaScore code, the corresponding line is:

y_true = tf.cast(y_true, self.dtype)

Is it possible that the new tf.keras.metric code should be using a tf.cast(...) vice a tf.convert_to_tensor(...)?

  • Do you want to contribute a PR? (yes/no): no

Standalone code to reproduce the issue.

Cannot share full code. Can share custom model init / train_step which causes the error.

class CustomModel(keras.Model):
    def __init__(self, model_type, val_samples, threshold, *args, **kwargs):
      super(CustomModel, self).__init__(*args,**kwargs)
      self.loss_tracker = tf.keras.metrics.Mean(name='Loss')
      self.val_samples = val_samples
      self.precision = tf.keras.metrics.Precision(name='Precision')
      self.recall = tf.keras.metrics.Recall(name='Recall')
      self.f1 = tf.keras.metrics.F1Score(name="F1", threshold=threshold)
      if model_type == 'binary':
          self.accuracy = tf.keras.metrics.BinaryAccuracy(name='accuracy', threshold=threshold)
      else:
          self.accuracy = tf.keras.metrics.CategoricalAccuracy(name='accuracy')
      

    def train_step(self, data):
        inputs, targets = data
        
        with tf.GradientTape() as tape:
            predictions = self(inputs, training=True)
            loss = self.compiled_loss(targets, predictions, regularization_losses=self.losses)

        gradients = tape.gradient(loss, self.trainable_variables)
        self.optimizer.apply_gradients(zip(gradients, self.trainable_variables))

        self.loss_tracker.update_state(loss)
        self.accuracy.update_state(targets, predictions)
        self.precision.update_state(targets,predictions)
        self.recall.update_state(targets,predictions)
        self.f1.update_state(targets, predictions)
        return {"Accuracy":self.accuracy.result(),"Loss":self.loss_tracker.result(), 
                "Precision":self.precision.result(), "Recall":self.recall.result(),
                "F1":self.f1.result()}

    def test_step(self, data):
        inputs, targets = data

        predictions = []
        
        for _ in range(self.val_samples):
            predictions.append(self(inputs, training=False))
        
        predictions = tf.math.reduce_mean(tf.stack(predictions, axis=0), axis=0)

        loss = self.compiled_loss(targets, predictions, regularization_losses=self.losses)
        
        self.loss_tracker.update_state(loss)
        self.accuracy.update_state(targets, predictions)
        self.precision.update_state(targets,predictions)
        self.recall.update_state(targets,predictions)
        self.f1.update_state(targets, predictions)
        return {"Accuracy":self.accuracy.result(),"Loss":self.loss_tracker.result(),
                "Precision":self.precision.result(), "Recall":self.recall.result(),
                "F1":self.f1.result()}
    
    @property
    def metrics(self):
        return [self.accuracy,self.loss_tracker, self.precision, self.recall, self.f1]

Source code / logs.

Epoch 1/2000
Traceback (most recent call last):
  File "/home/john.fischer/.conda/envs/psonar2/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/john.fischer/.conda/envs/psonar2/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/data/kraken/john.fischer/projects/passive_sonar_models/scripts/../../passive_sonar_models/__main__.py", line 110, in <module>
    main()
  File "/data/kraken/john.fischer/projects/passive_sonar_models/scripts/../../passive_sonar_models/__main__.py", line 37, in main
    train_model.main(args)
  File "/data/kraken/john.fischer/projects/passive_sonar_models/scripts/../../passive_sonar_models/task/train_model.py", line 139, in main
    model.fit(train_data, epochs=args.num_epochs, validation_data=validate_data, #steps_per_epoch=steps_per_epoch, validation_steps=val_steps,
  File "/home/john.fischer/.conda/envs/psonar2/lib/python3.9/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/tmp/__autograph_generated_file3ywpkuyj.py", line 15, in tf__train_function
    retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
  File "/data/kraken/john.fischer/projects/passive_sonar_models/scripts/../../passive_sonar_models/models/mc_dropout_CNN.py", line 42, in train_step
    self.f1.update_state(targets,predictions)
ValueError: in user code:

    File "/home/john.fischer/.conda/envs/psonar2/lib/python3.9/site-packages/keras/src/engine/training.py", line 1338, in train_function  *
        return step_function(self, iterator)
    File "/home/john.fischer/.conda/envs/psonar2/lib/python3.9/site-packages/keras/src/engine/training.py", line 1322, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/home/john.fischer/.conda/envs/psonar2/lib/python3.9/site-packages/keras/src/engine/training.py", line 1303, in run_step  **
        outputs = model.train_step(data)
    File "/data/kraken/john.fischer/projects/passive_sonar_models/scripts/../../passive_sonar_models/models/mc_dropout_CNN.py", line 42, in train_step
        self.f1.update_state(targets,predictions)
    File "/home/john.fischer/.conda/envs/psonar2/lib/python3.9/site-packages/keras/src/utils/metrics_utils.py", line 77, in decorated
        update_op = update_state_fn(*args, **kwargs)
    File "/home/john.fischer/.conda/envs/psonar2/lib/python3.9/site-packages/keras/src/metrics/base_metric.py", line 140, in update_state_fn
        return ag_update_state(*args, **kwargs)
    File "/home/john.fischer/.conda/envs/psonar2/lib/python3.9/site-packages/keras/src/metrics/f_score_metrics.py", line 176, in update_state  **
        y_true = tf.convert_to_tensor(y_true, dtype=self.dtype)

    ValueError: Tensor conversion requested dtype float32 for Tensor with dtype int32: <tf.Tensor 'cond/Identity_4:0' shape=(None,) dtype=int32>

L1 penalty set small weights to 0

Dear team,

I have noticed that L1 penalty does not strictly force weights to 0. I wonder what is your view on below tentative approach that forces small weights to strictly 0?

class SmallWeightsToZero(Constraint):
def call(self, w):
mask = tf.abs(w) < tf.keras.backend.epsilon()
w = tf.where(mask, tf.zeros_like(w), w)
return w

model.add(layers.Dense(units = 1,
kernel_regularizer = regularizers.L1(l1=l1),
kernel_constraint = SmallWeightsToZero()))

MultiHeadAttention quietly ignores masking when annotated with tf.function

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS 13.2.1
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.10
  • Python version: 3.10
  • Bazel version (if compiling from source): NA
  • GPU model and memory: Apple Silicone M2 Max
  • Exact command to reproduce:

Run following script in a Jupyter notebook. The outputs of with_mask and without_mask are the same. Mask is not working. When @tf.function is used.

import tensorflow as tf
class ApplyMHA(tf.keras.layers.Layer):
    def __init__(self, **kwargs):
        super().__init__()
        self.mha = tf.keras.layers.MultiHeadAttention(**kwargs)

    @tf.function
    def call(self, x):
        return self.mha(query=x, key=x, value=x)
x = tf.convert_to_tensor(
    [[1,2,0,0],
     [2,3,1,0]
    ]
)

# Create 2 embedding tables, one with masking, one without.
# They use the same seed as initializer so the two tables should be identical except masking.
initializer = tf.keras.initializers.RandomUniform(
    minval=-0.05, maxval=0.05, seed=123)

embedding_table_with_mask = tf.keras.layers.Embedding(input_dim=100, output_dim=3, mask_zero=True, embeddings_initializer=initializer)
embedding_table_without_mask = tf.keras.layers.Embedding(input_dim=100, output_dim=3, mask_zero=False, embeddings_initializer=initializer)
embedding_with_mask = embedding_table_with_mask(x)
print("embedding_with_mask:", embedding_with_mask)
embedding_without_mask = embedding_table_without_mask(x)
print("embedding_without_mask:", embedding_without_mask)

mha = ApplyMHA(num_heads=2, key_dim=3)
print("===After applying MHA====")

# The outputs of with_mask and without_mask are the same. Mask is not working.
print("with_mask:", mha(embedding_with_mask))
print("without_mask:", mha(embedding_without_mask))

Describe the problem.

In the script provided above, when MultiHeadAttention is called from a function annotated with @tf.function, it quietly ignores the masks from the input tensor. This is demonstrated by the results from "with_mask" and "without_mask" being the same.

Notice this problem only happens when calling from a context that's using @tf.function. Removing @tf.function will make this issue disappear.

IIUC the root cause of this is MultiHeadAttention(MHA) relies on _keras_mask attached to the input tensor(code). _keras_mask is not available in the context of @tf.function. To fix the problem, MHA should rely on the mask variable passed to the call function, but this will be a pretty big change.

Describe the current behavior.
When MultiHeadAttention is called from a function annotated with @tf.function, it quietly ignores the masks from the input tensor.

Describe the expected behavior.
MultiHeadAttention should respect the masking from inputs when being called from a function annotated with @tf.function

Contributing.

  • Do you want to contribute a PR? (yes/no):
    yes.
    Change the call signature of MHA to accept an array of inputs and their masks. Do not rely on tensor._keras_mask

Standalone code to reproduce the issue.

Mentioned in Exact command to reproduce section. Here is a colab notebook to demonstrate the issue.

Source code / logs.

Mentioned in Exact command to reproduce section.

Accurarcy() does not work, but 'accuracy' does

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): not really
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 22.04
  • TensorFlow installed from (source or binary): pip install tensorflow
  • TensorFlow version (use command below): 2.13.0
  • Python version: 3.11.5
  • Bazel version (if compiling from source): -
  • GPU model and memory: does not matter
  • Exact command to reproduce: run script below

Describe the problem.
If I write Accurarcy() in the metrics list, it does not work. But the String accuracy does work. According to the docs, bith sould work. See example code below.

Describe the current behavior.

469/469 [==============================] - 3s 6ms/step - loss: 0.2548 - accuracy: 0.0000e+00

Describe the expected behavior.

469/469 [==============================] - 3s 6ms/step - loss: 0.2540 - accuracy: 0.9260

Contributing.

  • Do you want to contribute a PR? (yes/no): no
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing): -

Standalone code to reproduce the issue.

import numpy as np
from keras import Sequential
from keras.datasets import mnist
from keras.src import activations
from keras.src.layers import Dense
from keras.src.losses import CategoricalCrossentropy
from keras.src.metrics import Accuracy
from keras.src.optimizers import RMSprop
from keras.src.utils import to_categorical


def preprocess_images(images):
    s = images.shape
    return images.reshape((s[0], s[1] * s[2])).astype(dtype=np.float32) / 255


if __name__ == '__main__':
    (train_images, train_labels), (test_images, test_labels) = mnist.load_data()
    processed_train_images = preprocess_images(train_images)
    processed_test_images = preprocess_images(test_images)
    processed_train_labels = to_categorical(y=train_labels)
    processed_test_labels = to_categorical(y=test_labels)

    network = Sequential()
    network.add(layer=Dense(units=512, activation=activations.relu, input_shape=(28 * 28, )))
    network.add(layer=Dense(units=10, activation=activations.softmax))

    network.compile(optimizer=RMSprop(), loss=CategoricalCrossentropy(), metrics=[Accuracy()])  # this does not work
    # network.compile(optimizer=RMSprop(), loss=CategoricalCrossentropy(), metrics=['accuracy'])  # this works

    network.fit(x=processed_train_images, y=processed_train_labels, epochs=1, batch_size=128)

Source code / logs.
Nothing.

Nested `WideDeepModel` fails when saved due to a bug on the optimizer serialization

Describe the problem.

A model that contains a nested WideDeepModel submodel throws an error when saved. The problem goes back at least since TFv2.9. I've tested the latest nightly and the issue persists on latest master. I've also reproduced this successfully both on Linux and MacOS.

Error message:

  File "./keras/saving/legacy/saving_utils.py", line 211, in model_metadata
    "config": model.optimizer.get_config(),
AttributeError: 'ListWrapper' object has no attribute 'get_config'

Describe the current behavior.

I tried to investigate the source of the issue and I think there are a few problems in the current implementation:

  1. The implementation of model_metadata() doesn't anticipate receiving a list of optimizers.
  2. The model.save(..., include_optimizer=False) parameter is not respected and model_metadata() is called with its default True value.

Describe the expected behavior.

I believe the expected behaviour is:

  1. The model_metadata() can handle lists of optimizers.
  2. The initially declared include_optimizer value should be respected.

Contributing.

  • Do you want to contribute a PR? (yes/no): yes
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing):

It's been years since my last contribution to Keras and I'm very rusty with TensorFlow, so take what I say with a grain of salt. I believe that the model_metadata() needs to handle both single optimizers and lists of optimizers. Optionally the original value of include_optimizer should be passed around, so that it's respected. Unfortunately, the latter may require API changes, so perhaps we should find a workaround.

Standalone code to reproduce the issue.

I've added a quick and dirty test on the wide_deep_test.py file:

    def test_config_nested(self):
        input1 = input_layer.Input(shape=(1,))
        output1 = linear.LinearModel()(input1)
        linear_model = keras.Model(input1, output1)

        input2 = input_layer.Input(shape=(1,))
        output2 = core.Dense(units=1)(input2)
        dnn_model = keras.Model(input2, output2)

        wide_deep_model = wide_deep.WideDeepModel(linear_model, dnn_model)
        wide_deep_model.compile(optimizer=['adam', 'adam'])

        output = wide_deep_model([input1, input2])
        model = keras.Model([input1, input2], output)
        model.compile()

        model.save("./deleteme", save_format="tf", include_optimizer=False)

Source code / logs.

Full trace log:

Traceback (most recent call last):
  File "./keras/premade_models/wide_deep_test.py", line 315, in test_config_nested
    model.save("./deleteme", save_format="tf", include_optimizer=False)
  File "./keras/utils/traceback_utils.py", line 61, in error_handler
    return fn(*args, **kwargs)
  File "./keras/engine/training.py", line 2985, in save
    saving_api.save_model(
  File "./keras/saving/saving_api.py", line 163, in save_model
    return legacy_sm_saving_lib.save_model(
  File "./keras/utils/traceback_utils.py", line 61, in error_handler
    return fn(*args, **kwargs)
  File "./keras/saving/legacy/save.py", line 168, in save_model
    saved_model_save.save(
  File "./keras/saving/legacy/saved_model/save.py", line 103, in save
    metadata = generate_keras_metadata(saved_nodes, node_paths)
  File "./keras/saving/legacy/saved_model/save.py", line 132, in generate_keras_metadata
    metadata=node._tracking_metadata,
  File "./keras/engine/base_layer.py", line 3484, in _tracking_metadata
    return self._trackable_saved_model_saver.tracking_metadata
  File "./keras/saving/legacy/saved_model/base_serialization.py", line 54, in tracking_metadata
    return json_utils.Encoder().encode(self.python_properties)
  File "./keras/saving/legacy/saved_model/layer_serialization.py", line 37, in python_properties
    return self._python_properties_internal()
  File "./keras/saving/legacy/saved_model/model_serialization.py", line 41, in _python_properties_internal
    saving_utils.model_metadata(
  File "./keras/saving/legacy/saving_utils.py", line 211, in model_metadata
    "config": model.optimizer.get_config(),
AttributeError: 'ListWrapper' object has no attribute 'get_config'

Add a target_width parameter to keras.utils.timeseries_dataset_from_array

Feature request:

It would have been nice if there was a parameter target_width (a.k.a. label_width) for keras.utils.timeseries_dataset_from_array which allowed the target to be sequences longer than just one timestep as it now assumes. Compare to the class WindowGenerator in https://www.tensorflow.org/tutorials/structured_data/time_series which have a label_width.

This would simplify the code for the case we want to generate all intermediate timestep when predicting with a target shift, like predicting weather at hour 48 from the sequence between hour 0 and 24. That is, I want to have the target to be the full sequence 25-48 instead of just 48.

As it is now I have to make two calls to timeseries_dataset_from_array and therefore I will be missing the shuffle function like this:

def datasetgen(dataframe, input_width=24, label_width=1, shift=1, batch_size=128,
  label_columns=None, start_index=None, end_index=None, shuffle=False):
  """
  Generate timeseries dataset from the given dataframe containing a data sequence.

  Parameters:
  - dataframe: The source time sequence dataframe.
  - input_width: Number of time steps in each input sequence.
  - label_width: Number of time steps in each label sequence.
  - shift: How many steps to shift the end of the input to get the label.
  - batch_size: Size of batches to generate.
  - label_columns: List of column names to extract as labels. If None, all columns are used.
  - start_index: Start index from the dataframe to consider data. Default is the start of the dataframe.
  - end_index: End index from the dataframe to consider data. Default is the end of the dataframe.
  - shuffle: Whether to shuffle the generated batches. Note: shuffling won't work in the current implementation.

  Returns:
  - A TensorFlow Dataset containing input and label sequences.
  """

  # If end index or start index is not given, assign them to the end or start of the dataframe respectively.
  if end_index is None:
      end_index = len(dataframe) - 1
  if start_index is None:
      start_index = 0

  # Generate a input timeseries dataset from the dataframe using keras.utils.timeseries_dataset_from_array
  input_ds = tf.keras.utils.timeseries_dataset_from_array(
                  dataframe, targets=None, sequence_length=input_width,
                  sequence_stride=1, sampling_rate=1, batch_size=batch_size, shuffle=shuffle,
                  start_index=start_index, end_index=(end_index-(input_width+shift-1)))

  # Fetch the indices of the label columns from the dataframe.
  label_columns_indices = get_label_columns_indices(dataframe,label_columns) # get the selected columns
  targetsdf = dataframe[list(label_columns_indices)]

  # Generate a timeseries dataset of label sequences from the dataframe.
  # Here we assume that label_width should be less than or equal to shift.
  target_ds = tf.keras.utils.timeseries_dataset_from_array(
  targetsdf, targets=None, sequence_length=label_width,
  sequence_stride=1, sampling_rate=1, batch_size=batch_size, shuffle=shuffle,
  start_index=(start_index+(input_width+shift-label_width)),
  end_index=end_index-input_width+1)

  # Combine input and target datasets to form a single dataset.
  train_ds = tf.data.Dataset.zip(input_ds,target_ds)

  return train_ds

Add a dedicated file for Keras' security policy

[At @divyashreepathihalli's recommendation, I'm moving this issue and its associated PR from /keras to /tf-keras]

Describe the feature and the current behavior/state.

Keras' security policy is currently at the very end of CONTRIBUTING.md. I suggest this information be moved to a dedicated SECURITY.md file.

GitHub treats SECURITY.md as a special file. If it exists in your repository, a new "issue type" is created that redirects users to the policy if they've discovered a vulnerability. The policy also appears in the project's Security Panel.

An even better solution would be to set the policy in the https://github.com/keras-team/.github repository. The policy will then be treated as the default policy for all projects in the keras-team organization.

Will this change the current api? How?
No.

Who will benefit from this feature?
Security researchers will have an easier time safely reporting vulnerabilities, therefore increasing Keras' overall security.

  • Do you want to contribute a PR? (yes/no): YES

  • Briefly describe your candidate solution(if contributing):

I will simply move the security policy to a SECURITY.md file at the root of the repository.

I would have preferred to send this PR to the keras-team/.github repository (therefore applying the policy to all repos), but since the repository is empty, I cannot create a fork and send a PR there.

Race Condition in RNN Layers using (Recurrent-)Dropout

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): CentOS
  • TensorFlow installed from (source or binary): PYPI
  • TensorFlow version (use command below): v2.11.0
  • Python version: 3.7

Describe the problem.

RNN layers using Dropout can be victim of a race condition when executed in graph mode. The problem comes from here:
https://github.com/keras-team/keras/blob/master/keras/backend.py#L5016-L5021

This is the outer loop for the RNNCells. It allows parallel execution. However, the dropout/recurrent_dropout masks get generated inside this loop. Depending on the parallel execution, one rand gets executed before the other, causing non-deterministic random numbers.

The tricky part is, that as in total the number of random calls is correct, the global state correctly gets incremented, so that consecutive executions of the RNN can be correct.

I therefore build a reproducer that has the rough structure:

for o in range(1000):
	reset_seed()
	result = []
	for i in range(10):
		result.append(model(data, training=True))
	if reference == result:
		print('passed')
	else:
		print_differences()

See full code below. When I run it on my server I get 50-80 failures within 1000 iterations of the outer loop. There is no pattern when this error occurs, which matches my theory of the a race condition.

www.tensorflow.org/api_docs/python/tf/while_loop further explains that For correct programs, while_loop should return the same result for any parallel_iterations > 0. (although I think they mean > 1). This assumption is certainly violated for RNN that use dropout/recurrent_dropout as the state cannot be independently computed.

Describe the current behavior.
Results of running RNN with enabled dropout can generate wrong results.

Describe the expected behavior.
Results of RNNs should be identical when between different runs when properly seeded.

Contributing.

  • Do you want to contribute a PR? (yes/no): yes
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing):

Change 32 to 1 in https://github.com/keras-team/keras/blob/a63bef2ac504d651568c8800a9c1fcdcbd1cb41c/keras/backend.py#L5019 ;)

Standalone code to reproduce the issue.

import tensorflow as tf
import numpy as np
from termcolor import colored

tf.keras.utils.set_random_seed(123)

# OPTIONS ----------------------------------------------------------------------
kwargs = {
	'units':				3,
	'dropout':				0.3,
	'recurrent_dropout':	0.7
}
runs			= 1000
batch_size		= 1
seq_len			= 2
use_graph_mode	= True
# /OPTIONS ---------------------------------------------------------------------

#-------------------------------------------------------------------------------
inp		= tf.keras.Input((seq_len, 3,))
rnn		= tf.keras.layers.SimpleRNN	(**kwargs)(inp)
gru		= tf.keras.layers.GRU		(**kwargs)(inp)
lstm	= tf.keras.layers.LSTM		(**kwargs)(inp)
model	= tf.keras.Model(inputs=[inp], outputs=[
	rnn,
	gru,
	lstm,
])

args = np.ones((batch_size, seq_len, 3,), np.float32)
reference = None
failed = 0

if use_graph_mode:  func = tf.function(model.__call__).get_concrete_function(args, training=True)
else:               func = model.__call__

for o in range(runs):
	print(f"Run #{o}: ", end='')

	tf.keras.utils.set_random_seed(123)

	results = {}
	for _ in range(10):
		res = func(args, training=True)
		output = model.output
		if not isinstance(res, (list, tuple)):
			output, res = [model.output], [res]

		for k, v in zip(output, res):
			results.setdefault(k.name, []).append(np.reshape(v.numpy(), -1))

	if reference is None:
		print('using as reference')
		reference = results
	else:
		def equal_list(A, B):
			assert len(A) == len(B)
			for a, b in zip(A, B):
				if not (a == b).all():
					return False
			return True

		def equal_dict(A, B):
			assert len(A) == len(B)
			for a, b in zip(A.values(), B.values()):
				if not equal_list(a, b):
					return False				
			return True

		if equal_dict(reference, results):
			print(colored('passed', 'green'))
		else:
			failed += 1
			print(colored('failed', 'red'))

			for (k, a), b in zip(reference.items(), results.values()):
				if not equal_list(a, b):
					print('\t', k)
					for i, (c, d) in enumerate(zip(a, b)):
						passed = (c == d).all()
						print(colored(f'\t\t{"" if passed else ">> "}#{i}: {c} {d}', 'green' if passed else 'red'))

print(f"## {failed}/{runs} failed!")

Source code / logs.

The >> indicate the line where the reference (left) does not match the current output (right).

Run #0: using as reference
Run keras-team/keras#1: passed
Run keras-team/keras#2: passed
Run keras-team/keras#3: passed
Run keras-team/keras#4: passed
Run keras-team/keras#5: passed
Run keras-team/keras#6: failed
         lstm/strided_slice_3:0
                #0: [-0.04476803  0.03647946  0.43398207] [-0.04476803  0.03647946  0.43398207]
                keras-team/keras#1: [-0.06646455  0.08540117  0.4261022 ] [-0.06646455  0.08540117  0.4261022 ]
                keras-team/keras#2: [-0.05694697 -0.03903455  0.31383437] [-0.05694697 -0.03903455  0.31383437]
                keras-team/keras#3: [-0.05401477  0.00309605  0.22946905] [-0.05401477  0.00309605  0.22946905]
                keras-team/keras#4: [-0.06022511  0.02117318  0.33324468] [-0.06022511  0.02117318  0.33324468]
                keras-team/keras#5: [ 0.0327878  -0.02470545  0.31550086] [ 0.0327878  -0.02470545  0.31550086]
                keras-team/keras#6: [-0.15186085  0.03566789  0.21747145] [-0.15186085  0.03566789  0.21747145]
                >> keras-team/keras#7: [-0.06862105  0.03145761  0.39546916] [-0.10350011  0.01307995  0.40501142]
                keras-team/keras#8: [-0.05030819  0.02323429  0.42084986] [-0.05030819  0.02323429  0.42084986]
                keras-team/keras#9: [-0.06160946  0.01366921  0.31102264] [-0.06160946  0.01366921  0.31102264]
Run keras-team/keras#7: passed
Run keras-team/keras#8: passed
Run keras-team/keras#9: passed
Run keras-team/keras#10: passed
Run keras-team/keras#11: passed
Run keras-team/keras#12: passed
Run keras-team/keras#13: passed
Run keras-team/keras#14: passed
Run keras-team/keras#15: passed
Run keras-team/keras#16: passed
Run keras-team/keras#17: passed
Run keras-team/keras#18: passed
Run keras-team/keras#19: passed
Run keras-team/keras#20: passed
Run keras-team/keras#21: passed
Run keras-team/keras#22: passed
Run keras-team/keras#23: passed
Run keras-team/keras#24: passed
Run keras-team/keras#25: passed
Run keras-team/keras#26: passed
Run keras-team/keras#27: passed
Run keras-team/keras#28: failed
         simple_rnn/strided_slice_3:0
                >> #0: [ 0.8512929  -0.7614667   0.11939589] [0.90060455 0.55757344 0.99093455]
                keras-team/keras#1: [-0.9390723  0.5154745  0.3419465] [-0.9390723  0.5154745  0.3419465]
                keras-team/keras#2: [-0.4668007  0.9807295 -0.526169 ] [-0.4668007  0.9807295 -0.526169 ]
                keras-team/keras#3: [ 0.9664895  -0.77306604  0.5689427 ] [ 0.9664895  -0.77306604  0.5689427 ]
                keras-team/keras#4: [ 0.84656304 -0.01103731  0.62985945] [ 0.84656304 -0.01103731  0.62985945]
                keras-team/keras#5: [ 0.84656304 -0.01103731  0.62985945] [ 0.84656304 -0.01103731  0.62985945]
                keras-team/keras#6: [ 0.9915579   0.55109423 -0.42523235] [ 0.9915579   0.55109423 -0.42523235]
                keras-team/keras#7: [-0.82279384  0.68517345  0.38375577] [-0.82279384  0.68517345  0.38375577]
                keras-team/keras#8: [0.94490975 0.5255155  0.9951209 ] [0.94490975 0.5255155  0.9951209 ]
                keras-team/keras#9: [0.98280257 0.66127515 0.99306947] [0.98280257 0.66127515 0.99306947]
Run keras-team/keras#29: passed
Run keras-team/keras#30: passed
Run keras-team/keras#31: passed
Run keras-team/keras#32: passed
...

using a custom layer, loaded_model cannot give the same predicted value compared with original model [results not reproducible]

System Info

  • Tensorflow Version: 2.4.3
  • Custom Code: Yes
  • OS Platform and Distribution: CentOS Linux release 8.2.2004
  • Python version: 3.8
  • CUDA/cuDNN version: CUDA11, cuDNN8
  • GPU model and memory: RTX 3090, 24268MiB

Current Behaviour:
I implemented a custom layer, and use this layer to build a model. After training it, the original model gave a predicted value A, and the original model is saved as a h5 file. Then I load model from the h5 file, but the loaded model gave a different predicted value B, which means the results are not reproducible. Normally, the two models are supposed to give the same predicted value.
The custom layer is as simple as a Dense layer, and I have already locate that the custom layer caused the above problem. Actually, if I comment the line of custom layer, and uncomment the line below it (which is an original tf Dense layer), both the original model and the loaded_model gave the same results.

Standalone code to reproduce the issue
https://colab.research.google.com/drive/19_8DqzfC2JadKM9ZykJRLDEcxEkzjrT7?usp=sharing
Can also refer to the link where the issue is firstly posted: tensorflow/tensorflow#59041

Relevant log output
2022-12-29 10:33:52.034627: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-12-29 10:33:53.136713: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-12-29 10:33:53.138072: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2022-12-29 10:33:53.198743: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:3d:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2022-12-29 10:33:53.198834: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-12-29 10:33:53.203166: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2022-12-29 10:33:53.203273: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2022-12-29 10:33:53.204328: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2022-12-29 10:33:53.204657: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2022-12-29 10:33:53.209031: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2022-12-29 10:33:53.209857: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2022-12-29 10:33:53.210030: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2022-12-29 10:33:53.212456: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-12-29 10:33:53.335970: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-12-29 10:33:53.348062: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-12-29 10:33:53.349549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:3d:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2022-12-29 10:33:53.349604: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-12-29 10:33:53.349644: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2022-12-29 10:33:53.349654: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2022-12-29 10:33:53.349664: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2022-12-29 10:33:53.349673: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2022-12-29 10:33:53.349682: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2022-12-29 10:33:53.349691: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2022-12-29 10:33:53.349701: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2022-12-29 10:33:53.352041: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-12-29 10:33:53.352076: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-12-29 10:33:53.873305: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-12-29 10:33:53.873357: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0
2022-12-29 10:33:53.873366: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N
2022-12-29 10:33:53.877072: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22430 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:3d:00.0, compute capability: 8.6)
/usr/local/lib64/python3.8/site-packages/tensorflow/python/data/ops/dataset_ops.py:3503: UserWarning: Even though the tf.config.experimental_run_functions_eagerly option is set, this option does not apply to tf.data functions. tf.data functions are still traced and executed as graphs.
warnings.warn(
2022-12-29 10:33:53.990568: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2022-12-29 10:33:53.997553: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2600000000 Hz
2022-12-29 10:33:54.015124: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2022-12-29 10:33:54.679072: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2022-12-29 10:33:54.679260: I tensorflow/stream_executor/cuda/cuda_blas.cc:1838] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
625/625 [==============================] - 5s 7ms/step - loss: 0.4327
[[0.44245386]
[0.6916534 ]
[0.49306032]
[0.98741436]
[0.7631112 ]]
[[0.48893338]
[0.54947186]
[0.40105245]
[0.56347597]
[0.32270208]]

Slow training for tf.keras.layers.Embedding with the new optimizers

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04.6 LTS
  • TensorFlow installed from (source or binary): binary, presumably
  • TensorFlow version (use command below): v2.12.0-rc1-12-g0db597d0d75
  • Python version: 3.10.12 (main, Jun 7 2023, 12:45:35) [GCC 9.4.0]
  • Bazel version (if compiling from source): n/a
  • GPU model and memory: NVIDIA Tesla T4 16GB
  • Exact command to reproduce: Code in colab here.

Describe the problem.

I'm seeing a significant performance regression when testing training with large embedding tables (tf.keras.layers.Embedding) with some of the new optimizers compared to the legacy optimizers. Times are logged from the last cell of the colab notebook; for larger tables this can be a severalfold slowdown. In this tiny test I see the issue with SGD, Adadelta, Adagrad, Adamax, and Ftrl. Relative to legacy, the new Nadam optimizer performs similarly and the new Adam and RMSProp optimizers speed up.

As a side-note, I'm not certain if this should be filed as a separate issue (and perhaps to XLA directly) but I am also seeing slowdowns when passing jit_compile=True to model.compile with some of the new optimizers. In the colab notebook, altering one line:
model.compile(optimizer=o, loss='binary_crossentropy')
to
model.compile(optimizer=o, loss='binary_crossentropy', jit_compile=True)
in the final cell displays a further slowdown for SGD, Adagrad, and FTRL optimizers, for example.

Describe the current behavior.

Rough performance figures summarized from this test (all ms/epoch for the same type of random data):

  • SGD: 97.2 legacy vs 605.1 new
  • Adadelta: 69.6 legacy vs 842.1 new
  • Adagrad: 66.9 legacy vs 88.2 new
  • Adamax: 72.7 legacy vs 898.6 new
  • FTRL: 71.5 legacy vs 983.2 new

Standalone code to reproduce the issue.

Code in colab here.

Feature request: Add a "pandas field selection layer" to allow saving a specification of what inputs are needed in what order / what the output of the model corresponds to

System information.

TensorFlow version (you are using): 2.11.0 (though it should not play a role)
Are you willing to contribute it (Yes/No) : No (I am not familiar enough with the Keras internals, it would take me too much time to get familiar with these)

Describe the feature and the current behavior/state.

I regularly perform error correction / post processing of data, where the data are available as a big pandas dataframe, with each potential "entry to process" as a row in the dataframe, and each column in the dataframe as a data field present in each entry. What happens usually is that the input dataframe has many more columns than I end up using - effectively, I use only a few of the features in the end. To remember which columns I use, and in which order, I then end up needing to save, in addition to the Keras model, a specification of the list of ordered feature columns I use as input to my model. Of course, this is a bit tiring and error prone to do by hand, make sure to keep the correct spec alongside the correct Keras model dump, etc.

This reminds me a bit of the "problem" faced when normalizing / denormalizing the data input / output. This used to be a pain (need to save the means and stds separately, and manage them by hand), but this is now super easy to manage thanks to the Normalization layers: since these are part of the network, this means that by using them, the user does not need to worry about storing, restoring, and applying these coefficients by hand, and does not need either to manage additional files that must be kept alongside the keras model dump (a simple in theory but error prone in practice process). For me, these simple Normalization layers are a huge gain, and I would like to leverage this in the same way for the features selection / output labeling.

Therefore, my question is the following: could we add a layer to make the specification of what columns to use from a pandas dataframe, and in what order, part of the specification of the Keras models, by adding a new "pandas field selection layer"? This would remove the overhead / tiring / error prone process of bookkeeping, saving, restoring, etc, this spec which users now have to do by hand.

This could also be used "in reverse" to automatically turn the Keras model output into a pandas, with named column(s). This way, this makes it possible to better / implicitly document the model as a whole (things like, "what is it producing and in what units" can now be embedded in the network, through the name of the output column(s)).

I am not an expert, but an API something like the following could be useful, partially copied from https://keras.io/api/layers/preprocessing_layers/numerical/normalization/ (open to discussions / suggestions of improvements of course :) ):

tf.keras.layers.PandasSelection(
    list_columns, invert=False, **kwargs
)

with arguments:

  • list_columns: the list of pandas columns, like ["column_name_feature_1", "column_name_feature_2", ...]
  • invert: if False, the layer can be used as the input layer to the model, and takes in a pandas dataframe, and will generate the individual samples with the ordered features corresponding to list_columns. If True, the layer can be used as the output of the model, and transforms the purely numeric output of the keras model into a pandas with column names for each output as specified by list_columns.

The layer would generate a runtime exception if the list of columns cannot be found in the input pandas dataframe. The layer would also have a couple of attributes, like .list_columns would return the list_columns list. A .reverse method to return the "reversed" version of the layer.

So my models would now look like (of course could have something else than connected layer at the start and end of the "real" network neural layers):

# defining the inputs / ouputs to use
pandas_column_extraction_layer = layers.PandasSelection(["column_name_1", "column_name_2", ...], invert=False)  # this is new!!
pandas_inv_labeling_layer = layers.PandasSelection(["output_name_1", "output_name_2", "output_name_3"], invert=True)  # this is new!!

# preparing my normalization / denormalization
labels_inv_normalization_layer = layers.Normalization(invert=True, input_shape=[len(pandas_inv_labeling_layer.list_columns),], axis=None)
labels_inv_normalization_layer.adapt(pandas_inv_labeling_layer.reverse(pandas_training_data))
#
predictors_normalization_layer = layers.Normalization()
predictors_normalization_layer.adapt(pandas_column_extraction_layer(pandas_training_data))

# the model itself; everything is part of the model spec and is fully saved / loaded with the default keras functions
input_layer = pandas_column_extraction_layer  # this is new!! takes in a pandas with well labeled columns
normalized_input = predictors_normalization_layer(input_layer)
fully_connected_start = keras.layers.Dense(60, activation="relu")(normalized_input)
... [the internals of the network]
fully_connected_end = keras.layers.Dense(60, activation="relu")(previous_internal_layer)
internal_output = keras.layers.Dense(len(pandas_inv_labeling_layer.list_columns))(fully_connected_end)
denormalized_output = labels_inv_normalization_layer(internal_output)
pandas_output = pandas_inv_labeling_layer(denormalized_output)  # this is new!! outputs a pandas with well labeled columns

keras_model = keras.Model(inputs=input_layer, outputs=pandas_output)

Now calling:

pandas_out = keras_model(pandas_in)

would work out of the box, and pandas_out is a pandas dataframe with the same number of rows as pandas_in, and the set of columns defined in pandas_inv_labeling_layer.list_columns, and all of this metadata is saved / restored with the save and load_model API.

Will this change the current api? How?

This will not change any existing API, this will only add an extra layer that can be used if the user wants and provides "automagic" management of metadata and inputs and outputs specs by leveraging pandas datasets labeling.

Who will benefit from this feature?

Potentially, all users who use pandas as an input to their Keras model, and use a given subset / ordering of the pandas file as an input. These users will not need any longer to implement the bookkeeping themselves, and can delegate it to a Keras layer that is part of the model spec, dump, load, etc.

Contributing

  • Do you want to contribute a PR? (yes/no): No (I am not familiar enough with the Keras internals, it would take me quite a bit of time to get familiar with these)

tf.keras.mixed_precision.LossScaleOptimizer causes Graph execution error when using tfa.optimizers.MultiOptimizer and mixed_precision

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): google colab
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.11.0
  • Python version: 3.8
  • Bazel version (if compiling from source):
  • GPU model and memory: no idea what is used in google colab

Describe the problem.
I want to use mixed_precision and multi_optimizer at the same time.

Describe the current behavior.
Tensorflow crashes with a Graph Execution error when using mixed precision with the multioptimizer

Describe the expected behavior.
No crash.

Contributing.

  • Do you want to contribute a PR? (yes/no): no (I have no understanding of the involved code at all)

Standalone code to reproduce the issue.

https://colab.research.google.com/drive/1dk9SXd88aVwWHs7mshnX8sR8FVoEJOt-?usp=sharing

import tensorflow as tf
import tensorflow_datasets as tfds
import tensorflow_addons as tfa
from tensorflow.keras import mixed_precision

policy = mixed_precision.Policy('mixed_float16')

#uncomment to trigger bug
mixed_precision.set_global_policy(policy)

ds_train, = tfds.load('mnist',  split=['train'], as_supervised=True,)

ds_train = ds_train.cache()
ds_train = ds_train.batch(32)
ds_train = ds_train.prefetch(tf.data.AUTOTUNE)

model = tf.keras.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(10)
])
optimizers = [
    tf.keras.optimizers.Adam(learning_rate=0.001),
    tf.keras.optimizers.Adam(learning_rate=0.002)
]

optimizers_and_layers = [
    (optimizers[0], model.layers[:2]), 
    (optimizers[1], model.layers[2:])
]
optimizer = tfa.optimizers.MultiOptimizer(optimizers_and_layers)

model.compile(optimizer=optimizer, loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True))
model.fit(ds_train, epochs=1)
InvalidArgumentError: Graph execution error:

Detected at node 'cond_1/AssignAddVariableOp' defined at (most recent call last):
    File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
      return _run_code(code, main_globals, None,
    File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
      exec(code, run_globals)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel_launcher.py", line 16, in <module>
      app.launch_new_instance()
    File "/usr/local/lib/python3.8/dist-packages/traitlets/config/application.py", line 992, in launch_instance
      app.start()
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelapp.py", line 612, in start
      self.io_loop.start()
    File "/usr/local/lib/python3.8/dist-packages/tornado/platform/asyncio.py", line 149, in start
      self.asyncio_loop.run_forever()
    File "/usr/lib/python3.8/asyncio/base_events.py", line 570, in run_forever
      self._run_once()
    File "/usr/lib/python3.8/asyncio/base_events.py", line 1859, in _run_once
      handle._run()
    File "/usr/lib/python3.8/asyncio/events.py", line 81, in _run
      self._context.run(self._callback, *self._args)
    File "/usr/local/lib/python3.8/dist-packages/tornado/ioloop.py", line 690, in <lambda>
      lambda f: self._run_callback(functools.partial(callback, future))
    File "/usr/local/lib/python3.8/dist-packages/tornado/ioloop.py", line 743, in _run_callback
      ret = callback()
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 787, in inner
      self.run()
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 748, in run
      yielded = self.gen.send(value)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelbase.py", line 365, in process_one
      yield gen.maybe_future(dispatch(*args))
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 209, in wrapper
      yielded = next(result)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelbase.py", line 268, in dispatch_shell
      yield gen.maybe_future(handler(stream, idents, msg))
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 209, in wrapper
      yielded = next(result)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/kernelbase.py", line 543, in execute_request
      self.do_execute(
    File "/usr/local/lib/python3.8/dist-packages/tornado/gen.py", line 209, in wrapper
      yielded = next(result)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/ipkernel.py", line 306, in do_execute
      res = shell.run_cell(code, store_history=store_history, silent=silent)
    File "/usr/local/lib/python3.8/dist-packages/ipykernel/zmqshell.py", line 536, in run_cell
      return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 2854, in run_cell
      result = self._run_cell(
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 2881, in _run_cell
      return runner(coro)
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/async_helpers.py", line 68, in _pseudo_sync_runner
      coro.send(None)
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3057, in run_cell_async
      has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3249, in run_ast_nodes
      if (await self.run_code(code, result,  async_=asy)):
    File "/usr/local/lib/python3.8/dist-packages/IPython/core/interactiveshell.py", line 3326, in run_code
      exec(code_obj, self.user_global_ns, self.user_ns)
    File "<ipython-input-6-287dce801bee>", line 18, in <module>
      model.fit(ds_train, epochs=1)
    File "/usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1650, in fit
      tmp_logs = self.train_function(iterator)
    File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1249, in train_function
      return step_function(self, iterator)
    File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1233, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1222, in run_step
      outputs = model.train_step(data)
    File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1027, in train_step
      self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
    File "/usr/local/lib/python3.8/dist-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 588, in minimize
      return self.apply_gradients(grads_and_vars, name=name)
    File "/usr/local/lib/python3.8/dist-packages/keras/mixed_precision/loss_scale_optimizer.py", line 837, in apply_gradients
      maybe_apply_op = tf.__internal__.smart_cond.smart_cond(
    File "/usr/local/lib/python3.8/dist-packages/keras/mixed_precision/loss_scale_optimizer.py", line 821, in do_not_apply_fn
      return self._optimizer.iterations.assign_add(1, read_value=False)
Node: 'cond_1/AssignAddVariableOp'
Cannot update variable with shape [0] using a Tensor with shape [], shapes must be equal.
	 [[{{node cond_1/AssignAddVariableOp}}]] [Op:__inference_fn_with_cond_1266]

tf.keras.models.model_from_json() missing a safe_mode parameter

When loading a model with a lambda layer from a json model config, TensorFlow/Keras 2.13 provides the following error:

ValueError: Requested the deserialization of a Lambda layer with a Python 'lambda' inside it. This carries a potential risk of arbitrary code execution and thus it is disallowed by default. If you trust the source of the saved model, you can pass 'safe_mode=False' to the loading function in order to allow Lambda layer loading.

However tf.keras.models.model_from_json() does not have a safe_mode parameter. So there does not seem to be a way to load models with lamda layers using a json config. This issue does not appear in TensorFlow/Keras 2.12

Standalone code to reproduce the issue

from tensorflow.keras import Model
from tensorflow.keras.layers import Dense, Input, Lambda

inputs = Input(shape=(1,))
x = Lambda(lambda x: x*2)(inputs)
out = Dense(1)(x)
model = Model(inputs=inputs,outputs=out)

model_config = model.to_json()
tf.keras.models.model_from_json(model_config)

Errors in loading TF model

Click to expand!

Issue Type

Bug

Source

binary

Tensorflow Version

tensorflow-macos 2.9.2, colab 2.8.2, linux 2.8.2

Custom Code

No

OS Platform and Distribution

mac m1, linux, colab

Mobile device

No response

Python version

3.8

Bazel version

No response

GCC/Compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current Behaviour?

Error when loading model
1

model of tf.keras.layers.Add with const in the second arg 
save_format tf - ok
save_format h5 - error

2

model of tf.keras.layers.Add with const in the first arg
save_format tf - error
save_format h5 - error

3

model of x1 == x2 
save_format tf - error
save_format h5 - error

Standalone code to reproduce the issue

Colab Code Link

1

import numpy as np
import tensorflow as tf
1 x1 = tf.keras.layers.Input(shape=(1, 2, 3))
x2 = tf.constant(np.ones([1,1,1,3]))
x = tf.keras.layers.Add()([x1, x2])
model = tf.keras.Model(inputs=[x1], outputs=[x])
model.compile()
print(model.predict(np.random.rand(1, 1, 2, 3)))
model.save("Mymodel", save_format='tf')
loaded_model = tf.keras.models.load_model("Mymodel") #ok
model.save("model.h5")
model = tf.keras.models.load_model("model.h5") #error

2

import numpy as np
import tensorflow as tf
x1 = tf.keras.layers.Input(shape=(1, 2, 3))
x2 = tf.constant(np.ones([1,1,3]))
x = tf.keras.layers.Add()([x2, x1])
model = tf.keras.Model(inputs=[x1], outputs=[x])
model.compile()
print(model.predict(np.random.rand(1, 1, 2, 3)))
model.save("Mymodel",save_format='tf')
loaded_model = tf.keras.models.load_model("Mymodel") #error
model.save("model.h5")
model = tf.keras.models.load_model("model.h5") #error

3

import numpy as np
import tensorflow as tf
x1 = tf.keras.layers.Input(shape=(1, 2, 3))
x2 = tf.keras.layers.Input(shape=(1, 2, 3))
x = x1 == x2
model = tf.keras.Model(inputs=[x1, x2], outputs=[x])
model.compile()
data= np.random.rand(1, 1, 2, 3)
print(model.predict([data, data]))
model.save("Mymodel",save_format='tf')
loaded_model = tf.keras.models.load_model("Mymodel") #error
model.save("model.h5")
model = tf.keras.models.load_model("model.h5") #error

Relevant log output

1

[[[[1.4474285 1.5162606 1.9524314]
   [1.6375002 1.3645701 1.248564 ]]]]
INFO:tensorflow:Assets written to: Mymodel/assets
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-df88e1175bcb> in <module>()
      8 loaded_model = tf.keras.models.load_model("Mymodel") #ok
      9 model.save("model.h5")
---> 10 model = tf.keras.models.load_model("model.h5") #error

1 frames
/usr/local/lib/python3.7/dist-packages/keras/layers/merge.py in <setcomp>(.0)
     94                        f'Got {len(input_shape)} inputs. '
     95                        f'Full input_shape received: {input_shape}')
---> 96     batch_sizes = {s[0] for s in input_shape if s} - {None}
     97     if len(batch_sizes) > 1:
     98       raise ValueError(

TypeError: unhashable type: 'list'

2

WARNING:tensorflow:5 out of the last 5 calls to <function Model.make_predict_function.<locals>.predict_function at 0x7fc230dc1c20> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.
[[[[1.2783022 1.9072516 1.3729118]
   [1.1954207 1.6057456 1.5275352]]]]
INFO:tensorflow:Assets written to: Mymodel/assets
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-8-9aaa3c26f4cc>](https://localhost:8080/#) in <module>()
      6 print(model.predict(np.random.rand(1, 1, 2, 3)))
      7 model.save("Mymodel",save_format='tf')
----> 8 loaded_model = tf.keras.models.load_model("Mymodel") #error
      9 model.save("model.h5")
     10 model = tf.keras.models.load_model("model.h5") #error

1 frames
[/usr/local/lib/python3.7/dist-packages/keras/backend.py](https://localhost:8080/#) in ndim(x)
   1499 
   1500   """
-> 1501   return x.shape.rank
   1502 
   1503 

AttributeError: Exception encountered when calling layer "add_2" (type Add).

'list' object has no attribute 'shape'

Call arguments received:
  • inputs=[[[['tf.Tensor(shape=(), dtype=float32)', 'tf.Tensor(shape=(), dtype=float32)', 'tf.Tensor(shape=(), dtype=float32)']]], 'tf.Tensor(shape=(None, 1, 2, 3), dtype=float32)']

3

[[[[ True  True  True]
   [ True  True  True]]]]
INFO:tensorflow:Assets written to: Mymodel/assets
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-6ed972a3dad2> in <module>()
      7 print(model.predict([data, data]))
      8 model.save("Mymodel",save_format='tf')
----> 9 loaded_model = tf.keras.models.load_model("Mymodel") #error
     10 model.save("model.h5")
     11 model = tf.keras.models.load_model("model.h5") #error

1 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py in op_dispatch_handler(*args, **kwargs)
   1074         if iterable_params is not None:
   1075           args, kwargs = replace_iterable_params(args, kwargs, iterable_params)
-> 1076         result = api_dispatcher.Dispatch(args, kwargs)
   1077         if result is not NotImplemented:
   1078           return result

TypeError: Missing required positional argument
```</details>

Silent coercion from logits to probabilities

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): M1 Mac
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): unknown 2.10.0
  • Python version: Python 3.10.5
  • Bazel version (if compiling from source): N/A
  • GPU model and memory: M1 Mac
  • Exact command to reproduce: N/A

Describe the problem.

The code for cross entropy will silently coerce the output vector so that it will sum to one. It is very easy to forget about the from_logits parameter (which is false by default), and mistakenly pass in logits to cross entropy which don't sum to one but which get silently scaled to sum to one regardless.

The code snippet in question: https://github.com/keras-team/keras/blob/master/keras/backend.py#L5589

def categorical_crossentropy(target, output, from_logits=False, axis=-1):
    ...
    # Adjust the predictions so that the probability of
    # each class for every sample adds up to 1
    # This is needed to ensure that the cross entropy is
    # computed correctly.
    output = output / tf.reduce_sum(output, axis, True)

This sort of error is easy to catch (by just checking that the output isn't outrageously far from summing to 1) and would solve a lot of common mistakes with keras beginners.

Describe the current behavior.

If I mistakenly pass logits to cross entropy loss and forget about the from_logits parameter, I will get no error but will get incorrect results

Describe the expected behavior.

If I pass something to cross entropy loss that's obviously not probabilities, it's probably a user error and the command should fail.

Contributing.

  • Do you want to contribute a PR? (yes/no): yes
  • If yes, please read this page for instructions
  • Briefly describe your candidate solution(if contributing): Add an assertion that fails if the output sums to a value that isn't within a tolerance to 1.

Standalone code to reproduce the issue.

import tensorflow as tf
import numpy as np
y_true = [[0, 1, 0], [0, 0, 1]]
y_logits = [[0.15, 0.05, 0], [0.01, 0.08, 0.01]]
# Assert that the `y_logits` don't sum to 1 (so they're not valid probabilities)
assert not np.any(np.array(y_logits).sum(axis=1) == 1)
# But we can still pass them in to categorical_crossentropy regardless ):
# no error is omitted
loss = tf.keras.losses.categorical_crossentropy(y_true, y_logits)
# The loss is still calculated fine, even though it's meaningless
assert np.isclose(
    loss.numpy(),
    np.array([1.3862944, 2.302585 ])
).all()

Thanks!

Fail to load model with == op (operator.eq)

System information.

  • Have I written custom code (as opposed to using a stock example script provided in Keras): NO
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Colab
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.13 / 2.15.0-dev20230903

Describe the problem.

When the model continues build-in equal like == or operator.eq operator fails in load
but when using tf.math.equal is ok

Describe the current behavior.

load when have build-in operator like other build-in + - * /

  • Do you want to contribute a PR? (yes/no): no

Standalone code to reproduce the issue.

gist here

Source code / logs.

TypeError                                 Traceback (most recent call last)
[<ipython-input-5-db50d76d7b09>](https://localhost:8080/#) in <cell line: 2>()
      1 # fails load model
----> 2 create_model_save_predict_load(operator.eq)

9 frames
[<ipython-input-3-c5cb0ac5e529>](https://localhost:8080/#) in create_model_save_predict_load(eq_func)
     12     model.predict([data1, data2])
     13 
---> 14     tf.keras.models.load_model("model.keras")  # fails here when is ==  or operator.eq

[/usr/local/lib/python3.10/dist-packages/keras/src/saving/saving_api.py](https://localhost:8080/#) in load_model(filepath, custom_objects, compile, safe_mode, **kwargs)
    252                 f"with the native Keras format: {list(kwargs.keys())}"
    253             )
--> 254         return saving_lib.load_model(
    255             filepath,
    256             custom_objects=custom_objects,

[/usr/local/lib/python3.10/dist-packages/keras/src/saving/saving_lib.py](https://localhost:8080/#) in load_model(filepath, custom_objects, compile, safe_mode)
    279 
    280     except Exception as e:
--> 281         raise e
    282     else:
    283         return model

[/usr/local/lib/python3.10/dist-packages/keras/src/saving/saving_lib.py](https://localhost:8080/#) in load_model(filepath, custom_objects, compile, safe_mode)
    244             # Construct the model from the configuration file in the archive.
    245             with ObjectSharingScope():
--> 246                 model = deserialize_keras_object(
    247                     config_dict, custom_objects, safe_mode=safe_mode
    248                 )

[/usr/local/lib/python3.10/dist-packages/keras/src/saving/serialization_lib.py](https://localhost:8080/#) in deserialize_keras_object(config, custom_objects, safe_mode, **kwargs)
    726     safe_mode_scope = SafeModeScope(safe_mode)
    727     with custom_obj_scope, safe_mode_scope:
--> 728         instance = cls.from_config(inner_config)
    729         build_config = config.get("build_config", None)
    730         if build_config:

[/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py](https://localhost:8080/#) in from_config(cls, config, custom_objects)
   3328                 # Revive Functional model
   3329                 # (but not Functional subclasses with a custom __init__)
-> 3330                 inputs, outputs, layers = functional.reconstruct_from_config(
   3331                     config, custom_objects
   3332                 )

[/usr/local/lib/python3.10/dist-packages/keras/src/engine/functional.py](https://localhost:8080/#) in reconstruct_from_config(config, custom_objects, created_layers)
   1503                 while layer_nodes:
   1504                     node_data = layer_nodes[0]
-> 1505                     if process_node(layer, node_data):
   1506                         layer_nodes.pop(0)
   1507                     else:

[/usr/local/lib/python3.10/dist-packages/keras/src/engine/functional.py](https://localhost:8080/#) in process_node(layer, node_data)
   1443                     input_tensors
   1444                 )
-> 1445             output_tensors = layer(input_tensors, **kwargs)
   1446 
   1447             # Update node index map.

[/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py](https://localhost:8080/#) in error_handler(*args, **kwargs)
     68             # To get the full stack trace, call:
     69             # `tf.debugging.disable_traceback_filtering()`
---> 70             raise e.with_traceback(filtered_tb) from None
     71         finally:
     72             del filtered_tb

[/usr/local/lib/python3.10/dist-packages/tensorflow/python/util/dispatch.py](https://localhost:8080/#) in op_dispatch_handler(*args, **kwargs)
   1252         if iterable_params is not None:
   1253           args, kwargs = replace_iterable_params(args, kwargs, iterable_params)
-> 1254         result = api_dispatcher.Dispatch(args, kwargs)
   1255         if result is not NotImplemented:
   1256           return result

TypeError: Missing required positional argument

TensorFlow 'NoneType' object is not subscriptable during fit()

I am building a TensorFlow model that takes 4 inputs and gives 2 outputs.
I first start with a pd.DataFrame:


train_targets = train_features[["content", "wording"]]
train_features = train_features[["text", "prompt_question", "prompt_title", "prompt_text"]]

Then, I use a generator to create the TensorFlow Dataset:

def generator():
    for text, prompt_question, prompt_title, prompt_text, content, wording in zip(train_features["text"], train_features["prompt_question"], train_features["prompt_title"], train_features["prompt_text"], train_targets["content"], train_targets["wording"]):
      yield {"text": text_vectorization(text), "prompt_question": text_vectorization(prompt_question), "prompt_title": text_vectorization(prompt_title), "prompt_text": text_vectorization(prompt_text)}, {"content": content, "wording": wording}

train_ds = tf.data.Dataset.from_generator(generator, output_types=({"text": tf.int64, "prompt_question": tf.int64, "prompt_title": tf.int64, "prompt_text": tf.int64}, {"content": tf.float32, "wording": tf.float32}))

Here is what one "row" of train_ds looks like:
({"text": [1, 0, 0, ..., 1, 1], "prompt_question": [1, 0, 0, ..., 1, 1], "prompt_title": [1, 0, 0, ..., 1, 1], "prompt_text": [1, 0, 0, ..., 1, 1]}, {"content": 2, "wording": 1}))
Every value is a tensor.


Type of text is <class 'tensorflow.python.framework.ops.EagerTensor'>
Type of prompt_question is <class 'tensorflow.python.framework.ops.EagerTensor'>
Type of prompt_title is <class 'tensorflow.python.framework.ops.EagerTensor'>
Type of prompt_text is <class 'tensorflow.python.framework.ops.EagerTensor'>
Type of content is <class 'tensorflow.python.framework.ops.EagerTensor'>
Type of wording is <class 'tensorflow.python.framework.ops.EagerTensor'>

Note that I use a TextVectorization layer from keras:


max_tokens = 20000
max_length = 600

text_vectorization = keras.layers.TextVectorization(
 max_tokens=max_tokens,
 output_mode="int",
 output_sequence_length=max_length,
)

text_vectorization.adapt(a list of all my texts)

At this point, train_ds contains no None values.

Here is my model:

def generate_model():
    input_names = train_features.columns.tolist()

    inputs = []

    for name in input_names:
        inputs.append(keras.layers.Input(shape=(None,), dtype="int64", name=name))

    concatenate = keras.layers.concatenate(inputs)
    
    embedded = keras.layers.Embedding(input_dim=max_tokens, output_dim=256, mask_zero=True)(concatenate)
    
    x = keras.layers.Bidirectional(keras.layers.LSTM(32))(embedded)
    x = keras.layers.Dropout(0.5)(x)

    output_content = keras.layers.Dense(1, activation="linear", name="content")(x)
    output_wording = keras.layers.Dense(1, activation="linear", name="wording")(x)

    model = keras.models.Model(inputs=inputs, outputs=[output_content, output_wording])

    return model

I compile it like this:
model.compile(loss={"content": "mean_squared_error", "wording": "mean_squared_error"}, optimizer="adam")

The error occurs when I try to fit the model:
history = model.fit(x=train_ds, batch_size=32, epochs=20, callbacks=[callbacks])
Here is the traceback:

TypeError: Exception encountered when calling layer 'forward_lstm_5' (type LSTM).
    
    'NoneType' object is not subscriptable
    
    Call arguments received by layer 'forward_lstm_5' (type LSTM):
      • inputs=tf.Tensor(shape=, dtype=float32)
      • mask=tf.Tensor(shape=, dtype=bool)
      • training=True
      • initial_state=None

It appears that the error rises because of the LSTM layer:
x = keras.layers.Bidirectional(keras.layers.LSTM(32))(embedded)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.