giuseppegrieco / keras-tuner-cv Goto Github PK

Extension for keras tuner that adds a set of classes to implement cross validation techniques.

License: GNU General Public License v3.0

Python 100.00%

cross-validation keras keras-tuner keras-tuner-cross-validation

keras-tuner-cv's Introduction

Hi there 👋

Hi, I'm Giuseppe Grieco and I am a computer science enthusiast with a passion for solving complex problems using cutting-edge technologies. I am particularly interested in algorithms, DevOps, machine learning, competitive programming, and backend development.

In my free time, I enjoy staying up-to-date with the latest developments in the field and participating in online coding challenges Codeforces.

If you have any questions or just want to chat about computer science, feel free to reach out to me on LinkedIn or Twitter. I'd love to connect!

keras-tuner-cv's People

Contributors

Stargazers

Watchers

Forkers

jaalu vzoche-golob dubovikmaster feheragyar maggiorante

keras-tuner-cv's Issues

Type error when running inner cross-validation

When trying the examples/inner_cv, I got a type error after calling tuner.search(...) in lines 52-60:

INFO:tensorflow:
------------------------------
Inner Cross-Validation 5/5
------------------------------

Epoch 1/2
1/1 [==============================] - 1s 1s/step - loss: 2.3470 - accuracy: 0.1125 - val_loss: 2.1534 - val_accuracy: 0.2590
Epoch 2/2
1/1 [==============================] - 0s 260ms/step - loss: 2.1707 - accuracy: 0.2352 - val_loss: 2.0040 - val_accuracy: 0.4114
1/1 [==============================] - 0s 219ms/step
1/1 [==============================] - 0s 82ms/step
1/1 [==============================] - 0s 305ms/step - loss: 2.0043 - accuracy: 0.4132
1/1 [==============================] - 0s 184ms/step - loss: 2.0040 - accuracy: 0.4114
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [20], line 1
----> 1 tuner.search(
      2     x_train,
      3     y_train,
      4     validation_split=0.2,
      5     batch_size="full-batch",
      6     validation_batch_size="full-batch",
      7     epochs=2,
      8     verbose=True,
      9 )

File ~/miniconda3/envs/tf/lib/python3.9/site-packages/keras_tuner_cv/inner_cv.py:78, in inner_cv.<locals>.InnerCV.search(self, *fit_args, **fit_kwargs)
     68     warnings.warn(
     69         "`Tuner.run_trial()` returned None. It should return one of "
     70         "float, dict, keras.callbacks.History, or a list of one "
   (...)
     75         stacklevel=2,
     76     )
     77 else:
---> 78     metrics = tuner_utils.convert_to_metrics_dict(
     79         results, self.oracle.objective, "Tuner.run_trial()"
     80     )
     81     metrics.update(get_metrics_std_dict(results))
     82     self.oracle.update_trial(
     83         trial.trial_id,
     84         metrics,
     85     )

TypeError: convert_to_metrics_dict() takes 2 positional arguments but 3 were given

No max_trials in a Hyperband tuner.

The Hyperband tuner of Keras Tuner does not have an attribute max_trials. But keras-tuner-cv.utils.pd_inner_cv_get_result() relies on it (line 20).

GPU issues?

I have been using your extension on CPUs and it runs perfectly. I recently moved over to a using a GPU and the loss calculation looks completely chaotic now. Are there some issues in the implementation that prohibit the use of GPUs?
Here is a snippet for you to see the loss calculation issues (Best loss is 'none'; The recovered weights result in previously unseen loss values after early stopping; and due to 'none' best loss value the Best hyperparameters remain as the were set for the very first trial):

Inner Cross-Validation 5/5

Epoch 1/50
6/6 [==============================] - 5s 575ms/step - loss: 0.5369 - mean_squared_error: 0.5369 - mean_absolute_error: 0.6359 - mean_absolute_percentage_error: 263.9126 - root_mean_squared_error: 0.7327 - val_loss: 0.0721 - val_mean_squared_error: 0.0721 - val_mean_absolute_error: 0.2148 - val_mean_absolute_percentage_error: 22.1264 - val_root_mean_squared_error: 0.2685
Epoch 2/50
6/6 [==============================] - 3s 475ms/step - loss: 0.1652 - mean_squared_error: 0.1652 - mean_absolute_error: 0.3106 - mean_absolute_percentage_error: 323.5719 - root_mean_squared_error: 0.4065 - val_loss: 0.0850 - val_mean_squared_error: 0.0850 - val_mean_absolute_error: 0.2492 - val_mean_absolute_percentage_error: 25.4391 - val_root_mean_squared_error: 0.2915
Epoch 3/50
6/6 [==============================] - 3s 478ms/step - loss: 0.1079 - mean_squared_error: 0.1079 - mean_absolute_error: 0.2405 - mean_absolute_percentage_error: 256.0751 - root_mean_squared_error: 0.3284 - val_loss: 0.0103 - val_mean_squared_error: 0.0103 - val_mean_absolute_error: 0.0714 - val_mean_absolute_percentage_error: 7.3397 - val_root_mean_squared_error: 0.1013
Epoch 4/50
6/6 [==============================] - 3s 478ms/step - loss: 0.1035 - mean_squared_error: 0.1035 - mean_absolute_error: 0.1980 - mean_absolute_percentage_error: 354.6868 - root_mean_squared_error: 0.3217 - val_loss: 0.0538 - val_mean_squared_error: 0.0538 - val_mean_absolute_error: 0.2179 - val_mean_absolute_percentage_error: 22.2260 - val_root_mean_squared_error: 0.2319
Epoch 5/50
6/6 [==============================] - 3s 481ms/step - loss: 0.1149 - mean_squared_error: 0.1149 - mean_absolute_error: 0.2556 - mean_absolute_percentage_error: 254.6845 - root_mean_squared_error: 0.3389 - val_loss: 0.0229 - val_mean_squared_error: 0.0229 - val_mean_absolute_error: 0.1178 - val_mean_absolute_percentage_error: 12.0714 - val_root_mean_squared_error: 0.1513
Epoch 6/50
6/6 [==============================] - 2s 381ms/step - loss: 0.0978 - mean_squared_error: 0.0978 - mean_absolute_error: 0.2223 - mean_absolute_percentage_error: 208.5932 - root_mean_squared_error: 0.3127 - val_loss: 0.0734 - val_mean_squared_error: 0.0734 - val_mean_absolute_error: 0.2140 - val_mean_absolute_percentage_error: 22.2007 - val_root_mean_squared_error: 0.2710
Epoch 7/50
6/6 [==============================] - 1s 225ms/step - loss: 0.0789 - mean_squared_error: 0.0789 - mean_absolute_error: 0.2038 - mean_absolute_percentage_error: 213.5430 - root_mean_squared_error: 0.2808 - val_loss: 0.0186 - val_mean_squared_error: 0.0186 - val_mean_absolute_error: 0.0969 - val_mean_absolute_percentage_error: 10.0373 - val_root_mean_squared_error: 0.1364
Epoch 8/50
6/6 [==============================] - 1s 228ms/step - loss: 0.0708 - mean_squared_error: 0.0708 - mean_absolute_error: 0.1652 - mean_absolute_percentage_error: 276.1188 - root_mean_squared_error: 0.2662 - val_loss: 0.0087 - val_mean_squared_error: 0.0087 - val_mean_absolute_error: 0.0701 - val_mean_absolute_percentage_error: 7.1587 - val_root_mean_squared_error: 0.0935
Epoch 9/50
6/6 [==============================] - 1s 219ms/step - loss: 0.0676 - mean_squared_error: 0.0676 - mean_absolute_error: 0.1503 - mean_absolute_percentage_error: 282.9794 - root_mean_squared_error: 0.2600 - val_loss: 0.0090 - val_mean_squared_error: 0.0090 - val_mean_absolute_error: 0.0536 - val_mean_absolute_percentage_error: 5.5848 - val_root_mean_squared_error: 0.0950
Epoch 10/50
6/6 [==============================] - 2s 409ms/step - loss: 0.0663 - mean_squared_error: 0.0663 - mean_absolute_error: 0.1536 - mean_absolute_percentage_error: 242.2759 - root_mean_squared_error: 0.2574 - val_loss: 0.0151 - val_mean_squared_error: 0.0151 - val_mean_absolute_error: 0.0738 - val_mean_absolute_percentage_error: 7.7006 - val_root_mean_squared_error: 0.1227
Epoch 11/50
6/6 [==============================] - 3s 481ms/step - loss: 0.0696 - mean_squared_error: 0.0696 - mean_absolute_error: 0.1742 - mean_absolute_percentage_error: 183.5706 - root_mean_squared_error: 0.2638 - val_loss: 0.0395 - val_mean_squared_error: 0.0395 - val_mean_absolute_error: 0.1167 - val_mean_absolute_percentage_error: 12.3000 - val_root_mean_squared_error: 0.1986
Epoch 12/50
6/6 [==============================] - 2s 269ms/step - loss: 0.0635 - mean_squared_error: 0.0635 - mean_absolute_error: 0.1620 - mean_absolute_percentage_error: 193.5781 - root_mean_squared_error: 0.2520 - val_loss: 0.0258 - val_mean_squared_error: 0.0258 - val_mean_absolute_error: 0.0838 - val_mean_absolute_percentage_error: 8.8847 - val_root_mean_squared_error: 0.1606
Epoch 13/50
6/6 [==============================] - 2s 409ms/step - loss: 0.0594 - mean_squared_error: 0.0594 - mean_absolute_error: 0.1509 - mean_absolute_percentage_error: 208.7011 - root_mean_squared_error: 0.2438 - val_loss: 0.0404 - val_mean_squared_error: 0.0404 - val_mean_absolute_error: 0.1378 - val_mean_absolute_percentage_error: 14.4424 - val_root_mean_squared_error: 0.2011
Restoring model weights from the end of the best epoch.
Epoch 00013: early stopping
1/1 [==============================] - 1s 579ms/step
1/1 [==============================] - 0s 500ms/step
1/1 [==============================] - 1s 1s/step - loss: 0.0499 - mean_squared_error: 0.0499 - mean_absolute_error: 0.1130 - mean_absolute_percentage_error: 234.8392 - root_mean_squared_error: 0.2234
1/1 [==============================] - 1s 609ms/step - loss: 0.1864 - mean_squared_error: 0.1864 - mean_absolute_error: 0.2046 - mean_absolute_percentage_error: 106.4081 - root_mean_squared_error: 0.4317
Trial 1 Complete [00h 02m 55s]

Best val_loss So Far: None
Total elapsed time: 00h 02m 55s

Error updating hyperparameters in "Best so far"

Hello, I found a hyperparameter error in the Best so far after using randomsearch. As you can see from the picture, when the num_layers = 11 in Best so far, we should expect unit_11=none, but the opposite is true. I don't know why this problem has occurred, and I really hope to get your help.
The relevant codes are as follows:
def build_model(hp):
model = keras.Sequential()
for i in range(hp.Int('num_layers', 1, 12)):
#model.add(layers.Dense(units=hp.Choice('units_' + str(i), values=[4,8,16,32,64,128,256,512]), activation='relu'))
model.add(layers.Dense(units=hp.Int('units_' + str(i), min_value=16,max_value=512,step=16), activation='relu'))
model.add(Dropout(rate=hp.Float('dropout_rate', min_value=0.0, max_value=0.5, step=0.1)))
batch_size = hp.Int('batch_size', min_value=4, max_value=64, step=4) #
model.add(Dense(1,kernel_initializer=initializer)) #
model.compile(optimizer='adam', loss='mean_squared_error', metrics=[keras.metrics.RootMeanSquaredError()])
return model

tuner_RandomSearch = inner_cv(RandomSearch)(
build_model,
KFold(n_splits=10, random_state=2024, shuffle=True),
save_output=True,
save_history=True,
objective=keras_tuner.Objective("val_root_mean_squared_error", direction="min"),
directory='DNN240503',
project_name='randomsearch_Int_L1_112',
seed=2024,
max_trials=100,
overwrite=True,
allow_new_entries=True,
)
my_callbacks = [
keras.callbacks.EarlyStopping(monitor='val_root_mean_squared_error',
patience=10, restore_best_weights=True)
]
tuner_RandomSearch.search(
train_validation_X.values,
train_validation_Y.values,
validation_split=0.25,
epochs=500,
callbacks=[my_callbacks],
verbose=True)

tuner initialisation causes error in example/inner_cv.py

Thank you for this useful extension of KerasTuner. Unfortunately, I had an issue while reproducing your examples:

The initialisation of the tuner in example/inner_cv.py, lines 49-50 causes an error because the parameters are given in the wrong order.
It should be:

tuner = inner_cv(RandomSearch)(
    build_model,                                           # hypermodel first
    KFold(n_splits=5, random_state=12345, shuffle=True),   # crossvalidator second
    save_output=True,
    save_history=True,
    objective=Objective("val_accuracy", direction="max"),
    project_name="0",
    directory="./out/inner-cv/",
    seed=12345,
    overwrite=False,
    max_trials=2,
)

Hyperband tuner cannot load model weights of previous rounds

In its second round, the Hyperband tuner loads model weights from the first round to continue the training. With keras-tuner-cv, this does not work:

Search: Running Trial #35

Value             |Best Value So Far |Hyperparameter
0.02              |0.02              |l2_regularization
0.99              |0.99              |bn_momentum
0.2               |0.2               |dropout_ti
0.2               |0.2               |dropout_context
0.2               |0.2               |dropout_trunk
cnn               |cnn               |layertype_ti
5                 |5                 |n_dim_emb_vvvo
0.01              |0.01              |lr
nadam             |nadam             |optimizer_type
2048              |2048              |batch_size
causal            |causal            |cnn_padding
20                |20                |cnn_filters
2                 |2                 |cnn_layers
9                 |3                 |tuner/epochs
3                 |0                 |tuner/initial_epoch
3                 |3                 |tuner/bracket
1                 |0                 |tuner/round
0025              |None              |tuner/trial_id

2023-09-04 16:01:33.885857: W tensorflow/core/util/tensor_slice_reader.cc:97] Could not open ./log/devpipeline_20230904144942/kt_log/devpipeline_20230904144942/trial_0025/checkpoint: DATA_LOSS: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
Traceback (most recent call last):
  File "/home/vzg/miniconda3/envs/tf/lib/python3.9/site-packages/keras_tuner_cv/inner_cv.py", line 119, in _try_run_and_update_trial
    self._run_and_update_trial(trial, *fit_args, **fit_kwargs)
  File "/home/vzg/miniconda3/envs/tf/lib/python3.9/site-packages/keras_tuner_cv/inner_cv.py", line 84, in _run_and_update_trial
    results = self.run_trial(trial, *fit_args, **fit_kwargs)
  File "/home/vzg/miniconda3/envs/tf/lib/python3.9/site-packages/keras_tuner_cv/inner_cv.py", line 228, in run_trial
    history, model = self._build_and_fit_model(
  File "/home/vzg/miniconda3/envs/tf/lib/python3.9/site-packages/keras_tuner_cv/inner_cv.py", line 348, in _build_and_fit_model
    model = self._try_build(hp)
  File "/home/vzg/miniconda3/envs/tf/lib/python3.9/site-packages/keras_tuner/engine/tuner.py", line 155, in _try_build
    model = self._build_hypermodel(hp)
  File "/home/vzg/miniconda3/envs/tf/lib/python3.9/site-packages/keras_tuner/tuners/hyperband.py", line 432, in _build_hypermodel
    model.load_weights(self._get_checkpoint_fname(trial_id))
  File "/home/vzg/miniconda3/envs/tf/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/vzg/miniconda3/envs/tf/lib/python3.9/site-packages/h5py/_hl/files.py", line 567, in __init__
    fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
  File "/home/vzg/miniconda3/envs/tf/lib/python3.9/site-packages/h5py/_hl/files.py", line 231, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 106, in h5py.h5f.open
OSError: Unable to open file (file signature not found)
Trial 35 Complete [00h 00m 01s]

Best val_mae So Far: 1.2140199661254882
Total elapsed time: 01h 11m 47s

Other tuners do not load weights during HPO, therefore, this issue is specific to Hyperband.
In tests with small hyperparameter spaces, this problem probably does not come up because Hyperband stops after the first round (see keras-tuner issue 676.

With keras-tuner-cv, the weights of a specific split model would have to be loaded.
At the moment, Hyperband._build_hypermodel() (v1.3.5) looks for the saved weights in the trial directory (trial_xxxx) expecting a single set of weights (with the default file names?) (see https://github.com/keras-team/keras-tuner/blob/v1.3.5/keras_tuner/tuners/hyperband.py#L432) - but in the directory are several of them.

cannot import name 'tuner_utils' from 'keras_tuner.engine'

Hellow, when I excitedly installed keras-tuner-cv and prepared to perform K-fold verification on the hyperparameters to improve the generalization ability of the model, I failed in the first step.
Specifically, when I run from keras_tuner_cv.outer_cv import OuterCV, I get the following error :
in this line：from keras_tuner.engine import tuner_utils
ImportError: cannot import name 'tuner_utils' from 'keras_tuner.engine' (D:\Anaconda\envs\py310\lib\site-packages\keras_tuner\engine_init_.py).
My version of kerastuner is 1.0.5. and tensorflow-gpu was 2.9.0.

TypeError with convert_to_metrics_dict() due to Argument Mismatch

I recently encountered a TypeError when using the inner_cv function from the library. The error suggests that the convert_to_metrics_dict() function is receiving an unexpected number of arguments.

TypeError: convert_to_metrics_dict() takes 2 positional arguments but 3 were given

Upon investigating, I noticed a recent commit titled "Fix: convert_to_metrics_dict no longer accepts a third argument" that seems to have removed the third argument from the function. I am unsure on what exactly I'm doing wrong here.

from keras_tuner import (
    HyperParameters,
    BayesianOptimization,
    RandomSearch,
    Objective,
)
from keras_tuner_cv.outer_cv import OuterCV
from keras_tuner_cv.inner_cv import inner_cv
from keras_tuner_cv.utils import pd_inner_cv_get_result
import numpy as np
from sklearn.model_selection import TimeSeriesSplit
import tensorflow as tf

def search_cnn_lstm_model(hp: HyperParameters):
    # Hyperparameters
    lr = hp.Float("lr", min_value=1e-4, max_value=1e-2, sampling="LOG")
    use_batch_norm = hp.Boolean("use_batch_norm", default=False)

    conv_filters = hp.Int("conv_filters", min_value=32, max_value=128, step=16)
    conv_kernel_size = hp.Int("conv_kernel_size", min_value=3, max_value=7, step=2)
    lstm_units = hp.Int("lstm_units", min_value=32, max_value=256, step=8)
    dense_units = hp.Int("dense_units", min_value=16, max_value=128, step=4)
    add_dense_layer = hp.Boolean("add_dense_layer", default=False)
    dense_dropout = hp.Float("dense_dropout", min_value=0.0, max_value=0.5, step=0.05)

    model = tf.keras.models.Sequential()

    model.add(
        tf.keras.layers.Conv1D(
            filters=conv_filters,
            kernel_size=conv_kernel_size,
            activation="relu",
            input_shape=(X_train.shape[1], X_train.shape[2]),
        )
    )
    model.add(tf.keras.layers.MaxPooling1D(pool_size=2))

    model.add(
        tf.keras.layers.LSTM(
            lstm_units,
            return_sequences=False,
        )
    )

    if use_batch_norm:
        model.add(tf.keras.layers.BatchNormalization())

    if add_dense_layer:
        model.add(tf.keras.layers.Dense(dense_units))

    model.add(tf.keras.layers.Dropout(dense_dropout))

    model.add(tf.keras.layers.Flatten())
    model.add(
        tf.keras.layers.Dense(
            units=1,
            activation="sigmoid",
        )
    )

    # Compile the model
    optimizer = tf.keras.optimizers.Adam(learning_rate=lr)
    model.compile(
        optimizer=optimizer,
        loss=tf.keras.losses.BinaryCrossentropy(),
        metrics=[tf.keras.metrics.BinaryAccuracy()],
    )

    tf.keras.utils.plot_model(
        model,
        to_file="model/architecture_plots/cnn_lstm_model.png",
        show_shapes=True,
        show_layer_activations=True,
    )

    return model

(
    X_train,
    X_test,
    y_train,
    y_test,
    X_predict,
) = get_train_and_val_sets(sequence_length=21)

tuner = inner_cv(BayesianOptimization)(
    search_cnn_lstm_model,
    TimeSeriesSplit(n_splits=2),
    objective="val_loss",
    # objective=Objective("val_binary_accuracy", direction="max"),
    save_output=True,
    save_history=True,
    max_trials=max_trials,
    seed=42,
    executions_per_trial=2,
    directory="tmp/tb",
    project_name="cnn_lstm_vl_innercv",
)

# Tuning
tuner.search(
    X_train,
    y_train,
    epochs=120,
    shuffle=False,
    validation_data=(X_test, y_test),
    batch_size=72,
    callbacks=[
        tf.keras.callbacks.EarlyStopping(
            monitor="val_binary_accuracy", patience=10, mode="max"
        ),
    ],
    verbose=True,
)