larq / zoo Goto Github PK

Reference implementations of popular Binarized Neural Networks

License: Apache License 2.0

Python 100.00%

larq binarized-neural-networks deep-learning machine-learning keras tensorflow pretrained-models neural-networks reproducible-research python

zoo's Introduction

Larq Zoo

For more information, see larq.dev/zoo.

Larq Zoo is part of a family of libraries for BNN development; you can also check out Larq for building and training BNNs and Larq Compute Engine for deployment on mobile and edge devices.

Requirements

Before installing Larq Zoo, please install:

Python version 3.8, 3.9, or 3.10
Tensorflow version 2.4 up to 2.12 (latest at time of writing).

Installation

You can install Larq Zoo with Python's pip package manager:

pip install larq-zoo

About

Larq Zoo is being developed by a team of deep learning researchers and engineers at Plumerai to help accelerate both our own research and the general adoption of Binarized Neural Networks.

zoo's People

Contributors

Stargazers

Watchers

Forkers

adamhillier jwfromm lkindrat-xmos lgeiger michea inventor71 sanyaade-teachings jawaechan happy-ngh ironteen sfalkena huangjunying kerrilu davidse7enlynch h-acker xushangnjlh franciscoandreo

zoo's Issues

Update Larq/Zoo to work with the breaking changes of Larq/Zookeeper

Zookeeper has been updated such that it can support a variety of computer vision tasks. However, the update to Zookeeper contained a breaking change (see larq/zookeeper#40). Therefore before the version of Zookeeper for Zoo can be updated (see #55), the breaking changes must be incorporated

Add documentation about how to add new models

Support TensorFlow 2.2

Describe the bug

TensorFlow 2.2 doesn't rely on the keras_applications pypi package anymore. We should either change to import it from the TensorFlow version, or explicitely include keras_applications as a dependency.

To Reproduce

Install larq zoo on a fresh Python env and try to import a model:

  File "/Users/runner/hostedtoolcache/Python/3.7.6/x64/lib/python3.7/site-packages/larq_zoo/core/utils.py", line 8, in <module>
    from keras_applications.imagenet_utils import _obtain_input_shape
ModuleNotFoundError: No module named 'keras_applications'

Environment

TensorFlow version: 2.2.0rc0
Larq version: 0.9.1
Larq-Zoo version: 1.0.0b3

Add XNOR Net

https://arxiv.org/abs/1603.05279

Make ordering of docstring constistent

Parameters to the models are now keyword only arguments. While the order doesn't matter for the code, the docstrings should be ordered consistently to show up nicely on the website.
E.g: QuickNet has the following signature:

def QuickNet(
    *,  # Keyword arguments only
    input_shape: Optional[Sequence[Optional[int]]] = None,
    input_tensor: Optional[tf.Tensor] = None,
    weights: Optional[str] = "imagenet",
    include_top: bool = True,
    num_classes: int = 1000,
) -> tf.keras.models.Model:
    """Instantiates the QuickNet architecture.
    Optionally loads weights pre-trained on ImageNet.
    ```netron
    quicknet-v0.1.0/quicknet.json
    ```
    # Arguments
    include_top: whether to include the fully-connected layer at the top of the network.
    weights: one of `None` (random initialization), "imagenet" (pre-training on
        ImageNet), or the path to the weights file to be loaded.
    input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as
        image input for the model.
    input_shape: optional shape tuple, only to be specified if `include_top` is False,
        otherwise the input shape has to be `(224, 224, 3)`.
        It should have exactly 3 inputs channels.
    classes: optional number of classes to classify images into, only to be specified
        if `include_top` is True, and if no `weights` argument is specified.
    # Returns
    A Keras model instance.
    # Raises
    ValueError: in case of invalid argument for `weights`, or invalid input shape.
    """

Add contributing guide

This should detail how to add new models and contribute fixes to the Zoo

QuickNet model and flip_ratio metric do not work together

Describe the bug

When using a model that includes QuickNet with flip_ratio metric,
model creation fails because of mismatched dimensions -
Dimensions must be equal, but are 64 and 128 for 'Equal' (op: 'Equal') with input shapes: [3,3,64,64], [3,3,128,128].

My suspicion is that one quantizer is created and reused for the entire model, and flip_ratio looks at the same quantizer with inputs of different shapes and fails because of this.

To Reproduce

from functools import partial
from typing import Callable, Tuple

import larq
import tensorflow as tf
from tensorflow import keras
from larq_zoo.sota import QuickNet

INPUT_SHAPE = 32
CLASSES_NUM = 10

EPOCHS = 100
BATCH_SIZE = 128
LEARNING_RATE = 5e-3


def quicknet(input_shape: int, num_classes: int) -> keras.Model:
    quicknet_spatial_reduce_factor = 32
    global_pool_shape = input_shape / quicknet_spatial_reduce_factor

    quicknet_pretrained_base = QuickNet(input_shape=(input_shape, input_shape, 3), include_top=False)
    quicknet_pretrained_base.trainable = True

    return tf.keras.models.Sequential([
        quicknet_pretrained_base,

        keras.layers.AveragePooling2D(pool_size=(global_pool_shape, global_pool_shape)),
        keras.layers.Flatten(),
        keras.layers.Dense(num_classes, kernel_initializer="glorot_normal"),
        tf.keras.layers.Activation("softmax", dtype="float32")
    ])


def get_dataset(batch_size: int, preprocessing: Callable) -> Tuple[tf.data.Dataset, tf.data.Dataset]:
    train, test = tf.keras.datasets.cifar10.load_data()

    train_dataset = (
        tf.data.Dataset.from_tensor_slices(train).cache()
            .shuffle(10 * batch_size, reshuffle_each_iteration=True)
            .map(partial(preprocessing, training=True))
            .batch(batch_size)
    )

    test_dataset = (
        tf.data.Dataset.from_tensor_slices(test).cache()
            .map(preprocessing)
            .batch(batch_size)
    )

    return train_dataset, test_dataset


def identity_preprocess(x, y, training=False):
    return x, y


if __name__ == "__main__":
    with larq.context.metrics_scope(['flip_ratio']):
        model = quicknet(INPUT_SHAPE, CLASSES_NUM)

    larq.models.summary(model)

    train_dataset, test_dataset = get_dataset(BATCH_SIZE, identity_preprocess)

    optimizer = keras.optimizers.Adam(LEARNING_RATE)
    loss = keras.losses.SparseCategoricalCrossentropy()

    model.compile(
        optimizer=optimizer, loss=loss, metrics=[keras.metrics.SparseCategoricalAccuracy()]
    )

    model.fit(
        train_dataset, epochs=EPOCHS, validation_data=test_dataset
    )

Expected behavior

I would have expected the example to run without problems, as happens when quicknet is replaced with a series of binary operations, but instead I get the following error:

Traceback (most recent call last):
  File "C:/Users/User/PycharmProjects/BNN-Playground/bug_replicate.py", line 59, in <module>
    model = quicknet(INPUT_SHAPE, CLASSES_NUM)
  File "C:/Users/User/PycharmProjects/BNN-Playground/bug_replicate.py", line 21, in quicknet
    quicknet_pretrained_base = QuickNet(input_shape=(input_shape, input_shape, 3), include_top=False)
  File "C:\Users\User\Anaconda3\lib\site-packages\larq_zoo\sota\quicknet.py", line 327, in QuickNet
    num_classes=num_classes,
  File "C:\Users\User\Anaconda3\lib\site-packages\zookeeper\core\factory.py", line 20, in wrapped_fn
    result = fn(factory_instance)
  File "C:\Users\User\Anaconda3\lib\site-packages\larq_zoo\sota\quicknet.py", line 189, in build
    model = super().build()
  File "C:\Users\User\Anaconda3\lib\site-packages\zookeeper\core\factory.py", line 20, in wrapped_fn
    result = fn(factory_instance)
  File "C:\Users\User\Anaconda3\lib\site-packages\larq_zoo\sota\quicknet.py", line 156, in build
    x = self.residual_block(x, use_squeeze_and_excite)
  File "C:\Users\User\Anaconda3\lib\site-packages\larq_zoo\sota\quicknet.py", line 104, in residual_block
    x = self.conv_block(x, infilters, use_squeeze_and_excite)
  File "C:\Users\User\Anaconda3\lib\site-packages\larq_zoo\sota\quicknet.py", line 92, in conv_block
    )(x)
  File "C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 773, in __call__
    outputs = call_fn(cast_inputs, *args, **kwargs)
  File "C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\autograph\impl\api.py", line 237, in wrapper
    raise e.ag_error_metadata.to_exception(e)
ValueError: in converted code:

    C:\Users\User\Anaconda3\lib\site-packages\larq\layers_base.py:37 call  *
        return super().call(inputs)
    C:\Users\User\Anaconda3\lib\site-packages\larq\layers_base.py:153 call  *
        return super().call(inputs)
    C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\keras\layers\convolutional.py:209 call
        outputs = self._convolution_op(inputs, self.kernel)
    C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\ops\nn_ops.py:1135 __call__
        return self.conv_op(inp, filter)
    C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\ops\nn_ops.py:640 __call__
        return self.call(inp, filter)
    C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\ops\nn_ops.py:239 __call__
        name=self.name)
    C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\ops\nn_ops.py:2011 conv2d
        name=name)
    C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\ops\gen_nn_ops.py:969 conv2d
        data_format=data_format, dilations=dilations, name=name)
    C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\framework\op_def_library.py:486 _apply_op_helper
        (input_name, err))

    ValueError: Tried to convert 'filter' to a tensor and failed. Error: in converted code:
    
        C:\Users\User\Anaconda3\lib\site-packages\larq\quantizers.py:249 call  *
            return super().call(outputs)
        C:\Users\User\Anaconda3\lib\site-packages\larq\quantizers.py:160 call  *
            self.add_metric(self.flip_ratio(inputs))
        C:\Users\User\Anaconda3\lib\site-packages\larq\metrics.py:43 __call__  *
            return super().__call__(inputs, **kwargs)
        C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\keras\metrics.py:196 __call__
            replica_local_fn, *args, **kwargs)
        C:\Users\User\Anaconda3\lib\site-packages\larq\metrics.py:71 update_state  *
            unchanged_values = tf.math.count_nonzero(
        C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\util\dispatch.py:180 wrapper
            return target(*args, **kwargs)
        C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\ops\math_ops.py:1305 equal
            return gen_math_ops.equal(x, y, name=name)
        C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py:3240 equal
            name=name)
        C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\framework\op_def_library.py:742 _apply_op_helper
            attrs=attr_protos, op_def=op_def)
        C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\framework\func_graph.py:595 _create_op_internal
            compute_device)
        C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\framework\ops.py:3322 _create_op_internal
            op_def=op_def)
        C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\framework\ops.py:1786 __init__
            control_input_ops)
        C:\Users\User\Anaconda3\lib\site-packages\tensorflow_core\python\framework\ops.py:1622 _create_c_op
            raise ValueError(str(e))
    
        ValueError: Dimensions must be equal, but are 64 and 128 for 'Equal' (op: 'Equal') with input shapes: [3,3,64,64], [3,3,128,128].

Environment

TensorFlow version: 2.1.0
Larq version: 0.9.3
Larq-Zoo version: 1.0b4

Add Ternary Net

https://arxiv.org/abs/1605.04711

Help, no logs are printed!

Describe the bug

When I run lqz TrainR2BStrongBaseline, after the training process has started, no logs are printed.
To ensure that the training process starts correctly, I have written a Callback:

class PrintCheckCallback(tf.keras.callbacks.Callback): 
    def on_train_batch_begin(self, batch, logs=None): 
        print("training on batch: {:4d}".format(batch))

And this callback is able to print the current batch index correcly, so I can be sure that the training process had started.
I am thinking whether this is a bug or there is no printed logs indeed?

To Reproduce

I clone the code and use pip install -e . to generate the module locally.

Expected behavior

I wish to monitor the training process, so it's necessary to have training logs printed.

Environment

TensorFlow version: 2.3.1
Larq version: 0.10.2
Larq-Zoo version: 2.0.1 (build locally)

Add BinaryNet

https://arxiv.org/abs/1602.02830

PR: #2

Unexpected behavior of the "include_top" argument

Describe the bug

For backbone models defined in "densenet.py" and "resnet_e.py", current implementation adds the relu activation layer only if the "include_top" argument is True.

https://github.com/larq/zoo/blob/v0.5.0/larq_zoo/densenet.py#L70-L74
https://github.com/larq/zoo/blob/v0.5.0/larq_zoo/resnet_e.py#L71-L75

This is different from the behavior of keras-applications.

https://github.com/keras-team/keras-applications/blob/1.0.8/keras_applications/densenet.py#L228-L229
https://github.com/keras-team/keras-applications/blob/1.0.8/keras_applications/inception_resnet_v2.py#L329-L330
https://github.com/keras-team/keras-applications/blob/1.0.8/keras_applications/inception_v3.py#L360-L361
https://github.com/keras-team/keras-applications/blob/1.0.8/keras_applications/mobilenet_v2.py#L384-L386
https://github.com/keras-team/keras-applications/blob/1.0.8/keras_applications/mobilenet.py#L244-L256
https://github.com/keras-team/keras-applications/blob/1.0.8/keras_applications/nasnet.py#L242-L243
https://github.com/keras-team/keras-applications/blob/1.0.8/keras_applications/resnet_common.py#L381-L382
https://github.com/keras-team/keras-applications/blob/1.0.8/keras_applications/resnet50.py#L257-L258
https://github.com/keras-team/keras-applications/blob/1.0.8/keras_applications/vgg16.py#L177-L180
https://github.com/keras-team/keras-applications/blob/1.0.8/keras_applications/vgg19.py#L189-L192
https://github.com/keras-team/keras-applications/blob/1.0.8/keras_applications/xception.py#L270-L271

To Reproduce

Just check the source code :-)

Expected behavior

The relu activation layer should be added outside the if/else statement.

Environment

TensorFlow version: 1.15.0
Larq version: 0.8.4
Larq-Zoo version: 0.5.0

Speech Models

Binarized Speech models would be a awesome addition to the zoo if you support it. Please add them if you have any canonical models.

Reproducing R2B model

Hi,
I tried to reproduce the r2b model results, but I couldn't reach 65.04% top-1 accuracy.
In fact, I couldn't even reproduce the first stage which is training original ResNet18 model (expected to achieve 70.32% validation accuracy (#196 (comment))).
It seems like the default hyper-parameter settings provided in the model zoo is not the optimal setting.
Can you provide the hyper-parameter settings for each stage in training r2b model?

P.S.) The training curve below is what we obtained by running the default script provided in larq model zoo. We could only achieve 66.03% top-1 val. accuracy with the default setting.

XNOR Net uses input quantization of ste_sign

Hi,

Thanks a lot for the larq framework and pretrained models. It is a huge help to the deep learning community.

I was looking through the code for XNOR Net implementation and was wondering how is the input scaling factor 'K' (average L1 norm of inputs) implemented. In the default hyper parameters list the input quantizer is given as ste_sign and only the kernels are quantized using XNOR scaling. I also tried to do the calculations and this confirmed that the inputs were not being scaled in the Larq implementation.

Is there a reason why the input scaling factor is set as ste_sign? or am i missing some detail.

Thanks again.

Update weights and parameters in docstrings

See larq/docs#71; got updated for the docs in larq/docs#85 and need to be updated in the docstrings here too.

Add Bi-Real Net

https://arxiv.org/abs/1808.00278

Add pre-made/canned models

This is more an idea for the future. See tensorflow/community#95

About RealToBinaryNet model

I read from https://docs.larq.dev/zoo/ that the RealToBinaryNet reach 65% accuracy and reach the SOTA.
I really appreciate this and want to train the model to learn about it.
I also read the code and see the method is included, but errors appeared when I train the model. and it seems like the code is not completed.
I want to fix it but it seems a lot of effects.
If it is available, would you please share the training code?

Unexpected behavior of the "preprocess_input" function

Describe the bug

The "larq_zoo.data.preprocess_input" function is define in the following sections:

https://github.com/larq/zoo/blob/v0.5.0/larq_zoo/data.py#L18-L32
https://github.com/larq/zoo/blob/v0.5.0/larq_zoo/data.py#L238-L253

If the argument "image" is a numpy array, the data flow would be CPU -> GPU -> CPU. And then feeding the data to GPU for trainings.
This slows down the trainings significantly if one uses a data generator of numpy arrays.

To Reproduce

Just check the source code :-)

Expected behavior

Actually, you don't need to implement this function from scratch.
The "keras_applications.imagenet_utils.preprocess_input" would work just fine with the "mode" argument set to "torch".

https://github.com/keras-team/keras-applications/blob/1.0.8/keras_applications/imagenet_utils.py#L157

Environment

TensorFlow version: 1.15.0
Larq version: 0.8.4
Larq-Zoo version: 0.5.0

Add Readme and Docs

Link to paper
Link to implementation
Table to compare models
- Top-1 accuracy
- Top-5 accuracy
- Number of parameters
- Number of MAC operations
Publish on github pages: https://plumerai.github.io/larq/

Support 4-D input in preprocess function

This should bring us inline with what Keras supports for the preprocess_input function.

BiRealnet has 19 instead of 18 layers

BiReal-Net has 19 instead of 18 layers.

The error in the code is here: https://github.com/larq/zoo/blob/master/larq_zoo/birealnet.py#L57

Model needs to be retrained before modifying this.

Drop Python 3.6 support

Pytest 7.1.0 (March 2022) has dropped support for Python 3.6, causing our CI to fail. As a result, we haven't updated Pytest since. Perhaps we should also drop Python 3.6 support in general or in CI only?

Any thoughts, @lgeiger?

Explicit layer naming

Feature motivation

When using the ready made models as parts of bigger networks it can be necessary to get the outputs of specific layers. One example are encoder-decoder style networks with skip connections. The keras model.get_layer api allows getting layers by name or by index. While getting layers by index works, often specific layers are needed regardless of the exact network architecture; the last layer of a block before a resolution change for instance. Currently, switching architectures (e.g. Densenet28 -> Densenet45) would change both the layer indices as well as the names of the layers.

Feature description

By passing explicit names when constructing the keras layers we could make sure that these specific layers always have the same predictable names, and can therefore easily be retrieved by name even when changing architectures. Retrieving them by name would be less error prone than by index. Currently, since no explicit names are passed, the layer names are very general (e.g. add_idx where idx depends on the total number of keras layers of that type that have been constructed so far).

Feature implementation

An example implementation can be seen in the keras applications code.

Considerations

Depending on what the zoo models will be used for, this might add more bloat than it prevents. However, the code that it adds also helps in documenting the models to some extent, for instance by explicitly passing block names that are also in corresponding papers.

Snapshot tests of model summaries

We should do the above.

QuickNet(Large) models don't match released h5 files

Describe the bug

The QuickNet and QuickNetLarge models that are released as h5 files on this repo and used for the LCE benchmarks do not match the architectures described in their code. The h5 files use average pool and reshape rather than GlobalAveragePool, see this comment.

The model architectures should be changed to correspond to the uploaded files, which will significantly speed up their inference time (currently, QuickNetLarge constructed from larq-zoo is roughly twice as slow as its h5 counterpart).

To Reproduce

Simply construct a QuickNet model from larq-zoo and compare it to the released h5 models.

RFC: structure change

In adding the R2B KD code to zoo I run into an issue that the needed code does not really fit the current directory structure.

The training procedure requires:

the models (R2BNet(),StrongBaseLineNet() ) which do fit into the current structure
a teacher-student model class which contains the logic for tying together two models, adding teacher-student losses and metrics and does some model loading.
the losses implementations (which I think would be neater to have in a separate file)
some (simplified) multi-stage experiment code and the experiment definitions (which might be a bit too bulky to put in experiments.py

In a chat with @koenhelwegen the following structure was proposed, please comment if disagree, as it would be a breaking change (without some __init__ magic).

larq_zoo/
  training/
    basic_experiments.py
    kd_experiments.py
    teacher_student.py
  core/
    datasets.py
    losses.py
    layers.py
  literature/
    r2bnet.py
      class R2BNet()
      class StrongBaseLineNet()
  sota/

The usage of data.cache() causes the run out of memory.

Hi guys, when I run the command lqz TrainR2BStrongBaseline, I found the memory had been ran out of. My server has 64GB memories.
I check the code and find it will cache both the train_data and the validation_data, and I think this is the reason why my memory is exhausted.
I didn't see any issue related to this problem and I am just curious about do you guys have such a large memory that enough to cache the whole ImageNet dataset(up to 150GB)?
Currently, I delete the code that will cache the dataset and the training process run smoothly. Is there any way to cache part of the training_data

Data directory

Hi, if I want to train the model from scratch with my own dataset, how/where do I modify the code to set the directory?

Reference implementations for initial release

Models for ImageNet:

Binary Net (@lgeiger, #2)
- https://arxiv.org/abs/1602.02830
Bi-Real Net (@koenhelwegen)
- https://arxiv.org/abs/1808.00278
XNOR Net (@jamescook106)
- https://arxiv.org/abs/1603.05279

Add training code and guides of the pretrianed models

To fully leverage the existing methods, we should need the training process of the pre-trained models.

No 'sota' module

Describe the bug

ModuleNotFoundError: No module named 'larq_zoo.sota'

To Reproduce

!pip install larq-zoo from larq_zoo.sota.quicknet import QuickNet

Expected behavior

Should have downloaded the QuickNet architecture.

Environment

TensorFlow version: 1.15.2
Larq version: 0.9.4
Larq-Zoo version: 0.5.0

import larq_zoo as lq dir(lq)
results in:
['BiRealNet', 'BinaryAlexNet', 'BinaryDenseNet28', 'BinaryDenseNet37', 'BinaryDenseNet37Dilated', 'BinaryDenseNet45', 'BinaryResNetE18', 'DoReFaNet', 'XNORNet', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'binarynet', 'birealnet', 'data', 'decode_predictions', 'densenet', 'dorefanet', 'preprocess_input', 'resnet_e', 'utils', 'xnornet']

Add model accuracies to docstrings

Feature motivation

Since the docs are maintained separately from the repo, it would be good to state the imagenet accuracy of each model (and potentially other stats such as number of params and model size) to the model docstring. This mitigates the risk of confusion and/or bugs should the docs ever become out of sync.

Feature description

Assuming the stats on https://docs.larq.dev/zoo/ are currently correct, we can just copy those into the relevant docstrings.

SOTA pretrained models in larq_zoo

Hi,
I really appreciate your framework larq and the pretrained weights in larq_zoo.
Currently, there are 9 pretrained weights for the selected models including AlexNet and DenseNet.
Is there any plan to expand the number of uploaded pretrained weights for other BNN models?

I recently found SOTA BNN reference which is called MeliusNet
https://arxiv.org/abs/2001.05936

Thank you.

Add model summaries to documentation

Feature motivation

Oftentimes when I want to find something like a standard model's layer name or its shape, I've had to build the model, wait for it to show up in my terminal, and then ^C out to stop it from training. It'd be much more convenient if these were saved somewhere for our standard models.

Feature description

A folder of .txt files with the output of that model's larq model summary, with links to those files on relevant docs pages.

Feature implementation

This could be implemented similar to the activation plotting we currently already have in the docs.

Intermediate results of training R2B model

Hi,

It is so nice that you almost reproduced the r2b model (65.04% top-1 accuracy).
Can you also provide the intermediate results of training process of r2b model?
For instance, the top-1 accuracy of each stage might be very helpful for me.

Thanks!

Not a bug: reference to old repo location needs to be updated

zoo/larq_zoo/utils.py

Line 16 in 89c8fa3

root_url = "https://github.com/plumerai/larq-zoo/releases/download/"

The utility function to download weights has a link to the old repo location. It still works because GitHub forwards the request but we should probably update this.

Add option to download pretrained weights

This can be done analogous to https://github.com/keras-team/keras-applications/blob/master/keras_applications/resnet50.py#L278-L295

QuickNet no-top models pretrained weights are not working as expected

Describe the bug

Tried freezing the no-top quicknet models, and training a linear classifier on top of them, in order to classify images from the Imagenette dataset (10 easy classes from ImageNet).

Because the pretrained zoo models are trained on the superset of this dataset, I expected the pretrained embedders to perform very well, but they did not succeed in reaching above 50% accuracy.

However, when I manually cut the full models, the embedders work as expected and reach 95% easily, hinting the problem is with the no-top pretrained weights.

To Reproduce

Run the code below with the following configurations:
QuickNetBugTest cut_full_model=True
QuickNetBugTest cut_full_model=False
QuickNetLargeBugTest cut_full_model=True
QuickNetLargeBugTest cut_full_model=False
QuickNetXLBugTest cut_full_model=True
QuickNetXLBugTest cut_full_model=False

(cut_full_model param determines if the pretrained no-top model is used, or the pretrained full model is taken and cut before the global pooling. The models are trained for 3 epochs in the example but even if trained more the no-top model does not improve much.)

import tensorflow as tf
from tensorflow import keras
import tensorflow_datasets as tfds
from larq_zoo.training.data import preprocess_image_bytes
from larq_zoo.sota import QuickNet, QuickNetLarge, QuickNetXL
from zookeeper import cli, task, Field
from typing import Callable, Tuple, Optional


class EmbedderWrapperModel(keras.Model):
    def __init__(self, zoo_class: Callable[..., keras.Model],
                 input_shape: int, num_classes: int, dynamic=False,
                 finetune_basenet=True, pretrained_basenet=True, cut_layer_name: Optional[str] = None):
        super(EmbedderWrapperModel, self).__init__(dynamic=dynamic)

        self.basenet = self._get_basenet(zoo_class, input_shape, finetune_basenet, pretrained_basenet, cut_layer_name)
        global_pool_shape = self.basenet.output_shape[1], self.basenet.output_shape[2]

        self.batch_norm = keras.layers.BatchNormalization(momentum=0.9, epsilon=1e-5)
        self.global_pool = keras.layers.AveragePooling2D(pool_size=global_pool_shape)
        self.dense_softmax = keras.layers.Dense(num_classes, activation=tf.nn.softmax)

    def _get_basenet(self, zoo_class: Callable[..., keras.Model], input_shape: int,
                     finetune_basenet: bool, pretrained_basenet: bool, cut_layer_name: Optional[str]) -> keras.Model:
        weights = "imagenet" if pretrained_basenet else None

        if not cut_layer_name:
            basenet = zoo_class(input_shape=(input_shape, input_shape, 3), include_top=False, weights=weights)
        else:
            full_zoo_model = zoo_class(input_shape=(input_shape, input_shape, 3), include_top=True, weights=weights)
            inputs, outputs = full_zoo_model.inputs, full_zoo_model.get_layer(cut_layer_name).output
            basenet = keras.Model(inputs=inputs, outputs=outputs)

        basenet.trainable = finetune_basenet

        return basenet

    def call(self, inputs, training=False, mask=None):
        x = self.basenet(inputs, training=training)
        x = self.batch_norm(x, training=training)
        x = self.global_pool(x)
        x = keras.layers.Flatten()(x)
        x = self.dense_softmax(x)

        return x


def wrap_preprocessing(preprocessing: Callable, training=False) -> Callable:
    return lambda x, y: (preprocessing(x, training), y)


def get_imagenette_dataset(batch_size: int, preprocessing: Callable,
                           parallel=True) -> Tuple[tf.data.Dataset, tf.data.Dataset]:
    decoders = {"image": tfds.decode.SkipDecoding()}
    total_dataset = tfds.load('imagenette', split=None, shuffle_files=True, as_supervised=True, decoders=decoders)
    train, test = total_dataset['train'], total_dataset['validation']

    parallelism = tf.data.experimental.AUTOTUNE if parallel else None

    train_dataset = (
        train.cache()
            .shuffle(10 * batch_size, reshuffle_each_iteration=True)
            .map(wrap_preprocessing(preprocessing, training=True), num_parallel_calls=parallelism)
            .batch(batch_size)
    )

    test_dataset = (
        test.cache()
            .map(wrap_preprocessing(preprocessing), num_parallel_calls=parallelism)
            .batch(batch_size)
    )

    return train_dataset, test_dataset


class BugTest:
    INPUT_SHAPE = 224
    CLASSES_NUM = 10

    EPOCHS = 3
    BATCH_SIZE = 256
    LEARNING_RATE = 1e-2

    LR_DECAY = 0.1
    DECAY_EVERY = 30

    FINETUNE_BASENET = False
    PRETRAINED_BASENET = True

    def test_bug(self, cut_full_model: bool, model_class: Callable):
        # `activation` is last relu layer before global pooling
        cut_layer_name = "activation" if cut_full_model else None

        model = EmbedderWrapperModel(model_class, self.INPUT_SHAPE, self.CLASSES_NUM,
                                     finetune_basenet=self.FINETUNE_BASENET, pretrained_basenet=self.PRETRAINED_BASENET,
                                     cut_layer_name=cut_layer_name)

        train_dataset, test_dataset = get_imagenette_dataset(self.BATCH_SIZE, preprocess_image_bytes)

        optimizer = keras.optimizers.Adam(self.LEARNING_RATE)
        loss = keras.losses.SparseCategoricalCrossentropy()
        metrics = [keras.metrics.SparseCategoricalAccuracy()]

        model.compile(optimizer=optimizer, loss=loss, metrics=metrics)

        model.fit(train_dataset, epochs=self.EPOCHS, validation_data=test_dataset)


@task
class QuickNetBugTest(BugTest):
    cut_full_model: bool = Field(False)

    def run(self):
        self.test_bug(self.cut_full_model, QuickNet)


@task
class QuickNetLargeBugTest(BugTest):
    cut_full_model: bool = Field(False)

    def run(self):
        self.test_bug(self.cut_full_model, QuickNetLarge)


@task
class QuickNetXLBugTest(BugTest):
    cut_full_model: bool = Field(False)

    def run(self):
        self.test_bug(self.cut_full_model, QuickNetXL)


if __name__ == "__main__":
    cli()

Expected behavior

Expected the pretrained no-top models and the cut pretrained full models to perform the same, instead got the following discrepancy:

Model	no_top accuracy	full_model_cut accuracy
QuickNet	47.5%	95.4%
QuickNetLarge	30.2%	96%
QuickNetXL	27.7%	97.7%

Environment

TensorFlow version: 2.2.0rc3
tensorflow-datasets version: 3.0.0
Larq version: 0.9.4
Larq-Zoo version: 1.0.b4

larq / zoo Goto Github PK

zoo's Introduction

Larq Zoo

Requirements

Installation

About

zoo's People

Contributors

Stargazers

Watchers

Forkers

zoo's Issues

Describe the bug

To Reproduce

Environment

Describe the bug

To Reproduce

Expected behavior

Environment

Describe the bug

To Reproduce

Expected behavior

Environment

Describe the bug

To Reproduce

Expected behavior

Environment

Describe the bug

To Reproduce

Expected behavior

Environment

Feature motivation

Feature description

Feature implementation

Considerations

Describe the bug

To Reproduce

Describe the bug

To Reproduce

Expected behavior

Environment

Feature motivation

Feature description

Feature motivation

Feature description

Feature implementation

Describe the bug

To Reproduce

Expected behavior

Environment

Recommend Projects

Recommend Topics

Recommend Org

Jobs