GithubHelp home page GithubHelp logo

Example Doubt about keras HOT 3 CLOSED

emi-dm avatar emi-dm commented on June 7, 2024
Example Doubt

from keras.

Comments (3)

emi-dm avatar emi-dm commented on June 7, 2024 1

Thank you so much @sineeli!!! I couldn't dept the necessary into the original paper, so this caused my doubt! Really appreciated :)

from keras.

sineeli avatar sineeli commented on June 7, 2024

Hi @emi-dm,

This design is inherited from the Transformer model for text, and we use it throughout the main
paper. An initial attempt at using only image-patch embeddings, globally average-pooling (GAP)
them, followed by a linear classifier—just like ResNet’s final feature map—performed very poorly.
However, we found that this is neither due to the extra token, nor to the GAP operation. Instead  
the difference in performance is fully explained by the requirement for a different learning-rate

Taken from ViT paper

with CLS and without CLS ViT can be constructed as per the paper. In case you want to use CLS token create a extra token embedding of ViT hidden dimension(d_model) and prepend to the Porojected Patches.

The attached new embedding can be considered as a separate single keras layer with a weight vector and this can work with all backends.

Example

class TokenLayer(keras.layers.Layer):
    
    def build(self, input_shape):
        self.cls_token = self.add_weight(
            name='cls',
            shape=(1, 1, input_shape[-1]),
            initializer='zeros'
        )
    
    def call(self, inputs):
        cls_token = self.cls_token + keras.ops.zeros_like(inputs[:, 0:1]) 
        out = keras.layers.Concatenate(axis=1)([cls_token, inputs])
        
        return out

Thanks and hope this helps.

from keras.

google-ml-butler avatar google-ml-butler commented on June 7, 2024

Are you satisfied with the resolution of your issue?
Yes
No

from keras.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.