GithubHelp home page GithubHelp logo

Simple Model Training Issue about deq HOT 4 CLOSED

locuslab avatar locuslab commented on August 22, 2024
Simple Model Training Issue

from deq.

Comments (4)

jerrybai1995 avatar jerrybai1995 commented on August 22, 2024

There seems to be multiple problems with your implementation. For instance,

  1. L179-194: Why did you put the forward function call (DEQFunc.apply), as well as the z, x, z0 in the __init__ function? This will only be called when the model is initialized. You will want to put them in forward(), like what you did in L166.

  2. L185-187: z, x and z0 are NOT parameters. They are actual inputs. Therefore, your DEQ.parameters() (in line 218) shouldn't include these.

  3. You didn't even call define and call the DEQ module with well-defined forward and backward passes. If you look at your deq.py (not mine, but your refactored one), the _solve_equi in DEQForward and _solve_back in DEQBackward are both undefined, as they are abstract methods that you need to inherit. One example is my DEQ-Transformer: https://github.com/locuslab/deq/blob/master/DEQModel/models/transformers/deq_transformer_forward_backward.py#L13
    where I actually subclassed the DEQForward object. Then, I actually called these two modules in training (self.deq for the forward pass, self.deqback for the backward pass): https://github.com/locuslab/deq/blob/master/DEQModel/models/transformers/deq_transformer.py#L372
    You need to do the same for your DEQAdder module. In your current code, you didn't even have a well-defined backward pass (DEQFunc.apply does not have a backward pass, which is in DummyDEQFunc).

Since your code only applies the forward root solving once and has no backward pass, I'm not surprised that the model doesn't learn anything at all.

Let me know if this helps, or if you need more concrete help on making it work (I may be able to help you with actual coding only a bit later this week only, though).

from deq.

Diego-Bit-0 avatar Diego-Bit-0 commented on August 22, 2024

Thank you for the suggestions! They have been really helpful and have even answered some prior questions I had concerning some of the code. My team is interesting in applying your DEQ method to our deep feedforward network. Do you happen to have any code for a DEQ model in a simple feedforward case?

from deq.

jerrybai1995 avatar jerrybai1995 commented on August 22, 2024

No, I don't have it for the simplest feedforward setting, but it shouldn't be hard to write. The only thing you may need to slightly adjust is the deq.py module, which currently assumes you are working on sequences (so 3-D). What kind of task are you trying to solve (e.g., image, etc.)?

For the rest, do the following:

I shall be able to create a cleaner tutorial version for you some time later (not sure when). But feel free to communicate with me further on this by emailing me at [email protected].

from deq.

Diego-Bit-0 avatar Diego-Bit-0 commented on August 22, 2024

No worries. The task we're working on is a simultaneous binary classification task mapping a matrix of features as soft values to a matrix of binary values of the same size. Each classification depends on the classifications of the other matrix entries. The data is unordered, so sequence models and graphical models do not apply, but luckily a transformer can. We would like to benchmark such results against a deep feed-forward network. The DEQ model captures both quite well.

from deq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.