Hello, I'm trying to classify events for a dark matter direct detect

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Problems running transformer models about pytorch-widedeep HOT 5 OPEN

rruizdeaustri commented on May 28, 2024

Problems running transformer models

from pytorch-widedeep.

Comments (5)

jrzaurin commented on May 28, 2024

Hey @rruizdeaustri

I will have a more detail look, but in general here are some comments:

Use a simpler model, forget about the wide component and use simply a deeptabular component with defaults. (review the code in your example since the optimizers and schedures are not correctly defined. The Trainer not throwing an error is intentional, I might change it, but just define your Trainer as

    trainer = Trainer(
        model,
        objective="binary",
        callbacks=[ModelCheckpoint(filepath="model_weights/wd_out")],
        metrics=[Accuracy],
    )

The results with Transformer based models depend A LOT on the parameters, far more than in GBMs, where all, XGBoost, LightGBM and CatBoost perform almost to their best performance out of the box. You could have a look to this relatively old post see if it helps

I hope this helps and let me know how you get on with this, see if I can help more

from pytorch-widedeep.

rruizdeaustri commented on May 28, 2024

Hi @jrzaurin,

I have made the modifications you suggested and results
make more sense now. I'm optimising hyper-parameters
with optima in resnet and transformer models but the results are
far from the one got with LightGMB: AUC ~0.93 versus ~ 0.98 for lgqbm

Thanks !

Rbt

from pytorch-widedeep.

jrzaurin commented on May 28, 2024

Hey @rruizdeaustri

Thanks for sharing the results :)

0.05 is perhaps a bit too much, maybe I can look at some examples if you would be willing to share them. However, I am afraid that this is the "brutal" reality for most (true) real world cases when it comes to DL vs GBMs.

You could try some other libraries see if their implementations are better or you get better results (?)

In my experience I have used DL for tabular data in a few occasions, but never aimed to beat GBMs, since I knew was a lost battle.

from pytorch-widedeep.

rruizdeaustri commented on May 28, 2024

Hi @jrzaurin,

Yes, these are too much differences !

I could share with you the files I'm using to train as well as the data if you like. Let me know !

Thanks !

from pytorch-widedeep.

jrzaurin commented on May 28, 2024

Hey @rruizdeaustri !

I am traveling at the moment, but if you join the slack channel we can move the conversation there and we can share the files. See if I have the time to give it a go myself! :)

Thanks!

from pytorch-widedeep.

Recommend Projects

Problems running transformer models about pytorch-widedeep HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs