Comments (7)
Also, do you have any pointers for hyperparameters while scaling up? The full dataset I am trying to run is ~40M rows, I'm trying to tune the hyperparameters on this 10% sample before applying those parameters to the full dataset.
from tabnet.
Your train/val plot looks suspicious to me: that is strange that train and valid have the exact same scores at every epochs.
Have you tried a learning decay?
from tabnet.
I'm using OneCycleLR right now, open to suggestions to try (scheduler or lr values) and I can follow up with results here.
Also, I have been using log cosh loss as my objective, MASE as my eval. My regression target is heavily right skewed so I recently tried RMSLE but that didn't change the dynamic you see above.
Here is an example where I trained with log cosh loss as my objective and use MAE as my eval (ignore the legend), this is a bit better but still pretty volatile.
from tabnet.
I tried reducing the number of epochs and pct_start in OneCycleLR and got the following
Much more stable but still not seeing the training MASE getting much better than validation.
More playing yielded more of the same. XGBoost is often able to get down to <0.4 MASE but I can't seem to get tabnet below ~0.45
from tabnet.
A large batch size often plays the role of a regularization method because of the batch norm used during training. At the cost of a longer training time you can try to significantly lower batch_size and virtual_batch_size (like 64). I'm not sure you'll get better validation performance but you should be able to see some overfitting.
from tabnet.
Doing this, I still was unable to get the model to overfit... Are there any other hyperparameters I should be looking to change to help this?
from tabnet.
larger n_d, n_a, larger number of steps: larger model capacity should enable overfitting capacity.
from tabnet.
Related Issues (20)
- Loss goes to -inf HOT 1
- The mask tensor M in script tab_network.py needs to be transformed to realize the objective stated in the paper: "γ is a relaxation parameter – when γ = 1, a feature is enforced to be used only at one decision step".
- Current version on conda-forge is 4.0 while 4.1 is already released HOT 8
- Minimal working example for TabNetRegressor/Classifier HOT 4
- Transfer learning, capability to change structure of model HOT 1
- Generate Embeddings for Tabular Data HOT 1
- TabNet overfits (help wanted, not a bug) HOT 9
- TabNetRegressor vs other networks HOT 1
- spike in memory when training ends HOT 8
- Severe overfitting HOT 18
- OOM problem when I search hyperparameters with Tabnet HOT 3
- Support for complex-valued datasets HOT 4
- Different classification variables in the test set and train set HOT 1
- Optimizing TabNet for Disease Classification with Continuous Audio Features HOT 1
- Interpreting Sparsity on Global Importance HOT 5
- ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() HOT 1
- Validation loss HOT 1
- Lightweight Fine-tunning or few-shot learning for limited labeled data HOT 1
- Maybe `drop_last` should be set as False in default? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tabnet.