Comments (9)
Thanks for your answering, that's may be one possible reason. And I have another question about Max-pooling. Since its performance on SSL pretrained ViT even surpass ABMIL. I occur this phenomena during my experiment. Can this phenomenon be explained by the fact that the feature representations are already good enough that there is no need to utilize additional parameters to learn?
from acmil.
Certainly, I share your viewpoint. I have obtained similar results on the CAMELYON dataset with ViT SSL pretrained embeddings.
from acmil.
I share a similar question and haven't fully unraveled it. My attempt to explain is as follows: While TransMIL makes strides in considering the correlation between instances, the use of two stacked attention layers introduces additional parameters, making TransMIL more susceptible to overfitting. In such cases, the quality of representation becomes pivotal for effective TransMIL utilization, and the better representation is more beneficial to TransMIL. Notably, three groups of results used the backbones of ResNet18 (our paper), ResNet50 (TransMIL paper), and SSL pretrained ViT (our paper), and this choice significantly impacted the final outcomes.
Furthermore, it's crucial to note that the implementation details and the selection of hyperparameters can also influence performance. In response to this, I've updated the implementations of ABMIL, TransMIL, and other baselines. I hope this provides some clarity.
from acmil.
Also, can the authors open-source their splits for reproducing the results? With "randomly split" quoted from the original work, the results are not easy to reproduce.
from acmil.
Thank you for your suggestion, I have uploaded the split files with five seeds for CAMELYON dataset. please check the files in splits/camelyon
from acmil.
Hi, i have one more question about what's the learning rate for reproducing the results of ABMIL and TransMIL?
from acmil.
we set learning rate as 0.0001 for ViT-based features and learning rate as 0.0002 for ResNet-based features for all MIL models.
from acmil.
Thank you for your reply, so I notice in the paper you mentioned that you trainde the model for 100 epochs, so what's the model you use for the testing, the one with the minimal loss or just the one in the last epoch?
from acmil.
We observed that the model with the minimal validation loss and the one from the last epoch exhibit similar performance across three datasets. Following established studies, we chose to use the model with minimal validation loss. Thank you.
from acmil.
Related Issues (7)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from acmil.