GithubHelp home page GithubHelp logo

EGL throws error about small-text HOT 4 CLOSED

webis-de avatar webis-de commented on July 20, 2024
EGL throws error

from small-text.

Comments (4)

chschroeder avatar chschroeder commented on July 20, 2024

The EGL strategy is tailored to the KimCNN and has also been scientifically proven to be effective only for this. You are using a transformer model (juding from SequenceClassifierOutput). (I improve the documentation and add this restriction.)

This could be adapted to work for transformers. I have investigated this myself, although I was a little less experienced at that time, querying based on gradients / gradient length seemed not to be effective at all for transformer models. Since it didn't really work, I gave up on this idea.

If I had to try that again, I would try the EGL variant that operates on the gradient of the last layer. If you are interested, I also have some old code for this, but if you are looking for well-working query strategy, I would advise against this.

from small-text.

emilysilcock avatar emilysilcock commented on July 20, 2024

Interesting! That's good to know. This paper has some success with some implementation of EGL on text, but I'm not sure how that differs - gradient-based methods are really not my speciality, just comparing as a benchmark

from small-text.

chschroeder avatar chschroeder commented on July 20, 2024

That's right, this may be one the few exceptions. I may have expressed this imprecisely since this is a counterexample to my statement and indeed uses it. It seems slightly superior than random sampling, so technically it it working, but would you use it judging from that paper? It is considerably more expensive, but does not really improve upon the simpler strategies. In general, the original paper seems where EGL is the most successful.

My own negative results on EGL might also be owed to my benchmark datasets, which were mostly balanced. This is exactly the setting where Ein-Dor et al. report results that are not statistically significant for EGL.

I know this paper and I remember thinking about this as well. It seems strange that they do not provide an implementation for EGL, although they published their code for the other strategies. Still, if you are looking at similar papers from that time (e.g., (Yuan et al., 2020), (Margatina et al., 2021), or (Zhang et al., 2021), EGL is not a common point of comparison anymore. Not saying this is a good thing, in a perfect world we wouldn't be that compute limited as it is currently the case, and evaluations could include many more combinations.

from small-text.

emilysilcock avatar emilysilcock commented on July 20, 2024

Thanks for this - this was really helpful!

from small-text.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.