GithubHelp home page GithubHelp logo

Comments (12)

le-big-mac avatar le-big-mac commented on July 17, 2024 2

Hi @WingCode, we're very glad you've taken an interest in this issue! The way the unitaryHack bounty tracking works means that we'll assign this issue to you if you are the first one to open a PR that solves it. You can work on it without being assigned and open a PR on this repo, after which we'll assign the issue to you and close it if it solves the problem!

from lambeq.

dimkart avatar dimkart commented on July 17, 2024 2

@WingCode Note that more than one users can work on the same issue, in which case the maintainers decide which one is the best solution (or they --we-- can even split the bounty).

from lambeq.

dimkart avatar dimkart commented on July 17, 2024 2

This is now completed. Thank you all for your work!

from lambeq.

WingCode avatar WingCode commented on July 17, 2024

@le-big-mac / @le-big-mac I would like to take a stab at this issue. Could you assign it to me?

from lambeq.

mithunpaul08 avatar mithunpaul08 commented on July 17, 2024

@dimkart Why not use the technique of a trained FFNN/MLP which learns a mapping between new/unknown words to their FastText equivalent- which @nikhilkhatri suggested in his Masters thesis? I am using it, and its brilliant.

from lambeq.

dimkart avatar dimkart commented on July 17, 2024

@mithunpaul08

@dimkart Why not use the technique of a trained FFNN/MLP which learns a mapping between new/unknown words to their FastText equivalent- which @nikhilkhatri suggested in his Masters thesis? I am using it, and its brilliant.

You are right it's much preferable, but it would be too much for this hackathon. We didn't want to add any tasks that involve real experiments.

from lambeq.

ACE07-Sev avatar ACE07-Sev commented on July 17, 2024

Greetings,

I have a code prepared for exactly that, but it's a function I defined, not a class instance of RewriteRule. Reason being is to allow the user to apply it to the diagrams with respect to the dataset they are using. I read the source code, and the manner I think it's possible (just what I understand for now, not saying it's impossible HEHEHE) to have the other rewrite rules, especially the determiner and punctuation and such is because we have defined what words they'll be BEFOREHAND, whereas the UNK will change for each dataset.

Proof of work :
image

To

image

There are two functions, one for applying UNK rewriting for the training which has less than some threshold occurrence condition, the other is for applying to a test sentence which has to look at the entire vocabulary of the words the model has seen before.

Shall I make my PR in the form of a jupyter notebook providing the approach?

from lambeq.

ACE07-Sev avatar ACE07-Sev commented on July 17, 2024

I finished my Jupyter notebook (removed irrelevant details like other tokenizers and other ansatzes). Here is the link for it, based on feedback I'll make a PR if requested.

My understanding of the problem :
"The overall goal of this task is to develop a DisCoPy functor that takes a list of unknown words to be replaced with UNK, and that, when passed a diagram, replaces all the boxes containing an unknown word with an UNK box corresponding to the same pregroup type."

So in my function, I am defining a DisCoPy Functor, given the unknown words, and the status wanted (using for low occurence or never seen before words), and then apply the functor to diagrams to rewrite them. I think this should be ok, my only hesitation at the moment is it not being in the same trend of the other rewrite rules, which I'll work on now.

https://github.com/ACE07-Sev/Quantum-Natural-Language-Processing-with-Lambeq/blob/main/QNLP-UNK.ipynb

from lambeq.

ACE07-Sev avatar ACE07-Sev commented on July 17, 2024

By the way @mithunpaul08, I'd love to help you with implementing that for Lambeq.

from lambeq.

dimkart avatar dimkart commented on July 17, 2024

@ACE07-Sev Hi, unfortunately we cannot review code that is not part of a PR in this repository. So if you want to participate, you will have to open a proper PR here. Note though that we are not asking for a notebook, but for a functor, rewrite rule, or method that is available from lambeq's public interface. If at the end there are more than one PRs open for the same issue, we will select the solution we consider the best (or split the bounty, as mentioned above).

from lambeq.

ACE07-Sev avatar ACE07-Sev commented on July 17, 2024

@dimkart dear, I have made the PR. I did two sets of codes, one is the one I made a PR for, the other is basically like something you would write (same structure and trend as the other classes), but I couldn't really test it to see if it works, so I made the PR for the one that I was able to test.

I don't like my current PR exactly because it's not a class. I am certain the idea is correct, but there is some syntax error somewhere that I can't find hehe. I'll try to see if I can fix that as well.

from lambeq.

ACE07-Sev avatar ACE07-Sev commented on July 17, 2024

@dimkart dear, I have made the PR with the class format. I added it to the Rewriter class as an _available_rules and to use we have to simply pass the words and apply it to the diagram.

from lambeq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.