GithubHelp home page GithubHelp logo

Comments (2)

pbontrager avatar pbontrager commented on June 9, 2024 1

If you look in the TextVQA config, you can see that answers are processed with the "simple_word" tokenizer found here. Also the vocabulary should be in textvqa/defaults/extras/vocabs/fixed_answer_vocab_textvqa_5k.txt

from mmf.

soonchangAI avatar soonchangAI commented on June 9, 2024

Thanks, @pbontrager I have tried it, but there are still several words difference. I think it's probably due to 10 ground truth answers available. The following are my code:

`
def word_tokenize(word, remove=None):
if remove is None:
remove = [",", "?"]
word = word.lower()

for item in remove:
    word = word.replace(item, "")
word = word.replace("'s", " 's")

return word.strip()

answers = []
for i in range(1,len(imdb)):
words = imdb[i]['answers'][0].split()

ans_word = []
for word in words:
    ans_word.append(word_tokenize(word))
clean_word = []
for w in ans_word:
    clean_word += w.split()
answers+= clean_word

unique = list(set(answers))
print(len(unique))
word_count = {}
for word in unique:
word_count[word] = answers.count(word)

sort_word_count = {k: v for k, v in sorted(word_count.items(), key=lambda item: item[1])}
freq_words = list(sort_word_count.keys())[-5000:]
`

from mmf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.