GithubHelp home page GithubHelp logo

Comments (12)

gujiuxiang avatar gujiuxiang commented on August 16, 2024 2

官方没提供evaluation GT json啊, 而且给的demo不清晰,我写了个baseline, 有问题希望指出
https://github.com/gujiuxiang/chinese_im2text.pytorch/blob/master/caption_eval/data/coco_val_caption_validation_annotations_20170910.json

from ai_challenger_2017.

wjb123 avatar wjb123 commented on August 16, 2024 1

还有另外一个问题,评测代码有点不太人性化。例如需要用户自己处理reference数据成下面这种格式:
{
"caption": "\u4e00\u4e2a \u957f \u5934\u53d1 \u7684 \u5973\u4eba \u5750\u5728 \u6d77\u8fb9 \u6c99\u6ee9 \u7684 \u6905\u5b50 \u4e0a \u770b\u7740 \u84dd\u5929 \u5927\u6d77",
"id": 1,
"image_id": 6678090646845985471
},
其中包括对reference的分词和id的重新hash,难道不能提供一个脚本将竞赛的reference自动转换成这个格式吗?

from ai_challenger_2017.

AIChallenger avatar AIChallenger commented on August 16, 2024

@wjb123 你提到的问题很好,只分一次词,然后在评价代码里将PTBTokenizer部分做修改也是可以的。当初是出于尽量少犯错的角度考虑,尽可能的在我们的输入形式上做修改,尽量少修改COCO的评价代码,以防部分代码没有修改导致未知的错误,谢谢!

from ai_challenger_2017.

wjb123 avatar wjb123 commented on August 16, 2024

@AIChallenger 好的,那你们是用哪个分词呢?对应部分会更改吗?

from ai_challenger_2017.

AIChallenger avatar AIChallenger commented on August 16, 2024

@wjb123 我们目前用jieba分词,PTBTokenizer部分暂时不会修改,如果确实它造成分数不对,后期才会修改,如果暂时并没有影响分数,就先不修改。

from ai_challenger_2017.

wjb123 avatar wjb123 commented on August 16, 2024

好吧,由于不同的分词对最终的评测指标是有影响的,你们难道不应该测试一下吗?

from ai_challenger_2017.

wangheda avatar wangheda commented on August 16, 2024

如果确实它造成分数不对

那么现在使用不同分词的评价分数是一致的吗?

from ai_challenger_2017.

AIChallenger avatar AIChallenger commented on August 16, 2024

【更正】经过主办方评委会的确认,为了保证本次大赛的公平性以及评价标准的可对比性,本次AI Challenger的图像中文描述比赛统一使用jieba 0.38分词。祝各位参赛选手取得好成绩!

from ai_challenger_2017.

happygds avatar happygds commented on August 16, 2024

@AIChallenger 我们之前用的thulac分词结果作为输入进行训练,训练过程中看起来validation metric很高,但用这个评测代码之后就相当低了,我们认为是分词不同导致的。因此我将对应评测代码由jieba换成thulac分词,metric很不一样,比如CIDEr由0.7多升到了1.7多,句子是一样的,只不过采用了不同的分词手段。这就非常有意思了,我想问在最终评测的时候,如果你们采用不同的分词方法,是怎么评判选取metric值呢?

from ai_challenger_2017.

chenghuige avatar chenghuige commented on August 16, 2024

大家都用相同的分词标准评估很合理啊,不同的分词粒度肯定metric不一样,比如你用大粒度或者小粒度分词,甚至是单字 metric结果会差很多,但是都用相同的分词标准评估 就是公平的

from ai_challenger_2017.

fword avatar fword commented on August 16, 2024

生成id_to_words.json文件的脚本有吗

from ai_challenger_2017.

Xiong-can avatar Xiong-can commented on August 16, 2024

官方没提供evaluation GT json啊, 而且给的demo不清晰,我写了个baseline, 有问题希望指出 https://github.com/gujiuxiang/chinese_im2text.pytorch/blob/master/caption_eval/data/coco_val_caption_validation_annotations_20170910.json

链接失效了,还能提供一下吗?谢谢您。

from ai_challenger_2017.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.