hello, I see your file for keypoint eval---' keypoint_eval.py' , and I think yo

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

key point eval error about ai_challenger_2017 HOT 13 CLOSED

aichallenger commented on August 16, 2024

key point eval error

from ai_challenger_2017.

Comments (13)

foolwood commented on August 16, 2024 1

@AIChallenger @tensorboy 目测组委会今天不会修这个bug了，第一次双周赛真是够刺激。

from ai_challenger_2017.

AIChallenger commented on August 16, 2024

@zhaishengfu Thanks for posting the issue. Indeed, we had considered using oks_num += np.min(oks.shape), but decided to pass this method, because it causes unfairness in the evaluation. For example, one way to get around is to predict ONLY ONE person with highest confidence score per image. In this case, oks_num is always equal to 1, and this means the final score is equal to that one OKS with highest confidence score, instead of the average score of all OKS in the given image.

Again, thanks for the advice. Good luck!

from ai_challenger_2017.

zhaishengfu commented on August 16, 2024

I Understand your meaning. But this is also unfair for some model. for example the followig image:

in your evaluation only give 1 person, but my model can predict the other person and this is better than your given label, isn't it?? But the result is worse than model that can only predict 1 person. Then The best result can not represent the model is the best , and vice versa. I think this is really bad for your competition because i believe this is really common in your dataset!
As for your mentioned case, I think we should do in the following way:
For each annotations in your ground truth label, we look for all the predictions and find the best oks score. and the oks_num should be fixed always as your annotions !! For example, in the image, yout annotations has A joints array, and I predict A1,B1, then your will look A in A1, B1 and find the best oks in A1 and B1 for the similarity result , and the oks_num should be fixed as 1.
I think using this method you can overcome your mentioned case the can choose the really good model!!
Thanks and Looks for your reply!!

from ai_challenger_2017.

AIChallenger commented on August 16, 2024

@zhaishengfu again we did consider the case you just introduced. If we were to set oks_num equal to the number of human body annotated, and look for the best oks score from submission result, this would create a different dilemma. Hypothetically, all possible prediction results on one human body could have been submitted simultaneously, because in this case, the evaluation script always picks the better prediction and the rest have no negative impact to the mAP at all.

To prevent both cheating cases (1 prediction per image, or too many predictions per human body), we carefully pick the current evaluation metrics, where oks_num += np.max(oks.shape).

Thanks. Good luck!

from ai_challenger_2017.

luohuan2uestc commented on August 16, 2024

我也刚想回答问这个问题来着，说实话，这个比赛大多数都是**人，为什么两个**人要用英语交流？
我更想帮主办方解释的是，我们不愿意改就是因为懒。
然后这个影响真的很大，在val上可以差别百分之十！！！！！！！！
从某种角度上，放出可见的框吗，让我们可以计算IOU，减去多余的人，可能更加公平。
谢谢。

from ai_challenger_2017.

zhaishengfu commented on August 16, 2024

看官方解释是防止欺骗。如果放出框的话，会降低很多难度。他的解释我倒是看懂了，有道理，但是还是没有办法解决我说的问题：给定的标签如果不准而模型更准，结果会更加糟糕。我觉得你说的不太可行，最好是既能解决他说的两个欺骗问题又能解决我说的问题。等想到再说吧。

from ai_challenger_2017.

AIChallenger commented on August 16, 2024

@zhaishengfu 您好，我们的测试数据集经过了专业数据数据标注团队的多轮人工审核，数据标注质量已经达到业界通用的高质量标准。因此无需担心“给定的标签如果不准而模型更准”的情况。

感谢您对AI Challenger的支持。预祝您取得好成绩！

from ai_challenger_2017.

foolwood commented on August 16, 2024

@AIChallenger 非常感谢组委会提供的数据与交流平台。我希望组委会可以认真考虑一下评价指标。

请将测评代码中的
oks_all = np.concatenate((oks_all, np.max(oks, axis=0)), axis=0)

改为
oks_all = np.concatenate((oks_all, np.max(oks, axis=1)), axis=0)

原因如下：
如果是 np.max(oks, axis=0)，也就是说组委会是在精度的基础上进行的计算，这就留下了一个bug。因为是以精度为计算指标的，那么我可以只预测一个目标（假设这个预测和标签一模一样），将这个结果复制100w次提交。那么根据组委会的测试代码，假设真实的标签数量是10w这个级别。得到的是oksmean是（100w*1+9.9999w*0）/109.9999w 约等于0.9。

问题出在了，没有消除计算重复。一个标签目标可以匹配多个预测结果，而如果np.max(oks, axis=1)的话，每个预测的结果只能匹配一个组委会的标签（存在匹配两个标签的可能，但这种情况出现的时候，本来就是很低的oks，并不影响）。

请组委会在尽快进行修复，这样不会影响第一次双周赛。非常感谢。

组委会也可以先后台测试一下，有没有钻空子的一目了然。

from ai_challenger_2017.

foolwood commented on August 16, 2024

@AIChallenger 建议组委会还是尽量参考一下COCO的测评代码。

from ai_challenger_2017.

foolwood commented on August 16, 2024

@AIChallenger 再次提醒组委会修复bug。

from ai_challenger_2017.

foolwood commented on August 16, 2024

@AIChallenger 组委会请认真考虑bug修复问题。

from ai_challenger_2017.

tensorboy commented on August 16, 2024

I agree with @foolwood!

from ai_challenger_2017.

AIChallenger commented on August 16, 2024

@foolwood 非常感谢您指出代码中的疏漏部分，我们已经修复了这个问题。同时也对Github中的评测脚本做了更新。再次感谢您的大力帮助和支持，祝您在比赛中取得好成绩！

from ai_challenger_2017.

key point eval error about ai_challenger_2017 HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs