GithubHelp home page GithubHelp logo

Comments (14)

rafaelpadilla avatar rafaelpadilla commented on July 21, 2024

Dear @ybcc2015,

Thanks for your comment.

The point you mentioned in the paper is implemented in our project; Note that in cases where we have more than one detection, we order them by the confidence level, as done in detections D and E. As D has 71% and E has 54% as confidence, the detection D comes first as shown in this table.

The paper says (page 12, 2nd paragraph)

Detections output by a method were assigned to ground truth objects satisfying the overlap criterion in order ranked by the (decreasing) confidence output.

We should not take the one with the highest confidence and discard the others. We should consider all detections ordered by their confidence level.

Further, the paper says:

Multiple detections of the same object in an image were considered false detections e.g. 5 detections of a single object counted as 1 correct detection and 4 false detections – it was the responsibility of the participant’s system to filter multiple detections from its output.

It means that the participants of the VOC PASCAL challenge are responsible for filtering them. Otherwise the system will consider all detections (as we did in the example with detections D and E).

We also followed the implementation of the official VOC PASCAL code.

I hope I was clear. My best regards.

from object-detection-metrics.

ybcc2015 avatar ybcc2015 commented on July 21, 2024

Thanks for your reply.

Detections output by a method were assigned to ground truth objects satisfying the overlap criterion in order ranked by the (decreasing) confidence output.

My understanding of this sentence is:
if there are multiple detections of the same object in an image, we first sort them by (decreasing) confidence, and then choose the detection which is the first satisfying the overlap criterion as the TP.

In some images there are more than one detection overlapping a ground truth (Images 2, 3, 4, 5, 6
and 7). For those cases the detection with the highest IOU is taken, discarding the other detections.

I may be confused by this "with the highest IOU is taken".

For example, Image 2 has three detections D(71%), E(54%), F(%74), after sorting their order is F -> D -> E, suppose D and E are satisfied the overlap criterion (iou>=30%) and F is not satisfied, and D's iou is smaller than E's. In this case, I think we should chooes D as the TP beacuse D is the first 'seen' , others as the FP.
You choose E as the TP because only E satisfied the overlap criterion, right?

Hope you can understand what I mean.
Best regards.

from object-detection-metrics.

rafaelpadilla avatar rafaelpadilla commented on July 21, 2024

Hi @ybcc2015,

I think you are right. The sentence in my text that says

In some images there are more than one detection overlapping a ground truth (Images 2, 3, 4, 5, 6
and 7). For those cases the detection with the highest IOU is taken, discarding the other detections.

seems a bit confusing. Thank you for your feed back. I will correct it.

I think the sentence should be In some images there are more than one detection overlapping a ground truth (Images 2, 3, 4, 5, 6 and 7). For those cases, the detection with the highest IOU is considered TP, and the others are considered FP.

During your analysis, keep in mind the order of the following steps:

[step 1] The IOU between a detection and a ground truth is only responsible for classifying the detection as TP or FP. We should consider that if more than one detection overlaps the same object, we consider the one with the highest IOU as TP and the other(s) as FP.
[step 2] Afterwards we order the detections by their confidence level to calculate their Precision and Recall values.
[step 3] Plot and choose a method (11-point interpolation or all points) to obtain the AP.

Answering your question: "You choose E as the TP because only E satisfied the overlap criterion, right?" The answer is yes. As I mentioned above (step 1), we must consider the one with the highest IOU. The first "seen" criterion is uncertain, once we cannot tell which detection was "seen" first. Note that the same criterion was used in images 2,3,4,5,6 and 7.

Best regards,
Rafael

from object-detection-metrics.

ybcc2015 avatar ybcc2015 commented on July 21, 2024

Hi @rafaelpadilla ,

Does [step 1] mean: only the IOU is considered when assigning TP or FP, regardless of confidence,we should select the one with the highest IOU as TP?

If so, I think it is not correct. I have read your code, in line 90 of Evaluator.py, you sort detections by decreasing confidence first :

dects = sorted(dects, key=lambda conf: conf[2], reverse=True)

then in line 111~117:

if iouMax >= IOUThreshold:
    if det[dects[d][0]][jmax] == 0:
        TP[d] = 1  # count as true positive
        det[dects[d][0]][jmax] = 1  # flag as already 'seen'
        # print("TP")
    else:
        FP[d] = 1  # count as false positive

the code above indicates, for a case where a GT has multiple detections, after the detections is sorted, the first one that satisfies the IOU criterion (>=threshold) is selected as the TP (just like you commented: " flag as already 'seen' "), instead of the one with the highest IOU.

I think your code is correct, the meaning of the code does exactly the same as I mean.
But I think your sentence is ambiguous with code. (maybe my understanding is wrong ~~)

from object-detection-metrics.

rafaelpadilla avatar rafaelpadilla commented on July 21, 2024

@ybcc2015,

Thanks for your reply. I like when comments like yours come up. :)

You are right. We first sort the detections by their confidence and only after we use the IOU criterion to determine which detections are TP or FP.

Taking a look at the code, the steps are:

  1. Order all detections by decreasing confidence.
  2. For each detection:
    2.1) Find the image this detection belongs to and get all ground truth bounding boxes that belong to its image.
    2.2) Among all ground truths found in the previous step, get the one with the highest IOU.
    2.3) If the highest IOU satisfies the threshold and the detection has not been considered as TP ('seen') before, set it as TP. Otherwise, set is as FP.

In your previous message, you said: "... after the detections is sorted, the first one that satisfies the IOU criterion is selected as the TP". Actually, we do not take the first one that satisfies the IOU criterion. We loop over all ground truths within the image and take the one with the highest IOU (lines 104 to 109). That's why, in the case where a GT has multiple detections, I don't think the confidence matters when we classify detections as TP or FP. Can you come up with a case where the confidence matters?

A different situation occurs when a detection overlaps two ground truths (like in image 7).

from object-detection-metrics.

ybcc2015 avatar ybcc2015 commented on July 21, 2024

Thanks for the great reply, It is very interesting to discuss with you. My English is poor and some sentences may be unclear, I hope you will not be bothered. (•ᴗ•)

Yes, you are right. For a specific detection, we loop over all GTs within the image and take the one with the highest IOU.

I mean to say, for the GT (suppose there is only one GT in the image), we loop over all detections (after sorted) and take the one which first satisfies the IOU criterion as TP.
If the "take the one which first satisfies the IOU criterion" here is replaced with "take the one with the highest IOU", then it is wrong, right?

Best regards.

from object-detection-metrics.

rafaelpadilla avatar rafaelpadilla commented on July 21, 2024

@ybcc2015 ,

I also like this discussion. :) Your English is very good.

The correct answer is "take the one with the highest IOU". Because that's what the code does and also that's what is done in the README example.

If we take the one which first satisfies the IOU criterion, we would have different results depending on which is seen first.

Maybe the case that makes a difference is when a detection overlaps two ground truths (like in image 7).

from object-detection-metrics.

ybcc2015 avatar ybcc2015 commented on July 21, 2024

@rafaelpadilla
Thanks for your compliment. In fact, this is the first time I have communicated with others in English. (Thanks Google Translate ヾ(๑╹◡╹)ノ" )

Yes, when a detection overlaps multiple ground truths, we will take the gt with the hightest IOU.

Let's look at image 2, has three detections D(71%), E(54%), F(%74), assume their IOU with the left-bottom GT was 0.4, 0.7 and 0 respectively (obviously, their IOU all equal to 0 with the right-top GT), and the iou threshold is 0.3.

In this case, D and E overlaps the same GT, The code will take D (confidence higher than E) as the left-bottom GT's TP instead of E (IOU higher than D). That's exactly what I mean.

from object-detection-metrics.

rafaelpadilla avatar rafaelpadilla commented on July 21, 2024

Hi @ybcc2015,

Yes, you are right! In the example you gave, the TP will be the the detection D, even though its IOU (0.4) is less then the IOU of E (0.7). It happens because that the detection D is the first 'seen', just as you said.

Could you, please, check the official implementation of the VOC PASCAL here and confirm if their algorithm is the same as implemented in our code? It seems their page is off today. But it is also available here.

Best regards

from object-detection-metrics.

ybcc2015 avatar ybcc2015 commented on July 21, 2024

Hi @rafaelpadilla ,

Unfortunately, I don't understand the matlab code. But, I have read your code in detail, and I think your code is the correct implementation.

Best regards.(•ᴗ•)

from object-detection-metrics.

rafaelpadilla avatar rafaelpadilla commented on July 21, 2024

@ybcc2015 ,

Can I close this issue?

from object-detection-metrics.

ybcc2015 avatar ybcc2015 commented on July 21, 2024

@rafaelpadilla ,

of course you can.

from object-detection-metrics.

kulkarnivishal avatar kulkarnivishal commented on July 21, 2024

shouldn't you also compare detected class with gt class? This is in addition to comparing IOU.

from object-detection-metrics.

rafaelpadilla avatar rafaelpadilla commented on July 21, 2024

Dear @ybcc2015,

Today I was reading the README and noticed the miss explanation of the criterion used when 2 or more detections overlap the same groundtruth. You are right in your point. We consider the first one as TP. So, the one as the highest IOU is considered.

Sorry for the miss understanding. Thank you for your comment.

from object-detection-metrics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.