GithubHelp home page GithubHelp logo

dbolya / tide Goto Github PK

View Code? Open in Web Editor NEW
699.0 17.0 114.0 12.28 MB

A General Toolbox for Identifying Object Detection Errors

Home Page: https://dbolya.github.io/tide

License: MIT License

Python 100.00%
object-detection instance-segmentation evaluation toolbox errors error-detection

tide's People

Contributors

dbolya avatar hyperparameters avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tide's Issues

Even when using the ground truth values as predicted values, there is error.

Hi,
First thanks for developing this tool.

Instead of utilizing the model prediction, I tried using a ground truth as both a ground truth and a prediction value. In this case, according to the theory all error should be zero because the predicted and ground truth values are exactly the same. The tool, however, generates some values under Missed error, which is unexpected. I attempted to modify the code by commenting out missed errors, background errors, and other errors. Nonetheless, the tool indicates that there are some Missed errors.

Experiment 1 : predicted values = ground truth
image

Experiment 1 :predicted values = model prediction
image

Your explanation is highly appreciated.

Thanks

float division by zero Error in td.plot and td.summarize when all errors are 0 and maps are 0

in validate(net, val_data, ctx, eval_metric)
37 if mean_ap[-1]>0.001:
38 td.summarize()
---> 39 td.plot()
40 return map_name,mean_ap

~/SageMaker/PICV/Segmentation Job2/tide_metric.py in plot(self)
20 return self.tide.summarize()
21 def plot(self):
---> 22 self.tide.plot()
23 def update(self, pred_bboxes, pred_labels, pred_scores,
24 gt_bboxes, gt_labels):

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/tidecv/quantify.py in plot(self, out_dir)
588 # Do the plotting now
589 for run_name, run in self.runs.items():
--> 590 self.plotter.make_summary_plot(out_dir, errors, run_name, run.mode, hbar_names=True)
591
592

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/tidecv/plotting.py in make_summary_plot(self, out_dir, errors, model_name, rec_type, hbar_names)
118 error_types = list(errors['main'][model_name].keys()) + list(errors['special'][model_name].keys())
119 error_sum = sum([e for e in errors['main'][model_name].values()])
--> 120 error_sizes = [e / error_sum for e in errors['main'][model_name].values()] + [0, 0]
121 fig, ax = plt.subplots(1, 1, figsize=(11, 11), dpi=high_dpi)
122 patches, outer_text, inner_text = ax.pie(error_sizes, colors=self.colors_main.values(), labels=error_types,

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/tidecv/plotting.py in (.0)
118 error_types = list(errors['main'][model_name].keys()) + list(errors['special'][model_name].keys())
119 error_sum = sum([e for e in errors['main'][model_name].values()])
--> 120 error_sizes = [e / error_sum for e in errors['main'][model_name].values()] + [0, 0]
121 fig, ax = plt.subplots(1, 1, figsize=(11, 11), dpi=high_dpi)
122 patches, outer_text, inner_text = ax.pie(error_sizes, colors=self.colors_main.values(), labels=error_types,

ZeroDivisionError: float division by zero

Keypoint Evaluation

Is it also possible to do keypoint evaluation with tide? What would i need to change to enable keypoint evaluation?

Per class plot

How can I modify tide.plot() to separate classes?

I only have 3 classes in my dataset which differ drastically in difficulty, so would like to see this per-class breakdown.

Use case where no segmentation field is present in a COCO file and where COCO result file contains all information about the dataset

Currently I work with a detection model. All COCO annotation files that I need does not contain "segmentation" field. Furthermore at the inference stage I generate a new COCO file with predictions. This new file needs to have all previous information about the dataset (images, categories) for the next stage of the pipeline. However I'd like to use tide to check the quality of my model by using evaluate_range with TIDE.BOX mode. To use TIDE in such conditions, some minor modifications of the dataset.py are needed. I believe that this feature may be interesting for the community and I'd like to share my code

where dataset defines

in the Error.show() method, there is a dataset with some functions(such as get_img_with_anns, cat_name). I don`t know how to define it. Could you please tell me how to use it?

Getting bbox AP @ 50: 0.00

I have trained a model, the predictions looks correct when plotted on an image. Still I am getting 0 AP from this tool. Can you explain the root cause for this?

ZeroDivisionError

I am getting ZeroDivisionError when using tide.summarize() for two custom datasets in COCO format.

 File "/home/diego/Projects/tfm/detr/util/plot_utils.py", line 25, in plot_tide
    tide.summarize()
  File "/home/diego/.pyenv/versions/master/lib/python3.6/site-packages/tidecv/quantify.py", line 494, in summarize
    main_errors    = self.get_main_errors()
  File "/home/diego/.pyenv/versions/master/lib/python3.6/site-packages/tidecv/quantify.py", line 603, in get_main_errors
    for error, value in run.fix_main_errors().items()
  File "/home/diego/.pyenv/versions/master/lib/python3.6/site-packages/tidecv/quantify.py", line 349, in fix_main_errors
    new_ap = _ap_data.get_mAP()
  File "/home/diego/.pyenv/versions/master/lib/python3.6/site-packages/tidecv/ap.py", line 150, in get_mAP
    return sum(aps) / len(aps)
ZeroDivisionError: division by zero

Why is this happening exactly?

code of Comparison of Scales

Hi,
How can I compare detections across bounding box areas, similar to Fig .5 Comparison of Scales between HTC and TridentNet in your paper?

Thanks!

Dupe detection seems to be working incorrectly

When testing an object detector on my custom dataset, I found out that the most prevalent error is duplicate bboxes. It is clearly seen when I visualize detected bboxes. But TIDE doesn't recognize that and always reports tiny, almost zero amount of duplicates.

I also noticed that there is no example in your paper or notebooks where Dupe category would be significant fraction of all errors, which looks dubious. Are you sure there is no bug in here? For example, what is ex.gt_used_cls and is it defined properly?

https://github.com/dbolya/tide/blob/master/tidecv/quantify.py#L251

Use of pos_threshold

Hello, thank you for your work.

I am having a little bit of hard time determining the use of pos_threshold.
I saw in the code that use_for_errors is only true when a threshold is equal to pos_treshold.

So the error are only calculated for the corresponding AP of pos_treshold.
What is the link between those errors and the computation of the maP shouldn't the error be calculated at each threshold ?

Then how to choose pos_threshold ? In my case due to my application I usually have threshold between 0 and 0.3 when evaluating my maP. What pos_threshold should I use ?

Could you give more insight about this parameter ?

Comparing Across Scale

Hi,
How can I compare detections across bounding box areas, similar to Fig .5 Comparison of Scales between HTC and TridentNet in your paper?

Thanks!

How to implement TIDE for custom dataset?

In my dataset i have one half as COCO dataset and other half as custom added dataset. So now, how should i check performance of model? Can you please explain in step by step?

BoxError and ClassError can match with used detections ?

Hey,

Thanks for this project, it seems to be a really useful tool to provide understandable inside into the performance of your model.

However, when looking at your code, I noticed that you use gt_cls_iou and gt_noncls_iou when matching IoU for BoxErrors and ClassErrors respectively. It is my understanding that these IoU's are the base IoU without removing the IoU from matched annotations, as those would be gt_unused_cls and gt_unused_noncls respectively.
Wouldn't this mean that you potentially assign a FP detection as a BoxError, but in fact the annotation for which it has a wrong localisation is already matched by another (TP) detection ? Shouldn't that detection thus become a BackgroundError, as there already is a TP detection for that annotation, but it is not localised well enough to become a DuplicateError ?
The same goes for ClassErrors, though here it cannot be a DuplicateError because of the wrong class, and thus can only be a BackgroundError.

Let me know your thoughts about this.

Crashes with a DivideByZeroError when there are no detections

Badly trained algorithms might not return detections at all. Tide should return meaningful results in this case instead of crashing on line:

return sum(aps) / len(aps)

With a quick search, I see several other places in the code that perform unchecked divisions. Tide should check for zero and either return meaningful results or meaningful error messages in all of them.

TIDE outputs vs. what pycocotools outputs

Hi, first things first.. This lib is amazing and helped a lot to understand the errors related to the detections.
I was doing using this project for the initial evaluation but since there was no support for Recall, I decided to use the pycocotools for evaluation as well.

Now, during the comparison I got different results for the AP[0.50-0.95]
pycocotools gives- 0.460
TIDE gives - 41.33

Also,
pycocotools gives - AP @ 50 = 0.804
TIDE gives - AP @ 50 = 70.93 (extracted from the summary table)

I was wondering where the difference comes from, exploring how the TP, FP, FN are calculated for now.

TIDE output interpretation

hi @dbolya,

i was testing out TIDE with 2 of my models (with slight different augmentations between them).
The results are:

Model 1

 mask AP @ 50: 50.43

                         Main Errors
=============================================================
  Type      Cls      Loc     Both     Dupe      Bkg     Miss  
-------------------------------------------------------------
   dAP     5.05     5.61     0.21     0.00     3.73    14.52  
=============================================================

        Special Error
=============================
  Type   FalsePos   FalseNeg  
-----------------------------
   dAP       8.64      28.71  
=============================

Model 2

mask AP @ 50: 45.71

                         Main Errors
=============================================================
  Type      Cls      Loc     Both     Dupe      Bkg     Miss  
-------------------------------------------------------------
   dAP     5.09     3.76     0.05     0.00     3.54    14.56  
=============================================================

        Special Error
=============================
  Type   FalsePos   FalseNeg  
-----------------------------
   dAP       8.75      25.02  
=============================

I am a little confused that the dAP (except Miss) Model 2 (with 45.71 AP) are significantly lower than Model 1 (with 50.43 AP)..
Is there a good intuition or interpretation of the aforementioned results? I would think Model 1 is better (given its mAP) but TIDE seems to suggest otherwise.

list input can be bounding box (Nx4) or RLEs ([RLE])

Hi,
I tried to run evaluate_range for TIDE.MASK but gave an error :
--> list input can be bounding box (Nx4) or RLEs ([RLE])

my dataset dict was like this :
{'_id': 0,
'bbox': [365.0, 436.0, 657.0, 331.0],
'class': 59,
'ignore': False,
'image': 5,
'mask': [[536.4705882352941,
436.1764705882353,
610.5882352941177,
439.70588235294116
]],
'score': 1}

is anything wrong with the structure?
Thank you in advance

Is that right to evaluate performance on PASCAL VOC via COCO metric?

The title says my concern.
By looking at dataset.py, it seems that TIDE utilizes COCO metric to compute mAP on PASCAL VOC dataset.
However, I've compared the VOC official evaluation code with TIDE (which is exactly the COCO evaluation code), and the protocols for assigning tp / fp labels for predicted boxes differs. Given same scores and bboxes, VOC and COCO do output different mAPs.
I think that will be a problem. What do you think? @dbolya

Mismatch of AP as compared to pycocotools due to mismatch number of categories.

Hi @dbolya,

I modified this awesome library for my own use case and test it on a new dataset. However, I found out that the AP @ IOU 0.5 is different from what I get when using pycocotools. The root cause of this issue is the mismatched number of categories between groundtruth and prediction. For example, I defined 10 categories in the categories dictionary (in JSON). But in fact, the groundtruth annotations only involve 8 categories. At this point, both TIDE and pycocotools will output the same AP by calculating (sum(APs) / 8), only when the predictions cover 8 or less categories.

However, the AP will be different if the prediction involves all 10 (or more than 8) categories. Let's assume that the total AP @ IOU 0.5 that I will get is 100.

What I get from cocoeval:
AP: 100 /8 = 12.5

What I get from TIDE:
AP: 100/10 = 10

The main reason that leads to this result is pycocotools only considers the number of classes available in groundtruth, which is 8. On the other hand, TIDE will consider all 10 classes, as the 2 respective ClassedAPDataObject are not empty (len(self.data_points) > 0).

This use case happens when the training set has 10 classes, but the validation set only has 8 of them. I m training my model with 10 classes, and sometimes it will output 10 classes while inferring on the validation set.

What do you think of this mismatch of results?

Thank you in advance.

Encountered a problem with PASCAL VOC

Hi, @dbolya , I'm interested in this work, but encountered a problem on Pascal VOC dataset, a very low mAP. 71.1 in the mmdetection v.s. 5.7 in tide. I tried to find the reason for several days, but failed. Could you kindly give some suggestions? Thanks a lot!

Code related to tide is the following,
gt = datasets.Pascal(path='pascal_test2007.json')
pred = datasets.COCOResult(path='pre.json.bbox.json')
tide = TIDE()
tide.evaluate_range(pred, gt, mode=TIDE.BOX )
tide.summarize()

I convert the detection results to COCO json style with the following code,
image

The results are as follows,
image

The sum of mAP is not 100

image
as the picture show,all mAP' sum is not 100.
If you can give me some insight about this issue, I will appreciate.

Recall Metrics

Hi

I am trying to get the recall to be output. If I call
1 - len(obj.false_negatives) / obj.num_gt_positives
where obj is an APDataObject, is that correct?

How to handle images with no ground truth box

Hey @dbolya,
I used my own dataset according to #9. However, I got slightly better results compared to pycocotools because gt images with no objects are ignored. How can I include these empty images in score calculation?
Thank you!

Feature Request: Finish statically typing the API

To start, the type annotations in your code are a huge advantage of this over pycocotools; thanks for adding annotations!

That said, some arguments to API functions are not fully statically typed. The type instability in the pycocotools API is unfortunate, but in your wrapper you can use Union types to, i.e. represent the switching between compressed and non-compressed masks in overloaded methods Data.add*

Thanks!

What fix_main_errors does ?

Hi,

Great Work!

I am customising this code for my needs.i would like to know what actually fix_main_errors does?
I can able to see before going in side this function have many errors i.e Background errors , Class errors and other errors. But after going out side i can see only class errors get populated in summary report.

Dataset which is not COCO in COCO format

Is it possible to apply TIDE as it is to a custom dataset which is not COCO but is in the exact format of COCO? My model also outputs the same results file. But at the moment I get:

Traceback (most recent call last):
File "..../mAP_evaluation.py", line 91, in evaluate_coco
tide.summarize()
File "..../lib/python3.7/site-packages/tidecv/quantify.py", line 494, in summarize
main_errors = self.get_main_errors()
File "..../lib/python3.7/site-packages/tidecv/quantify.py", line 603, in get_main_errors
for error, value in run.fix_main_errors().items()
File "..../lib/python3.7/site-packages/tidecv/quantify.py", line 349, in fix_main_errors
new_ap = _ap_data.get_mAP()
File "...../lib/python3.7/site-packages/tidecv/ap.py", line 150, in get_mAP
return sum(aps) / len(aps)
ZeroDivisionError: division by zero

Thanks a lot in advance!

ValueError: negative dimensions are not allowed

Facing this error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-4f5ad051ad2b> in <module>()
      9 tide.evaluate(datasets.COCO(gt_path), datasets.COCOResult(det_path), mode=TIDE.BOX) # Use TIDE.MASK for masks
     10 tide.summarize()  # Summarize the results as tables in the console
---> 11 tide.plot()

/opt/conda/lib/python3.6/site-packages/tidecv/quantify.py in plot(self, out_dir)
    566 
    567                 for run_name, run in self.runs.items():
--> 568                         self.plotter.make_summary_plot(out_dir, errors, run_name, run.mode, hbar_names=True)
    569 
    570 

/opt/conda/lib/python3.6/site-packages/tidecv/plotting.py in make_summary_plot(self, out_dir, errors, model_name, rec_type, hbar_names)
    180                 lpad, rpad = int(np.ceil((pie_im.shape[1] - summary_im.shape[1])/2)), \
    181                                         int(np.floor((pie_im.shape[1] - summary_im.shape[1])/2))
--> 182 		summary_im = np.concatenate([np.zeros((summary_im.shape[0], lpad, 3)) + 255,
    183                                                                         summary_im,
    184 									np.zeros((summary_im.shape[0], rpad, 3)) + 255], axis=1)

ValueError: negative dimensions are not allowed

Attached pdb and got that lpad is negative

import pdb; pdb.pm()

> /opt/conda/lib/python3.6/site-packages/tidecv/plotting.py(182)make_summary_plot()
-> summary_im = np.concatenate([np.zeros((summary_im.shape[0], lpad, 3)) + 255,
(Pdb)  lpad
-30

Solution

  • check the width of pie_im and summary_im.
  • pad the one with small width

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.