GithubHelp home page GithubHelp logo

Precision-recall curves about proc HOT 6 CLOSED

xrobin avatar xrobin commented on June 20, 2024
Precision-recall curves

from proc.

Comments (6)

xrobin avatar xrobin commented on June 20, 2024

I believe that for the coords function it would be pretty easy. I just need to add precision and recall to the list of return coordinates. CI follows automatically, and plot easily.

I assume that you're more interested in the AUC. At the moment the code is very ROC-centric, and requires the curve to increase monotonically. That's not the case for PR, so a new method would have to be written. Once we have that it should be possible to bootstrap and calculate variance, p values etc. I have been working on cleaning up the current mess in my bootstrapping code, but it's still very convoluted (as I need to keep the parameters used to build the ROC curve such as partial AUC, smoothing, direction etc, and handle stratified and non stratified sampling).

Smoothing is out of the picture and I'd have no idea how to integrate that.

I guess that makes it a pretty big rewrite. Except for ยง1 it's unlikely I'll have time to do it in the foreseeable future unless someone else steps in.

from proc.

topepo avatar topepo commented on June 20, 2024

Everything is simple for the person not doing it =]

I guess that makes it a pretty big rewrite. Except for ยง1 it's unlikely I'll have time to do it in the foreseeable future unless someone else steps in.

Thanks for the assessment.

from proc.

sebastienwood avatar sebastienwood commented on June 20, 2024

Hi,
sorry for unearthening this thread, I would be pretty interested if the coords function would work with PR curve as well. Use case would be to simply call
pROC::coords(roc(something), "bestPR", "threshold")
Is there any chance it would be considered for the package ? Thanks ! :)

from proc.

xrobin avatar xrobin commented on June 20, 2024

Precision and recall was implemented in version 1.10.0. I don't know why it wasn't mentioned here.

co <- coords(rocobj, "all", ret = c("recall", "precision"))
plot(precision ~ recall, t(co), type="l")

Now regarding the "bestPR" bit, I don't know what that would mean, and how "best" is defined on a PR curve, or if it even is defined at all. PR curves aren't very intuitive to say the least, and many things that "work" for ROC just make no sense in PR. Do you have any reliable reference for this? If so I will re-open the issue, or even please feel free to open a new one.

from proc.

sebastienwood avatar sebastienwood commented on June 20, 2024

Thanks for the input :) I would've thought that the approach would be the same that in ROC curve analysis : drawing a line from the "ideal" corner and gradually decreasing till finding a tangent point in the PR curve (EER if I'm not mistaken). This link offers a few options and they seem legit from what I understand : https://stats.stackexchange.com/questions/7718/how-to-choose-a-good-operation-point-from-precision-recall-curves
They state another interesting approach which is a cost function that would be user defined, but I don't know if it would be easy to implement.

from proc.

xrobin avatar xrobin commented on June 20, 2024

There are a lot of legitimate things that can be done. The real question is which one(s) are worth being implemented.

Equal Error Rate or EER is not something that pROC can do with ROC curves at the moment. It's tricky to calculate, as one needs to interpolate both sensitivity and specificity together. It may likely not correspond to a threshold. I am not aware that it's ever used in practice. However if interest is there it can be done. For PR curves I don't know if the equation has a single solution, but again this can be worked out.

You can already specify the prevalence and relative cost for mis-classifications, with a formula given by Perkins and Schisterman.

Please feel free to open a new feature request with the specific feature you'd like to see implemented. It would also help if you can provide some evidence that it's been used in published research, and an algorithm to calculate it.

from proc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.