xrobin / proc Goto Github PK
View Code? Open in Web Editor NEWDisplay and analyze ROC curves in R and S+
Home Page: https://cran.r-project.org/web/packages/pROC/
License: GNU General Public License v3.0
Display and analyze ROC curves in R and S+
Home Page: https://cran.r-project.org/web/packages/pROC/
License: GNU General Public License v3.0
The area under the ROC curve can be calculated directly from a vector of predictions and a vector of binary labels using the Mann-Whitney U Test. Since this algorithm does not require calculating the ROC curve, it can provide a significant performance increase. My benchmarks on show that, on 10 thousand observations, this algorithm is 1,000 times faster than calculating AUROC with your package (2100 milliseconds seconds vs 2.3 milliseconds).
Would you be interested in adding a C++ implementation of this algorithm to your package? The speedup that this algorithm provides would be valuable for users who need to evaluate hundreds to thousands of models (e.g. with a grid search over a feature / hyper-parameter space).
If you are interested in this contributions to your package, please let me know.
Any chance that you might add these?
Is there an easy way to have ggroc
plot the false positive rate (1 - specificity) on the x-axis?
By default ggroc
plots specificity on the reversed x-axis over [1, 0], instead of the perhaps more familiar (1-specificity) over [0, 1].
Implement vertical, horizontal and threshold averaging, as described by Fawcett
Input: list of roc curves.
Output: object with behavior like ROC. Can calculate AUC and plot, but not CI.
See also this Stack Overflow question for possible visualization.
Hi!
I am computing ROC curves to compare a new score to previously developed scores. Visually when looking at AUROC curves, the new score seems to outperfom all old scores. One of the old scores performs quite bad and reminds the letter S (AUC = 0.52 and half of the curve is under the line of identity and the other half is ontop of the line of identity). I receive an error message when trying to analysis it with Delong's test
roc.test(score_new, score_old_3, method = "delong")
"Warning message:
In roc.test.roc(score_new, score_old_3, method = "delong") :
DeLong's test should not be applied to ROC curves with a different direction."
According to Delong's test the new score is better than other scores (p<0.05), but with this bad-performing score_old_3, the p-value is 0.08. The problem remains with bootstrap and venkatraman. I do not thrust the results. How would you recommend me to analyze this?
Thanks for the help,
Oscar
Coords returns a matrix with thresholds in columns, and the measurement in rows.
This has always been a bit weird but is becoming problematic with pipelines where a data.frame in transposed form would be better suited.
Expected behavior
> library(dplyr)
> roc(aSAH, outcome, wfns) %>% coords()
Setting levels: control = Good, case = Poor
Setting direction: controls < cases
threshold specificity sensitivity
-Inf 0.0000000 1.0000000
1.5 0.5138889 0.9512195
2.5 0.7916667 0.6585366
3.5 0.8333333 0.6341463
4.5 0.9444444 0.4390244
Inf 1.0000000 0.0000000
Hi,
we use multiclass.roc in mlr here:
https://github.com/berndbischl/mlr
It seems to be that auc.roc etc are S3 methods in your package. But you do not mark them as such in your NAMESPACE, which is probably incorrect.
In mlr this triggers now a bug, where we requireNamespace("pROC"), then call multiclass.roc, then this does not find auc.roc, although that function lives in the same package.
Could this please be fixed?
Surfaced with issue #25.
r <- roc(c("A", "B"), c(0, 1))
ci(r)
r <- roc(c("A", "B", "A", "A"), c(0, 1, 0.5, 1.1))
var(r)
The problem happens when one group has only one observation.
I'm trying to calculate auc based on a pROC package. I use the formula:
auc(set_temp$def_woe,set_temp$total_pymnt_woe)
Unfortunately, for some variables, I getting an error:
Error in if (thresholds[tie.idx] == unique.candidates[tie.idx -1]) { : argument is of length zero
Hi,
I am trying to plot two roc curves in the same figure. I would like to have different colour as well as different linetype for each. However, i can do only one at a time and not simultaneously. That is, i can have different colour but same linetype :
`ggroc(list(myrocglm, myrocrf), legacy.axes = T) + geom_abline(intercept = 0,slope = 1)`
or different linetype but same color :
`ggroc(list(myrocglm, myrocrf), aes= "linetype", legacy.axes = T) + geom_abline(intercept = 0,slope = 1)`
And if i try to add the color parameter in the fuction above, it works only for one value, i.e. color="red". For more i get the following error :
"Error: Aesthetics must be either length 1 or the same as the data (353): colour"
Thanks,
John
They are set with par() only on new plots. Not sure if this is a bug, a feature, or has no effect.
It may or may not cause or be related to issue #9.
Something goes wrong when setting par(mar=...), calling plot.roc, axis and plot.roc again with add=TRUE. Visible only when xlim/ylim are set (or maybe also with massive margins?)
Compare:
roc1 <- roc(aSAH$outcome, aSAH$wfns)
roc2 <- roc(aSAH$outcome, aSAH$ndka)
par(mar=c( 4, 4.5, 1, 1 ))
plot(roc1, xlim=c(0.96, 0.66), ylim=c(0.56,0.86), xaxt="n")
axis(side=1)
plot(roc2, add=T)
With:
m1.roc <- roc(aSAH$outcome, aSAH$wfns)
m2.1.roc <- roc(aSAH$outcome, aSAH$ndka)
par(mar=c( 4, 4.5, 1, 1 ))
plot(roc1, xlim=c(0.96, 0.66), ylim=c(0.56,0.86), xaxt="n")
plot(roc2, add=T)
or:
m1.roc <- roc(aSAH$outcome, aSAH$wfns)
m2.1.roc <- roc(aSAH$outcome, aSAH$ndka)
par(mar=c( 4, 4.5, 1, 1 ))
plot(roc1, xlim=c(0.96, 0.66), ylim=c(0.56,0.86), xaxt="n")
plot(roc2, add=T)
axis(side=1)
Describe the bug
While re-running (repeating a colleague's analysis) ci.auc() I received the following error message :
pROC: error in calculating DeLong's theta: got 0.65441176470588235947 instead of 0.63622994652406417160. And was asked to report the bug. (sorry if report is not perfect - my first bug report, and under time pressure)
To Reproduce
EDIT: posted the wrong list originally.
R version 3.5.0 (2018-04-23)
Platform: x86_64-suse-linux-gnu (64-bit)
Running under: openSUSE Leap 42.3
Matrix products: default
BLAS: /usr/lib64/R/lib/libRblas.so
LAPACK: /usr/lib64/R/lib/libRlapack.so
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] pROC_1.12.1 epiR_0.9-96 survival_2.41-3 tableone_0.9.3
[5] xtable_1.8-2 doBy_4.6-1 ggplot2_2.2.1 someR_1.5.1
What command did you run?
Command - ci.aus()
What data did you use? Use save(myData, file="data.RData")
or save.image("data.RData")
Error in delongPlacements(roc) :
pROC: error in calculating DeLong's theta: got 0.65441176470588235947 instead of 0.63622994652406417160. Diagnostic data saved in pROC_bug.RData. Please report this bug to https://github.com/xrobin/pROC/issues.
Expected behavior
A clear and concise description of what you expected to happen.
Additional context
Add any other context about the problem here.
CI is broken for multiclass.roc:
data(aSAH)
multiclass.roc(aSAH$gos6, aSAH$s100b, ci=TRUE)
Error in roc.default(response, predictor, levels = X, percent = percent, :
formal argument "ci" matched by multiple actual arguments
It is also not possible to calculate a CI on an existing object:
ci(multiclass.roc(aSAH$gos6, aSAH$s100b))
Error in roc.default(response, predictor, ...) : No valid data provided.
This should work easily for univariate multiclass.roc
. The new mv.multiclass.roc
might need a bit more work.
We are contacting you because you are the maintainer of pROC, which imports ggplot2 and uses vdiffr to manage visual test cases. The upcoming release of ggplot2 includes several improvements to plot rendering, including the ability to specify lineend
and linejoin
in geom_rect()
and geom_tile()
, and improved rendering of text. These improvements will result in subtle changes to your vdiffr dopplegangers when the new version is released.
Because vdiffr test cases do not run on CRAN by default, your CRAN checks will still pass. However, we suggest updating your visual test cases with the new version of ggplot2 as soon as possible to avoid confusion. You can install the development version of ggplot2 using remotes::install_github("tidyverse/ggplot2")
.
If you have any questions, let me know!
Thank you very much for your very useful pROC package.
I've noticed a curious result on which I'd like to draw your attention : when simulating samples with the same distribution for the classifier score for cases and controls, an AUC of 0.5 is expected on average. However, a slightly biased mean > 0.5 is provided by auc function of pROC, whereas both ROC and fbroc packages yield an identical mean value closer to 0.5 than pROC.
When comparing the individual AUCs estimated by the 3 packages, although pROC yields in some cases, the same AUCs as the 2 other packages, in some other cases, the result is different. The 2 others packages always give the same AUC estimate.
The following code illustrates this:
##############################################
rm(list=ls())
library(pROC)
library(ROC)
library(fbroc)
nsim <- 1000
result <- matrix(ncol=3,nrow=nsim)
n.cases <- n.controls <- 150
for (i in 1:nsim){
response.cases <- rnorm(n.cases, 6,50)
response.controls <- rnorm(n.controls, 6,50)
#############################################
pROC <- roc(controls=response.controls, cases=response.cases)
#############################################
ROC <- rocdemo.sca(truth=c(rep(1, n.cases), rep(0, n.controls)), data=c(response.cases, response.controls))
#############################################
fbroc <- boot.roc(pred=c(response.cases, response.controls), true.class=c(rep(TRUE, n.cases), rep(FALSE, n.controls)))
#############################################
result[i,] <- c(auc(pROC), AUC(ROC), fbroc$auc)
}
apply(result, 2, mean)
##############################################
Many thanks if you can look into this issue.
Best regards,
Jacques
I believe the only supported use case of boostrapping threshold is with 'x = "best"'. For all other cases, pROC should produce a useful error message, not garbage like:
> ci.coords(roc1, x=0.8, input = "sensitivity", ret=c("specificity", "ppv", "tp", "thr"))
Error in apply(sapply(perfs, c), 1, quantile, probs = c(0 + (1 - conf.level)/2, :
dim(X) must have a positive length
De plus : Warning message:
In ci.coords.roc(roc1, x = 0.8, input = "sensitivity", ret = c("specificity", :
NA value(s) produced during bootstrap were ignored.
or
> ci.coords(roc1, x=0.9, input = "sensitivity", ret="t")
95% CI (2000 stratified bootstrap replicates):
2.5% 50% 97.5%
sensitivity 0.9: threshold NA NA NA
Warning message:
In ci.coords.roc(roc1, x = 0.9, input = "sensitivity", ret = "t") :
NA value(s) produced during bootstrap were ignored.
In the longer term, work should continue on branch interpolate that will ultimately support this feature by interpolating thresholds.
I'd like to compute the Generalized AUC for comparing methods that should separate 3 different categories (e.g. no, mild, severe disease).
https://stats.stackexchange.com/questions/112383/roc-for-more-than-2-outcome-categories
Does your package provide that kind of statistic?
Thanks
pROC generates an error whenever the list of case values contains Inf. I suspect this is related to issue #25 . I am using the most recent GitHub version of pROC (as of May 11).
To Reproduce
Steps to reproduce the behavior:
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 17.10Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.solocale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=Cattached base packages:
[1] stats graphics grDevices utils datasets methods baseloaded via a namespace (and not attached):
[1] compiler_3.4.4 tools_3.4.4 yaml_2.1.18
Error in delongPlacements(roc) :
pROC: error in calculating DeLong's theta: got 0.73333333333333328152 instead of 0.65000000000000002220. Diagnostic data saved in pROC_bug.RData. Please report this bug to https://github.com/xrobin/pROC/issues.
Hi,
First, thank you of this wonderfull package.
I am trying to use the function 'power.roc.test' from the development version of the package. I would like to compute the sample size needed to compare a single AUC to a theoric value.
Is it possible to define the theoric value of the AUC (For example, if the expected AUC is 0.9, and its theoric value is 0.8) ?
Best,
David
Hello,
i've a problem installing package pROC on my Debian testing
install.packages("pROC")
Installing package into
‘/home/l/R/x86_64-pc-linux-gnu-library/3.0’
(as ‘lib’ is unspecified)
provo con l'URL
'http://cran.mirror.garr.it/mirrors/CRAN/src/contrib/pROC_1.7.1.ta
r.gz'
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
connesso a 'cran.mirror.garr.it' sulla porta 80.
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
-> GET /mirrors/CRAN/src/contrib/pROC_1.7.1.tar.gz HTTP/1.0
Host: cran.mirror.garr.it
User-Agent: R (3.0.3 x86_64-pc-linux-gnu x86_64 linux-gnu)
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- HTTP/1.1 200 OK
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- Server: nginx/1.4.7
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- Date: Fri, 28 Mar 2014 16:54:04 GMT
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- Content-Type: text/plain
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- Content-Length: 91857
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- Last-Modified: Fri, 21 Feb 2014 04:39:58 GMT
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- Connection: close
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- ETag: "5306d89e-166d1"
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
<- Accept-Ranges: bytes
Warning in download.file(url, destfile, method, mode = "wb", ...)
:
Code 200, content-type 'text/plain'
Content type 'text/plain' length 91857 bytes (89 Kb)
URL aperto
downloaded 89 Kb
Carico il pacchetto richiesto: splines
Warning in library(pkg, character.only = TRUE, logical.return =
TRUE, lib.loc = lib.loc) :
there is no package called ‘pROC’
Warning: package ‘yapomif’ in options("defaultPackages") was not
found
tools:::.install_packages()
- installing source package ‘pROC’ ...
** package ‘pROC’ successfully unpacked and MD5 sums checked
Warning in writeLines(paste0(c(out[is_not_empty]), eor), file) :
stringa carattere non valida nella conversione dell'output
** libs
g++ -I/usr/share/R/include -DNDEBUG
-I"/usr/lib/R/site-library/Rcpp/include" -fpic -O3 -pipe -g
-c RcppExports.cpp -o RcppExports.o
g++ -I/usr/share/R/include -DNDEBUG
-I"/usr/lib/R/site-library/Rcpp/include" -fpic -O3 -pipe -g
-c delong.cpp -o delong.o
g++ -I/usr/share/R/include -DNDEBUG
-I"/usr/lib/R/site-library/Rcpp/include" -fpic -O3 -pipe -g
-c perfsAll.cpp -o perfsAll.o
Carico il pacchetto richiesto: splines
Error in library.dynam(lib, package, package.lib) :
shared object ‘pROC.so’ not found
Warning: package ‘yapomif’ in options("defaultPackages") was not
found
g++ -shared -o pROC.so RcppExports.o delong.o perfsAll.o >
Rcpp:::LdFlags() > > -L/usr/lib/R/lib -lR
Carico il pacchetto richiesto: splines
Error in library.dynam(lib, package, package.lib) :
shared object ‘pROC.so’ not found
Warning: package ‘yapomif’ in options("defaultPackages") was not
found
g++: error: >: File o directory non esistente
g++: error: Rcpp:::LdFlags(): File o directory non esistente
g++: error: >: File o directory non esistente
g++: error: >: File o directory non esistente
make: *** [pROC.so] Error 1
ERROR: compilation failed for package ‘pROC’- removing ‘/home/l/R/x86_64-pc-linux-gnu-library/3.0/pROC’
Warning in install.packages("pROC") :
installation of package ‘pROC’ had non-zero exit status
It seems to be a compilation problem, but Rcpp version is > than
that required (0.10.5)
packageVersion("Rcpp")
[1] ‘0.11.0’
A few infos...
sysname
"Linux"
release
"3.10-2-amd64"
version
"#1 SMP Debian 3.10.7-1 (2013-08-17)"
nodename
"np350v5c"
machine
"x86_64"
login
"l"
user
"l"
effective_user
"l"
R.version
_
platform x86_64-pc-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 3
minor 0.3
year 2014
month 03
day 06
svn rev 65126
language R
version.string R version 3.0.3 (2014-03-06)
nickname Warm Puppy
Any hint to solve the problem?
thank you,
Luca
I'm computing multiple ROC-curves. Two out of 12 computations give me a warning message and won't compute the desired variables without throwing an error.
Output of the function "roc" with rm.remove=TRUE of the two problematic functions:
"True Postive Rate: from Inf to -Inf
False Positive Rate: from Inf to -Inf
Area under Curve:
Maximum F1 Score: -Inf
Warning messages:
1: In min(x$tp) : no non-missing arguments to min; returning Inf
2: In max(x$tp) : no non-missing arguments to max; returning -Inf
3: In min(x$fp) : no non-missing arguments to min; returning Inf
4: In max(x$fp) : no non-missing arguments to max; returning -Inf
5: In max(x$F1) : no non-missing arguments to max; returning -Inf"
problematicRoc$auc gives "NULL".
Everything works as expected with the two computations when I explicitly state that I want auc and ci computed like "roc(....., auc=TRUE, ci=TRUE)".
I have no conflicting packages installed and to my knowledge the data im running roc on in the two problematic instances is not that different from the other ten instances where it works as expected.
I'm not sure how to reproduce the error but I'm glad to provide more detail.
(Thanks for this great package by the way!)
roc
or only auc
functionauc
functionroc
or auc
I think that there is an issue when using coords
and a smoothed curve. The format of results are different between smooth and unsmoothed curves and I suspect that the threshold is being returned in place of the specificity when smoothing is used.
For example:
library(pROC)
data(aSAH)
roc_orig <- roc(aSAH$outcome, aSAH$s100b)
roc_smooth <- roc(aSAH$outcome, aSAH$s100b, smooth = TRUE)
## plots are not extremely different
plot(roc(aSAH$outcome, aSAH$s100b, smooth = TRUE))
plot(roc(aSAH$outcome, aSAH$s100b), add = TRUE, col = "red")
coord_orig <- t(coords(roc_orig, seq(0, 1, 0.01)))
coord_smooth <- t(coords(roc_smooth, seq(0, 1, 0.01)))
coord_smooth2 <- t(coords(smooth(roc_orig), seq(0, 1, 0.01)))
The results are very different:
> head(coord_orig)
threshold specificity sensitivity
0 0.00 0.00000000 1.0000000
0.01 0.01 0.00000000 1.0000000
0.02 0.02 0.00000000 1.0000000
0.03 0.03 0.00000000 1.0000000
0.04 0.04 0.00000000 0.9756098
0.05 0.05 0.06944444 0.9756098
> head(coord_smooth)
specificity sensitivity
0 0.00 1.0000000
0.01 0.01 0.9970265
0.02 0.02 0.9942254
0.03 0.03 0.9914151
0.04 0.04 0.9885741
0.05 0.05 0.9856905
Thanks,
Max
> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.5 (El Capitan)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] pROC_1.8
loaded via a namespace (and not attached):
[1] plyr_1.8.4 tools_3.3.1 Rcpp_0.12.5
I tried to reproduce the sample size calculations in Table 4 of the Obuchowski paper (2004) for a single ROC curve. For a significance level of 0.05, an expected AUC of 0.7, a desired power of 0.9 and kappa = 1, the sample size calculation should result in 33 patients for each of the two groups.
However,
power.roc.test(auc=0.7, sig.level=0.05, power=0.9, kappa=1.0)
gives ncases = ncontrols = 40.21369 as a result.
Maybe the problem is that inside the function, the z-value for the significance level is calculated by
zalpha <- qnorm(sig.level)
,
which gives the lower alpha percentile (-1.64 instead of 1.64), not the upper one. I think it should be:
zalpha <- qnorm(sig.level, lower.tail = F)
or, of course
zalpha <- qnorm(1 - sig.level)
Thank you very much for your work and for maintaining this great package!
Describe the bug
This is a regression due to the fix in #25
To Reproduce
> response <- rbinom(1E5, 1, .5)
> predictor <- rnorm(1E5)
> rocobj <- roc(response, predictor)
Erreur : impossible d'allouer un vecteur de taille 74.5 Go
4: outer(thresholds, predictor, `==`) at roc.utils.R#119
3: roc.utils.thresholds(c(controls, cases), direction) at roc.R#316
2: roc.default(response, predictor) at roc.R#21
1: roc(response, predictor)
This is caused by the check for identical values:
if (any(o <- outer(thresholds, predictor, `==`))) {
There must be an other way to test for exact equality between two vectors safely.
Calculate FPR and TPR of ROC curve in coords.
Example usage: https://stackoverflow.com/questions/16643917/
This is trivially calculated as:
Returning with the "TPR" and "FPR" labels might require a bit more work.
Ways to deal with this include:
Hi,
I am trying to use pROC in RStudio Cloud. The data I'm dealing with can only be accessed in a secure "datalab" environment designed by Statistics New Zealand, so using R on my personal computer is not possible and I doubt that Stats NZ will be able to support a different implementation just for me.
Using the DeLong method to get a confidence interval works no problem, but when I try something like:
ci.auc(roc_object, method = "bootstrap")
I get the error:
Error in structure(.External(.C_dotTclObjv, objv), class = "tckObj") : [tcl] invalid comman name "toplevel".
This appears to me to be similar to this issue in the old RStudio community. It appears that the tcltk
package is the reason it doesn't work? Is that the case for the pROC package too?
Thanks for the great package, would love any feedback on whether I'm mistaken or whether a work around is possible!
The following piece of code in plot.roc handling legacy.axis fails to rescale with xlim:
lab.at <- seq(1, 0, -0.2)
if (x$percent)
lab.at <- lab.at * 100
lab.labels <- lab.at
if (legacy.axes)
lab.labels <- rev(lab.labels)
Too much time is spent in roc.utils.R:60 in roc.utils.perfs.all.fast
:
dups.sesp <- duplicated(matrix(c(se, sp), ncol=2), MARGIN=1)
There must be a better way to do it. Here is some benchmarking code:
n <- 1e6
dat <- data.frame(x = rnorm(n), y = sample(c(0:1), size = n, replace = TRUE))
library(profvis)
profvis({
for (i in 1:10) {
pROC::roc(dat$y, dat$x, algorithm = 2)
}
})
Followup of issue #40.
It might be useful to return every coordinate possible in coords. This could be done by adding a special ret
value of "all"
(verbatim).
Warning: this cannot be abbreviated, as this would change the current behavior of ret="a"
which is to return accuracy. It cannot be mixed with any other value, and therefore only an exact match with a vector of length 1 should be allowed.
I think there may be an issue with the DeLong confidence interval for AUC. When the sample size gets large, the CI goes to either 0-0 or 1-1. Here is an example:
predictor1 <- c(runif(12000,0,0.5), runif(14472-12000, 0.5,0.75))
response1 <- rbinom(14472, size=1, p=predictor1)
roc1 <- roc(response1, predictor1)
predictor2 <- c(runif(3 * 12000,0,0.5), runif(3 * (14472-12000), 0.5,0.75))
response2 <- rbinom(3 * 14472, size=1, p=predictor2)
roc2 <- roc(response2, predictor2)
predictor3 <- c(runif(10 * 12000,0,0.5), runif(10 * (14472-12000), 0.5,0.75))
response3 <- rbinom(10 * 14472, size=1, p=predictor3)
roc3 <- roc(response3, predictor3)
auc(roc1)
Area under the curve: 0.7586
ci.auc(roc1)
95% CI: 0.7506-0.7667 (DeLong)
auc(roc2)
Area under the curve: 0.7584
ci.auc(roc2)
95% CI: 0.7537-0.7631 (DeLong)
auc(roc3)
Area under the curve: 0.7561
ci.auc(roc3)
95% CI: 1-1 (DeLong)
See the confusion on SO. A similar behavior is featured in the new cutpointr package. It is time to be more explicit about automatic choices.
Todo:
This should start with simple operations like var and later ci.coords. The following sub-steps will have to be taken:
Additional considerations:
ggroc() does not show subtitle or caption label
To Reproduce
a <- 1:10
b <- rep(c(TRUE, FALSE), 5)
ggroc(roc(b ~ a)) + labs(title = "stairs", subtitle = "leading upstairs", caption = "from right to left leading downstairs")
Expected behavior
A graph of a step function displaying the contents of the subtitle and caption arguments somewhere.
For a future version of pROC:
Currently unable to use pROC::ci
in my package without requiring the whole package.
# somewhere in function definition
#' @importFrom pROC ci ci.auc ci.roc roc
pROC::ci(factor(c(0, 1, 0, 1)), c(0.1, 0.2, 0.3, 0.4), of = 'auc')
# WARNING: Error in UseMethod("ci") :
# no applicable method for 'ci' applied to an object of class "factor"
library(pROC)
pROC::ci(factor(c(0, 1, 0, 1)), c(0.1, 0.2, 0.3, 0.4), of = 'auc')
#95% CI: 0.05705-1 (DeLong)
Probably an issue with namespacing within method dispatch.
Hello, I've been using pROC in last few days. Very nice and works well for getting AUC out. However, I can't seem to find a way to extract AUC values out into txt or csv files. I was hoping to loop through several columns of an input datafile in order to calculate the AUC for each variable.
Many thanks for your help
roc(aSAH$outcome, aSAH$ndka, smooth=TRUE, smooth.method="density")
Error in match.fun(paste("bw", bw, sep = "."))(roc$predictor) :
need at least 2 data points
This is because roc$predictor is not set at the time smooth.roc is called.
Make sure to re-enable tests by removing the skip_if call in test-roc.R once fixed.
Users seem confused by the auto-detection of the direction of a ROC curve. See this discussion and others. This issue discusses whether to change the default to '<'.
Pros:
Cons:
The plyr is old and newer, better options exist for parallel execution. The foreach package seems to be the way to go, with different backends available, and the doRNG package for reproducible parallel calculations.
Interface from the user perspective would look like:
cl <- makeCluster(2) # 2 cores
registerDoParallel(cl)
registerDoRNG(1234)
ci(...)
stopCluster(cl)
Internally we would simply have:
resampled.values <- foreach(i=1:boot.n) %dopar% { stratified.bootstrap.test(...) }
instead of
resampled.values <- laply(1:boot.n, stratified.bootstrap.test, ...)
Things to consider:
coords(r.s100b, c(0.51, 0.2), input = "threshold", ret = "specificity", drop = TRUE)
coords(r.s100b, "local maximas", input = "threshold", ret = "specificity", drop = TRUE)
Both return a matrix with 1 row. Note: this is tested in test-coords.R but the test is skipped as it fails.
The documentation only mentions dropping over length(x), but also doesn't state that it won't drop if length(ret) == 1. The doc should be either updated to mention not dropping over ret, or updated to mention to drop over ret and the code updated accordingly.
This however is an api change and too close for 1.14.
I've probably missed this, but is there an option in pROC for comparing more than two ROC curves at the same time? I see that DeLong et al 1988 is a reference, but it seems like pROC is missing this ability. If so, could this be considered a feature request? Also amazing would be the ability to test multiple (>2) pAUCs at the same time. Thanks for the great addition to R!
I want to calculate the AUC for many subgroups, one at a time (in a foreach loop).
From my understanding the direction
can change every time depending on the relation of 0 vs. 1 in the outcome, so I don't see if the AUC would be under 0.5.
Is it possible to specify the "negative outcome" or the "positive outcome" in advance?
I am currently using a workaround like
direction_i <- if(mean(df_i[["outcome"]]) < 0.5) {">"} else {"<"}
in every step, which is ok for my use case.
However, in general I would find your package much more appealing, if this could be specified directly.
Am I missing sth. obvious?
There seems to be a lot of confusion around this function, what it does and how to use it.
In particular, it seems people would like to pass a "multiclass predictor", a matrix containing probabilities of each datapoint belonging to a class. See for instance this question on StackOverflow
Not sure anything can be saved here.
I was running a piece of code and my code threw this error message:
Error in delongPlacements(roc) :
A problem occured while calculating DeLong's theta: got 0.50057161522129678399 instead of 0.50032663726931247972. This is a bug in pROC, please report it to the maintainer.
Does anyone know what I should do?
Thanks
Followup of issue #40.
These values are calculated but never returned to the user.
They would be "special" as they would be weighted, unlike all other returned coordinate values.
response <- rbinom(1E5, 1, .5)
predictor <- rnorm(1E5)
r <- roc(response, predictor)
system.time(coords(r, "a"))
utilisateur système écoulé
47.791 0.088 47.867
I would expect it to complete more or less instantly.
It should be possible to calculate the significance of a single ROC curve.
This would test H_0: AUC = 0.5.
For a full AUC this should correspond to the Wilcoxon Test. For partial AUC we need to use bootstrapping. Something like this:
roc.test(roc(aSAH$outcome, aSAH$ndka))
roc.test(roc(aSAH$outcome, aSAH$ndka, partial.auc = c(1, 0.9)))
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.