juliatrustworthyai / conformalprediction.jl Goto Github PK

View Code? Open in Web Editor NEW

136.0 7.0 12.0 12.34 MB

Predictive Uncertainty Quantification through Conformal Prediction for Machine Learning models trained in MLJ.

Home Page: https://juliatrustworthyai.github.io/ConformalPrediction.jl/

License: MIT License

Julia 100.00%

conformal-prediction machine-learning julia predictive-uncertainty

conformalprediction.jl's Introduction

ConformalPrediction

ConformalPrediction.jl is a package for Predictive Uncertainty Quantification (UQ) through Conformal Prediction (CP) in Julia. It is designed to work with supervised models trained in MLJ (Blaom et al. 2020). Conformal Prediction is easy-to-understand, easy-to-use and model-agnostic and it works under minimal distributional assumptions.

🏃 Quick Tour

First time here? Take a quick interactive tour to see what this package can do right on JuliaHub (To run the notebook, hit login and then edit).

This Pluto.jl 🎈 notebook won the 2nd Price in the JuliaCon 2023 Notebook Competition.

Local Tour

To run the tour locally, just clone this repo and start Pluto.jl as follows:

] add Pluto
using Pluto
Pluto.run()

All notebooks are contained in docs/pluto.

📖 Background

Don’t worry, we’re not about to deep-dive into methodology. But just to give you a high-level description of Conformal Prediction (CP) upfront:

Conformal prediction (a.k.a. conformal inference) is a user-friendly paradigm for creating statistically rigorous uncertainty sets/intervals for the predictions of such models. Critically, the sets are valid in a distribution-free sense: they possess explicit, non-asymptotic guarantees even without distributional assumptions or model assumptions.

— Angelopoulos and Bates (2022)

Intuitively, CP works under the premise of turning heuristic notions of uncertainty into rigorous uncertainty estimates through repeated sampling or the use of dedicated calibration data.

Conformal Prediction in action: prediction intervals at varying coverage rates. As coverage grows, so does the width of the prediction interval.

The animation above is lifted from a small blog post that introduces Conformal Prediction and this package in the context of regression. It shows how the prediction interval and the test points that it covers varies in size as the user-specified coverage rate changes.

🚩 Installation

You can install the latest stable release from the general registry:

using Pkg
Pkg.add("ConformalPrediction")

The development version can be installed as follows:

using Pkg
Pkg.add(url="https://github.com/juliatrustworthyai/ConformalPrediction.jl")

🔍 Usage Example

To illustrate the intended use of the package, let’s have a quick look at a simple regression problem. We first generate some synthetic data and then determine indices for our training and test data using MLJ:

using MLJ

# Inputs:
N = 600
xmax = 3.0
using Distributions
d = Uniform(-xmax, xmax)
X = rand(d, N)
X = reshape(X, :, 1)

# Outputs:
noise = 0.5
fun(X) = sin(X)
ε = randn(N) .* noise
y = @.(fun(X)) + ε
y = vec(y)

# Partition:
train, test = partition(eachindex(y), 0.4, 0.4, shuffle=true)

We then import Symbolic Regressor (SymbolicRegression.jl) following the standard MLJ procedure.

regressor = @load SRRegressor pkg=SymbolicRegression
model = regressor(
    niterations=50,
    binary_operators=[+, -, *],
    unary_operators=[sin],
)

To turn our conventional model into a conformal model, we just need to declare it as such by using conformal_model wrapper function. The generated conformal model instance can wrapped in data to create a machine. Finally, we proceed by fitting the machine on training data using the generic fit! method:

using ConformalPrediction
conf_model = conformal_model(model)
mach = machine(conf_model, X, y)
fit!(mach, rows=train)

Predictions can then be computed using the generic predict method. The code below produces predictions for the first n samples. Each tuple contains the lower and upper bound for the prediction interval.

show_first = 5
Xtest = selectrows(X, test)
ytest = y[test]
ŷ = predict(mach, Xtest)
ŷ[1:show_first]

5-element Vector{Tuple{Float64, Float64}}:
 (-0.04087262272113379, 1.8635644669554758)
 (0.04647464096907805, 1.9509117306456876)
 (-0.24248802236397216, 1.6619490673126376)
 (-0.07841928163933476, 1.8260178080372749)
 (-0.02268628324126465, 1.881750806435345)

For simple models like this one, we can call a custom Plots recipe on our instance, fit result and data to generate the chart below:

using Plots
zoom = 0
plt = plot(mach.model, mach.fitresult, Xtest, ytest, lw=5, zoom=zoom, observed_lab="Test points")
xrange = range(-xmax+zoom,xmax-zoom,length=N)
plot!(plt, xrange, @.(fun(xrange)), lw=2, ls=:dash, colour=:darkorange, label="Ground truth")

We can evaluate the conformal model using the standard MLJ workflow with a custom performance measure. You can use either emp_coverage for the overall empirical coverage (correctness) or ssc for the size-stratified coverage rate (adaptiveness).

_eval = evaluate!(mach; measure=[emp_coverage, ssc], verbosity=0)
display(_eval)
println("Empirical coverage: $(round(_eval.measurement[1], digits=3))")
println("SSC: $(round(_eval.measurement[2], digits=3))")

PerformanceEvaluation object with these fields:
  model, measure, operation, measurement, per_fold,
  per_observation, fitted_params_per_fold,
  report_per_fold, train_test_rows, resampling, repeats
Extract:
┌──────────────────────────────────────────────┬───────────┬─────────────┬──────
│ measure                                      │ operation │ measurement │ 1.9 ⋯
├──────────────────────────────────────────────┼───────────┼─────────────┼──────
│ ConformalPrediction.emp_coverage             │ predict   │ 0.953       │ 0.0 ⋯
│ ConformalPrediction.size_stratified_coverage │ predict   │ 0.953       │ 0.0 ⋯
└──────────────────────────────────────────────┴───────────┴─────────────┴──────
                                                               2 columns omitted

Empirical coverage: 0.953
SSC: 0.953

📚 Read on

If after reading the usage example above you are just left with more questions about the topic, that’s normal. Below we have have collected a number of further resources to help you get started with this package and the topic itself:

Blog post introducing conformal classifiers: [Quarto], [TDS], [Forem].
Blog post applying CP to a deep learning image classifier: [Quarto], [TDS], [Forem].
The package docs and in particular the FAQ.

External Resources

A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification by Angelopoulos and Bates (2022) (pdf).
Predictive inference with the jackknife+ by Barber et al. (2021) (pdf)
Awesome Conformal Prediction repository by Valery Manokhin (repo).
Documentation for the Python package MAPIE.

🔁 Status

This package is in its early stages of development and therefore still subject to changes to the core architecture and API.

Implemented Methodologies

The following CP approaches have been implemented:

Regression:

Inductive
Naive Transductive
Jackknife
Jackknife+
Jackknife-minmax
CV+
CV-minmax

Classification:

Inductive
Naive Transductive
Adaptive Inductive

The package has been tested for the following supervised models offered by MLJ.

Regression:

keys(tested_atomic_models[:regression])

KeySet for a Dict{Symbol, Expr} with 8 entries. Keys:
  :ridge
  :lasso
  :evo_tree
  :nearest_neighbor
  :decision_tree_regressor
  :quantile
  :random_forest_regressor
  :linear

Classification:

keys(tested_atomic_models[:classification])

KeySet for a Dict{Symbol, Expr} with 5 entries. Keys:
  :nearest_neighbor
  :evo_tree
  :random_forest_classifier
  :logistic
  :decision_tree_classifier

Implemented Evaluation Metrics

To evaluate conformal predictors we are typically interested in correctness and adaptiveness. The former can be evaluated by looking at the empirical coverage rate, while the latter can be assessed through metrics that address the conditional coverage (Angelopoulos and Bates 2022). To this end, the following metrics have been implemented:

emp_coverage (empirical coverage)
ssc (size-stratified coverage)

There is also a simple Plots.jl recipe that can be used to inspect the set sizes. In the regression case, the interval width is stratified into discrete bins for this purpose:

bar(mach.model, mach.fitresult, X)

🛠 Contribute

Contributions are welcome! A good place to start is the list of outstanding issues. For more details, see also the Contributor’s Guide. Please follow the SciML ColPrac guide.

🙏 Thanks

To build this package I have read and re-read both Angelopoulos and Bates (2022) and Barber et al. (2021). The Awesome Conformal Prediction repository (Manokhin 2022) has also been a fantastic place to get started. Thanks also to @aangelopoulos, @valeman and others for actively contributing to discussions on here. Quite a few people have also recently started using and contributing to the package for which I am very grateful. Finally, many thanks to Anthony Blaom (@ablaom) for many helpful discussions about how to interface this package to MLJ.jl.

🎓 References

Angelopoulos, Anastasios N., and Stephen Bates. 2022. “A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification.” https://arxiv.org/abs/2107.07511.

Barber, Rina Foygel, Emmanuel J. Candès, Aaditya Ramdas, and Ryan J. Tibshirani. 2021. “Predictive Inference with the Jackknife+.” The Annals of Statistics 49 (1): 486–507. https://doi.org/10.1214/20-AOS1965.

Blaom, Anthony D., Franz Kiraly, Thibaut Lienart, Yiannis Simillides, Diego Arenas, and Sebastian J. Vollmer. 2020. “MLJ: A Julia Package for Composable Machine Learning.” Journal of Open Source Software 5 (55): 2704. https://doi.org/10.21105/joss.02704.

Manokhin, Valery. 2022. “Awesome Conformal Prediction.” https://doi.org/10.5281/zenodo.6467205; Zenodo. https://doi.org/10.5281/zenodo.6467205.

conformalprediction.jl's People

Contributors

Stargazers

Watchers

Forkers

chenweichen mojifarmanbar rikhuijzer pitmonticone qxzsilver1 john-waczak sbata1984 akai01 ceferisbarov rockdeldiablo pricklypointer jackachou

conformalprediction.jl's Issues

Move to adjusted quantile

Use

q_level = ceil((n+1)*(cov))/n
q̂ = StatsBase.quantile(v, q_level)

Add TaijaPlotting to docs env

Add TaijaPlotting to docs env onve registered

Wrap models not machines?

@pat-alt

Congratualations on the launch of this new package 🎉 Great to have the integration with MLJ!

I'm not familiar with conformal prediction, but I nevertheless wonder why this package wraps MLJ machines rather than models. If you wrap models, then you will buy into MLJ's model composition. So, a "conformally wrapped model" will behave like any other model: you can insert in pipeline, can wrap in tuning strategy, and so forth.

New models in MLJ generally implement the "basement level" model API. Machines are a higher level abstraction for: (i) user interaction; and (ii) syntax for building learning networks which are ultimately "exported" as standalone model types.

Here are other examples of model wrapping in MLJ: EnsembleModel (docs), BinaryThresholdPredictor, TunedModel, IteratedModel. What makes things a little complicated is the model hierarchy: the model supertype for the wrapped model depends on the supertype of the atomic model. So for example, we don't just have EnsembleModel we have DeterministicEnsembleModel (for ordinary point predictors) and ProbabilisticEnsembleModel (for probabilistic predictors) but the user only sees a single constructor EnsembleModel; see here. (A longer term goal is to drop the hierarchy in favour of pure trait interface, which will simplify things, but that's a little ways off yet.)

Happy to provide further guidance.

cc @azev77

[maybe] Refactor Jackknife and CV methods (DRY)

Currently we have separate constructors for Jackknife and CV methods, e.g. JackknifePlusRegressor and CVPlusRegressor. Since the former is just a special case of the latter (for which nfold=nobs), technically this makes the former constructor redundant. By getting rid of it we could make the code base more DRY ("don't repeat yourself").

The problem with this idea is that at instantiation the models have no access to data, so nobs is unknown. So if we want to keep a separate model type (JackknifePlusRegressor), then making the code more DRY would come with its own complications.

Currently undecided, so won't fix myself, but leaving this here for discussion.

Add support for ConfTr

This ICLR 2022 paper shows how to train conformal classifiers.

Add losses for prediction step (prediction step)
Streamline (need separate score method for dealing with MLJFlux) - done in b4c7140
Add support for differentiable quantile computations (calibration step)
Implement batch training procedure
Test and document

[Testing] Ensure that empirical coverages check out

To ensure that all methods are implemented correctly, we should verify that the empirical coverage rates. To some extent this has already been done in the documentation, but we can be a bit more serious about it.

Add a guide to docs/explanation that runs all implemented methods and compares the empirical coverage to the theoretical coverage (see this table) for reference)
Add these sanity checks as unit tests

Add parallelizer field to all models

This is necessary to set CP up for usage with TaijaParallel

Add support for RAPS

Regularized Adaptive Prediction Sets: https://arxiv.org/pdf/2009.14193.pdf

[Refactor] Separate module for TS

Refactor the code to have TS methods in separate module/files.

Add support for class-conditional CP

Section 4.2

Add support for addressing covariate and distribution shift

Section 4.5

JuliaCon pres

readme Quick Tour notebook: "Could not fetch rendered notebook or notebook source."

[Feature request] Conformal Predictive Distributions

Hi & thanks for this package.
I've been waiting for a package for conformal prediction...

Here is some sample code from my test drive which may or may not be useful for docs:

using Pkg
Pkg.add.(["MLJ" "EvoTrees" "Plots"])
Pkg.add(url="https://github.com/pat-alt/ConformalPrediction.jl")
using MLJ, EvoTrees, ConformalPrediction, Plots, Random;
########################################
rng=MersenneTwister(49); #rng=Random.GLOBAL_RNG;
n= 100_000; p=7; σ=0.1;
X = [ones(n) randn(rng, n, p-1)]
θ = randn(rng, p)
y = X * θ .+ σ .* randn(rng, n)
train, calibration, test = partition(eachindex(y), 0.4, 0.4)
########################################
EvoTreeRegressor = @load EvoTreeRegressor pkg=EvoTrees
model = EvoTreeRegressor() 
mach = machine(model, X, y)
fit!(mach, rows=train)
pr_y = predict(mach, rows=test)
########################################
conf_mach = conformal_machine(mach)
calibrate!(conf_mach, selectrows(X, calibration), y[calibration])
pr = predict(conf_mach, X[test,:]; coverage=0.95)

pr_lower = [pr[j][1][2][] for j in 1:length(test)]
pr_upper = [pr[j][2][2][] for j in 1:length(test)]

###########################################
plot()
plot!(y[test], lab="y test")
plot!(pr_y, lab="y prediction")
plot!(pr_lower, lab = "y 95% prediction lower bound")
plot!(pr_upper, lab = "y 95% prediction upper bound")

mean(pr_lower .<= y[test] .<= pr_upper)

Full module for Conformal Training

move existing things into their own module (may use separate package in the future)
implement complete algorithm for training
tests
docs

Add support for conformalized scalar uncertainty estimates

Add support for conformalized scalar uncertainty estimates as outlined in Section 2.3. Related to #34 and #31 in the sense that we'd want to avoid the need for separate constructors here. Key will be to just change the uncertainty heuristic/score function, but again, not sure how easy (struggled to do this for Bayes in #31).

Set-valued predictions for MLJ

The goal for this package is to seamlessly interact with MLJ. To that end all conformal models implement the compulsory MMI.fit and MMI.predict methods (following guidelines set out here).

With respect to downstream tasks (in particular evaluation) we are facing a problem: predictions for standard conformal classifiers are set-valued. Currently MLJ supports Interval-valued predictions, but not Sets (see related discussion #20).

A long-term goal is to align MLJ and this package. Any thoughts/comments/help welcome!

Adaptive Inductive Classification broken?

Getting weird results for LLM #65

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Conformal Bayes through 'add-one-in' importance sampling

Implement method proposed in this paper. See also this R package for reference: https://github.com/CoryMcCartan/conformalbayes.

Package stability

Hi,

I want to introduce this package to teaching. I know you warn that the API is not stable, but I wanted to check if this warning is serious or just a safety belt and you mostly worked out the required design?

Other methods for PIs

When you recover from COVID, can we discuss implementing other methods for prediction Intervals besides the naive method?

I have code for the Jacknife+
https://www.stat.cmu.edu/~ryantibs/papers/jackknife.pdf

and additional conformal methods, that can handle heteroskedasticity ….

Support for thresholding predictive distributions as explained in Section 2.4 of the tutorial

See Section 2.4 of the tutorial: https://arxiv.org/abs/2107.07511

Treat data as artifacts

Conformal Training

Finally add full support for conformal training.

Add differentiable sort

Add support for 1.6

Revisit sample correction

I've previously opened a discussion regarding finite sample corrections. Time to revisit this.

.vscode folder

Maybe I am missing something, but why do we need to track the.vscode folder?

CP for LLMs

Issue set up for Experiment Week

Study this paper
Get data from hugging face
~~Train small transformer model from scratch~~
~~Look at fine-tuning of pre-trained model~~

Add support for ensemble batch prediction intervals

For time series models, see here and MAPIE

Tasks:

Do you really need MLJ dependency?

I notice that CP takes a while to load/precompilel. I'd be surprised if you really need MLJ as a dependency. MLJ essentially just collects these components, and you probably don't need all of them:

MLJBase
MLJModels (maybe, if you use some then built-in transformers)
MLJEnsembles (unlikely)
MLJIteration (unlikely)
MLJTuning

Ordinary model interfaces just need MLJModelInterface, which is very lightweight. But you will need MLJBase if you are using composite model tools like learning networks, pipelines, etc. And if you are extending measures (but this will ultimately move out). You need it for machines too, but I thought that was factored out now, yes?

Add support for Conformalized Bayes

It is possible to Conformalize Bayesian predictive distributions (see section 2.4 in this tutorial).

Add Aqua.jl

Add support for Quantile regression

MLJLinearModels.jl has a QuantileRegressor that (I presume) returns and interval. Section 2.2 outlines how Conformalize Quantile Regression can be implemented. Like with #32 I'm not sure how easy it is to simply adapt the score functions in order to then still be able to use all of the different approaches to conformalizing regression. Probably easier than #32 and would definitely be desirable in this case.

Add traits for custom measures (and later contribute for general use)

    @pat-alt Great to hear about your progress!

Q1: Firstly, should I extend MMI.evaluate to assert that users only use one of the two applicable custom measures?

Generally the kind of target proxy the measure is used for is articulated with the prediction_type trait. (Measures have traits, just like models. The manual mentions this, but you'll want to look also here if you're contributing new measures.) So, you would do something like:

StatisticalTraits.prediction_type(::Type{<:YourMeasureType}) = :probablisitic_set

edited: The model version of this trait is already suitably overloaded here:

https://github.com/JuliaAI/MLJModelInterface.jl/blob/d9e9703947fc04b0a5e63680289e41d0ba0d65bd/src/model_traits.jl#L27

The evaluate apparatus in MLJBase should check the model matches the measure and throw an error if it doesn't. Possibly, as this is a new target proxy type, the behaviour at MLJBase may need to be adjusted. The relevant logic lives approximately here:

https://github.com/JuliaAI/MLJBase.jl/blob/d79f29b78c5068377e25363884e2ea1c4b4a149a/src/resampling.jl#L600

Q2:

Do you always see this rubbish, or just for your custom measure? Where are you viewing this? Is it in an ordinary terminal or VSCode, notebook, other? Could you please try MLJ.color_off() and see if that helps?

Originally posted by @ablaom in #40 (comment)

MNIST
German Credit

and cross-check results in paper

#109
#110

[Docs] EnbPI for a time series problem

Can draw inspiration from this tutorial.

Class-Conditional CP with many classes

Implement approach presented here