tisimst / soerp Goto Github PK

Second Order ERror Propagation

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

soerp's Introduction

`soerp` Package Documentation

Overview

soerp is the Python implementation of the original Fortran code SOERP by N. D. Cox to apply a second-order analysis to error propagation (or uncertainty analysis). The soerp package allows you to easily and transparently track the effects of uncertainty through mathematical calculations. Advanced mathematical functions, similar to those in the standard math module can also be evaluated directly.

In order to correctly use soerp, the first eight statistical moments of the underlying distribution are required. These are the mean, variance, and then the standardized third through eighth moments. These can be input manually in the form of an array, but they can also be conveniently generated using either the nice constructors or directly by using the distributions from the scipy.stats sub-module. See the examples below for usage examples of both input methods. The result of all calculations generates a mean, variance, and standardized skewness and kurtosis coefficients.

Required Packages

ad : For first- and second-order automatic differentiation (install this first).
NumPy : Numeric Python
SciPy : Scientific Python (the nice distribution constructors require this)
Matplotlib : Python plotting library

Basic examples

Let's begin by importing all the available constructors:

>>> from soerp import *   # uv, N, U, Exp, etc.

Now, we can see that there are several equivalent ways to specify a statistical distribution, say a Normal distribution with a mean value of 10 and a standard deviation of 1:

Manually input the first 8 moments (mean, variance, and 3rd-8th standardized central moments):
```
>>> x = uv([10, 1, 0, 3, 0, 15, 0, 105])
```
Use the rv kwarg to input a distribution from the scipy.stats module:
```
>>> x = uv(rv=ss.norm(loc=10, scale=1))
```
Use a built-in convenience constructor (typically the easiest if you can):
```
>>> x = N(10, 1)
```

A Simple Example

Now let's walk through an example of a three-part assembly stack-up:

>>> x1 = N(24, 1)  # normally distributed
>>> x2 = N(37, 4)  # normally distributed
>>> x3 = Exp(2)  # exponentially distributed
>>> Z = (x1*x2**2)/(15*(1.5 + x3))

We can now see the results of the calculations in two ways:

The usual print statement (or simply the object if in a terminal):

>>> Z  # "print" is optional at the command-line
uv(1176.45, 99699.6822917, 0.708013052944, 6.16324345127)

The describe class method that explains briefly what the values are:

>>> Z.describe()
SOERP Uncertain Value:
 > Mean...................  1176.45
 > Variance...............  99699.6822917
 > Skewness Coefficient...  0.708013052944
 > Kurtosis Coefficient...  6.16324345127

Distribution Moments

The eight moments of any input variable (and four of any output variable) can be accessed using the moments class method, as in:

>>> x1.moments()
[24.0, 1.0, 0.0, 3.0000000000000053, 0.0, 15.000000000000004, 0.0, 105.0]
>>> Z.moments()
[1176.45, 99699.6822917, 0.708013052944, 6.16324345127]

Correlations

Statistical correlations are correctly handled, even after calculations have taken place:

>>> x1 - x1
0.0
>>> square = x1**2
>>> square - x1*x1
0.0

Derivatives

Derivatives with respect to original variables are calculated via the ad package and are accessed using the intuitive class methods:

>>> Z.d(x1)  # dZ/dx1
45.63333333333333

>>> Z.d2(x2)  # d^2Z/dx2^2
1.6

>>> Z.d2c(x1, x3)  # d^2Z/dx1dx3 (order doesn't matter)
-22.816666666666666

When we need multiple derivatives at a time, we can use the gradient and hessian class methods:

>>> Z.gradient([x1, x2, x3])
[45.63333333333333, 59.199999999999996, -547.6]

>>> Z.hessian([x1, x2, x3])
[[0.0, 2.466666666666667, -22.816666666666666], [2.466666666666667, 1.6, -29.6], [-22.816666666666666, -29.6, 547.6]]

Error Components/Variance Contributions

Another useful feature is available through the error_components class method that has various ways of representing the first- and second-order variance components:

>>> Z.error_components(pprint=True)
COMPOSITE VARIABLE ERROR COMPONENTS
uv(37.0, 16.0, 0.0, 3.0) = 58202.9155556 or 58.378236%
uv(24.0, 1.0, 0.0, 3.0) = 2196.15170139 or 2.202767%
uv(0.5, 0.25, 2.0, 9.0) = -35665.8249653 or 35.773258%

Advanced Example

Here's a slightly more advanced example, estimating the statistical properties of volumetric gas flow through an orifice meter:

>>> from soerp.umath import *  # sin, exp, sqrt, etc.
>>> H = N(64, 0.5)
>>> M = N(16, 0.1)
>>> P = N(361, 2)
>>> t = N(165, 0.5)
>>> C = 38.4
>>> Q = C*umath.sqrt((520*H*P)/(M*(t + 460)))
>>> Q.describe()
SOERP Uncertain Value:
 > Mean...................  1330.99973939
 > Variance...............  58.210762839
 > Skewness Coefficient...  0.0109422068056
 > Kurtosis Coefficient...  3.00032693502

This seems to indicate that even though there are products, divisions, and the usage of sqrt, the result resembles a normal distribution (i.e., Q ~ N(1331, 7.63), where the standard deviation = sqrt(58.2) = 7.63).

Main Features

Transparent calculations with derivatives automatically calculated. No or little modification to existing code required.
Basic NumPy support without modification. Vectorized calculations built-in to the ad package.
Nearly all standard math module functions supported through the soerp.umath sub-module. If you think a function is in there, it probably is.
Nearly all derivatives calculated analytically using ad functionality.
Easy continuous distribution constructors:
- N(mu, sigma) : Normal distribution
- U(a, b) : Uniform distribution
- Exp(lamda, [mu]) : Exponential distribution
- Gamma(k, theta) : Gamma distribution
- Beta(alpha, beta, [a, b]) : Beta distribution
- LogN(mu, sigma) : Log-normal distribution
- Chi2(k) : Chi-squared distribution
- F(d1, d2) : F-distribution
- Tri(a, b, c) : Triangular distribution
- T(v) : T-distribution
- Weib(lamda, k) : Weibull distribution
The location, scale, and shape parameters follow the notation in the respective Wikipedia articles. Discrete distributions are not recommended for use at this time. If you need discrete distributions, try the mcerp python package instead.

Installation

Make sure you install the ad package first! (If you use options 3 or 4 below, this should be done automatically.)

You have several easy, convenient options to install the soerp package (administrative privileges may be required)

Download the package files below, unzip to any directory, and run:
```
$ [sudo] python setup.py install
```
Simply copy the unzipped soerp-XYZ directory to any other location that python can find it and rename it soerp.

If setuptools is installed, run:

$ [sudo] easy_install [--upgrade] soerp

If pip is installed, run:
```
$ [sudo] pip install [--upgrade] soerp
```

Uninstallation

To remove the package, there are really two good ways to do this:

Go to the folder site-packages or dist-packages and simply delete the folder soerp and soerp-XYZ-egg-info.
If pip is installed, run:
```
$ [sudo] pip uninstall soerp
```

Contact

Please send feature requests, bug reports, or feedback to Abraham Lee.

Acknowledgements

The author wishes to thank Eric O. LEBIGOT who first developed the uncertainties python package (for first-order error propagation), from which many inspiring ideas (like maintaining object correlations, etc.) are re-used and/or have been slightly evolved. If you don't need second order functionality, his package is an excellent alternative since it is optimized for first-order uncertainty analysis.

References

N.D. Cox, 1979, Tolerance Analysis by Computer, Journal of Quality Technology, Vol. 11, No. 2, pp. 80-87

soerp's People

Contributors

Stargazers

Watchers

Forkers

davenquinn lepy hornekyle cdeil formulas-and-numbers passion4energy maiterth

soerp's Issues

Bug in Tri() constructor for triangular distribution

The constructor for triangular distributions, Tri(a, b, c) is calling scipy.stats.triang as follows:

return uv(rv=ss.triang(c, loc=a, scale=b-a), tag=tag)

where a and b are respectively the min and max of the distribution and c is the peak. However, even if this is not clear from the SciPy documentation, the first argument to scipy.stats.triang should be between 0 and 1. That line should read something like

return uv(rv=ss.triang((c-a)/float(b-a), loc=a, scale=b-a), tag=tag)

Add CITATION.cff file

Github supports CITATION.cff files, so I was thinking you might want to add one:
https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files

Weib distribution

Your __init__.py file references the Weibull distribution on line 620 but in reality calls the exponentiated Weibull distribution (exponweib function) from scipy.stats on line 1035. This can be confusing to an unsuspecting user. The attached python script demonstrates the problem.

# import libraries
import scipy.stats as sps
import soerp as srp
import scipy.special as spp

# define lamda and k parameters
lamda=4
k=6

# calculate EXPONENTIATED Weibull distribution mean using soerp
res1 = srp.Weib(lamda=lamda, k=k).mean

# calculate Weibull distribution mean using the analytic solution
res2 = lamda*spp.gamma(1+1/k)

# calculate Weibull distribution mean using scipy.stats
res3 = sps.weibull_min.mean(c=k, scale=lamda, loc=0)

# calculate Weibull distribution mean using soerp's uv function
res4 = srp.uv(rv=sps.weibull_min(k, scale=lamda, loc=0)).mean

# and print the results
print(res1, "\t", res2, "\t", res3, "\t", res4)

I think that the following patch rectifies the situation.

--- __init__.py.OLD	2017-02-25 09:24:55.965982600 -0600
+++ __init__.py	2017-03-19 14:58:39.587601800 -0500
@@ -582,7 +582,7 @@
     +---------------------------+-------------+-------------------+-----+---------+
     | Student-T(v)              | t           | v                 |     |         |
     +---------------------------+-------------+-------------------+-----+---------+
-    | Weibull(lamda, k)         | exponweib   | lamda, k          |     |         |
+    | Weibull(lamda, k)         | weibull_min | k                 |     | lamda   |
     +---------------------------+-------------+-------------------+-----+---------+
     
     Thus, each distribution above would have the same call signature::
@@ -1032,7 +1032,7 @@
         The shape parameter
     """
     assert lamda>0 and k>0, 'Weibull scale and shape parameters must be greater than zero'
-    return uv(rv=ss.exponweib(lamda, k), tag=tag)
+    return uv(rv=ss.weibull_min(k, loc=0, scale=lamda), tag=tag)
 
 ###############################################################################

I have tested it minimally. Unfortunately it could break existing code but I believe that these changes conform to the principle of least surprise based on your documentation.

Probably density from soerp

Hi there,

I was just wondering if there is an easy way to use the result of soerp operations as random variables. For example

x=N(0,1)
y=N(1,2)
z=x+y

Then, can I get the pdf of z easily?

EDIT: Actually I see now (from a plot message) that this is not yet possible.

Did you have anything in mind about how you might go about doing this? Something like edgeworth expansions?

Add shape param for LogN distribution

It is currently not possible to set the "usual" loc and scale params on the log scale for the LogNormal distribution.

loc is usually 0, and scale should usually set to exp(mu)

For details, see http://nbviewer.ipython.org/url/xweb.geos.ed.ac.uk/~jsteven5/blog/lognormal_distributions.ipynb

Error in estimating error of (a**x)

Hello,
I am trying to estimate the error of a formula, and the code goes to a dead loop.
After some attempts I find the problem lies in the (a**x) like formula, which belongs to 'rpow' type.

A simple example for a dead loop is like:

alpha=N(20,1)
10**alpha
Traceback (most recent call last):
File "", line 1, in
File "/opt/anaconda3/lib/python3.7/site-packages/soerp/init.py", line 426, in rpow
return _make_UF_compatible_object(ADF.rpow(self, val))
File "/opt/anaconda3/lib/python3.7/site-packages/ad/init.py", line 787, in rpow
return to_auto_diff(val)**self
File "/opt/anaconda3/lib/python3.7/site-packages/soerp/init.py", line 426, in rpow
return _make_UF_compatible_object(ADF.rpow(self, val))
File "/opt/anaconda3/lib/python3.7/site-packages/ad/init.py", line 787, in rpow
return to_auto_diff(val)**self
File "/opt/anaconda3/lib/python3.7/site-packages/soerp/init.py", line 426, in rpow
return _make_UF_compatible_object(ADF.rpow(self, val))
File "/opt/anaconda3/lib/python3.7/site-packages/ad/init.py", line 787, in rpow
return to_auto_diff(val)**self
(984 loops skipped...)
File "/opt/anaconda3/lib/python3.7/site-packages/soerp/init.py", line 426, in rpow
return _make_UF_compatible_object(ADF.rpow(self, val))
File "/opt/anaconda3/lib/python3.7/site-packages/ad/init.py", line 787, in rpow
return to_auto_diff(val)**self
File "/opt/anaconda3/lib/python3.7/site-packages/soerp/init.py", line 426, in rpow
return _make_UF_compatible_object(ADF.rpow(self, val))
File "/opt/anaconda3/lib/python3.7/site-packages/ad/init.py", line 787, in rpow
return to_auto_diff(val)**self
File "/opt/anaconda3/lib/python3.7/site-packages/soerp/init.py", line 426, in rpow
return _make_UF_compatible_object(ADF.rpow(self, val))
File "/opt/anaconda3/lib/python3.7/site-packages/ad/init.py", line 787, in rpow
return to_auto_diff(val)**self
File "/opt/anaconda3/lib/python3.7/site-packages/soerp/init.py", line 426, in rpow
return _make_UF_compatible_object(ADF.rpow(self, val))
File "/opt/anaconda3/lib/python3.7/site-packages/ad/init.py", line 787, in rpow
return to_auto_diff(val)**self
File "/opt/anaconda3/lib/python3.7/site-packages/ad/init.py", line 39, in to_auto_diff
if isinstance(x, ADF):
RecursionError: maximum recursion depth exceeded while calling a Python object

In soerp/init.py:
def rpow(self, val):
return _make_UF_compatible_object(ADF.rpow(self, val))

In ad/init.py:
def rpow(self,val):
return to_auto_diff(val)**self

Please solve this problem, thanks!

Sincerely,
Peter

Python 3 Compatibility

When trying to use error_components there are some dictionary errors. These result from Python 3 removing the 'has_key' method of dictionary objects. They can be corrected by replacing 'dictionary.has_key(key)' with 'key in dictionary'.

Documentation/technical description

I was wondering if there was, apart from the docstrings, some documentation or technical explanation about how the package works, from a mathematical point of view ? Or maybe point some wikipedia/online ressource that describes the rational about the foundation of the package

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.