Comments (7)
Requesting help to understand the error in more detail.
The UserWarning
is telling you that some of the learned equations are all 0's. With the STLSQ
optimizer, the threshold you set will determine the smallest coefficient that is retained. You probably shouldn't need to set it as low as 0.00005. See the suggestions below for ways of dealing with this issue.
The LinAlgWarning
is saying that the matrix of library functions, Theta(X)
probably has columns that are very close to linearly dependent. This can occur if your state variables are highly correlated or can be expressed as polynomial functions of other state variables (since you're using a polynomial library). I've seen the latter come up when the exact solution of the underlying differential equations involved sines and cosines due to trig identities (e.g. if x1(t) = sin(t) and x2(t) = cos(t) are solutions, then x1^2 + x2^2 = 1, so the columns of Theta(X)
corresponding to 1, x1, and x2, will be linearly dependent).
A few things worth trying:
- Normalize the features in the feature library with
ps.STLSQ(normalize=True)
. Different scales in input features can affect the relative sizes of coefficients. - For the poor conditioning problem: increase regularization strength,
alpha
, withps.STLSQ(alpha=0.1)
. You can try different versions ofalpha
and see what works best. - Experiment with different optimizers: SR3 or maybe a Scikit-learn method.
- You may need to tweak the way you're computing
Dx_train
. If the data are noisy, be sure to use a differentiation method that is robust to noise.
from pysindy.
Dx_train
derivatives of state-variables are obtained from the ode system-model(which is stiff) itself by plugging in the solution and evaluating the RHS. I am trying to derive the ode-system by SINDy again, however am unable to do so.
I notice that indeed there are some highly co-related state variables in my x_train
(as they have nearly same qualitative variation with time).
I have 7 states in x_train
= [x1(t),x2(t),x3(t),x4(t),x5(t),x6(t),x7(t)]
Following code, I guess includes all combinations of xy
from 7 state variables ( some of which are again co-related).
library_functions = [
lambda x, : x,
lambda x,y : x*y,
]
library_function_names = [
lambda x, : x,
lambda x,y : '(' + x + '*' + y + ')',
]
But suppose of all combinations I wish to specifically select only 2 or 3 combinations such as x2*x3
or x3*x4
which are not much co-related. How I do proceed further to avoid including all the combinations ?
from pysindy.
But suppose of all combinations I wish to specifically select only 2 or 3 combinations such as
x2*x3
orx3*x4
which are not much co-related. How I do proceed further to avoid including all the combinations ?
Did anything I suggested help at all? Increasing the regularization parameter, alpha
, should help. If the system is stiff, it is likely that there are at least two different timescales coming into play, so you may be able to improve your results by using a non-uniform time sampling scheme. See this paper for further information. You may also be able to get away with simply increasing the sampling rate.
from pysindy.
Hello sir,
Thanks for pointing me to the paper, I will go through it asap. Also will check the effect of increasing sampling rate.
Based on the earlier suggestions, here is what I found.
- Increasing
alpha
andnormalization=True
helped in spitting out some non-zero terms inaccurately for couple of state-variables, but the other state-variables still remain zero. SR3
(and alsoSTLSQ
with largealpha
)does not give any error or warning, but gives results that are not explainable physically(terms printed by the model are not possible).Lasso
also helped in spitting out some non-zero terms inaccurately for couple of state-variables, but it alsogives warning of increasing max iterations
even after setting it to 2000
Here is the explanation of the model, [x1(t), x2(t),...,x7(t)]
are the state-variables obtained from a high-dimensional(many state-variables) chemical-reactions full model which has even more number of state-variables. It has been proved in literature that a reduced-order model can be constructed from using only these 7 state variables, and the equivalent reduced order model of only 4 parameters is given below (which only approximately re-presents actual full model)
[k1, k2, k3, k4] = [6.93938e+20 , 3.49444e+03 ,1.48266e+08, 2.82183e-02]
# inaccurate parameters for reduced order model, obtained from optimization methods like curve fitting
[x1 ,x2,x3,x4,x5,x6,x7] @(t =0) = [100e-12, 0.0, 1.6e-07 , 2.0e-08 , 0.0 , 1.4e-06, 0.0 ]
# units (M)
t_span = 0:800; # time span for ODE system ,. 800s
x1' = −k1 (x1 x2 x3 x4)
x2' = −k1 (x1 x2 x3 x4) + k2 (x1 x3 x6) + k3 (x5 x6) − k4 x2
x3' = −k1 (x1 x2 x3 x4)
x4' = −k1 (x1 x2 x3 x4)
x5' = k1 (x1 x2 x3 x4)
x6' = −k2(x1 x3 x6) − k3 (x5 x6)
x7' = k4 x2
I was trying to prove these above equations hold using SINDy, which might spit out more accurate values of parameter [k1,k2,k3,k4]
for which the reduced-order model accurately represents the full model.
from pysindy.
Thanks for the detailed explanation. A couple of things stand out to me here:
- Many of the terms you're hoping to learn involve fourth degree terms, so you should use a polynomial library of degree four.
- I think you will see a good deal of improvement if you can increase the sampling rate. Even using a naive uniform time sampling strategy, you should be able to get better results. I suspect that one second between samples might be too long for PySINDy to compute an accurate derivative.
- If the coefficients k1 - k4 really do span 22 orders of magnitude, SINDy is going to have a hard time picking up the smaller ones. You might consider trying the ConstrainedSR3 optimizer on the SR3Enhanced_variableThresholding branch with custom thresholds for different library terms (#78). This should allow you to set, for example, a small threshold for the
x2
term, which you expect to have a smaller coefficient value and a larger threshold for, say,(x1 x2 x3 x4)
. - Sometimes Orthogonal matching pursuit gives better results than LASSO.
from pysindy.
Hi,
Polynomial library to degree four was used, But I was unable to get the correct result.
It really was because of parameter k
spanning many orders of magnitude. Increasing sampling time did no help. I also tried to tune the constrained SR3 optimizing for different library terms, but was unable to recover the equations.
Recently, I have come across a nice paper that does tackle this issue of parameters spanning orders of magnitude(as in chemical reactions networks). The key is to scale the linear system before minimizing it. Attaching the link here for the reference. Will check how this fares with SINDy
Rapid data-driven model reduction of nonlinear dynamical systems including chemical reaction networks using ℓ1-regularization
https://aip.scitation.org/doi/10.1063/1.5139463
from pysindy.
Thanks for the reference. Setting normalization=True
should rescale each of the columns, but you might have better luck doing the rescaling by hand via the approach in the paper you linked. I'm curious to hear whether it works!
from pysindy.
Related Issues (20)
- Incorporate make_constraints into trapping_sr3.py HOT 8
- Initial Guess - SR3 - Custom Library [BUG] HOT 3
- [Help] Can't get the right model for 2d system HOT 3
- [DOC] WeakPDELibrary.transform does not produce output in shape specified by documentation.
- Help in figuring out the cause behind TypeError while fitting with tensor libraries HOT 7
- Optimizer for SINDy-PI feature of implicit ODEs HOT 7
- Let model.print() respect the same kwargs as print()
- Use `x_dot` kwarg in `model.fit` in PDEs with `WeakPDELibrary()`
- constraint_separation_index undocumented
- Do we need `test_trapping_cubic_library()`?
- Make `IdentityLibrary` a subclass of `PolynomialLibrary`
- Make models guaranteed pickleable
- Error in '3_original_Paper' Example: Missing 'quiet' Argument in pysindy.py HOT 1
- Remove research experiments from pysindy HOT 4
- [Question]How model.predict() works for implicit ODEs function? HOT 14
- Passing custom derivatives instead of using predefined differentiation classes HOT 4
- [BUG] pytest-lazy-fixture may be dead, blocking pytest upgrades HOT 1
- `np.median(model.coef_list, axis=0)` does not give the same answer as `model.print()` from first example in ensembling documentation HOT 2
- [BUG] SINDyDerivative does not handle PDEs HOT 2
- Feature request: Manually set coefficients HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pysindy.