Comments (7)
Note that this parameter should be also added to var
.
from scikit-fda.
I understand that ddof
should also be 1 by default in var
. This is going to change the default behavior of var
, as it now normalizes by N
.
from scikit-fda.
That is right. I do not think that changing it will cause problems, as it is not a very used function.
from scikit-fda.
Some tests are failing because of the modification in var
.
================================== short test summary info ===================================
FAILED skfda/ml/clustering/_kmeans.py::skfda.ml.clustering._kmeans.FuzzyCMeans
FAILED skfda/tests/test_neighbors.py::TestNeighbors::test_score_functional_response - AssertionError:
FAILED skfda/tests/test_scoring.py::TestScoreFunctionsGrid::test_all - AssertionError:
FAILED skfda/tests/test_scoring.py::TestScoreFunctionGridBasis::test_all - AssertionError: 0.9859361757845293 != 0.992968013919044 within 2 places (0.00703183813451...
FAILED skfda/tests/test_scoring.py::TestScoreZeroDenominator::test_zero_r2 - AssertionError:
I am going to write ddof=0
in the calls to var
that appear in the code:
scikit-fda/skfda/misc/scoring.py
Line 77 in 415796b
scikit-fda/skfda/ml/clustering/_kmeans.py
Line 150 in 415796b
This solves the issues with the tests.
from scikit-fda.
Do not "just" write ddof=0
. It would be best to analyze the intended denominator in each place.
from scikit-fda.
Sorry, I thought those functions were already intentionally using
First case
A function _var
is defined:
scikit-fda/skfda/misc/scoring.py
Lines 70 to 82 in 415796b
Here:
scikit-fda/skfda/misc/scoring.py
Lines 245 to 250 in 415796b
And here:
scikit-fda/skfda/misc/scoring.py
Lines 1011 to 1020 in 415796b
stats.var
function is called iff sample_weight=None
, in which case the variable ss_res
is also normalized by N
by mean
, so I think ddof
should be 0 in this first case.
Second case (
scikit-fda/skfda/ml/clustering/_kmeans.py
Line 150
in
415796b
)
scikit-fda/skfda/ml/clustering/_kmeans.py
Line 150 in 415796b
Here the variance is used only to calculate the tolerance used by the K-Means algorithm to check for convergence:
scikit-fda/skfda/ml/clustering/_kmeans.py
Lines 149 to 153 in 415796b
I believe that ddof=0
can be acceptable here but I do not have any arguments against ddof=1
other than the fact that the tests were built taking into account the previous definition of tolerance that used the variance with ddof=0
.
from scikit-fda.
I think that your analysis is correct.
from scikit-fda.
Related Issues (20)
- Expose the complete API of `AgglomerativeClustering`
- How to deal with unequally spaced data points? HOT 1
- `hat_matrix_` not available for BasisSmoother HOT 2
- Change covariance function and smoothing kernel nomenclature
- Make squared l2 distance? HOT 1
- Multivariate FPCA not working HOT 1
- fda_kmeans.fit_predict(X) HOT 3
- Default `random_state` for `KMeans` and `FuzzyCMeans` should be `None`
- TypeError: 'ABCMeta' object is not subscriptable HOT 2
- Add explicit validation in scoring functions
- Operations between FData objects and other callables
- Scores for `FDataIrregular` objects HOT 3
- FPCA on FDataIrregular HOT 5
- Evaluation of FDataIrregular observation with only one point not working
- Evaluation of `FDataIrregular` objects with multidimensional domain not working
- Integral of discretized functional data
- Effect size in functional One-way ANOVA HOT 1
- Getting confidence interval in scikit fda Python HOT 4
- ImportError: cannot import name 'invert_warping' from 'skfda._utils' HOT 6
- Why does calling `FuzzyCMeans.predict_proba()` alter the centroids of the fitted model?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scikit-fda.