GithubHelp home page GithubHelp logo

Comments (5)

andersy005 avatar andersy005 commented on July 18, 2024

Thinking about it a moment more, @matt-long, shouldn't we check for weights and x array shapes compatibility to make sure there isn't a mismatch before doing any computation too?

from esmlab.

matt-long avatar matt-long commented on July 18, 2024

@andersy005, do you have a proposed check? I think x must have all of the avg_over_dims_v (or equivalent) for the computation to be valid—or equivalently, x must have all of the weights.dims (but not the other way around).

from esmlab.

andersy005 avatar andersy005 commented on July 18, 2024

So far, I have this:

def _get_weights_and_dims(x, weights=None, dim=None):
    if dim and isinstance(dim, str):
        dims = [dim]

    elif isinstance(dim, list):
        dims = dim 

    
    else:
        dims = [k for k in x.dims]


    op_over_dims = [k for k in dims if k in x.dims]
    if not op_over_dims:
        raise ValueError("Unexpected dimensions for variable {0}".format(x.name))

    dims_shape = tuple(i for i in x[op_over_dims].dims.values())
    if weights is None:
        weights = xr.DataArray(np.ones(dims_shape), dims=op_over_dims)


    else:
        w = np.array(weights)
        assert w.shape == dims_shape
        weights = xr.DataArray(w, dims=op_over_dims)


    return weights, op_over_dims

def weighted_sum(x, weights=None, dim=None):

    if weights is None:
        warn("Computing sum with equal weights for all data points")

    
    weights, op_over_dims = _get_weights_and_dims(x, weights, dim)
    x_w_sum = (x * weights).sum(op_over_dims)

    original_attrs, original_encoding = get_original_attrs(x)
    return update_attrs(x_w_sum, original_attrs, original_encoding)

This line

dims_shape = tuple(i for i in x[op_over_dims].dims.values())

isn't working on dataarray. It works on datasets only.

from esmlab.

andersy005 avatar andersy005 commented on July 18, 2024

Let me know if you have a solution for me :)

from esmlab.

matt-long avatar matt-long commented on July 18, 2024

So I think you can use something like this.

dims_shape = tuple(l for i, l in enumerate(x.shape) if x.dims[i] in op_over_dims)

I would suggest changing this

...
else:
        w = np.array(weights)
        assert w.shape == dims_shape
        weights = xr.DataArray(w, dims=op_over_dims)

to

...
else:
     assert weights.shape == dims_shape

I don't see any reason to convert to numpy and then back.

We have to be aware of an issue with this, however. We have the following

def _apply_nan_mask(x, weights, avg_over_dims_v):
    weights = weights.where(x.notnull())
    ...

which is used in weighted_mean etc. This step effectively adds all the dimensions in x that are not already in weights to weights. I think this has desirable effects; for instance, in averaging data with non-uniform missing values in time, we are doing the right thing.

One possibility is to fold the _apply_nan_mask into your new _get_weights_and_dims function so we can ensure that the masking happens last.

from esmlab.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.