Weighted Reductions

Weighted Reductions#

Xarray supports weighted reductions.

For demonstration, we will create a “weights” array proportional to cosine of latitude. Modulo a normalization, this is the correct area-weighting factor for data on a regular lat-lon grid.

import numpy as np
import xarray as xr
import matplotlib.pyplot as plt

%config InlineBackend.figure_format='retina'
ds = xr.tutorial.load_dataset("ersstv5")
ds
<xarray.Dataset>
Dimensions:    (lat: 89, lon: 180, time: 624, nbnds: 2)
Coordinates:
  * lat        (lat) float32 88.0 86.0 84.0 82.0 ... -82.0 -84.0 -86.0 -88.0
  * lon        (lon) float32 0.0 2.0 4.0 6.0 8.0 ... 352.0 354.0 356.0 358.0
  * time       (time) datetime64[ns] 1970-01-01 1970-02-01 ... 2021-12-01
Dimensions without coordinates: nbnds
Data variables:
    time_bnds  (time, nbnds) float64 9.969e+36 9.969e+36 ... 9.969e+36 9.969e+36
    sst        (time, lat, lon) float32 -1.8 -1.8 -1.8 -1.8 ... nan nan nan nan
Attributes: (12/37)
    climatology:               Climatology is based on 1971-2000 SST, Xue, Y....
    description:               In situ data: ICOADS2.5 before 2007 and NCEP i...
    keywords_vocabulary:       NASA Global Change Master Directory (GCMD) Sci...
    keywords:                  Earth Science > Oceans > Ocean Temperature > S...
    instrument:                Conventional thermometers
    source_comment:            SSTs were observed by conventional thermometer...
    ...                        ...
    creator_url_original:      https://www.ncei.noaa.gov
    license:                   No constraints on data access or use
    comment:                   SSTs were observed by conventional thermometer...
    summary:                   ERSST.v5 is developed based on v4 after revisi...
    dataset_title:             NOAA Extended Reconstructed SST V5
    data_modified:             2022-06-07
weights = np.cos(np.deg2rad(ds.lat))
display(weights.dims)

weights.plot()
('lat',)
[<matplotlib.lines.Line2D at 0x7f51803b09d0>]
../_images/1cf104813e97db64222189723c5b9c52b46a4a6e2a0dfa642e276c40af22ea90.png

Manual weighting#

Thanks to the automatic broadcasting and alignment discussed earlier, if we multiply this by SST, it “just works,” and the arrays are broadcasted properly:

(ds.sst * weights).dims
('time', 'lat', 'lon')

We could imagine computing the weighted spatial mean of SST manually.

sst_mean = (ds.sst * weights).sum(dim=("lon", "lat")) / weights.sum(dim="lat")
sst_mean.plot()
plt.title("This is wrong!")
Text(0.5, 1.0, 'This is wrong!')
../_images/847c4fdcf59df8828d0732f30219379512a7ccb9c0a2137c25f3e5a919b97fac.png

That would be wrong, however, because the denominator (weights.sum(dim='lat')) needs to be expanded to include the lon dimension and modified to account for the missing values (land points).

The weighted method#

In general, weighted reductions on multidimensional arrays are complicated. To make it a bit easier, Xarray provides a mechanism for weighted reductions.

It does this by creating a special intermediate DataArrayWeighted object, to which different reduction operations can applied.

sst_weighted = ds.sst.weighted(weights)
sst_weighted
DataArrayWeighted with weights along dimensions: lat
Xarrays' computation methods (groupby, groupby_bins, rolling, coarsen, weighted) all return special objects that represent the basic underlying computation pattern. For e.g. `sst_weighted` above is a `DatasetWeighted` object that represents the weighting by `weights` of the data in `ds.sst`. It is usually helpful to save and reuse these objects for multiple operations (e.g. a mean and standard deviation calculation).
sst_weighted.mean(dim=("lon", "lat")).plot()
plt.title("Correct Global Mean SST");
../_images/36979e44a5dc0d0bc026b25c4aad90952660496ad07b52ccc3d0166833c280d2.png

A handful of reductions have been implemented: mean, sum, std, var.