Fast vectorization with Numba#
np.vectorize
is a very convenient function but is unfortunately slow. It is only marginally faster than writing a for loop in Python and looping.
A common way to get around this is to write a base interpolation function that can handle nD arrays in a compiled language like C or Fortran and then pass that to apply_ufunc
.
Another option is to use the numba package which provides two very convenient decorators to build numpy universal functions or ufuncs:
vectorize
for functions that act on scalars, andguvectorize
for functions that operates on subsets of the array along core-dimensions. Any decorated function gets compiled and will loop over the loop dimensions in parallel when necessary.
For apply_ufunc
the key concept is that we must provide vectorize=False
(the default) when using Numba vectorized functions.
Numba handles the vectorization (or looping) and apply_ufunc
handles converting Xarray objects to bare arrays and handling metadata.
Load data#
Exception reporting mode: Minimal
<xarray.DataArray (x: 3, y: 4)> Size: 96B array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) Coordinates: * x (x) int64 24B 12 13 14 Dimensions without coordinates: y Attributes: foo: bar
vectorize
#
Our squared_error
example from earlier works element-by-element, and is a great example for vectorize
from numba import vectorize, float64
@vectorize([float64(float64, float64)])
def squared_error(x, y):
return (x - y) ** 2
See the numba documentation to understand @vectorize([float64(float64, float64)])
Now use apply_ufunc
to apply it.
xr.apply_ufunc(squared_error, da, 1)
<xarray.DataArray (x: 3, y: 4)> Size: 96B array([[ 1., 0., 1., 4.], [ 9., 16., 25., 36.], [ 49., 64., 81., 100.]]) Coordinates: * x (x) int64 24B 12 13 14 Dimensions without coordinates: y
guvectorize
#
guvectorize
is for functions that work on small subsets of the data. Quoting the Numba documentation
While
vectorize()
allows you to write ufuncs that work on one element at a time, theguvectorize()
decorator takes the concept one step further and allows you to write ufuncs that will work on an arbitrary number of elements of input arrays, and take and return arrays of differing dimensions. The typical example is a running median or a convolution filter.
This description should remind you of apply_ufunc
!
We will use the example function g
from the numba docs, which adds a scalar y
to a 1D vector x
. The res
argument here will contain the output (this is a Numba detail).
array([2, 3, 4, 5, 6])
Unlike squared_error
we cannot pass an Xarray object to g
directly.
g(da, 1)
NotImplementedError: <ufunc 'g'> not supported: xarray objects do not directly implement generalized ufuncs. Instead, use xarray.apply_ufunc or explicitly convert to xarray objects to NumPy arrays (e.g., with `.values`).
Now use apply_ufunc
.
xr.apply_ufunc(
g,
da,
1,
input_core_dims=[["x"], []],
output_core_dims=[["x"]],
)
<xarray.DataArray (y: 4, x: 3)> Size: 96B array([[ 1, 5, 9], [ 2, 6, 10], [ 3, 7, 11], [ 4, 8, 12]]) Coordinates: * x (x) int64 24B 12 13 14 Dimensions without coordinates: y
Notice the following:
The
guvectorize
decorator includes the concept of “core dimensions”:'(n),()->(n)'
. This string means that theg
takes a 1D vector of sizen
, a scalar, and returns a 1D vector of sizen
. There is one core dimension for the input, and one core dimension for the output. Both core dimensions have the same size.That string translates to
input_core_dims=[["x"], []], output_core_dims=[["x"]]
inapply_ufunc
.We don’t provide
vectorize=True
toapply_ufunc
sincenumba
will handle the vectorization in compiled code automatically.
With dask#
Use the chunked DataArray
da_dask = da.chunk({"y": 1})
da_dask
<xarray.DataArray (x: 3, y: 4)> Size: 96B dask.array<xarray-<this-array>, shape=(3, 4), dtype=int64, chunksize=(3, 1), chunktype=numpy.ndarray> Coordinates: * x (x) int64 24B 12 13 14 Dimensions without coordinates: y Attributes: foo: bar
Exercise
Apply g
to da_dask
Solution
xr.apply_ufunc(
g,
da_dask,
1,
input_core_dims=[["x"], []],
output_core_dims=[["x"]],
dask="parallelized",
)
Next#
For more, see the numpy.interp end-to-end example in the left sidebar.