(complex-output)=
# Handling complex output

We've seen how to use `apply_ufunc` to handle relatively simple functions that transform every element, or reduce along a single dimension.

This lesson will show you how to handle cases where the output is more complex in two ways:
1. Handle adding a new dimension by specifying `output_core_dims`
1. Handling the change in size of an existing dimension by specifying `exclude_dims` in addition to `output_core_dims`


## Introduction

A good example of a function that returns relatively complex output is numpy's 1D interpolate function `numpy.interp`:

```
    Signature: np.interp(x, xp, fp, left=None, right=None, period=None)
    Docstring:
        One-dimensional linear interpolation.

    Returns the one-dimensional piecewise linear interpolant to a function
    with given discrete data points (`xp`, `fp`), evaluated at `x`.
```

This function expects a 1D array as input, and returns a 1D array as output. That is, `numpy.interp` has one core dimension.


```{tip}
We'll reduce the length of error messages using `%xmode minimal` See the [ipython documentation](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-xmode) for details.
```


In [None]:
%xmode minimal

import xarray as xr
import numpy as np

np.set_printoptions(threshold=10, edgeitems=2)
xr.set_options(display_expand_data=False)

air = (
    xr.tutorial.load_dataset("air_temperature")
    .air.sortby("lat")  # np.interp needs coordinate in ascending order
    .isel(time=-0, lon=0)  # choose a 1D subset
)
air

In [None]:
# Our goal is to densify from 25 to 100 coordinate values:s
newlat = np.linspace(15, 75, 100)
np.interp(newlat, air.lat.data, air.data)

(interp-add-new-dim)=
## Adding a new dimension

1D interpolation transforms the size of the input along a single dimension.

Logically, we can think of this as removing the old dimension and adding a new dimension.

We provide this information to `apply_ufunc` using the `output_core_dims` keyword argument

```
   output_core_dims : List[tuple], optional
        List of the same length as the number of output arguments from
        ``func``, giving the list of core dimensions on each output that were
        not broadcast on the inputs. By default, we assume that ``func``
        outputs exactly one array, with axes corresponding to each broadcast
        dimension.

        Core dimensions are assumed to appear as the last dimensions of each
        output in the provided order.
```

For `interp` we expect one returned output with one new core dimension that we will call `"lat_interp"`.

Specify this using `output_core_dims=[["lat_interp"]]`

In [None]:
newlat = np.linspace(15, 75, 100)

xr.apply_ufunc(
    np.interp,  # function to apply
    newlat,  # 1st input to np.interp
    air.lat,  # 2nd input to np.interp
    air,  # 3rd input to np.interp
    input_core_dims=[["lat_interp"], ["lat"], ["lat"]],  # one entry per function input, 3 in total!
    output_core_dims=[["lat_interp"]],
)

```{exercise}
:label: newdim

Apply the following function using `apply_ufunc`. It adds a new dimension to the input array, let's call it `newdim`. Specify the new dimension using `output_core_dims`. Do you need any `input_core_dims`?

```python
def add_new_dim(array):
    return np.expand_dims(array, axis=-1)
```
````{solution} newdim
:class: dropdown

``` python
def add_new_dim(array):
    return np.expand_dims(array, axis=-1)


xr.apply_ufunc(
    add_new_dim,
    air,
    output_core_dims=[["newdim"]],
)
```
````

(complex-output-change-size)=
## Dimensions that change size

Imagine that you want the output to have the same dimension name `"lat"` i.e. applying`np.interp` changes the size of the `"lat"` dimension.

We get an a error if we specify `"lat"` in `output_core_dims`

In [None]:
newlat = np.linspace(15, 75, 100)

xr.apply_ufunc(
    np.interp,  # first the function
    newlat,
    air.lat,
    air,
    input_core_dims=[["lat"], ["lat"], ["lat"]],
    output_core_dims=[["lat"]],
)

As the error message points out,
```
Only dimensions specified in ``exclude_dims`` with xarray.apply_ufunc are allowed to change size.
```

Looking at the docstring we need to specify `exclude_dims` as a "set":

```
exclude_dims : set, optional
        Core dimensions on the inputs to exclude from alignment and
        broadcasting entirely. Any input coordinates along these dimensions
        will be dropped. Each excluded dimension must also appear in
        ``input_core_dims`` for at least one argument. Only dimensions listed
        here are allowed to change size between input and output objects.
```


In [None]:
newlat = np.linspace(15, 75, 100)

xr.apply_ufunc(
    np.interp,  # first the function
    newlat,
    air.lat,
    air,
    input_core_dims=[["lat"], ["lat"], ["lat"]],
    output_core_dims=[["lat"]],
    exclude_dims={"lat"},
)

## Returning multiple variables

Another common, but more complex, case is to handle multiple outputs returned by the function.

As an example we will write a function that returns the minimum and maximum value along the last axis of the array.

We will work with a 2D array, and apply the function `minmax` along the `"lat"` dimension:
```python
def minmax(array):
    return array.min(axis=-1), array.max(axis=-1)
```

In [None]:
def minmax(array):
    return array.min(axis=-1), array.max(axis=-1)


air2d = xr.tutorial.load_dataset("air_temperature").air.isel(time=0)
air2d

By default, Xarray assumes one array is returned by the applied function.

Here we have two returned arrays, and the input core dimension `"lat"` is removed (or reduced over).

So we provide `output_core_dims=[[], []]` i.e. an empty list of core dimensions for each of the two returned arrays.

In [None]:
minda, maxda = xr.apply_ufunc(
    minmax,
    air2d,
    input_core_dims=[["lat"]],
    output_core_dims=[[], []],
)
minda

````{exercise}
:label: generalize

We presented the concept of "core dimensions" as the "smallest unit of data the function could handle." Do you understand how the above use of `apply_ufunc` generalizes to an array with more than one dimension? 

Try applying the minmax function to a 3d air temperature dataset 
```python
air3d = xr.tutorial.load_dataset("air_temperature").air)
``` 
Your goal is to have a minimum and maximum value of temperature across all latitudes for a given time and longitude.
````

````{solution} generalize
:class: dropdown

We want to use `minmax` to compute the minimum and maximum along the "lat" dimension always, regardless of how many dimensions are on the input. So we specify `input_core_dims=[["lat"]]`. The output does not contain the "lat" dimension, but we expect two returned variables. So we pass an empty list `[]` for each returned array, so `output_core_dims=[[], []]` just as before.


```python
minda, maxda = xr.apply_ufunc(
    minmax,
    air3d,
    input_core_dims=[["lat"]],
    output_core_dims=[[],[]],
)
```
````