Example with ARGO data¶
We first need to download some ARGO data for example.
[1]:
import requests
url = "https://www.ncei.noaa.gov/data/oceans/argo/gadr/data/coriolis/69022/nodc_69022_prof.nc"
r = requests.get(url, allow_redirects=True)
with open("ARGO_example.nc", "wb") as f:
f.write(r.content)
[2]:
# ruff: noqa: F401
# ruff: noqa: I001
import xarray as xr
import cf_xarray
import gsw_xarray as gsw
[3]:
ds = xr.open_dataset("ARGO_example.nc")
[4]:
ds
[4]:
<xarray.Dataset> Size: 656kB
Dimensions: (n_prof: 130, n_param: 3, n_levels: 56,
n_calib: 1, n_history: 0)
Dimensions without coordinates: n_prof, n_param, n_levels, n_calib, n_history
Data variables: (12/65)
data_type object 8B ...
format_version object 8B ...
handbook_version object 8B ...
reference_date_time object 8B ...
date_creation object 8B ...
date_update object 8B ...
... ...
history_parameter (n_history, n_prof) object 0B ...
history_start_pres (n_history, n_prof) float32 0B ...
history_stop_pres (n_history, n_prof) float32 0B ...
history_previous_value (n_history, n_prof) float32 0B ...
history_qctest (n_history, n_prof) object 0B ...
crs int32 4B ...
Attributes: (12/49)
title: Argo float vertical profile
institution: FR GDAC
source: Argo float
history: 2019-10-20T09:53:12Z boyer convAGDAC.f90...
references: https://www.nodc.noaa.gov/argo/
user_manual_version: 3.1
... ...
time_coverage_end: 2005-09-19T02:13:46Z
time_coverage_duration: point
time_coverage_resolution: point
gadr_ConventionVersion: GADR-3.0
gadr_program: convAGDAC.f90
gadr_programVersion: 1.2We can rely on cf-xarray to see what variables have standard names in our dataset:
[5]:
ds.cf
[5]:
Coordinates:
CF Axes: X, Y, Z, T: n/a
CF Coordinates: longitude, latitude, vertical, time: n/a
Cell Measures: area, volume: n/a
Standard Names: n/a
Bounds: n/a
Grid Mappings: n/a
Data Variables:
Cell Measures: area, volume: n/a
Standard Names: latitude: ['latitude']
longitude: ['longitude']
sea_water_pressure: ['pres', 'pres_adjusted']
sea_water_salinity: ['psal', 'psal_adjusted']
sea_water_temperature: ['temp', 'temp_adjusted']
time: ['juld']
Bounds: n/a
Grid Mappings: latitude_longitude: ['crs']
The dataset contains multiple time the same variable (e.g. ‘pres_adjusted’ and ‘pres’ both have the standard name ‘sea_water_pressure’). For the accessor to work, only 1 variable or each standard name must be present, explicitely stated when calling the function, or the gsw option set_cf_name_preference must be set. For this example we will only retain the adjusted variables. We set the global option, but we could also use context, i.e.
[6]:
with gsw.set_cf_name_preference(
sea_water_pressure="pres_adjusted",
sea_water_practical_salinity="psal_adjusted",
sea_water_temperature="temp_adjusted",
):
# do the computation
pass
gsw.set_cf_name_preference(
sea_water_pressure="pres_adjusted",
sea_water_practical_salinity="psal_adjusted",
sea_water_temperature="temp_adjusted",
)
# We can check the options we have set:
gsw.get_options()
[6]:
{'non_cf_name': {},
'cf_name_preference': {'sea_water_pressure': 'pres_adjusted',
'sea_water_practical_salinity': 'psal_adjusted',
'sea_water_temperature': 'temp_adjusted'}}
In the following sections we will demonstrate each features of gsw-xarray. We will focus on computing the potential density anomaly.
Basic usage as drop in replacement of gsw¶
[7]:
help(gsw.sigma0)
Help on function sigma0 in module gsw._wrapped_ufuncs:
sigma0(SA, CT)
Calculates potential density anomaly with reference pressure of 0 dbar,
this being this particular potential density minus 1000 kg/m^3. This
function has inputs of Absolute Salinity and Conservative Temperature.
This function uses the computationally-efficient expression for
specific volume in terms of SA, CT and p (Roquet et al., 2015).
Parameters
----------
SA : array-like
Absolute Salinity, g/kg
CT : array-like
Conservative Temperature (ITS-90), degrees C
Returns
-------
sigma0 : array-like, kg/m^3
potential density anomaly with
respect to a reference pressure of 0 dbar,
that is, this potential density - 1000 kg/m^3.
Notes
-----
Note that this 75-term equation has been fitted in a restricted range of
parameter space, and is most accurate inside the "oceanographic funnel"
described in McDougall et al. (2003). The GSW library function
"gsw_infunnel(SA,CT,p)" is available to be used if one wants to test if
some of one's data lies outside this "funnel".
References
----------
IOC, SCOR and IAPSO, 2010: The international thermodynamic equation of
seawater - 2010: Calculation and use of thermodynamic properties.
Intergovernmental Oceanographic Commission, Manuals and Guides No. 56,
UNESCO (English), 196 pp. Available from https://www.teos-10.org/
See Eqn. (A.30.1) of this TEOS-10 Manual.
McDougall, T.J., D.R. Jackett, D.G. Wright and R. Feistel, 2003:
Accurate and computationally efficient algorithms for potential
temperature and density of seawater. J. Atmosph. Ocean. Tech., 20,
pp. 730-741.
Roquet, F., G. Madec, T.J. McDougall, P.M. Barker, 2015: Accurate
polynomial expressions for the density and specific volume of seawater
using the TEOS-10 standard. Ocean Modelling., 90, pp. 29-43.
We need Absolute Salinity and Conservative Temperature, so 1st we need to do some conversions:
[8]:
SA = gsw.SA_from_SP(
SP=ds.psal_adjusted, p=ds.pres_adjusted, lon=ds.longitude, lat=ds.latitude
)
CT = gsw.CT_from_t(SA=SA, t=ds.temp_adjusted, p=ds.pres_adjusted)
sigma0 = gsw.sigma0(SA=SA, CT=CT)
sigma0
[8]:
<xarray.DataArray 'sigma0' (n_prof: 130, n_levels: 56)> Size: 58kB
array([[26.64758465, 26.64656536, 26.64667857, ..., 27.77078682,
27.77467073, 27.77756609],
[26.48498586, 26.48409899, 26.48432005, ..., 27.76600562,
27.76941237, 27.76895082],
[26.25518807, 26.25449663, 26.25488776, ..., 27.7603578 ,
27.76478937, 27.76924889],
...,
[25.31136543, 25.31674362, 25.42998838, ..., 27.77414777,
27.76859076, nan],
[25.30657435, 25.30409728, 25.29751108, ..., 27.78551733,
27.77778253, nan],
[25.43166004, 25.43155161, 25.43251587, ..., 27.7946225 ,
27.79664872, nan]], shape=(130, 56))
Dimensions without coordinates: n_prof, n_levels
Attributes:
standard_name: sea_water_sigma_t
units: kg/m^3Using Pint and pint-xarray to handle units¶
[9]:
import pint_xarray
import cf_xarray.units
/tmp/ipykernel_1440/3142801479.py:2: UserWarning: Import(s) unavailable to set up matplotlib support...skipping this portion of the setup.
import cf_xarray.units
[10]:
ds_pint = ds.pint.quantify()
ds_pint
[10]:
<xarray.Dataset> Size: 656kB
Dimensions: (n_prof: 130, n_param: 3, n_levels: 56,
n_calib: 1, n_history: 0)
Dimensions without coordinates: n_prof, n_param, n_levels, n_calib, n_history
Data variables: (12/65)
data_type object 8B b'Argo profile '
format_version object 8B b'3.1 '
handbook_version object 8B b'1.2 '
reference_date_time object 8B b'19500101000000'
date_creation object 8B b'20040428194950'
date_update object 8B b'20190626110153'
... ...
history_parameter (n_history, n_prof) object 0B
history_start_pres (n_history, n_prof) float32 0B [dbar]
history_stop_pres (n_history, n_prof) float32 0B [dbar]
history_previous_value (n_history, n_prof) float32 0B
history_qctest (n_history, n_prof) object 0B
crs int32 4B -2147483647
Attributes: (12/49)
title: Argo float vertical profile
institution: FR GDAC
source: Argo float
history: 2019-10-20T09:53:12Z boyer convAGDAC.f90...
references: https://www.nodc.noaa.gov/argo/
user_manual_version: 3.1
... ...
time_coverage_end: 2005-09-19T02:13:46Z
time_coverage_duration: point
time_coverage_resolution: point
gadr_ConventionVersion: GADR-3.0
gadr_program: convAGDAC.f90
gadr_programVersion: 1.2We compute again sigma0, using the ds_pint dataset, i.e. variables have a physical dimension
[11]:
SA = gsw.SA_from_SP(
SP=ds_pint.psal_adjusted,
p=ds_pint.pres_adjusted,
lon=ds_pint.longitude,
lat=ds_pint.latitude,
)
CT = gsw.CT_from_t(SA=SA, t=ds_pint.temp_adjusted, p=ds_pint.pres_adjusted)
sigma0 = gsw.sigma0(SA=SA, CT=CT)
sigma0
[11]:
<xarray.DataArray 'sigma0' (n_prof: 130, n_levels: 56)> Size: 58kB
<Quantity([[26.64758465 26.64656536 26.64667857 ... 27.77078682 27.77467073
27.77756609]
[26.48498586 26.48409899 26.48432005 ... 27.76600562 27.76941237
27.76895082]
[26.25518807 26.25449663 26.25488776 ... 27.7603578 27.76478937
27.76924889]
...
[25.31136543 25.31674362 25.42998838 ... 27.77414777 27.76859076
nan]
[25.30657435 25.30409728 25.29751108 ... 27.78551733 27.77778253
nan]
[25.43166004 25.43155161 25.43251587 ... 27.7946225 27.79664872
nan]], 'kilogram / meter ** 3')>
Dimensions without coordinates: n_prof, n_levels
Attributes:
standard_name: sea_water_sigma_tgsw-xarray converts the units (if necessary) when using pint quantities:
[12]:
# start to convert the pressure into Pascal
pressure_in_pascal = ds_pint.pres_adjusted.pint.to("Pa")
pressure_in_pascal
[12]:
<xarray.DataArray 'pres_adjusted' (n_prof: 130, n_levels: 56)> Size: 29kB
<Quantity([[7.0000e+03 5.8000e+04 1.5700e+05 ... 1.4253e+07 1.4752e+07 1.5013e+07]
[2.7000e+04 5.5000e+04 1.5200e+05 ... 1.4249e+07 1.4751e+07 1.4920e+07]
[3.6000e+04 5.4000e+04 1.5100e+05 ... 1.4249e+07 1.4749e+07 1.5041e+07]
...
[4.5000e+04 1.2900e+05 2.2700e+05 ... 1.4726e+07 1.5011e+07 nan]
[4.8000e+04 1.3100e+05 2.3100e+05 ... 1.4725e+07 1.4979e+07 nan]
[4.5000e+04 1.3200e+05 2.2900e+05 ... 1.4726e+07 1.5021e+07 nan]], 'pascal')>
Dimensions without coordinates: n_prof, n_levels
Attributes:
long_name: Sea water pressure, equals 0 at sea-level
standard_name: sea_water_pressure
valid_min: 0.0
valid_max: 12000.0
C_format: %7.1f
FORTRAN_format: F7.1
resolution: 1.0
axis: ZCompute again density, using the pressure in Pascal. No worries as the conversion to dbar is automatic!
[13]:
SA = gsw.SA_from_SP(
SP=ds_pint.psal_adjusted,
p=pressure_in_pascal,
lon=ds_pint.longitude,
lat=ds_pint.latitude,
)
CT = gsw.CT_from_t(SA=SA, t=ds_pint.temp_adjusted, p=ds_pint.pres_adjusted)
sigma0 = gsw.sigma0(SA=SA, CT=CT)
sigma0
[13]:
<xarray.DataArray 'sigma0' (n_prof: 130, n_levels: 56)> Size: 58kB
<Quantity([[26.64758465 26.64656536 26.64667857 ... 27.77078682 27.77467073
27.77756609]
[26.48498586 26.48409899 26.48432005 ... 27.76600562 27.76941237
27.76895082]
[26.25518807 26.25449663 26.25488776 ... 27.7603578 27.76478937
27.76924889]
...
[25.31136543 25.31674362 25.42998838 ... 27.77414777 27.76859076
nan]
[25.30657435 25.30409728 25.29751108 ... 27.78551733 27.77778253
nan]
[25.43166004 25.43155161 25.43251587 ... 27.7946225 27.79664872
nan]], 'kilogram / meter ** 3')>
Dimensions without coordinates: n_prof, n_levels
Attributes:
standard_name: sea_water_sigma_t[ ]:
Using the accessor to simplify the workflow¶
Common case¶
gsw-xarray adds ths gsw accessor to datasets. This accessor makes it easy to run the gsw functions on variables from a dataset.
A first solution is to use the accessor exactly as when using gsw:
[14]:
ds.gsw.SA_from_SP(
SP=ds.psal_adjusted, p=ds.pres_adjusted, lon=ds.longitude, lat=ds.latitude
)
[14]:
<xarray.DataArray 'SA' (n_prof: 130, n_levels: 56)> Size: 58kB
array([[35.81755188, 35.81657471, 35.81659823, ..., 35.08564178,
35.08162835, 35.0796239 ],
[35.80757275, 35.8065829 , 35.80559104, ..., 35.09481071,
35.09079742, 35.08778979],
[36.03182228, 36.03082693, 36.0308492 , ..., 35.10502264,
35.1010208 , 35.10101815],
...,
[35.79041046, 35.77693492, 35.76322478, ..., 35.20835127,
35.19176101, nan],
[35.78753964, 35.78651775, 35.77824776, ..., 35.25467841,
35.22770745, nan],
[35.86339942, 35.86237223, 35.85926688, ..., 35.27570263,
35.26426658, nan]], shape=(130, 56))
Dimensions without coordinates: n_prof, n_levels
Attributes:
standard_name: sea_water_absolute_salinity
units: g/kgThis is however not very useful… A better option is to simply give the name of the variables from the dataset:
[15]:
ds.gsw.SA_from_SP(
SP="psal_adjusted", p="pres_adjusted", lon="longitude", lat="latitude"
)
[15]:
<xarray.DataArray 'SA' (n_prof: 130, n_levels: 56)> Size: 58kB
array([[35.81755188, 35.81657471, 35.81659823, ..., 35.08564178,
35.08162835, 35.0796239 ],
[35.80757275, 35.8065829 , 35.80559104, ..., 35.09481071,
35.09079742, 35.08778979],
[36.03182228, 36.03082693, 36.0308492 , ..., 35.10502264,
35.1010208 , 35.10101815],
...,
[35.79041046, 35.77693492, 35.76322478, ..., 35.20835127,
35.19176101, nan],
[35.78753964, 35.78651775, 35.77824776, ..., 35.25467841,
35.22770745, nan],
[35.86339942, 35.86237223, 35.85926688, ..., 35.27570263,
35.26426658, nan]], shape=(130, 56))
Dimensions without coordinates: n_prof, n_levels
Attributes:
standard_name: sea_water_absolute_salinity
units: g/kgIt is even possible to go 1 step further and rely on the usage of standard name! In this case, you don’t need to provide any argument for the variables with the proper standard name.
With this method, it is way faster to compute the density:
[16]:
# WITHOUT any detection
SA = gsw.SA_from_SP(
SP=ds.psal_adjusted, p=ds.pres_adjusted, lon=ds.longitude, lat=ds.latitude
)
CT = gsw.CT_from_t(SA=SA, t=ds.temp_adjusted, p=ds.pres_adjusted)
sigma0 = gsw.sigma0(SA=SA, CT=CT)
# WITH autodetection
ds = ds.merge(ds.gsw.SA_from_SP())
ds = ds.merge(ds.gsw.CT_from_t())
ds = ds.merge(ds.gsw.sigma0())
You can also use brackets to get either 1 or multiple variables computed:
[17]:
ds.gsw["sigma0"] # Returns a DataArray
ds.gsw[["sigma0"]] # Returns a Dataset
ds.gsw[["sigma0", "alpha", "beta", "sigma1", "rho"]] # With multiple outputs
[17]:
<xarray.Dataset> Size: 291kB
Dimensions: (n_prof: 130, n_levels: 56)
Dimensions without coordinates: n_prof, n_levels
Data variables:
sigma0 (n_prof, n_levels) float64 58kB 26.65 26.65 26.65 ... 27.8 nan
alpha (n_prof, n_levels) float64 58kB 0.0002081 0.0002082 ... nan
beta (n_prof, n_levels) float64 58kB 0.0007438 0.0007437 ... nan
sigma1 (n_prof, n_levels) float64 58kB 31.01 31.01 31.01 ... 32.38 nan
rho (n_prof, n_levels) float64 58kB 1.027e+03 1.027e+03 ... nanOf course any kind of mixture between all the solutions is possible:
[18]:
# Give a value for SP
# Take p from dataset
# Automatically get lon and lat based on standard names
ds.gsw.SA_from_SP(SP=35, p="pres_adjusted")
[18]:
<xarray.DataArray 'SA' (n_prof: 130, n_levels: 56)> Size: 58kB
array([[35.16540604, 35.16543255, 35.16545565, ..., 35.16805541,
35.16807052, 35.16807842],
[35.16541107, 35.16542513, 35.16544852, ..., 35.16805194,
35.16806717, 35.1680723 ],
[35.1654079 , 35.16541651, 35.16543824, ..., 35.16804504,
35.16806022, 35.16806909],
...,
[35.16533263, 35.16534798, 35.16535502, ..., 35.16820029,
35.16823002, nan],
[35.16532498, 35.16533771, 35.16534623, ..., 35.16817466,
35.16820349, nan],
[35.1653088 , 35.16532382, 35.16533425, ..., 35.16819766,
35.16823758, nan]], shape=(130, 56))
Dimensions without coordinates: n_prof, n_levels
Attributes:
standard_name: sea_water_absolute_salinity
units: g/kgAnd it is also possible to use automatic discovery of argument with pint datasets:
[19]:
ds_pint.gsw.SA_from_SP()
[19]:
<xarray.DataArray 'SA' (n_prof: 130, n_levels: 56)> Size: 58kB
<Quantity([[35.81755188 35.81657471 35.81659823 ... 35.08564178 35.08162835
35.0796239 ]
[35.80757275 35.8065829 35.80559104 ... 35.09481071 35.09079742
35.08778979]
[36.03182228 36.03082693 36.0308492 ... 35.10502264 35.1010208
35.10101815]
...
[35.79041046 35.77693492 35.76322478 ... 35.20835127 35.19176101
nan]
[35.78753964 35.78651775 35.77824776 ... 35.25467841 35.22770745
nan]
[35.86339942 35.86237223 35.85926688 ... 35.27570263 35.26426658
nan]], 'gram / kilogram')>
Dimensions without coordinates: n_prof, n_levels
Attributes:
standard_name: sea_water_absolute_salinity[ ]:
Case with some argument without standard name¶
Some functions have argument without any standard name. In this case, it is possible to refer to these arguments using gsw-xarray options.
Let’s take the function gsw.SP_salinometer that has 2 arguments: Rt without standard name, and t the in situ temperature.
A 1st option is to explicitely provide a value or the name from the dataset (we will create some fake data for the purpose of this example):
[20]:
ds["salinometer_Rt"] = 35
[21]:
ds.gsw.SP_salinometer(Rt="salinometer_Rt")
[21]:
<xarray.DataArray 'SP' (n_prof: 130, n_levels: 56)> Size: 58kB
array([[14852.44767448, 14852.37691945, 14852.34156055, ...,
15304.61363721, 15308.13947294, 15310.42597351],
[14827.44354407, 14827.40896543, 14827.51267175, ...,
15298.31168787, 15301.51050946, 15302.44067949],
[14768.33013327, 14768.33013327, 14768.33013327, ...,
15291.27112184, 15294.91511554, 15297.07554699],
...,
[14673.397394 , 14675.12423965, 14689.35099574, ...,
15258.27973762, 15261.63122146, nan],
[14673.09995293, 14672.86208721, 14672.77288815, ...,
15246.89048002, 15252.84600545, nan],
[14680.80248747, 14680.83244267, 14681.16223826, ...,
15243.57738327, 15248.32719727, nan]], shape=(130, 56))
Dimensions without coordinates: n_prof, n_levels
Attributes:
standard_name: sea_water_practical_salinity
units: 1
reference_scale: PSS-78A 2nd solution is to use gsw-xarray option with function set_non_cf_name. This way if you need to compute multiple times functions that use these arguments without a standard name, you only need to provide once the mapping.
[22]:
help(gsw.set_non_cf_name)
Help on class set_non_cf_name in module gsw_xarray._options:
class set_non_cf_name(set_options)
| set_non_cf_name(**kwargs)
|
| Set `non_cf_name` options for gsw_xarray in a controlled context.
|
| Parameters
| ----------
| Provide the name in dataset of arguments that don't have a standard name
| e.g. entropy='entropy_name_in_ds', SA_seaice='salinity_of_sea_ice_name_in_ds'
|
| Using `set_non_cf_name` is equivalent to using `set_options`
| with argument 'set_non_cf_name', but is provided as a shorter method.
|
| You can use `set_non_cf_name` either as a context manager (using `with`)
| or to set global options
|
| Method resolution order:
| set_non_cf_name
| set_options
| builtins.object
|
| Methods defined here:
|
| __init__(self, **kwargs)
| Initialize self. See help(type(self)) for accurate signature.
|
| ----------------------------------------------------------------------
| Methods inherited from set_options:
|
| __enter__(self)
|
| __exit__(self, type, value, traceback)
|
| ----------------------------------------------------------------------
| Data descriptors inherited from set_options:
|
| __dict__
| dictionary for instance variables
|
| __weakref__
| list of weak references to the object
Using a context manager:
[23]:
with gsw.set_non_cf_name(Rt="salinometer_Rt"):
ds.gsw.SP_salinometer()
Or globally:
[24]:
gsw.set_non_cf_name(Rt="salinometer_Rt")
ds.gsw.SP_salinometer()
[24]:
<xarray.DataArray 'SP' (n_prof: 130, n_levels: 56)> Size: 58kB
array([[14852.44767448, 14852.37691945, 14852.34156055, ...,
15304.61363721, 15308.13947294, 15310.42597351],
[14827.44354407, 14827.40896543, 14827.51267175, ...,
15298.31168787, 15301.51050946, 15302.44067949],
[14768.33013327, 14768.33013327, 14768.33013327, ...,
15291.27112184, 15294.91511554, 15297.07554699],
...,
[14673.397394 , 14675.12423965, 14689.35099574, ...,
15258.27973762, 15261.63122146, nan],
[14673.09995293, 14672.86208721, 14672.77288815, ...,
15246.89048002, 15252.84600545, nan],
[14680.80248747, 14680.83244267, 14681.16223826, ...,
15243.57738327, 15248.32719727, nan]], shape=(130, 56))
Dimensions without coordinates: n_prof, n_levels
Attributes:
standard_name: sea_water_practical_salinity
units: 1
reference_scale: PSS-78[ ]:
Note on performance¶
It is still very slightly faster to provide the arguments than rely on autodetect, however for large data sets this difference should be negligible compared to the internal computation time of the gsw functions.
[25]:
%%timeit
gsw.SA_from_SP(
SP=ds.psal_adjusted, p=ds.pres_adjusted, lon=ds.longitude, lat=ds.latitude
)
760 μs ± 1.47 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
[26]:
%%timeit
ds.gsw.SA_from_SP(
SP="psal_adjusted", p="pres_adjusted", lon="longitude", lat="latitude"
)
807 μs ± 1.17 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
[27]:
%%timeit
ds.gsw.SA_from_SP()
817 μs ± 2.75 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Compare with upstream gsw
[28]:
import gsw as gsw_upstream
[29]:
%%timeit
gsw_upstream.SA_from_SP(
SP=ds.psal_adjusted, p=ds.pres_adjusted, lon=ds.longitude, lat=ds.latitude
)
671 μs ± 3.24 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)