Data Tools

Contains a collections of functions for working with RAMS data in Python

class pyrams.data_tools.DataInfo(variable, longname, unit)[source]

Deprecated. Please use DataVar()

A class to handle model data

variable

The name of the variable as found in the data files (e.g. “RTP”)

Type

str

longname

The long name of the variable (e.g. “Total Water Mixing Ratio”)

Type

str

unit

The unit of the variable (e.g. “kg/kg”)

Type

str

data

The data array for the variable

Type

numpy.ndarray

get_data(datadir, simulation)[source]
Parameters
  • datadir (str) – The path of the data files

  • simulation (str) – Name of the subfolder that the data is found in (e.g. “feb2014_control”)

Returns

data – The data for the desired variable

Return type

numpy.ndarray

class pyrams.data_tools.DataVar(varname, longname=None, unit=None)[source]

A new class created to manage variables, their names, and units (replaces DataInfo)

Parameters
  • varname (str) – The variable name as found in the data files (e.g. “RTP”)

  • longname (str) – Optional. The long name of the variable (e.g. “Total Water Mixing Ratio”)

  • unit (str) – Optional. The unit of the variable (e.g. “kg/kg”)

varname

str The variable name as found in the data files (e.g. “RTP”)

longname

The long name of the variable (e.g. “Total Water Mixing Ratio”)

Type

str

unit

The unit of the variable (e.g. “kg/kg”)

Type

str

data

The data for the variable

Type

numpy.ndarray

get_data(flist)[source]

Pulls data from a list of files (flist) and puts it into a single array

Parameters

flist (list) – A list of sorted file paths

pyrams.data_tools.build_mfdataset(path, **kwargs)[source]

Build and xarray dataset with a time dimension

Parameters
  • path (string) – Path to folder containing files

  • **kwargs – Additional arguments to pass to xarray

pyrams.data_tools.calc_height(topt, ztn)[source]

Calculates the height of each grid box

Parameters
  • topt (numpy.ndarray) – The 2-D topographic height information

  • ztn (numpy.ndarray) – The ztn variable from the *head.txt files output from RAMS

Returns

z – A 3-D array of the heights of each gridbox

Return type

numpy.ndarray

pyrams.data_tools.domain_mean_netcdf(ds_with_metadata, outfile, vars=None)[source]

Writes x/y domain-average from from an xarray dataset to outfile as NetCDF.

Parameters
  • ds_with_metadata (xr.Dataset) – An xarray dataset created with pyrams.datatools.create_xr_dataset()

  • outfile (str) – Name of output file

  • vars (list (optional)) – List of variable names to write. Default is to process and write all variables

pyrams.data_tools.fix_duplicate_dims(ds, duped_dims, phony_dim)[source]

Fixes duplicate dimensions (often with the same amount of x and y gridpoints), for use with xarray.open_mfdataset and xarray.combine_nested.

Parameters
  • ds (xarray.Dataset) – The dataset to be fixed

  • duped_dims (list of str) – List of dimensions that are duplicated, in order (e.g. [‘y’, ‘x’])

  • phony_dim (string) – Name of duplicate dimension in ds, often ‘phony_dim_0’

Returns

ds_new – New dataset with fixed dimension names

Return type

xarray.Dataset

pyrams.data_tools.flist_to_times(flist)[source]

Creates a list of datetimes from a list of RAMS output variables.

Function uses regex to find the pattern “YYYY-mm-dd-HHMMSS” in the file path and converts to a np.datetime64 object.

Parameters

flilst (list) – A list of files

Returns

times – A list of times in np.datetime64 format.

Return type

list

pyrams.data_tools.habit_count(habits, tmax)[source]

Takes 3D habit data and tmax (number of time steps) and returns the number of each habit at each time step.

pyrams.data_tools.press_level(pressure, heights, plevels, no_time=False)[source]

Calculates geopotential heights at a given pressure level

Parameters
  • pressure (numpy.ndarray) – The 3-D pressure field (assumes time dimension, turn off with no_time=True)

  • heights (numpy.ndarray) – The 3-D array of gridbox heights

  • plevels (list) – List of pressure levels to interpolate to

  • no_time=False (bool) – Optional, set to True to indicate lack of time dimension.

Returns

press_height – The geopotential heights at the specified pressure levels

Return type

numpy.ndarray

pyrams.data_tools.rewrite_to_netcdf(flist, output_path, duped_dims, phony_dim, prefix='dimfix', single_file=False, compression_level=None)[source]

Rewrites RAMS standard output files as netCDF4 with fixed dimension data, using data_tools.fix_duplicate_dims()

Parameters
  • flist (list of str) – A list of file paths (recommend using sorted(glob.glob('/path/to/files/*g1.h5')) or similar)

  • output_path (str) – Path where new files will be written

  • duped_dims (list of str) – List of dimensions that are duplicated, in order (e.g. ['y', 'x'])

  • phony_dim (string) – Name of duplicate dimension in ds, often phony_dim_0

  • prefix (string) – Prefix for output files, defaults to dimfix

  • single_file (bool (optional)) – If True, will combine all files into a single file with name <prefix>.nc. Defaults to False.

  • compression_level (int) – If specified, data will be compressed on the given level (0-9 are valid). Defaults to None

pyrams.data_tools.z_levels_2d(ztn, topt)[source]

Calculates the gridbox heights for a 2-D grid

Parameters
  • ztn (list) – List of ztn values from RAMS *head.txt output

  • topt (numpy.ndarray) – 1-D array of topography height values

Returns

zheight – 2-D array of gridbox heights

Return type

numpy.ndarray

pyrams.data_tools.z_levels_3d(ztn, topt)[source]

Calculates the gridbox heights for a 3-D grid

Parameters
  • ztn (list) – List of ztn values from RAMS *head.txt output

  • topt (numpy.ndarray) – 2-D array of topography height values

Returns

zheight – 3-D array of gridbox heights

Return type

numpy.ndarray