Introducing xCDAT, a new climate data analysis package!

Hi Pangeo Community,

My name is Tom Vo and I am the lead developer for xCDAT (Xarray Climate Data Analysis Tools). xCDAT is an extension of xarray for climate data analysis on structured grids. It serves as a modern successor to the CDAT (Community Data Analysis Tools) library.

The goal of xCDAT is to provide features and utilities for simple and robust analysis of climate data. xCDAT’s design philosophy is focused on reducing the overhead required to accomplish certain tasks in xarray. Some key xCDAT features are inspired by or ported from the core CDAT library, while others leverage powerful libraries in the xarray ecosystem (e.g., xESMF and cf_xarray) to deliver robust APIs.

As xCDAT matures, we are seeking early adopters to help us improve and grow xCDAT! If you have a moment, please check out the repository and view our documentation on Read the Docs. The package is available to install in an Anaconda environment through conda-forge/xcdat.

We’re looking forward to hearing about your experience with xCDAT!

Thank you,

Tom and the xCDAT core team (Stephen Po-Chedley, Jason Boutte, Jill Zhang, Jiwoo Lee)

xCDAT Overview

Available Features

  • Extension of xarray’s open_dataset() and open_mfdataset() with post-processing options
    • Generate bounds for axes supported by xcdat if they don’t exist in the Dataset
    • Optional selection of single data variable to keep in the Dataset (bounds are also kept if they exist)
    • Optional decoding of time coordinates
      • In addition to CF time units, also decodes common non-CF time units (“months since …”, “years since …”)
    • Optional centering of time coordinates using time bounds
    • Optional conversion of longitudinal axis orientation between [0, 360) and [-180, 180)
  • Temporal averaging
    • Time series averages (single snapshot and grouped), climatologies, and departures
    • Weighted or unweighted
    • Optional seasonal configuration (e.g., DJF vs. JFD, custom seasons)
  • Geospatial weighted averaging
    • Supports rectilinear grid
    • Optional specification of regional domain
  • Horizontal structured regridding
    • Supports rectilinear and curvilinear grids
    • Python implementation of regrid2 for handling cartesian latitude longitude grids
    • API that wraps xesmf

Planned Features

  • Vertical structured regridding
    • Support rectilinear grids

Things we are striving for:

  • xCDAT supports CF compliant datasets, but will also strive to support datasets with common non-CF compliant metadata (e.g., time units in “months since …” or “years since …”)
    • xCDAT leverages cf_xarray to interpret CF attributes on xarray objects
  • Robust handling of dimensions and their coordinates and coordinate bounds
    • Coordinate variables are retrieved with cf_xarray using CF axis names or coordinate names found in xarray object attributes
    • Bounds are retrieved with cf_xarray using the “bounds” attr
    • Ability to operate on both longitudinal axis orientations, [0, 360) and [-180, 180)
  • Support for lazy operations and parallelism using dask where it is both possible and makes sense