Pandas dtypes: now free from nanosecond limitation

Hi everyone,
I’m not a regular contributor on this forum, but I wanted to draw attention to some recent and exciting developments on the pandas side of things, which could have ripple effects in the pangeo community.

As some of you may know, pandas and NumPy developers have been working to generalize the concept of datetime64 dtype so it can accommodate non-nanosecond resolution. The reason why you might care is that on 64 bit architectures, ns-resolution limits those datetime objects to a comparatively short 584 years. If you were interested in representing Earth processes farther into the past or future (e.g. for PMIP sort of endeavors), the dependence on ns-dtypes used to severely limit what could be done with pandas.

Those of you who care about these longer time intervals may be happy to hear that non-ns dtypes have been enabled and will be fully functional in the 2.0.0 release of pandas (I’m already playing with them in the beta version, for Pyleoclim development). Xarray has so far circumvented the timestamp limitation by using cftime, but this may no longer be necessary. There are probably many other things that the pangeo community can do with pandas that used to be impossible.

Since the extension to non-ns dtypes was supported in part by an EarthCube grant awarded to @khider /LinkedEarth, we’re quite curious what this community will do with this new capability, and would love to hear about where you take this. One of the greatest joys of my professional life is to see some of our work re-used for use cases we hadn’t anticipated, so please do some non-nanosecond science with pandas and let us know about it! You may do so here or in the LinkedEarth discourse forum.

I’d also like to use this opportunity to invite anyone who wants to collaborate on interoperability between Xarray and Pyleoclim to get in touch with us.

All the best,
Julien, for LinkedEarth

6 Likes

Thanks for highlighting this important development! This is a common gotcha for many working with climate related data and modelling and, although there are solutions and workarounds, it will be great that we will soon no longer be constrained by that datetime64[ns] / pandas limitation.

2 Likes

We’re starting to encounter some functions that are not supported if the datetime is no longer nanosecond. Where would be the best place to report them? GitHub? Caveat is that, right now, we are using the Pandas nightly built for testing. The new datetime support won’t be fully integrated until Pandas2.0.

Most likely API/DES: Non-Nanosecond Tracker · Issue #46587 · pandas-dev/pandas · GitHub if it isn’t already posted elsewhere.

Thanks! I added it as an issue on the xarray GitHub, which is really where the problems begin.