This week Pangeo showcase is excited to welcome Hauke Schulze to talk about the tradeoffs associated with compressing datasets using information theory.
Meeting Logistics
Title: Xbitinfo: Compress datasets based on their information content
Invited Speaker: Hauke Schulz at CICOES/University of Washington
Speaker contact:
Hauke Schulz: (Github: observingClouds (Hauke Schulz) · GitHub | twitter: @meteo_hauke | ORCHID ID:0000-0001-5468-1137)
Aaron Spring: (Max Planck Institute for Meteorology | ORCHID ID: 0000-0003-0216-2241)
Milan Klöwer: (University of Oxford | ORCHID ID:0000-0002-3920-4356)
When: Wednesday October 26th 4PM EDT
Where: Launch Meeting - Zoom
Abstract:
Xbitinfo provides additional compression capabilities to the Pangeo Python workflow. Building on top of the Julia package BitInformation.jl, Xbitinfo is able to compress xarray datasets based on their information content. With compression rates out-performing traditionally for compression used formats like GRIB, Xbitinfo allows to save datasets in cloud-ready formats like Zarr without sacrificing compression rates or performance.
Relevant material: GitHub - observingClouds/xbitinfo: Python wrapper of BitInformation.jl to easily compress xarray datasets based on their information content
Agenda:
- 5-15 minutes - Community showcase
- 5-15 minutes - Q&A / Community check-in
- 20-35 minutes - Agenda and Open discussion