Hi all!
I figured this forum is a great place to raise awareness about a new Python package I created called WxData (first released in November 2025 available on Anaconda and PYPI). WxData is a Python package that utilizes the requests, pandas and xarray libraries to download, pre-process and post-process various types of weather data. This package is especially geared towards people interested in creating automated workflows to download and process various types of weather data.
This package has some safety features automatically built in. These features are:
-
A scanner to make sure the latest files on your local machine are up to date with the latest files on the data servers to minimize unnecessary repetitive downloads (a common mistake made by new developers or people new to automating workflows).
-
To preserve system memory, old data files are automatically deleted before the download of the new data files.
-
To further preserve system memory, users can change the optional argument of clear_recycle_bin=False to clear_recycle_bin=True in WxData >= 1.2.5. If the user is running WxData < 1.2.5 then clear_recycle_bin=True is the default setting in the client function.
This package also extracts the variables for each GRIB ‘typeOfLevel’ into a dataset and builds those datasets into one dataset with all the variables, thus no more Dataset Build Errors and all the data is unlocked.
This package also remaps all the GRIB variable keys into a plain language format so people who are learning data analysis in Python don’t need to spend lots of time figuring out what each variable key means (i.e. ‘r2’ —> ‘2m_relative_humidity’). This can be very helpful given different level types (i.e. surface vs. pressure) use the same variable keys.
Another thing that makes WxData unique is the ability for this to work for users who are on VPN/PROXY connections to prevent SSL Certificate Errors. Users define their own VPN/PROXY address and port as a dictionary and then pass that in as an optional argument. WxData defaults to proxies=Noneso this will need to be modified by users who are on a VPN/PROXY connection. For information on how to configure a VPN/PROXY connection, please see the WxData documentation below.
WxData Documentation & Jupyter Lab Examples
WxData Github Repository
I think this package will make downloading and processing weather data easier for everyone, from students first learning working with weather data in Python to people setting up automated workflows on VPN/PROXY connections and everyone in between.
If you made it this far in my post, thank you for reading and I hope you find WxData useful!
Regards,
Eric J. Drewitz
2 Likes
Hi all!
Several updates to WxData launched since I made this post so I figured I’d push this back to the top of the queue with the new information regarding what has been added.
The current version is WxData 1.4.
Here is a list of all the available clients in WxData:
- GFS0p25
- GFS0p25 Secondary Parameters
- GFS0p50
- AIGFS
- HGEFS
- AIGEFS (Pressure Parameters)
- AIGEFS (Surface Parameters)
- AIGEFS Single (Ensemble Mean & Ensemble Spread)
- ECMWF IFS
- ECMWF AIFS
- ECMWF IFS Wave
- ECMWF IFS Ensemble
- ECMWF AIFS Ensemble
- ECMWF IFS Wave Ensemble
- RTMA
- RTMA Comparison
- NDFD Grids/SPC Outlooks
- CPC Outlooks
- RAWS Data
- METAR Data
- Observed Upper-Air Data
get_gridded_data() - User defines their own data source (gridded data) for automated retrieval
get_csv_data() - User defines their own data source (csv data) for automated retrieval.
get_xmacis_data() - xmACIS2 Climate Data - Also available in xmacis2py
WxData also has post-processors to process all this data and re-map variable keys from GRIB codes to plain-language.
WxData Analysis Tools:
linear_anti_aliasing() - Interpolating n-amount of data points between two points.
cyclic_point() - Resolves the meridian issue when plotting gridded-data across 0 or 180 degrees.
shift_longitude() - Converts longitude from 0 to 360 degrees to -180 to 180 degrees.
pixel_query() - Extracts gridded data from a given point (point forecasts/forecast soundings)
line_query() - Extracts gridded data along a line between points A and B (cross-sections)
WxData Workflow Tools:
run_external_scripts() - Executes scripts for the user in the order the user defines them. This is done by a string list with each element being the full path to the file needing to run.
WxData 1.4 Additions
- Added ecmwf-opendata package as a dependency and wrapped it into WxData which allows for ecmwf-opendata to be compatible with those on VPN/PROXY connections and I added the post-processing of GRIB variable keys from their GRIB codes into plain-language. I also have the safety-scanner in place to prevent too many requests on the servers.
This addition adds clients that allow sub-setting of the data prior to download to speed up the process. This also adds the capability to download ECMWF data from Amazon AWS servers.
ECMWF IFS, ECMWF IFS Ensemble, ECMWF AIFS, ECMWF AIFS Ensemble, ECMWF IFS Wave and ECMWF IFS Wave Ensemble clients.
-
Added a tool called linear_anti_aliasing() to create an n amount of data points between data points. This is useful to those who want to plot a vertical profile in matplotlib since using ax.scatter() works much better than ax.plot() when using a dynamic colormap cmap as opposed to a static color color. For users who wish to do this, this tool works well to get a scatter-plot that appears as a line and allows for much smoother color-transitions.
-
Added a tool cyclic_point() which simplifies the use of add_cyclic_point() from cartopy. This us useful for those plotting GRIB/netCDF data as a polar-stereographic as it resolves the gap issue along the 0 or 180 degree meridian.
WxData 1.3 Additions
-
Added a client for the new HGEFS data.
-
Added a client to download, process and calibrate the probabilities of the latest NOAA Climate Prediction Center Outlooks.
WxData 1.3 & 1.4 Improvements
-
Ability to subset ECMWF by variables (String List) and levels (Integer List) for pressure levels (when level_type=‘pressure’) data prior to download.
-
Added progress-bars to all clients that download gridded data.
-
Changed the visibility of all imports in files that directly interact with the user from public to private to prevent clutter on the drop downs in VS Code/tab completion in Jupyter Lab.
-
Added the ability to disable the scanner safety-feature (for more advanced users). Disabling this feature disables the scanner that prevents repetitive and unnecessary downloads. Default setting: clear_data=False (Default setting has this safety feature enabled).
-
The default setting clear_recycle_bin=True has been changed to clear_recycle_bin=False from good constructive community feedback. That feature is still there for those who wish to preserve system memory to the maximum, however now they will need to manually set clear_recycle_bin=True.
Here is a link to the Documentation & WxData Github Repository for those interested in learning more about how to use WxData to help you in your automated weather data workflow: edrewitz/WxData: A Python library that acts as a client to download, pre-process and post-process weather data. Friendly for users on VPN/PROXY connections.
Here is a link to several Jupyter Lab Tutorials demonstrating the use of WxData: edrewitz/WxData: A Python library that acts as a client to download, pre-process and post-process weather data. Friendly for users on VPN/PROXY connections.
Here is a link to the Github Repository: edrewitz/WxData: A Python library that acts as a client to download, pre-process and post-process weather data. Friendly for users on VPN/PROXY connections.
Finally, here are some graphics I created in these notebooks using WxData for data retrieval and post-processing before plotting:
- NKX Sounding Without Using
linear_anti_aliasing()
- NKX Sounding with using
linear_anti_aliasing(df, 'TEMP', 100) - 100x Anti-Aliasing
- Downloading and plotting a polar-stereographic of the ECMWF IFS Initial Analysis using
cyclic_point() to resolve the gap along the meridian.
- Using
get_cpc_outlook() to download the late4st 6-10 Day NOAA/CPC Probabilistic Precipitation Forecast, calibrate the probabilities and then plot our cleaned-up geopandas.GeoDataFrame
I hope this finds you all well.
Regards,
Eric J. Drewitz