ERA5 is used in thousands of climate studies every year. But validation against actual weather station observations, structured, reproducible, and exportable - is rarely done. This notebook changes that.
What it does: Five stations across five climate zones (Oslo, Madrid, Nairobi, Toronto, Sydney). Daily 2m air temperature. 2015–2020. Zero credentials required, Open-Meteo + Meteostat, both free, no API keys.
Outputs:
Per-station metric scorecard (RMSE, MAE, Mean Bias, Pearson r, Taylor Skill Score)
Multi-panel ERA5 vs station time series
World map of stations sized and colored by RMSE (pure matplotlib, no cartopy)
Climate zone comparison bar chart
Exportable HTML/JSON/Markdown validation report
150+ lines of boilerplate without climval. 10 lines with it. Same rigour. Exportable results.
This is awesome work, thank you for putting your time into this.
The only thing I’d like to flag here is that ERA5’s 2m temperature outputs should be height-adjusted to account for discrepancies between the reanalysis grid and the station’s actual height above sea/ground level. Usually the lowest-hanging fruit is to correct it using ~6.5K/km. If your code does that already, even better!
Thanks Tman, really good point! The current version doesn’t apply a lapse-rate correction, so the bias values at stations with significant elevation mismatch (especially Oslo and Nairobi where ERA5 orography deviates most from station altitude) will include a systematic height related component.
We’ll add an optional height correction using the ~6.5 K/km standard lapse rate in the next update, comparing ERA5 orography elevation to station metadata elevation via Meteostat and applying the adjustment before validation. Good candidate for a climval preprocessing utility too.
Appreciate the feedback, this is exactly the kind of thing that makes the validation more defensible.