I’m working on a geodataframe with about 4 millions record, my problem is the read/write speed and also memory limit but memory is my second priority.
I was wondering if there is a package like “Dask” that could read/write data fast and also out-of-memory?
There is GitHub - geopandas/dask-geopandas: New implementation of geopandas + dask. If you write to parquet with
.to_parquet() the speed might be acceptable.
There is a proposed spec for more efficiently encoding the
geometry column(s), but I don’t know how expensive that is relative to everything else. I suppose in your case you could write it both with and without the geometry column and see what the difference is.