Is there an out-of-memory package like dask for read/write geodataframes?

I’m working on a geodataframe with about 4 millions record, my problem is the read/write speed and also memory limit but memory is my second priority.
I was wondering if there is a package like “Dask” that could read/write data fast and also out-of-memory?


1 Like

There is GitHub - geopandas/dask-geopandas: New implementation of geopandas + dask. If you write to parquet with .to_parquet() the speed might be acceptable.

There is a proposed spec for more efficiently encoding the geometry column(s), but I don’t know how expensive that is relative to everything else. I suppose in your case you could write it both with and without the geometry column and see what the difference is.

1 Like