Zarr operations on IPLD

rabernat · June 7, 2022, 5:43pm

I agree that the immutable / content-addressable framework poses some interesting challenges. In general, Zarr expects that the parent object (array or group) will get created before the chunks. However, with IPLD, the children need to get created first, because the parent can’t be constructed without the children’s hashes. It will be interesting to think through how this will work. I’m sure it is solvable–it just requires some thought and creativity.

This is a really interesting suggestion. It is actually a bit similar to the “Zarr hacking” I proposed in Using to_zarr(region=) and extending the time dimension?

In general, yes, I think CAS will require some specialized utilities to merge / extend / append to datasets. As you noted, it should be possible to do all of this at the level of pure metadata, without ever rewriting any of the actual data chunks. That would be very cool to see!

In your opinion, what next steps are needed to make progress?

Topic		Replies	Views
Zarr on unixfsv1 vs on IPLD IPFS	2	674	June 22, 2022
Cloud array storage solutions Data	3	1185	November 29, 2023
Using to_zarr(region=) and extending the time dimension? Data	10	2355	June 22, 2022
Conflict-free Replicated Zarr Data	4	488	January 20, 2025
Slow Zarr to Netcdf Data	4	639	April 7, 2021

Zarr operations on IPLD

Related topics