Xarray is very slow

I needed to read 10 geotiff files (each of 10000 by 10000 size) and calculate mean across 10 images

import glob
import threading
import numpy as np
import rasterio, xarray, rioxarray
files = glob.glob("/data/*.tif")
xda_lists = []
for fi in files:
  xdaObj = rioxarray.open_rasterio(fi,masked=True,chunks= {'band':1,'y':10000, 'x':10000},lock=threading.Lock())
xda_lists.append(xdaObj)
xdaMerged = xarray.concat(xda_lists,dim="band",join="override")
xdaMedian = xdaMerged.median(dim='band',skipna=True)
xdaMedian.rio.to_raster('out.tif',tiled=True,compress="lzw",lock=threading.Lock())

However, its very slow. How can I accelerate it ?? The input images were not tiled.

You could look at something like this as a start, see how it goes :- median() for a dask-backed xarray ยท GitHub