Strange error using pangeo-forge-recipes/apache beam in parallel

I’m encountering an error I’m not really sure what to do with when running apache-beam (via pangeo-forge-recipes) with the direct runner with multi process enabled:

terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::erase: __pos (which is 16) > this->size() (which is 15)

which as far as I can tell occurs just before the actual data processing through the pipeline.

I’m just curious if anyone has seen this before, a trawl through github and stackoverflow etc. hasn’t been wonderfully helpful so far.

Some more detail available here: Create example for the UKCEH GEAR-1hrly dataset · Issue #3 · NERC-CEH/object_store_tutorial · GitHub

and the recipe I’m trying to run here: object_store_tutorial/scripts/convert_GEAR_beam.py at GEAR · NERC-CEH/object_store_tutorial · GitHub

Hey @matbro. I do not have a concrete clue about this error, but in my experience with beam debugging (which is just gnarly in general) it might be helpful to look at what step the pipeline was at before this occurs.

1 Like

Adding on to what @jbusecke said. Debugging beam pipelines can be a real PITA. One tool I found is helpful is adding a | beam.Map(print) Ptransform to stages in your pipeline to see what the previous PTransform is outputting. ex:

recipe = (
    beam.Create(pattern.items())
    | OpenWithXarray(file_type=pattern.file_type)
    | beam.Map(print)
)

Thanks both!
I’ve narrowed it down to being something to do with newer versions of pyarrow (>8.0.1) and/or it’s dependencies, but other than that who knows xD
I’ve parked it for now, given I’ve managed to assemble a working environment with older versions of pyarrow, but will be something I’m likely to revisit in future when it comes to deploying this workflow/recipe in anger!!

1 Like

Sounds good @matbro. I also wanted to flag that we have reorganized bi-weekly working group meetings for PGF. That might be a good spot to talk in more detail!