Geosoft GDB converstion at scale

Last weekend notes from some brief looks: GitHub - RichardScottOZ/Geosoft-GDB-Conversion: Work looking at converting Geosoft GDBs to useable open format for analysis, machine learning and AI

This is geophysics data [Magnetics etc.] - a simple large chunk of it is plane flies in a line [or sorts] and takes readings with [X,Y,Z] as it goes of various types. Various other types, but that is a representative example.

A proprietary format from a company that has been bought by a company and then another company. e.g. gets more expensive.

https://help.seequent.com/Oasismontaj/2023.2/Content/ss/prepare_om/work_with_databases/c/oasis_databases.htm

This is a zarr conversion example: [for what it could save simply as far as something more multidimensional.

The fiducials in lines can be all the same where they overlap, but not always.

The channels (variables) often the same in each line, but not always - were in this case.

This is just a simple first look :-

Dimensions:     (line: 108, fiducial: 9091)
Dimensions without coordinates: line, fiducial
Data variables: (12/20)
    Altitude    (line, fiducial) float32 ...
    CompMag     (line, fiducial) float32 ...
    DCMag       (line, fiducial) float32 ...
    DTM         (line, fiducial) float32 ...
    Date        (line, fiducial) float32 ...
    Diurnal     (line, fiducial) float32 ...
    ...          ...
    Long_wgs84  (line, fiducial) float32 ...
    Radar       (line, fiducial) float32 ...
    RawMag      (line, fiducial) float32 ...
    Time        (line, fiducial) float32 ...
    x_wgs84     (line, fiducial) float32 ...
    y_wgs84     (line, fiducial) float32 ...

Data can be 1D or 2D - but mostly the latter for things of interest.

Lines are variable length - and this was dust an easy first pass half hour export data trial. So best format to use to store these in is a good question. Zarr can be small - as conversion is an additional storage cost.

X, Y, Z hence will vary similarly.

HD5/Tree things a possibility.