-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GeoTransform as implemented by GDAL #19
Add GeoTransform as implemented by GDAL #19
Conversation
@christophenoel -- I've altered my terminology here to avoid the |
@dblodgett-usgs can you identify 1-2 reviewers you feel that would have substantive inputs? Is this something we can already prototype interoptibility? |
I don't have permissions to add reviewers formally, but perhaps @christophenoel, @rouault, @edzer (others in the R spatial community?), @snowman2 (others in the python spatial community)? We could certainly prototype it -- it's actually more or less in use in the wild since it's adopting what GDAL already does. |
Could you attach an example zarr file then? |
Sounds like a good idea. Relevant references: |
@briannapagan -- I'm not really in a position to be able to create a zarr mock up. Maybe someone a little closer to the implementation could do it? The NetCDF CDL representation would look like:
|
@snowman2 -- so what I take away from your references is that this is basically already supported in more than just GDAL? What do you think about space separated vs an actual vector of doubles? Based on this, I think it is sensible to just move ahead with this proposal. The only catch is that gdal uses space seperated strings and it may be good to support both strings and explicit vector attributes? |
The approach rioxarray has taken is to maximize compatibility. So, the format that works with GDAL is the version used by rioxarray. It would be nice if the storage format here is compatible with both netCDF and zarr in GDAL. |
Per updates and discussion on the community call today, we generally have agreement on the current PR. Does anyone tagged here have interest in review / expressing support? The text now describes that the origin is intended to be a cell corner and shows the equation to resolve cell centers. We have settled on space delimited ascii for compatibility with existing implementations. We are using a |
It seems to me that this PR aligns with the desire to support multiple coordinate encodings (not only coordinate arrays), which I fully support. Note that for indicating the projection of a n-D data variable, the current convention is to define in the data array a property called grid_mapping, indicating the name of the auxiliary variable (a Zarr array whcih is not data, neither coordinate) that contains the attributes describing this grid mapping. The auxiliary variables that contains the "grid_mapping" is an empty array and contains the crs attributes (as shown in above example) Therefore, we could also use as proposed this grid_mapping property for describing origin_offset coordinates (using the GeoTransform). However, I think that for consistency, each dimension of a variable should map to a coordinate variable. For origin_offset instead of containing a 1D array of all coordinates values, the Zarr Array, would contain either:
To maximise/facilitate compatibility with map tools / generic client, I would suggest to preserve the grid_mapping when provided, but to impose the origin_offset tuple for each coordinate variable. |
Here's a rough draft at a suggestion (assuming an extension of the Zarr v3 spec). The specific granule I am using here is Something to consider. import rasterio
rast = rasterio.open("sample-data/HLS.S30.T11XNE.2024176T201849.v2.0.B01.tiff")
# Creating a `zarr.json` dict "by hand".
zarr_meta = {
"zarr_format": 3,
"shape": rast.shape, # (3660, 3660),
# Other stuff...
"chunk_grid": {
"name": "raster", # NOTE: custom grid type
"configuration": {
"crs": {
# Either an EPSG code
"epsg": rast.crs.to_epsg(), # 32611
# ...or a WKT string
"wkt": rast.crs.to_wkt() # 'PROJCS["WGS 84 / UTM zone 11N",GEOGCS[<etc...>]]'
},
"transform": list(rast.transform), # [30.0, 0.0, 499980.0, 0.0, -30.0, 8500020.0, 0.0, 0.0, 1.0],
# Is each pixel a point or an area? See https://docs.ogc.org/is/19-008r4/19-008r4.html#RasterSpace
# See also https://docs.ogc.org/is/19-008r4/19-008r4.html#_cookbook_for_defining_transformations
"area_or_point": rast.tags()["AREA_OR_POINT"] # 'Area'
}
}
} Some additional notes:
|
Where do you get that rotation in affine terms is common? It's extremely rare, and it's really better to work with an extent in most cases which is simpler to use and think about. No one uses a transform for specifying the output grid for regridding for example, though that's probably how the format will store it (an affine transform is great of course and a massive step forward for xarray and Zarr for when rectilinear is degenerate (which is extremely common) or curvilinear is degenerate (less common but out there and even more cryptic)). |
Hello, This looks like an interesting idea. However, I am not yet familiar with the attributes in Zarr V3. Based on V2, in a dataset (as outlined in the current draft of the specification), either the Zarr group of the dataset or a child (empty) Zarr array (such as grid_mapping) would hold such metadata in the .zattrs file (zarr.json in V3). |
@mdsumner I may be wrong! I thought it was fairly common for things like airborne and (lower-level) satellite data, where the rotation is oriented in the direction of the flight path / orbit and you can save some space by not having big corners of
@christophenoel Agreed that exactly where these metadata should live is a bit of an open question. I don't have strong opinions about it. I do think that we need to be proactive in supporting Zarr v3, though, so that GeoZarr doesn't immediately become outdated. If it's not too much work to also support v2, of course we should do that, too. |
I think there are some satellite streams that have this rotation (you need to have it as capability!) but in 20 years I've only encountered it a couple of times. FWIW it's not about north-south orientation, consider a map of Antarctica in Polar Stereographic. It's just a basic offset and scale situation in units-of-the-projection (same as any other regular grid, in Mercator, LAEA, or longlat etc). |
This seems indeed totally feasible. I hope to study if supporting both would be straightforward in the next weeks. |
It belongs in xarray (and hopefully the cross lang library that becomes ...) They're working on it |
For consideration. It's rough, but I think this is more or less what we are after in #17 ??
I think the logic to just adopt what's already in GDAL makes a lot of sense.
This isn't incompatible with CF in that data that uses this approach just wouldn't have georeference information in a client that expects normal coordinate variables.