-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add raster support #6
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like you put in a really great effort on this PR! Thanks for all the time you put into this.
I haven't had the chance to run the code yet, but I'll try to do so one of the next few days.
As a general comment; I'm not sure what kind of tests, if any, would be sensible to write for the code. Perhaps we could put an effort into some documentation in a future PR which downloads and plots some example rasters (and geojson files) to showcase usage. If we set strict==true
when building docs, that would hopefully catch most issues, at least every time tests are run.
In that same future PR, we could add some docstrings as well e.g. to the NaturalEarthRaster
struct and ne_raster
method.
function naturalearth(dataset_name::String) | ||
if dataset_name ∈ available_artifacts | ||
file_path = @artifact_str("$dataset_name/$dataset_name.geojson") | ||
return GeoJSON.read(read(file_path, String)) | ||
# elseif dataset_name ∈ available_rasters | ||
end | ||
|
||
@assert isfile(file_path) """ | ||
`$dataset_name` is not a valid NaturalEarth.jl artifact! | ||
Please search https://www.naturalearthdata.com for available | ||
datasets. | ||
""" | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this rewrite related to support for rasters, or just experimentation? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly to add support for Rasters in the future...wanted to have a template to build off.
# Implement the RasterDataSources.jl interface. | ||
# With this, you can call e.g. # Rasters.Raster(NaturalEarthRaster{:latest, 10, :HYP_HR}, nothing) | ||
|
||
struct NaturalEarthRaster{Version, Scale, Name} <: RasterDataSources.RasterDataSource end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is having the version, scale and name as type parameters a stylistic choice, performance gain (vs field names), or something else? .. I'm just curious!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way RasterDataSources works is that you have to pass a type in to RasterDataSources.getraster
, and Rasters.jl has the same dispatch (Raster(T::Type{<: RasterDataSources.RasterDataSource}, layer; kwargs...)
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do wish we could keep them as fields, since that would allow a lot more validation, but oh well 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put less in the type parameter!
RasterDataSources.jl has the layer name as the second argument, and scale would be a keyword.
Otherwise you get compilation every time you use it. Having the name as second argument means a tuple of names can make a stack.
You can put as much of that as you want in the keywords really.
You just need at least one type for dispatch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, when there are multiple type parameters in RasterDataSources its usually because the data source is structurally different - they have different keyword arguments and separate dataset handling code. WorldClim{Weather}
and WorldClim{Climate}
dispatch to totally separate methods in different files.
Here you don't actually use them to dispatch different code, suggesting they probably should be keywords.
There are also a few in RasterDataSources.jl like ALWB
that were just design mistakes and have unnecessary things in the type that could be keywords.
I'm also a little confused at why the tests fail attempting to download the |
Yeah, I'm kind of confused as well...will have to check that out locally. |
return string(Name) * ".tif" | ||
end | ||
|
||
function RasterDataSources.getraster(T::Type{NaturalEarthRaster{Version, Scale, Name}}, layer = nothing) where {Version, Scale, Name} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't layer
be Name
? that's how the rest of RasterDataSources.jl works unless I misunderstand something.
A NTuple{N,Symbol}
for layer
will then give you a RasterStack
automatically, or just a NamedTuple
returned from getraster
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ne = NaturalEarth{10 #= scale =#}
Raster(ne, :HY_SR)
do you mean something like this, where each type represents a scale, and has its own set of layers we can pull from? That makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just have a scale
keyword so there is no recompilation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raster(NaturalEarth, :HY_SR; scale=10)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so turns out that wont work! Because Rasters.jl has a fixed list of keywords to pass to getraster
which does not include scale
.
That's not going to "scale". I think we need a RasterDataSource
to declare its keywords to make this more flexible.
Like:
getraster_keywords(NaturalEarth) = (:scale, :version)
Then Rasters.jl can just pass those through and keep the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'd also like to discuss the getraster
interface more generally...the reliance on types kind of confused me.
Would love to chat about it at MakieCon :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With no types at all we have no dispatch. But do you mean types rather than constructed objects?
Partly its just historical, I didnt start the package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could rewrite it all to use structs with fields.
Basically I organised and standardised the few initial sources as much as possible, added a few more, and have had very little time to put into it since.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
_zipfile_to_read(raster_name, zf) = first(filter(f -> splitdir(f.name)[2] == raster_name, zf.files)) | ||
|
||
function RasterDataSources.zippath(::Type{NaturalEarthRaster{Version, Scale, Name}}) where {Version, Scale, Name} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods will recompile with any change at all, probably not what you want.
I would guess Version
will hardly ever lead to recompilation as people would stick to one version. But then you have the problem of the default, which is easier as a keyword.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I didn't pay too much attention to keyword options, so will definitely look at that!
|
||
struct NaturalEarthRaster{Version, Scale, Name} <: RasterDataSources.RasterDataSource end | ||
|
||
function ne_raster(; version = :latest, scale::Int = 10, name::Int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you instead pass these keywords through and not put them in the type, except maybe at most version? As that's more tied to the definition of the dataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure. This probably ties in with the question of which dataset to get anyway.
raster_path = RasterDataSources.rasterpath(T) | ||
if !isfile(raster_path) | ||
zip_path = RasterDataSources.zippath(T) | ||
RasterDataSources._maybe_download(RasterDataSources.zipurl(T), zip_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extending methods with underscores is risky, that's marking them as internal so I can change them in a patch version.
Maybe we should may maybe_download
part of the API instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be nice, but as a utility function I could probably replicate it (I think it's just isfile(zip_path) ? zip_path : download(zipurl, zip_path)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed review @rafaqz! I'll definitely look into the layer/kwarg approach. |
Co-authored-by: Rafael Schouten <[email protected]>
This is a basic implementation of RasterDataSources.jl for NaturalEarth. With this, the code:
just works!