Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add applyreduce method #40

Merged
merged 8 commits into from
Jan 12, 2024
Merged

add applyreduce method #40

merged 8 commits into from
Jan 12, 2024

Conversation

rafaqz
Copy link
Member

@rafaqz rafaqz commented Jan 2, 2024

This PR adds an applyreduce method, like apply, but reducing rather than rebuilding.

applyreduce is a variation on mapreduce as its kind of similar except for the nesting of apply. It has a function f that is applied to each nested geom, and a reducing op as the second argument.

I'm implementing it for area, centroid and maybe distance. The idea is you could take the area of a whole dataframe column of FeatureCollections, Feature, geometries, whatever in one go (threaded if you like). centroid will also take the centroid across the whole column/whatever by combining centroids and areas fractionally in the reducing op.

distance could take the minimum distance from any collection of objects to a point? I'm not sure if thats as useful as the othere but it may as well?

It could also be used for e.g. polygon union over any object.

@rafaqz rafaqz marked this pull request as ready for review January 3, 2024 02:37
@rafaqz rafaqz requested a review from skygering January 3, 2024 02:37
@@ -77,8 +83,11 @@ computed slighly differently for different geometries:
Result will be of type T, where T is an optional argument with a default value
of Float64.
"""
signed_area(geom, ::Type{T} = Float64) where T <: AbstractFloat =
_signed_area(T, GI.trait(geom), geom)
function signed_area(geom, ::Type{T} = Float64; threaded=false) where T <: AbstractFloat
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing is that I didn't define the signed area for multi-geometries. What does it really mean since different polygons could have different winding orders. If I understand the code correctly, this does define signed area for multi-geometries

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to define it for multi-geometries, we should probably put a warning.

Copy link
Member Author

@rafaqz rafaqz Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I did wonder about that. So it only works if the winding is consistent.

I guess we can not do this for signed area? Just area? Honestly I implemented it to be complete along with distance and signed distance, and it does make sense in that case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean to aggregate signed areas? is that a useful thing to have?

We could make sure that at the collection and multi-geometry level and children have the same sign in the area?

Like all polygons must have positive signed areas, or all netagive, or it doesn't make sense. We could check that and throw a warning if it doesn't hold?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I don't see the use case. We still check if the polygon or shape is CW or CCW under the hood, so we aren't saving the user any computation by giving them a signed area if that is what they want to know (in addition to area obviously). We are also adding in a check and potentially throwing a warning, which will slow things down a bit. It seems like if a user wants the area and wants to know if they all have the same winding order, they should just get the signed area of each polygon, or the area and then check the winding order themselves. However, if we do have it, I think a warning is good to avoid a misunderstanding. Because the issue is if we have two polygons, one CW with an area of 50 and one CCW with an area of 50, the total signed area of the multipolygon will be 0. Which to me seems pretty confusing and now the user has none of the information that we calculated under the hood. So they lose the area and winding order information entirely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, lets just keep area using applyreduce then and make it clear in the docs that signed_area has a more limited application.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of my main intentions with these methods is to standardize everything and make all of our functions work on the same set of possible inputs, and to do that in a centralized, structured way rather than actually writing all that code.

But the generality will break in some places.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree with that. Just not in this case.

@rafaqz rafaqz merged commit 49af1cc into main Jan 12, 2024
4 checks passed
@rafaqz rafaqz deleted the applyreduce branch January 12, 2024 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants