Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spatial join area #1713

Open
wants to merge 46 commits into
base: master
Choose a base branch
from

Conversation

Jonathan24680
Copy link
Collaborator

@Jonathan24680 Jonathan24680 commented Jan 17, 2025

This merge request adds the possibility to use arbitrary WKT Geometries (i.e. areas) in SpatialJoin queries. Right now, areas can get approximated as the center point or the true distance between them is estimated. In addition to that a future PR will add pre computed bounding boxes to the index generation to enable even faster query times (like it already exists for geopoints)

@Jonathan24680 Jonathan24680 requested a review from joka921 January 17, 2025 10:00
Copy link

codecov bot commented Jan 21, 2025

Codecov Report

Attention: Patch coverage is 94.32990% with 11 lines in your changes missing coverage. Please review.

Project coverage is 90.06%. Comparing base (d7a70c7) to head (e42d9d0).
Report is 12 commits behind head on master.

Files with missing lines Patch % Lines
src/engine/SpatialJoinAlgorithms.cpp 93.71% 5 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1713      +/-   ##
==========================================
+ Coverage   90.03%   90.06%   +0.02%     
==========================================
  Files         395      396       +1     
  Lines       37904    38062     +158     
  Branches     4263     4277      +14     
==========================================
+ Hits        34128    34280     +152     
- Misses       2479     2486       +7     
+ Partials     1297     1296       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An initial general review of everything but the tests.
When you have addressed the initial principle suggestions I will have a closer look

src/engine/SpatialJoinAlgorithms.h Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.h Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.h Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.h Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.h Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
Comment on lines 83 to 85
} catch (...) {
return std::nullopt;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does boost geometry have a mode or overload of readWkt that doesn't use exceptions? then use that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. At least I'm not aware of such a function or overload and could not find one.

src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pass on SpatialJoinAlgoirthms.cpp (where most of the code lies anyway) :)

src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.cpp Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
// ____________________________________________________________________________
double SpatialJoinAlgorithms::convertDegreesToMeters(
AnyGeometry geometry1, AnyGeometry geometry2) const {
return boost::apply_visitor(DistanceVisitor(), geometry1, geometry2) * 78.630;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly is the multiplication doing here, the magic constant should be explained.
Is the function name actually computeDistanceInMeters (you first compute it, and then do some magic multiplication to convert)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a very verbose explanation of this is in the header file. But you're right, the function should be called different, as it first gets the distance in degrees and then converts it. I called it computeDist, since that's what the other compute distance functions are called. Then it's consistent and depending on the parameters available, one can call one or the other computeDist function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the clarification, I am much less confused now:)

src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
Comment on lines 134 to 138
AnyGeometry geom1 = geo1.geometry.value();
AnyGeometry geom2 = geo2.geometry.value();
if (geo1.geoPoint.has_value()) {
geom1 = convertGeoPointToPoint(geo1.geoPoint.value());
} else if (geo2.geoPoint.has_value()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is stored e.g. in geo1.geometry.value() if geo1 points to a POINT already? This last part looks a bit confusing.

Copy link
Collaborator Author

@Jonathan24680 Jonathan24680 Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one line above, there is a comment, which states "one point and one area". So this is the case, where we have one GeoPoint and one area. Then geom1 and geom2 get initialized to be the geometries (which is wrong in one case. When the geometry is a GeoPoint, then geomX.geometry is std::nullopt). The if below checks, which of those two is not a geometry but a geopoint. Then it converts geopoint to boost point (as boost requires both types to be boost geometries). You're right, that they should not be initialized with .value, since that is std::nullopt for at least one of them. I still have to write a test, which checks that, but it should work now

boost::geometry::read_wkt(getAreaString(idtable, row, col), geometry);
} else {
geometry = Point(point.value().getLng(), point.value().getLat());

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easier control flow:
if (point.has_value) { return point; } else { return getAnyGeometry(...)} Everything else then is not nested in the if`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'(with my suggestion for the AnyGeometry)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't understand what you mean

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (point.has_value()) {
  return std::optional{AnyGeometry{convertPoint...()}};
}
return std::optional{getAnyGeometry(...);

You can drop the std::optional if you explicitly set the return type of the lambda to std::optional<AnyGeometry>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course only relevant, if we still have this code at all.

src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another round of reviews.

src/engine/SpatialJoinAlgorithms.cpp Show resolved Hide resolved
// ____________________________________________________________________________
double SpatialJoinAlgorithms::convertDegreesToMeters(
AnyGeometry geometry1, AnyGeometry geometry2) const {
return boost::apply_visitor(DistanceVisitor(), geometry1, geometry2) * 78.630;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the clarification, I am much less confused now:)

src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
boost::geometry::read_wkt(getAreaString(idtable, row, col), geometry);
} else {
geometry = Point(point.value().getLng(), point.value().getLat());

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (point.has_value()) {
  return std::optional{AnyGeometry{convertPoint...()}};
}
return std::optional{getAnyGeometry(...);

You can drop the std::optional if you explicitly set the return type of the lambda to std::optional<AnyGeometry>

src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.cpp Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.h Outdated Show resolved Hide resolved
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another pass

src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
Comment on lines 114 to 116
const auto& areaBox = bbox.value();
Point p = calculateMidpointOfBox(areaBox);
return GeoPoint(p.get<1>(), p.get<0>());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think below the code gets much more readable if you do
if (!useMidpoint) { return computeDist(index1, index2);}; //.
Then it becomes much clearer that the whole complexity is just for the midpoint case (is this actually useful still?)
If you want to keep the midpoint algorithm then you can move the code into a separate function computeDistUsingMidpoints and then the actual computeDist becomes very short and readable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't get what you mean. The if else code blocks in the current implementation are equally large and complex for both cases.

src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.cpp Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.cpp Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.h Outdated Show resolved Hide resolved
src/engine/SpatialJoinAlgorithms.h Outdated Show resolved Hide resolved
@sparql-conformance
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants