Skip to content
This repository has been archived by the owner on Jun 22, 2023. It is now read-only.

[Enhancement] Performance-Friendly IMathTransform Alternative #32

Open
airbreather opened this issue Oct 11, 2017 · 13 comments
Open

[Enhancement] Performance-Friendly IMathTransform Alternative #32

airbreather opened this issue Oct 11, 2017 · 13 comments
Labels
breaking change Doing this would break either source or binary compatibility with existing versions. enhancement
Milestone

Comments

@airbreather
Copy link
Member

The GeoAPI.CoordinateSystems.Transformations.IMathTransform API is sub-par. Unless the implementer and consumer mutually agree on a protocol that would allow outputs to be pooled (which is stricter than what GeoAPI demands), it pretty much forces all implementations to allocate a whole new object on the managed heap for every call to hold the return value.

This is pretty bad, and TransformList methods don't help this at all... in fact, they can make it worse, because in exchange for reducing virtual calls compared to the alternatives, they force all the inputs and outputs to be allocated at once, which (beyond a certain list size) will guarantee that they get tenured.

For a while, we were using our own Geotools.NET at work, and we solved this with just a void Transform(ref double x, ref double y). PackedDoubleCoordinateSequence.GetRawCoordinates() gives us what we need to use that method to project / unproject without any temporary allocations.

Obviously, it's a lot more complicated to do this in GeoAPI / ProjNet4GeoAPI, since there are external consumers. Any thoughts about how we could make this better?

@DGuidi
Copy link
Contributor

DGuidi commented Oct 11, 2017

looks ok to me if we use simply doubles as you suggested.
if you're worried about breaking changes, maybe we can use Obsolete attribute to old methods and add documentation about why new methods are preferable.
probably, dotspatial guys should implement newer methods, too.

@airbreather
Copy link
Member Author

Yeah, I'm mainly worried about breaking changes.

I think the reason why each point is either a double[] or a Coordinate is because IMathTransform might be intended to work with XYZ[M?] inputs. So just passing in ref double x, ref double y isn't enough... but does it really make sense to pass in ref double x, ref double y, ref double z, ref double m when a significant fraction of callers will only transform X/Y?

A new interface, or set of interfaces, might be ideal for this, but I'm not sure.

@FObermaier
Copy link
Member

FObermaier commented Oct 12, 2017

For geocentric transformations you need the z-ordinate.
Just a thought:
As ICoordinateTransformations and their IMathTransform objects are usually created using a factory method it should be possible to tweak the transformation chain to make a copy only on the first transform and then work on the input data.

This would have to be done on the implementation side without changing the GeoAPI IMathTransform interface.

@airbreather
Copy link
Member Author

airbreather commented Oct 13, 2017

As ICoordinateTransformations and their IMathTransform objects are usually created using a factory method it should be possible to tweak the transformation chain to make a copy only on the first transform and then work on the input data.

This would have to be done on the implementation side without changing the GeoAPI IMathTransform interface.

Wouldn't that just contain the damage to "only" one allocation per point per call?

An alternative might be to add void Transform(double[] coord), void Transform(Coordinate coord), void Transform(IList<double[]>), void Transform(IList<Coordinate>), void Transform(ICoordinateSequence sequence) methods to the interface. This should make it clear that the input values are changed.

Yeah, something like that might be viable. The problem with adding more stuff to an interface that's already been shipped is that everyone downstream of GeoAPI who implements that interface will have to add those methods. There's a proposal out there targeted at C# 8 that would let us implement those "in-place" methods in terms of the problematic ones to avoid the breaking change while still letting subclasses do better, but even if we had that today, it requires runtime support, so it wouldn't work for all GeoAPI target platforms.

I'm wondering if it would make sense to just create a separate, focused interface for each "kind" of coordinate that just has a single in-place transformation method?

public interface ICoordinateTransformXY
{
    void Transform(ref double x, ref double y);
}

public interface ICoordinateTransformXYZ
{
    void Transform(ref double x, ref double y, ref double z);
}

public interface ICoordinateTransformXYM
{
    void Transform(ref double x, ref double y, ref double m);
}

public interface ICoordinateTransformXYZM
{
    void Transform(ref double x, ref double y, ref double z, ref double m);
}

Having those four independent interfaces makes it clear in the type system exactly "what" is being transformed at compile-time (instead of needing external context or runtime checks), and you could use any one of the four to implement (parts of) the current IMathTransform, which could remain as-is.

ProjNet4GeoAPI could then reorganize itself a little so that the "core" of the math calculations are exposed as implementations of that ICoordinateTransformXY, which performance-sensitive users could get from a new property that we add to MapProjection.

edit: simple map projections are the only use case I have for ProjNet4GeoAPI, so the last paragraph sorta assumes that... obviously, the other stuff in ProjNet4GeoAPI could expose a similar thing with higher-dimension transforms if there's something performance-sensitive about those as well.

@FObermaier
Copy link
Member

I don't know how familiar you are with the coordinate transformation steps in ProjNet[4GeoAPI], but is possible (and I supsect usual) to have a transition from 2D to 3D coordinates somewhere along the transformation chain (GeocentricTransformation).

So for this to work, I'd say we need one interface specifying both 2D and 3D functions. I think we can omit the measure value here:

public interface ICoordinateTransfromationEx {
    void Transform(byref double x, byref double y); 
    void Transform(byref double x, byref double y, byref double z); 
}

Nonetheless, that would be a major effort in Proj4Net because internally MapProjection work on and with double[] (MeterToDegrees, DegreesToMeters, MetersToRadians, RadiansToMeters). We'd still have to create a double array.

@airbreather
Copy link
Member Author

So for this to work, I'd say we need one interface specifying both 2D and 3D functions. I think we can omit the measure value here:

Not necessarily. That could just be a different interface to add to the list above:

public interface ICoordinateTransformXYToXYZ
{
    void Transform(ref double x, ref double y, out double z);
}

We'd still have to create a double array.

Yeah, at least for compat, we'd need to keep the double[] stuff around. The actual math stuff could move out into implementations of these new interfaces, though:

public abstract class MapProjection
{
    // ...

    // maybe protected set, maybe virtual get accessor, maybe something else
    public ICoordinateTransformationXY RadiansToMetersTransformer { get; protected set; }
    public ICoordinateTransformationXY MetersToRadiansTransformer { get; protected set; }

    // ...

    // if we aren't worried about external subclasses, then these can become private.
    protected virtual double[] RadiansToMeters(double[] lonlat)
    {
        double[] result = (double[])lonlat.Clone();
        this.RadiansToMetersTransformer.Transform(ref result[0], ref result[1]);
        return result;
    }

    protected virtual double[] MetersToRadians(double[] p)
    {
        double[] result = (double[])p.Clone();
        this.MetersToRadiansTransformer.Transform(ref result[0], ref result[1]);
        return result;
    }
}

Nonetheless, that would be a major effort in Proj4Net because internally MapProjection work on and with double[] (MeterToDegrees, DegreesToMeters, MetersToRadians, RadiansToMeters).

Yeah, it would definitely shake things up on the ProjNet4GeoAPI side, especially since I didn't put "reverse" methods on those sample interfaces. Even if the interfaces did have "reverse" methods, as long as there are concerns about compat, then [Serializable] means that we have to tread carefully over there.

@FObermaier
Copy link
Member

A solution without a whole bunch of new interfaces would be

  • Add void Transform(ref double[] xy, ref double[] z) to the IMathTransform interface and implement it. xy is an array like new []{x0, y0, x1, y1, ... xn, yn}, z is null or an array like new double {z0, z1, ... zn}
  • Add abstract RadiansToMeters(ref double[] lonlat, ref double[] alt) and MetersToRadians(ref double[] xy, ref double[] z) to MapProjection.

It also has the benefit of less virtual function calls.

@airbreather
Copy link
Member Author

airbreather commented Mar 6, 2019

Including Zs in the interface sounds like a good idea overall.

ref double[] xy
ref double[] lonlat

Small tweak: ref DoublePair[] xy / ref DoublePair[] lonlat

DoublePair would be:

public struct DoublePair
{
    public double First;
    public double Second;
}

usage:

for (int i = 0; i < xy.Length; i++)
{
    xy[i].First = CalculateX();
    xy[i].Second = CalculateY();
    z[i] = CalculateZ();
}

However maybe it should also allow for SoA layouts? void Transform(double[] xs, double[] ys, double[] zs) has the potential to be faster (see benchmarks I did over here... AVX2 looks a lot better).

We could have the abstract method be one or the other, and then have a virtual method that transforms a call to one into calls to the other one-by-one.

Also, Span<T> instead of T[].

@airbreather
Copy link
Member Author

airbreather commented Mar 6, 2019

To put it a bit more precisely what I'm currently thinking of:

public abstract (double x, double y, double z) Transform(double x, double y, double z);
public virtual void TransformAoS(
    ReadOnlySpan<DoublePair> inXY, ReadOnlySpan<double> inZ,
    Span<DoublePair> outXY, Span<double> outZ)
{
    // validate lengths all the same
    for (int i = 0; i < inXY.Length; i++)
    {
        (outXY[i].X, outXY[i].Y, outZ[i]) = Transform(inXY[i].X, inXY[i].Y, inZ[i]);
    }
}

public virtual void TransformSoA(
    ReadOnlySpan<double> inXs, ReadOnlySpan<double> inYs, ReadOnlySpan<double> inZs,
    Span<double> outXs, Span<double> outYs, Span<double> outZs)
{
    // validate lengths all the same
    for (int i = 0; i < inXs.Length; i++)
    {
        (outXs[i], outYs[i], outZs[i]) = Transform(inXs[i], inYs[i], inZs[i]);
    }
}

There could, of course, be methods that work with arrays of {x0, y0, z0, x1, y1, z1, ..., xn, yn, zn} or something as it suits us... perhaps something with measures too in case we want to avoid copying for a PackedDoubleCoordinateSequence with Dimension = 4 / Measures = 1.

@FObermaier
Copy link
Member

@airbreather
Copy link
Member Author

Please have a look at https://github.com/NetTopologySuite/ProjNet4GeoAPI/tree/perf/Transform_with_Span

Main things that stand out to me (ignoring smaller things I would identify for a "real" PR):

  • Why ref Span<T>? Just Span<T> should be fine... letting the callee overwrite the entire span at once seems strange, and I can't think of any important use cases that this enables.
  • I personally favor accepting ReadOnlySpan<T> inputs / Span<T> outputs; not all callers will want to transform in-place, and those callers who want to do so could pass in the same span both times.
  • I'm in favor of making the transform implementation only need to implement a one-by-one method.
    • If all we have is that, then we can make a decent default implementation of all the "multiples" that make sense for us to implement.
    • Yes, our default "call abstract method in a loop" would be slower than if we forced the subclass to implement an "all at once" variety, but I'm actually not too worried about it. I mainly want to give the implementation the opportunity to optimize "multiples" without requiring it to write boilerplate.

I can submit a PR targeting that branch later to give an idea of what I'm talking about, in case the things I'm saying are a bit too far removed from actual code for them to make sense...

@airbreather
Copy link
Member Author

(@FObermaier I've pushed a change to let it build on other people's machines and to remove the junk that was only needed when we were multi-targeting).

@airbreather airbreather added breaking change Doing this would break either source or binary compatibility with existing versions. and removed help wanted labels Mar 7, 2019
@airbreather
Copy link
Member Author

I can submit a PR targeting that branch later to give an idea of what I'm talking about, in case the things I'm saying are a bit too far removed from actual code for them to make sense...

NetTopologySuite/ProjNet4GeoAPI#42

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
breaking change Doing this would break either source or binary compatibility with existing versions. enhancement
Projects
None yet
Development

No branches or pull requests

3 participants