-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance ArchGDAL OGR objects with parametric composite types #266
base: master
Are you sure you want to change the base?
Conversation
I have changed the FeatureDefn type parameter of ArchGDAL OGR objects from FDType = Tuple{NTuple{NG,GType} where NG,NTuple{NF,FType} where NF} which is the same as FDType = Tuple{
Tuple{Vararg{GType, NG}} where NG,
Tuple{Vararg{FType, NF}} where NF,
} to (thanks to https://discourse.julialang.org/t/parametric-type-with-a-constrained-namedtuple/34511) FDType = Tuple{
NamedTuple{NG,<:Tuple{Vararg{GType}}} where NG,
NamedTuple{NF,<:Tuple{Vararg{FType}}} where NF,
}
The total performance gain with with the addition of avoiding most of the calls to Current master:@benchmark Tables.columntable(AG.getlayer(AG.read("../../tmp/road/road.shp"), 0)) seconds = 30
BenchmarkTools.Trial: 142 samples with 1 evaluation.
Range (min … max): 182.404 ms … 335.852 ms ┊ GC (min … max): 0.00% … 10.29%
Time (median): 196.571 ms ┊ GC (median): 0.00%
Time (mean ± σ): 213.022 ms ± 37.221 ms ┊ GC (mean ± σ): 3.27% ± 4.88%
▃█▅▄▂▄ ▄
▆██████▇██▇▃▃▁▃▁▃▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▃▁▁▃▃▅▄▇▁▃▄▆▃▅▃▁▁▁▁▁▁▁▁▁▃ ▃
182 ms Histogram: frequency by time 318 ms <
Memory estimate: 8.98 MiB, allocs estimate: 373869. Shapefile.jl:julia> @benchmark Tables.columntable(Shapefile.Table("../../tmp/road/road.shp")) seconds = 30BenchmarkTools.Trial: 198 samples with 1 evaluation.
Range (min … max): 142.824 ms … 163.894 ms ┊ GC (min … max): 0.00% … 5.48%
Time (median): 152.970 ms ┊ GC (median): 4.10%
Time (mean ± σ): 152.064 ms ± 4.070 ms ┊ GC (mean ± σ): 2.95% ± 2.46%
▃▂▂▄ ▄ ▇▂█▂▄▄▅
▃▁▁▃▁▃▃▅▆▅▆▆▅█▇█████▅██▃▃▅▃▃▅▁█▆██▆▇████████▆▃▆▇▇▆▅▅▁▁▁▃▃▁▁▁▃ ▃
143 ms Histogram: frequency by time 161 ms <
Memory estimate: 35.89 MiB, allocs estimate: 243040. Current master with geometry stealing:julia> @benchmark Tables.columntable(AG.getlayer(AG.read("../../tmp/road/road.shp"), 0)) seconds = 30
BenchmarkTools.Trial: 144 samples with 1 evaluation.
Range (min … max): 178.243 ms … 312.567 ms ┊ GC (min … max): 0.00% … 13.64%
Time (median): 191.755 ms ┊ GC (median): 0.00%
Time (mean ± σ): 208.443 ms ± 40.801 ms ┊ GC (mean ± σ): 3.39% ± 5.23%
█▂▁▆▅▂▂
▃████████▅▄▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▃▁▁▃▃▃▄▃▅▁▃▄▃▃▃▃ ▃
178 ms Histogram: frequency by time 312 ms <
Memory estimate: 8.98 MiB, allocs estimate: 373879. New ArchGDAL OGR types with geometry stealing:julia> @benchmark Tables.columntable(AG.getFDPlayer(AG.read("../../tmp/road/road.shp"), 0)) seconds = 30
BenchmarkTools.Trial: 169 samples with 1 evaluation.
Range (min … max): 158.547 ms … 245.862 ms ┊ GC (min … max): 0.00% … 7.86%
Time (median): 169.471 ms ┊ GC (median): 0.00%
Time (mean ± σ): 178.152 ms ± 22.915 ms ┊ GC (mean ± σ): 1.48% ± 2.67%
▁▃▃▆▄▇█▁▃▂▆▁
▇████████████▇▄▃▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▃▅▇▃▃▆▄▅▃▃▁▃▁▁▁▃ ▃
159 ms Histogram: frequency by time 243 ms <
Memory estimate: 7.75 MiB, allocs estimate: 279394. Beyond the call to |
@yeesian I'm not able to push the commit associated to my previous comment into branch Enhance_ArchGDAL_types. I got the following message:
Do you know what I did wrong ? |
I've updated the branch protection settings now, can you try again? Thanks! |
It stills fails, but I have a different message:
|
Shall I try a force push with lease ? |
I have further relaxed the branch protection rules -- I'm so sorry for the inconvenience so far -- can you try yet again? I'll get to your PRs soon; have been on the road for the past few days. |
Great, it works. I was able to push commit a146c79 above |
Continuing to shave off time in layer to table conversion, I noticed that New convert functionsfunction convert(::Type{OGRwkbGeometryType}, gogtinst::GDAL.OGRwkbGeometryType)
return OGRwkbGeometryType(Integer(gogtinst))
end
function convert(::Type{GDAL.OGRwkbGeometryType}, ogtinst::OGRwkbGeometryType)
return GDAL.OGRwkbGeometryType(Integer(ogtinst))
end The convert functions code looks simpler and faster like this, and could be extended to other similar conversions between ArchGDAL Enum and GDAL CEenum.Cenum This has the prerequisite to align Results
julia> geom = AG.fromWKT("POINT (1 2)")
Geometry: POINT (1 2)
julia> @benchmark AG._infergeomtype($(geom.ptr))
BenchmarkTools.Trial: 10000 samples with 949 evaluations.
Range (min … max): 83.511 ns … 209.497 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 97.937 ns ┊ GC (median): 0.00%
Time (mean ± σ): 95.838 ns ± 13.089 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█ ▆ ▂▂ ▄█▂▁▅ ▂▂ ▁▁▁▃▁▁ ▁ ▂ ▂
█▂█▄██▆▆▅▄█▆██████████▇███████▇▇▇█▇▇▇▇▇██▆▆▇▇█▆▆▆▇▆▅▆▆▄▅▅▆▅▆ █
83.5 ns Histogram: log(frequency) by time 143 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> geom = AG.fromWKT("POINT (1 2)")
Geometry: POINT (1 2)
julia> @benchmark AG._infergeomtype($(geom.ptr))
BenchmarkTools.Trial: 10000 samples with 997 evaluations.
Range (min … max): 18.790 ns … 78.432 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 19.063 ns ┊ GC (median): 0.00%
Time (mean ± σ): 19.519 ns ± 2.273 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▇█▄ ▅ ▁ ▂ ▂ ▁ ▂
███▅██▄█▃▅▄▃▄▁▁▄▃▁▃▁▃▁▁▁▁▁▃▁▁▁█▆█▁▁▁▁▁▁▁▁▁▃▁▁▆█▅█▆▆▇▆▅▅▆▆▆▇ █
18.8 ns Histogram: log(frequency) by time 29.9 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
NextI will go saving time with a trial to implement a specialized version of |
I have copied and specialized Tables.jl/src/fallback.jl and Tables.jl/src/utils.jl. I had to use GeneralizedGenerated.jl to solve a world age problem with generated functions julia> display(@benchmark DataFrame(AG.getFDPlayer(AG.read("../../tmp/road/road.shp"), 0)) seconds = 30)
display(@benchmark DataFrame(Shapefile.Table("../../tmp/road/road.shp")) seconds = 30)
BenchmarkTools.Trial: 181 samples with 1 evaluation.
Range (min … max): 146.779 ms … 251.964 ms ┊ GC (min … max): 0.00% … 6.67%
Time (median): 157.170 ms ┊ GC (median): 0.00%
Time (mean ± σ): 165.837 ms ± 26.339 ms ┊ GC (mean ± σ): 0.99% ± 2.04%
▂▇▅█▅▁▁ ▆▁▁
▅███████▇███▅▃▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▃▁▃▁▄▁▄▃▃▃▃ ▃
147 ms Histogram: frequency by time 251 ms <
Memory estimate: 6.32 MiB, allocs estimate: 134898.
BenchmarkTools.Trial: 194 samples with 1 evaluation.
Range (min … max): 146.609 ms … 164.070 ms ┊ GC (min … max): 0.00% … 7.49%
Time (median): 152.686 ms ┊ GC (median): 0.00%
Time (mean ± σ): 154.683 ms ± 5.141 ms ┊ GC (mean ± σ): 2.93% ± 3.05%
▂ ▄▃▂▅▃█▄▂▅ ▄▂ ▇ ▃
▃▁▅▁█▆▁█████████▇▆▅▆▃▇▅▁▁▁▁▁▁▁▅▁▁▁▁▃▃▅▅▆██████▇▇█▆█▆▃█▅▅▃▃▃▁▃ ▃
147 ms Histogram: frequency by time 164 ms <
Memory estimate: 36.71 MiB, allocs estimate: 243065. |
…of `Tables.eachcolumns` for ArchGDAL
@visr you mentioned in comment #238 (comment) that you had a case where layer to table conversion took a longtime
Could you hand me the vector file example, to test it with this PR ? |
I believe in that case I was reading the HydroATLAS / BasinATLAS data. This is a direct link to the download: https://figshare.com/articles/dataset/HydroATLAS_version_1_0/9890531?file=20087237 One nice property of this dataset is that is has detail levels 1 to 12, which gradually get larger. Especially the higher levels started to get really slow I noticed. |
@visr Ok thanks. I will test the PR with this file set |
@visr I have reimplemented Results
Performance with HydroATLAS basin data level 6Shapefile.jlUncompiled Shapefile: 7.719764 seconds (10.36 M allocations: 792.434 MiB, 6.68% gc time)
Compiled Shapefile: BenchmarkTools.Trial: 50 samples with 1 evaluation.
Range (min … max): 1.790 s … 2.263 s ┊ GC (min … max): 1.03% … 10.02%
Time (median): 1.979 s ┊ GC (median): 0.92%
Time (mean ± σ): 1.999 s ± 94.965 ms ┊ GC (mean ± σ): 1.11% ± 1.29%
▁▃▁█ ▁ ▃ ▁
▇▁▁▄▁▁▄▁▁▄▄▄▄▁▁▁▄▄▁████▄▄▁▄▄▄▁▁▄▄█▇█▄█▄▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▁
1.79 s Histogram: frequency by time 2.26 s <
Memory estimate: 501.84 MiB, allocs estimate: 5446511. ArchGDAL 0.7.4Uncompiled AG 0.7.4 : 155.592979 seconds (69.32 M allocations: 3.783 GiB, 0.67% gc time, 93.75% compilation time)
Compiled AG 0.7.4 : BenchmarkTools.Trial: 5 samples with 1 evaluation.
Range (min … max): 9.593 s … 9.887 s ┊ GC (min … max): 2.33% … 2.87%
Time (median): 9.750 s ┊ GC (median): 2.73%
Time (mean ± σ): 9.739 s ± 124.606 ms ┊ GC (mean ± σ): 2.68% ± 0.22%
█ █ █ █ █
█▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁█ ▁
9.59 s Histogram: frequency by time 9.89 s <
Memory estimate: 443.71 MiB, allocs estimate: 10380858. FeatureLayer with stealgeom and new (Tables.)columnsUncompiled AG normal: 3.652381 seconds (11.98 M allocations: 291.840 MiB, 1.87% gc time, 38.22% compilation time)
Compiled AG normal: BenchmarkTools.Trial: 50 samples with 1 evaluation.
Range (min … max): 2.418 s … 7.823 s ┊ GC (min … max): 7.82% … 6.07%
Time (median): 2.496 s ┊ GC (median): 7.93%
Time (mean ± σ): 2.690 s ± 812.183 ms ┊ GC (mean ± σ): 7.66% ± 0.72%
█▄▄
███▁▁▁▁▁▁▁▅▁▁▁▁▁▁▁▁▁▁▁▁▁▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅ ▁
2.42 s Histogram: log(frequency) by time 7.82 s <
Memory estimate: 179.96 MiB, allocs estimate: 10161324. FDP_FeatureLayer with (Tables.)columns using plain
|
…yer` and `AbstractFeatureLayer`
Great that you were able to improve the performance so much! I don't really have experience with generated / RuntimeGeneratedFunctions / GeneralizedGenerated, so it's hard for me to comment on it, other than that it's probably nice if we don't have to add a dependency. Do any of the limitations of generated functions from Base cause issues, like the world-age issues or closures? What was the reason to look into those packages? |
I had the world age issue when trying to call directly But the overhead of calling So the use of I’ll make a PR on master for the new |
- Reverted from `stealgeom` to `getgeom` in `getcolumn` to repair Tables unit tests - `Tables.schema` back to return `nothing` and `Schema` extraction functions from FeatureDefn called `gdal_schema` since the result is not a reliable `Table.Schema` (see Shapefile driver handling of mixed Line and MultiLine handling)
Repairing Tables methods unit testing, I had to revert from The impact on performance is limited for HydroATLAS basin test files and French main roads test file (see tests results below). FDP layer to table conversion is on par or faster (for big files) compared to Shapefile.jl
HydroATLAS Basin level 12Shapefile 1st run: 177.759011 seconds (347.88 M allocations: 20.670 GiB, 12.77% gc time)
Shapefile 2nd run: 186.639426 seconds (342.95 M allocations: 20.386 GiB, 17.35% gc time)
AG FDP pg 1st run: 103.441506 seconds (353.07 M allocations: 9.472 GiB, 9.15% gc time, 0.20% compilation time)
AG FDP pg 2nd run: 97.793840 seconds (348.00 M allocations: 9.178 GiB, 8.61% gc time, 0.01% compilation time) HydroATLAS Basin level 6Shapefile 1st run: 4.370634 seconds (10.36 M allocations: 792.293 MiB, 11.53% gc time)
Compiled Shapefile: BenchmarkTools.Trial: 16 samples with 1 evaluation.
Range (min … max): 1.782 s … 2.311 s ┊ GC (min … max): 0.00% … 14.18%
Time (median): 1.922 s ┊ GC (median): 4.01%
Time (mean ± σ): 1.944 s ± 113.983 ms ┊ GC (mean ± σ): 5.11% ± 3.67%
▃ █
▇▁▁▁▁▁▁▁▇▁▇▇▁█▇█▇▁▇▇▁▁▇▁▁▁▁▁▁▁▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇ ▁
1.78 s Histogram: frequency by time 2.31 s <
Memory estimate: 501.84 MiB, allocs estimate: 5446511.
AG FDP pg 1st run: 4.993936 seconds (10.46 M allocations: 450.400 MiB, 2.49% gc time, 3.47% compilation time)
Compiled AG FDP pg: BenchmarkTools.Trial: 17 samples with 1 evaluation.
Range (min … max): 1.553 s … 1.979 s ┊ GC (min … max): 1.75% … 1.98%
Time (median): 1.839 s ┊ GC (median): 1.91%
Time (mean ± σ): 1.813 s ± 132.176 ms ┊ GC (mean ± σ): 1.95% ± 0.71%
█ ▁ ▁ ▁ █▁▁ █ ▁ ▁ ▁ ▁ █
█▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁█▁▁▁▁▁███▁█▁▁▁█▁█▁▁█▁█▁▁▁▁▁█ ▁
1.55 s Histogram: frequency by time 1.98 s <
Memory estimate: 148.99 MiB, allocs estimate: 5379083. French main roads fileShapefile 1st run: 2.446287 seconds (5.19 M allocations: 329.013 MiB, 5.75% gc time)
Compiled Shapefile: BenchmarkTools.Trial: 192 samples with 1 evaluation.
Range (min … max): 147.897 ms … 170.355 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 157.005 ms ┊ GC (median): 4.41%
Time (mean ± σ): 156.399 ms ± 4.845 ms ┊ GC (mean ± σ): 2.94% ± 2.80%
▂ ▁ ▂ ▁▇█▁ ▂ ▁ ▁ ▂ ▄▄▂▁▂ ▅
▃▃▃▃▃▃█▅█▆█████████▃███▁▅▁▁▅▁▁▃▁▅▃█▅█▆▅▃█▆██████▃██▁█▆▃▅▅▅▆▁▅ ▃
148 ms Histogram: frequency by time 165 ms <
Memory estimate: 36.71 MiB, allocs estimate: 243067.
AG FDP pg 1st run: 2.479176 seconds (5.13 M allocations: 295.079 MiB, 4.28% gc time, 8.29% compilation time)
Compiled AG FDP pg: BenchmarkTools.Trial: 153 samples with 1 evaluation.
Range (min … max): 176.988 ms … 274.031 ms ┊ GC (min … max): 0.00% … 7.41%
Time (median): 187.198 ms ┊ GC (median): 0.00%
Time (mean ± σ): 196.996 ms ± 24.662 ms ┊ GC (mean ± σ): 1.46% ± 2.73%
▁█▇ ▂ ▁
▃▄█████▇▇█▄▅▅▇▃▄▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▃▃▃▃▃▃▃▃▃▃▃▃▁▃▃ ▃
177 ms Histogram: frequency by time 264 ms <
Memory estimate: 8.17 MiB, allocs estimate: 292898. |
- Fixed `NamedTuple` creation in `Tables.columns` for Julia >= 1.7 - Fixed `_gtnames` for Julia >= 1.7
I'm really sorry, but I'm going through a period of bereavement due to a personal loss, and nominate @visr to review this PR in my absence. If he might be unavailable to do so, I'll review it when I'm in the appropriate state to do so. (At a high level, performance matters to me: I'm partial to both this PR and #243 and will allow for them to introduce breaking/experimental changes since it's still pre-1.0 and might give some directional information/guidance on workflows and APIs that enable performance gains. I'd review with an eye towards complexity and maintainability, but my preferences are ultimately non-blocking and I'm willing to postpone them until after @mathieu17g has explored the space of generated functions and unexploited GDAL functions that are important for optimization and might have implications for the API.) |
Really sorry to hear that @yeesian. Of course take all the time you need, I wish you strength during a difficult time. I can probably help out a bit here with reviewing. As it is right now I don't fully understand the PR, but that also has to do with my limited experience with generated functions and since I didn't properly dive into the code yet. @mathieu17g hopefully when you come out of the (impressive) rabbit hole here the result will be sufficiently simple. |
@yeesian, I'm so sorry for your loss. Thank you for having taken the time to share your views with us in those difficult times. As of this PR and |
…ness between ArchGDAL and Shapefile
@visr I have:
Could you please have a look to it, and test its performance on your own.
|
@mathieu17g I'm a little bit lost. It would be of great help to me if you could write a high level summary of (1) the issue that this PR is trying to fix, and (2) how the proposed changes help fix this issue. I have only a rough understanding about ArchGDAL Tables performance issues. If there is a simple change that solves the largest performance bottleneck, but leaves another 10% on the table, that may well be preferred. I see you introduce new parametrized structs besides the original unparametrized ones. Is that for now to isolate the changes, or is there a reason we'd have to keep both around? |
@visr sure, I will make a summary |
@visr this PR tries to solve two issues and introduces some more limited enhancements:
The parametric types introduced to solve the 1st issue can be:
Since Follow-ups besides the parametric types handling:
1.
|
Thanks a lot @mathieu17g for taking the time to write down all your findings! I'll get back to it when I have some more time. |
Hi @visr, did you had any time lately to look into this PR ? In the mean time, I proposed a PR in GDAL to be able to extend the geometry stealing performance gain beyond the the first geometry. It is still under review. Once merged, I will see to propagate it all the way to ArchGDAL. |
@mathieu17g thanks for the bump, and sorry I did not yet sit down to get the proper look that this work deserves. I hope to get to this soon. I see OSGeo/gdal#5196 landed, nice work there! Regarding this bit of your writeup:
Perhaps it's a good idea to go ahead and raise that issue, with a reproducible example for them to test the performance? I've seen the same functions show up in my earlier profiles, but couldn't really figure out how to improve it. |
@visr I have proposed in JuliaData/Tables.jl#273 solution to overcome the performance issue at first run. But the current PR implementation of |
Awesome, I tried your inlining change from JuliaData/Tables.jl#273 (comment) as well. On level 4 BasinATLAS I see similar results, around 100x speedup of compilation, and no real effect on runtime, only some less allocations and less time in gc. Noted this was just a simple repeated at-time exercise.
Do you want to go ahead and submit that change as a PR to Tables.jl? You'll probably get more feedback that way. And if it lands, we get a free 100x in ArchGDAL! Good to know that this PR would still improve the situation futher at least 5x. Though we'd have to weigh it against the added complexity of course, it's not a one line change that gives 100x. So let's start there? |
I have made the PR JuliaData/Tables.jl#275 to solve the compilation performance issue JuliaData/Tables.jl#273 and another one seen during the investigation on column ordering when using |
Great, thanks! Though perhaps better to split it into two separate PR's since the two changes are unrelated. The inlining change is straightforward. For the dicts I guess the main question is do they want to take on OrderedCollections.jl as a new dependency. |
Ok let's wait for some feedback from the @quinnj |
Sorry for the slowness; I've been slammed w/ non-julia things lately. I'll try to dig in tonight. |
It seems like a lot of work has gone into this PR. What's the current state? This shouldn't go to waste :) |
Some of this is superceded by #316 There the enum integers are used in the type directly. Its simple and works. But we could also swap them to real types, that would be a minor change (but no performance improvement). I tried to use this PR but it contains a lot of unrelated ideas. It should be cleanly separated into separate PRs |
Ok, to me it looks as if #316 will be easier to merge and comes first. @mathieu17g can you somehow separate the ideas you had into separate PRs, which are more digestible on their own and hence easier to merge? E.g. one for table interface improvement, one for geometry stealing, one for better convert functions (if they can be separated like that). This can maybe be done best after #316 has been merged. Or do you have any objections against merging #316 and arguments for going further with your version (which looks more complicated after a superficial read)? |
I think there's a lot of great ideas here (for not only improvements but also diagrams and benchmarking) that I'm happy to keep it as-is (without deleting or merging) and closing it only after we feel we've made progress on the items that we'd like to incorporate. |
Here is a draft for review, demonstrating:
ogr_f_stealgeometry
for a gain of an additionnal 7% gain on layer to table conversion when there is only one geometry fieldgetfield
(to fix issue Consider turning the enums into types for dispatch #246) which brings 5% performance gain to the latterPerformance results
Current master
With Shapefile.jl
Current master with
ogr_f_stealgeometry
used ingetcolumn
With new OGR parametric types and with
ogr_f_stealgeometry
used ingetcolumn
1. ArchGDAL OGR related types moved to parametric composite types
Trying to dispatch on types required:
1.1 Creation of two parametric singleton types for fields and geomfields, and of a featuredefn like type alias
FType
for fields, which is paramaterized by a tuple ofOGRFieldType
andOGRFieldSubType
GType
for geomfields, which is parameterized by anOGRwkbGeometryType
FDType
for featuredefn, which is a 2-tuple of a tuple ofGType
and a tuple ofFType
Note 1: I have added convert functions from
FType
andDataType
which yields a more readable correspondance between Julia DataTypes and GDAL Field type ids, than the current separated convert functions fromOGRFieldType
andOGRFieldSubType
toDataType
Note 2: I have also added a commented draft of new
Geometry
parametric singleon type (namedGeom
) based onGType
but not used it since I have not identified any benefit1.2 Duplication of most of OGR related composite types and parameterization with
FType
,GType
andFDType
The duplicated types have the
FDP
prefixFDP_GeomFieldDefn
andFDP_FieldDefn
types are parameterized byGType
andFType
FDP_FeatureDefn
,FDP_Feature
andFDP_FeatureLayer
types are parameterized byFDType
All of them have been set as subtypes of corresponding abstract supertypes
DUAL_AbstractXXX
, in order to limit the duplication of functions code when not necessary.Note: A plain FeatureDefn parametric composite type could later be used as type parameter for Feature and FeatureLayer
1.3 Modification of a few functions, used to show a layer and convert it to a table
getFDPlayer
to handle a feature layer data source with all the newly introduced parametric composite typesFDType
information, have just been extended to acceptDUAL_AbstractXXX
FDType
information, have been duplicated as generated functionsNote: In the event of the unlikely use case where there would be too many generated functions (a lot of different data sources with different featuredefn) for compilation to be performed in a reasonable time or even to succeed, the generated functions could be moved to optionally-generated functions. The fallbacks being the legacy functions
1.4 Side additions
I have added a macro to export
Enum
type id for user convenience. Disabled in the PR because it is not convenient when developping using revise.jl because it redefines the related constants. I have not found a way to inhibit the latterI have generalized the
ownedby
field in feature layer related types which may serve as a basis to fix issue Model and handle relationships between GDAL objects systematically #215, but @yeesian is tackling this one. For all intents and purposes here are below the relationships which are proposed:Ownership relations graph
destroy
functions forFDP_XXX
types have been created accordingly2. Steal geometry instead of cloning for the first geometry in features
There appears that
Tables.rows(layer)
makes a copy of the layer.Since
ogr_f_stealgeometry
is faster than the composition ofogr_f_getgeomfieldref
andogr_g_clone
, I have modifiedTables.getcolumn
to use astealgeom
function instead of the currentgetgeom
function.Unfortunately,
ogr_f_getgeomfieldref
is implemented in GDAL C interface only for the default (first) geometry. Datasources with only one geometry per feature is the general case, but that could be an issue to raise in GDAL.3. Further performance optimizations to investigate, regarding the layer to table conversion performance
FDType
to avoid callingogr_f_getfieldindex
Count the number of calls toogr_l_getnextfeature
to ensure there is no unnecessary call to itEDIT: Done only one extra call to
iterate
compared to the number of features when converting a layer to table