Skip to content

Commit

Permalink
Add more descriptive docs + some experiments (#108)
Browse files Browse the repository at this point in the history
* Add more descriptive docs + some experiments

* Update docs project to include experiment packages

* Add more benchmark files (still raw)

* Update apply return type docs

Co-authored-by: Rafael Schouten <[email protected]>

* Update docs/src/paradigms.md

Co-authored-by: Rafael Schouten <[email protected]>

* Update docs/src/paradigms.md

Co-authored-by: Rafael Schouten <[email protected]>

* Update docs/src/peculiarities.md

* Add code for `orient` demo

This is a dashboard that's meant to be run.  Will add an animation later as well.

* Add examples from issue

* Add a true summary figure to the docs

* Import the relevant Chairmarks/BenchmarkTools functions

* Update Project.toml

* Write the GeometryOps HackMD call notes to the docs

As a hidden page for now but that can always change!

* Add MultiFloats

* Add NaturalEarth.jl devbranch when building docs

* make Julia actually execute the code

* Add Statistics, fix namespacing error

* `geometry_providers.jl`: Remove redundancy, add comments

* `vector_benchmark_plot.jl`: add a comment on top

* rearrange file

* Add warning that BoolsAsTypes are not public API

---------

Co-authored-by: Rafael Schouten <[email protected]>
  • Loading branch information
asinghvi17 and rafaqz authored Apr 23, 2024
1 parent c671b34 commit 8f46d15
Show file tree
Hide file tree
Showing 9 changed files with 533 additions and 0 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ jobs:
- uses: julia-actions/setup-julia@v1
with:
version: '1'
- name: Add custom versions of packages
run: julia --project=docs -e 'using Pkg; Pkg.add(PackageSpec(; url = "https://github.com/JuliaGeo/NaturalEarth.jl", rev = "as/scratchspaces"))'
- uses: julia-actions/julia-buildpkg@v1
- uses: julia-actions/julia-docdeploy@v1
env:
Expand Down
182 changes: 182 additions & 0 deletions benchmarks/geometry_providers.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
#=
# Geometry providers
This file benchmarks GeometryOps methods on every GeoInterface.jl implementation we can find, in order to test:
a. genericness, i.e., does GeometryOps work correctly with all GeoInterface.jl implementations?
b. performance, i.e., how does GeometryOps compare to the native implementation?
c. performance issues in the packages' implementations of GeoInterface
=#

# First, we import the providers:
using ArchGDAL, LibGEOS, Shapefile, GeoJSON, WellKnownGeometry, GeometryBasics, GeoInterface, GeoFormatTypes
PROVIDERS = (ArchGDAL, LibGEOS, GeometryBasics, GI.Wrappers)
# Now, we import GeoInterface and GeometryOps,
import GeometryOps as GO, GeoInterface as GI
# Finally, we import some utility benchmarking, plotting and data munging packages!
using BenchmarkTools, Chairmarks, CairoMakie, MakieThemes, DataFrames, Proj
using CoordinateTransformations, Rotations


# Polylabel.jl is a package that finds the "pole of inaccessibility" of a polygon,
# i.e., the point within it that is furthest away from its boundaries.

# It depends on GeometryOps, but in this instance, we'll grab some of its test geometries
# to use.
import Polylabel

# TODO: the reason we change to LibGEOS intermediately here is so that the
# linear rings of the WKG polygons are interpreted correctly. Unfortunately
# that doesn't work when read, which there's an issue up for.
water1 = GeoFormatTypes.WellKnownText(GeoFormatTypes.Geom(), readchomp(joinpath(dirname(dirname(pathof(Polylabel))), "test", "data", "water1.wkt")) |> String) |> x -> GI.convert(LibGEOS, x) |> GO.tuples
water2 = GeoFormatTypes.WellKnownText(GeoFormatTypes.Geom(), readchomp(joinpath(dirname(dirname(pathof(Polylabel))), "test", "data", "water2.wkt")) |> String) |> x -> GI.convert(LibGEOS, x) |> GO.tuples
# To fix these polygons is a complicated task, and even then LibGEOS gets it wrong:
# water1 |> x -> LibGEOS.makeValid(GI.convert(LibGEOS, x)) |> GI.getgeom |> collect |> x -> filter(y -> GI.trait(y) isa Union{GI.PolygonTrait, GI.MultiPolygonTrait}, x) |> first |> GO.tuples # hide

f, a, p = poly(water1; axis = (; title = "water1")); poly(f[1, 2], water2; axis = (; title = "water2")); f
# Now, we rotate the `water1` polygon about its centroid, so we can use it to
# test the time it takes to intersect complex polygons:
water1r = GO.transform(
Translation(GO.centroid(water1)) LinearMap(Makie.rotmatrix2d/2)) Translation((-).(GO.centroid(water1))),
water1
)
f, a, p = poly(water1; label = "Original")
poly!(water1r; label = "Rotated")
axislegend(a)
f
# WARNING: does not work
@b GO.union($(water1), $(water1r); target = GI.PolygonTrait()) seconds=3
@b LibGEOS.union($(GI.convert(LibGEOS, water1)), $(GI.convert(LibGEOS, water1r))) seconds=3
@b ArchGDAL.union($(GI.convert(ArchGDAL, water1)), $(GI.convert(ArchGDAL, water1r))) seconds=3

poly(GO.union(w1g, w1rg; target = GI.PolygonTrait()))

GI.getgeom(water1, 3) |> GI.trait

# We can benchmark each provider and see if any of them have glaring issues.

water1_centroid_suite = BenchmarkGroup()

for provider in PROVIDERS
@info "Benchmarking $provider"
geom = GI.convert(provider, water1)
water1_centroid_suite[string(provider)] = @be GO.centroid($geom) seconds=3
end


# ## Tables.jl performance in `apply`
#=
This code checks how Tables.jl performs when using `apply`.
We use two sources for this: `Shapefile.jl` and `DataFrames.jl`.
More will be coming in the future!
=#
shp_file = "/Users/anshul/Downloads/ne_10m_admin_0_countries (1)/ne_10m_admin_0_countries.shp"
table = Shapefile.Table(shp_file)
go_df = DataFrame(table)
go_df.geometry = GO.tuples(go_df.geometry);

table_suite = BenchmarkGroup()


ll2moll = Proj.Transformation("+proj=longlat +datum=WGS84", "+proj=moll")

# First, we try reprojecting the geometries using Proj,
reproject_suite = table_suite["reproject"] = BenchmarkGroup(["title:Reproject", "subtitle:All country borders from Natural Earth, 1:10m res."])

reproject_suite["Shapefile.Table"] = @be GO.reproject($table, $ll2moll) seconds=3
reproject_suite["DataFrame (Shapefile)"] = @be GO.reproject($(DataFrame(table)), $ll2moll) seconds=3
reproject_suite["DataFrame (GO)"] = @be GO.reproject($(go_df), $ll2moll) seconds=3
reproject_suite["Shapefile geoms"] = @be GO.reproject($(table.geometry), $ll2moll) seconds=3
reproject_suite["GeometryOps geoms"] = @be GO.reproject($(GO.tuples(table.geometry)), $ll2moll) seconds=3

# then transforming, just to see the difference in runtime
# between calling out to C vs pure Julia,
function _scaleby5(x)
return x .* 5
end

transform_suite = table_suite["transform"] = BenchmarkGroup(["title:Transform", "subtitle:All country borders from Natural Earth, 1:10m res."])
transform_suite["Shapefile.Table"] = @be GO.transform($_scaleby5, $table) seconds=3
transform_suite["DataFrame (Shapefile)"] = @be GO.transform($_scaleby5, $(DataFrame(table))) seconds=3
transform_suite["DataFrame (GO)"] = @be GO.transform($_scaleby5, $(go_df)) seconds=3
transform_suite["Shapefile geoms"] = @be GO.transform($_scaleby5, $(table.geometry)) seconds=3
transform_suite["GeometryOps geoms"] = @be GO.transform($_scaleby5, $(GO.tuples(table.geometry))) seconds=3

# and finally, calling `applyreduce` to find the area of each
# polygon.
area_suite = table_suite["area"] = BenchmarkGroup(["title:Area", "subtitle:All country borders from Natural Earth, 1:10m res."])

area_suite["Shapefile.Table"] = @be GO.area($(table)) seconds=3
area_suite["DataFrame (Shapefile)"] = @be GO.area($(DataFrame(table))) seconds=3
area_suite["DataFrame (GO)"] = @be GO.area($(go_df)) seconds=3
area_suite["Shapefile geoms"] = @be GO.area($(table.geometry)) seconds=3
area_suite["GeometryOps geoms"] = @be GO.area($(GO.tuples(table.geometry))) seconds=3

ts = getproperty.(area_suite["Shapefile.Table"].samples, :time)
boxplot(ones(length(ts)), ts)
violin(ones(length(ts)), ts; npoints = 3500, axis = (; yscale = log10,))


# ## Plotting
function Makie.convert_arguments(::Makie.PointBased, xs, bs::AbstractVector{<: Chairmarks.Benchmark})
ts = getproperty.(Statistics.mean.(bs), :time)
return (xs, ts)
end

function Makie.convert_arguments(::Makie.PointBased, bs::AbstractVector{<: Chairmarks.Benchmark})
ts = getproperty.(Statistics.mean.(bs), :time)
return (1:length(bs), ts)
end

function Makie.convert_arguments(::Makie.SampleBased, b::Chairmarks.Benchmark)
ts = getproperty.(b.samples, :time)
return (ones(length(ts)), ts)
end

function Makie.convert_arguments(::Makie.SampleBased, n::Number, b::Chairmarks.Benchmark)
ts = getproperty.(b.samples, :time)
return (fill(n, length(ts)), ts)
end

function Makie.convert_arguments(::Makie.SampleBased, labels::AbstractVector{<: AbstractString}, bs::AbstractVector{<: Chairmarks.Benchmark})
ts = map(b -> getproperty.(b.samples, :time), bs)
labels =
return flatten
end

function Makie.convert_arguments(::Type{Makie.Errorbars}, xs, bs::AbstractVector{<: Chairmarks.Benchmark})
ts = map(b -> getproperty.(b.samples, :time), bs)
means = map(Statistics.mean, ts)
stds = map(Statistics.std, ts)
return (xs, ts)
end

ks = keys(area_suite) |> collect .|> identity

bs = getindex.((area_suite,), ks)
b_lengths = length.(getproperty.(bs, :samples))
b_timing_flattened = collect(Iterators.flatten(Iterators.map(b -> getproperty.(b.samples, :time), bs)))
k_strings = Iterators.flatten((fill(k, bl) for (k, bl) in zip(ks, b_lengths))) |> collect

f = Figure()
ax = Axis(f[1, 1];
convert_dim_1=Makie.CategoricalConversion(; sortby=nothing),
)
violin!(ax, k_strings, b_timing_flattened .|> log10)
f
ax.yscale = log10
ax.xticklabelrotation = π/12
f


bs = values(area_suite) |> collect .|> identity
labels = ["ST", "DS", "DG", "SG", "GG"]


using AlgebraOfGraphics

boxplot(b1)
boxplot!.(1:5, values(area_suite) |> collect .|> identity)
Makie.current_figure()
Makie.current_axis().yscale = log10

data((; x = labels, y = bs)) * mapping(:y => verbatim, :x, :y) * visual(BoxPlot) |> draw
125 changes: 125 additions & 0 deletions benchmarks/vector_benchmark_plot.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
#=
# `vector-benchmark` result plot
This code plots the results of the `kadyb/vector-benchmark` repository,
and needs the MakieTeX SVG pr for now.
The unique feature (and what takes up so many lines of code) is that
the scatter markers for each language are SVGs of the logo! This
makes the plot eye-catching and allows users to quickly grasp language
wise performance.
Stepwise, here's what is going on:
1. It loads the benchmark data from a CSV file into a DataFrame.
2. It defines color and marker mappings for each package, where the markers are SVG logos of the respective programming languages.
3. It uses the beeswarm function from the SwarmMakie package to create a scatter plot, where the x-axis represents the different benchmark tasks, and the y-axis represents the median execution time (in seconds) on a log scale.
4. The scatter points are colored and marked according to the package and programming language, using the predefined color and marker mappings.
5. It adds a legend to the plot, displaying the package names and their corresponding language logos.
=#

using CairoMakie, MakieTeX, SwarmMakie

using CSV, DataFrames, CategoricalArrays
using DataToolkit

path_to_makietex_datatoml = joinpath(dirname(dirname(@__DIR__)), "MakieTeX", "docs", "Data.toml")
data = DataToolkit.load(path_to_makietex_datatoml)


using DataToolkit, DataFrames, StatsBase
using CairoMakie, SwarmMakie #=beeswarm plots=#, Colors
using MakieTeX # for SVG icons

function svg_icon(name::String)
if name == "go"
icon = d"go-logo-solid::IO"
else
path = "svg/$name.svg"
icon = get(d"file-icons::Dict{String,IO}", path, nothing)
end
if isnothing(icon)
icon = get(d"file-icons-mfixx::Dict{String,IO}", path, nothing)
end
if isnothing(icon)
icon = get(d"file-icons-devopicons::Dict{String,IO}", path, nothing)
end
isnothing(icon) && return missing
return CachedSVG(read(seekstart(icon), String))
end

const colours_vibrant = range(LCHab(60,70,0), stop=LCHab(60,70,360), length=36)
const colours_dim = range(LCHab(25,50,0), stop=LCHab(25,50,360), length=36)

const julia_logo = svg_icon("Julia")
const r_logo = svg_icon("R")
const python_logo = svg_icon("python")

marker_map = Dict(
"geometryops" => julia_logo,
# "gdal-jl" => julia_logo,
"sf" => r_logo,
"terra" => r_logo,
"geos" => r_logo,
"s2" => r_logo,
"geopandas" => python_logo,
)


color_map = Dict(
# R packages
"sf" => Makie.wong_colors()[1],
"s2" => Makie.wong_colors()[5],
"terra" => Makie.wong_colors()[6],
"geos" => Makie.wong_colors()[4],
# Python package
"geopandas" => Makie.wong_colors()[2],
# Julia package
"geometryops" => Makie.wong_colors()[3],
)

path_to_vector_benchmark = "/Users/anshul/git/vector-benchmark"
timings_df = CSV.read(joinpath(path_to_vector_benchmark, "timings.csv"), DataFrame)
replace!(timings_df.package, "sf-project" => "sf", "sf-transform" => "sf")

# now plot

task_ca = CategoricalArray(timings_df.task)

group_marker = [MarkerElement(; color = color_map[package], marker = marker_map[package], markersize = 12) for package in keys(marker_map)]
names_marker = collect(keys(marker_map))
lang_markers = ["R" => r_logo, "Python" => python_logo, "Julia" => julia_logo]
group_package = [MarkerElement(; marker, markersize = 12) for (lang, marker) in lang_markers]
names_package = first.(lang_markers)


f, a, p = beeswarm(
task_ca.refs, timings_df.median;
marker = getindex.((marker_map,), timings_df.package),
color = getindex.((color_map,), timings_df.package),
markersize = 10,
axis = (;
xticks = (1:length(task_ca.pool.levels), task_ca.pool.levels),
xlabel = "Task",
ylabel = "Median time (s)",
yscale = log10,
title = "Benchmark vector operations",
xgridvisible = false,
xminorgridvisible = true,
yminorgridvisible = true,
yminorticks = IntervalsBetween(5),
ygridcolor = RGBA{Float32}(0.0f0,0.0f0,0.0f0,0.05f0),
)
)
leg = Legend(
f[1, 2],
[group_marker, group_package],
[names_marker, names_package],
["Package", "Language"],
tellheight = false,
tellwidth = true,
gridshalign = :left,
)
resize!(f, 650, 450)
a.spinewidth[] = 0.5
f
5 changes: 5 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
[deps]
AccurateArithmetic = "22286c92-06ac-501d-9306-4abd417d9753"
Base64 = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"
BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
Expand All @@ -10,6 +11,8 @@ DataStructures = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DocumenterVitepress = "4710194d-e776-4893-9690-8d956a29c365"
DoubleFloats = "497a8b3b-efae-58df-a0af-a86822472b78"
ExactPredicates = "429591f6-91af-11e9-00e2-59fbe8cec110"
GeoDatasets = "ddc7317b-88db-5cb5-a849-8449e5df04f9"
GeoInterface = "cf35fbd7-0cd7-5166-be24-54bfbe79505f"
GeoInterfaceMakie = "0edc0954-3250-4c18-859d-ec71c1660c08"
Expand All @@ -20,7 +23,9 @@ LibGEOS = "a90b1aa1-3769-5649-ba7e-abc5a9d163eb"
Literate = "98b081ad-f1c9-55d3-8b20-4c87d4299306"
Makie = "ee78f7c6-11fb-53f2-987a-cfe4a2b5a57a"
MakieThemes = "e296ed71-da82-5faf-88ab-0034a9761098"
MultiFloats = "bdf0d083-296b-4888-a5b6-7498122e68a5"
Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7"
Proj = "c94c279d-25a6-4763-9509-64d165bea63e"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
Shapefile = "8e980c4a-a4fe-5da2-b3a7-4b4b0353a2f4"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
7 changes: 7 additions & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,9 @@ withenv("JULIA_DEBUG" => "Literate") do # allow Literate debug output to escape
# TODO: We should probably fix the above in `process_literate_recursive!`.
end

# Now that the Literate stuff is done, we also download the call notes from HackMD:
download("https://hackmd.io/kpIqAR8YRJOZQDJjUKVAUQ/download", joinpath(@__DIR__, "src", "call_notes.md"))

# Finally, make the docs!
makedocs(;
modules=[GeometryOps],
Expand All @@ -91,6 +94,10 @@ makedocs(;
pages=[
"Introduction" => "introduction.md",
"API Reference" => "api.md",
"Explanations" => [
"Paradigms" => "paradigms.md",
"Peculiarities" => "peculiarities.md",
],
"Source code" => literate_pages,
],
warnonly = true,
Expand Down
Loading

0 comments on commit 8f46d15

Please sign in to comment.