Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: DashMap based catalog #1585

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

CommanderStorm
Copy link
Collaborator

@CommanderStorm CommanderStorm commented Nov 16, 2024

This PR changes the internal datastructure to be Dashmap<..> as suggested in #1179 (review)
I thought that the data/configuration/.. split might be more managble in terms of reviwing and shipping this.

Sadly two changes did cascade a bit:

  • &dyn Source -> Box<dyn Source>
  • DashMap does not guarantee sorting of keys => I need to manually do so

I had some free time and this seemed like a nice issue to progress on, I hope you (@sharkAndshark) don't mind me stealing this part ^^

@CommanderStorm
Copy link
Collaborator Author

CommanderStorm commented Nov 16, 2024

Here is a benchmark of the versions "requesting" tiles (i.e. without the network/file io overhead)

branch features image
master - image
dashmap-based-catalog inline image
dashmap-based-catalog rayon image
dashmap-based-catalog rayon
inline
image

=> it is a significant performance drop, but 60ns to enable updating is likely an okay tradeoff. What do you people think?

@sharkAndshark
Copy link
Collaborator

sharkAndshark commented Nov 17, 2024

I'm really happy to see you pick this up again. It's a looooooong requested feature. And Martin needs more contributor. Go ahead! : D @CommanderStorm

@sharkAndshark
Copy link
Collaborator

Consider making the use of dashmap and /refresh API optional (with a default)? @nyurik This would help users like me who don't mind rebooting Martin to reflect database updates.

@CommanderStorm
Copy link
Collaborator Author

CommanderStorm commented Nov 17, 2024

Consider making dashmap optional

No, splitting the implementation on one of the core datastructures is a great way to get bugs and maintainability issues.
The 60ns performance drop should not matter to you (remember that 60ns = 0.000060ms)

/refresh is not included in this PR, but will be disablable in the config as far as I read the spec

@sharkAndshark
Copy link
Collaborator

sharkAndshark commented Nov 17, 2024

60ns means nothing to me. I'm totally Ok the performance part.
So yes let's use dashmap and only make /refresh optional(in the next PRs):D

Copy link
Member

@nyurik nyurik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work!! Seriously impressive :)

A few things:

  • could you explain a bit why you need boxed source? Can we work around it?
  • please move the sorting output into a separate PR - this way test changes do not get in the way of the review
  • this also goes for anything unrelated to this pr - switching to dashmap is a complex PR - so the fewer changes we have in it, the easier it is to spot issues

plus a few minor inlines

martin/Cargo.toml Outdated Show resolved Hide resolved
martin/src/fonts/mod.rs Outdated Show resolved Hide resolved
@@ -392,7 +392,7 @@ fn parse_encoding(encoding: &str) -> MartinCpResult<AcceptEncoding> {
async fn init_schema(
mbt: &Mbtiles,
conn: &mut SqliteConnection,
sources: &[&dyn Source],
sources: &[Box<dyn Source>],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not certain why you need a boxed type here - what caused it to change? Also, you might want to use TileInfoSource for consistency

martin/src/sprites/mod.rs Outdated Show resolved Hide resolved
@@ -198,18 +195,18 @@ async fn parse_sprite(
}

pub async fn get_spritesheet(
sources: impl Iterator<Item = &SpriteSource>,
paths: impl Iterator<Item = &PathBuf>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this relevant to this PR? I am usually ok to mix in a bunch of unrelated changes into smaller / linter / cleanup PRs, but in case of a the complex PRs that already do a lot of fundamental system changes, it is better to move unrelated stuff to a separate PR

Comment on lines +45 to 52
pub fn get_source(&self, id: &str) -> actix_web::Result<TileInfoSource> {
Ok(self
.0
.get(id)
.ok_or_else(|| ErrorNotFound(format!("Source {id} does not exist")))?
.as_ref())
.value()
.clone())
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets provide the rationale why the Box change was done:

The change to the boxed variant comes from here and cascades pretty far sadly.
Because accessing a value returns Ref<..> => a temporary variable

image

@CommanderStorm CommanderStorm marked this pull request as draft December 13, 2024 16:24
CommanderStorm added a commit that referenced this pull request Dec 13, 2024
This PR pulls the changes to `Box<dyn Source>` vs `TileInfoSource` from
- #1585
CommanderStorm added a commit that referenced this pull request Dec 15, 2024
This PR splits out the test reproducibility fix from #1585

Currently, these fields not being sorted does not pose an issue, but
with #1585, it would at least be differnt
=> we likely should not rely upon this in our CI
@CommanderStorm CommanderStorm marked this pull request as ready for review December 15, 2024 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants