-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(postgres-estimated-rows): pg Estimated Rows on Data Warehouse Sync #27634
Conversation
Size Change: +33 B (0%) Total Size: 1.16 MB ℹ️ View Unchanged
|
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
@Gilbert09 I tried this out on metabase and saw that every 'estimate' returned zero, so I assume that this isn't going to work in our/all environments, I'm going to switch to a more traditional generated count(*) type query... Unfortunately, I cannot join literals and use them with table names in a subquery, so I'll have to generate a second union style query to accomplish this... On the plus side it would no longer be an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of questions before approving, but this looks good
@@ -20,6 +20,8 @@ export interface LemonTableColumn<T extends Record<string, any>, D extends keyof | |||
/** Tooltip to display on title hover. An info icon ("i" in circle) is shown when a tooltip is available. */ | |||
tooltip?: string | |||
key?: string | |||
/** If true, the column is not displayed. Optional, defaults to not disabled. */ | |||
is_disabled?: boolean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe is_visible
is better suited here - we use disabled
elsewhere for when a UI element is still visible but not interactive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, I do want to keep it as a negative for semantic and optional parameterization reasons, so I'll switch to isHidden
.
if not tables: | ||
return {} | ||
union = [ | ||
f"SELECT '{table[0]}' AS table_name, COUNT(*) AS row_count FROM {schema}.{table[0]}" for table in tables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to SQL inject here? Can we parameterize this query instead? https://www.psycopg.org/psycopg3/docs/basic/params.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered this, the injection would need to be in the table name which I don't think is possible unless there is some crazy escape sequence hack. I'd also like to be more confident, so I'm going to add some substitutions when creating the query expression. I think the substituted approach reads a bit better as well, so no complaints there either.
@@ -309,6 +309,45 @@ def filter_postgres_incremental_fields(columns: list[tuple[str, str]]) -> list[t | |||
return results | |||
|
|||
|
|||
def get_postgres_row_count( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all fine to live here right now, but we need to figure out a more longer-term solution for these SQL sources - they're all kinda over the place right now - some future pipeline work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I noticed it's a little chaotic with all of these definitions. Certainly worth consolidating some base interfaces for the corresponding providers.
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
…nto feat(postgres-estimated-rows) # Conflicts: # frontend/__snapshots__/replay-player-success--second-recording-in-list--dark.png
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
* master: (103 commits) feat(postgres-estimated-rows): pg Estimated Rows on Data Warehouse Sync (#27634) fix: revert darkmode class toggle, updated content on fills (#27783) chore: upgrade posthog-js (#27790) chore(editor-3001): add back join actions (#27740) feat: Add person distinct ID overrides squash job (as dagster job) (#27710) fix(created-by-sources): Adding `created_by` to sources (#27751) Revert "feat(data-warehouse): V2 pipeline release " (#27791) fix: typo for feature flags (#27786) fix(defer-unmounting): Defer unmounting of react elements (#27742) feat(data-warehouse): V2 pipeline release (#27732) fix(data-warehouse): Ensure dates are actual datetime formats (#27777) fix: enable hot reload for the products dir (#27746) fix: assignee selector when null (#27737) chore: clarify rrweb imports (#27776) chore(deps): Update posthog-js to 1.207.3 (#27779) feat(retention): filters on start/return event (#27770) fix(experiments): only show supported math functions (#27589) feat(web-analytics): Set unique conversions graph when adding conversions goal (#27774) chore: color design system part 1: banner and accents (#27756) chore(experiments): Add tests for funnel attribution options (#27752) ...
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
…nc (#27634) Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Problem
We did not provide row estimates for Postgres so users aren't really able to calculate cost effectively
Changes
Does this work well for both Cloud and self-hosted?
Yes
How did you test this code?
Used a real postgres instance and got some row estimates.