Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the sampling of the pgsa query #255

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Fix the sampling of the pgsa query #255

wants to merge 1 commit into from

Conversation

marco44
Copy link
Contributor

@marco44 marco44 commented Jan 22, 2025

It wrongly computes the sampling: the "total" column is not correct in this context: we (most of the time) have more than 1 record per sample. If you have 100 sessions, the sampling will be off by a factor of 100, and you'll end up with only 1 sample

It's a bit ugly to fix, because we need another subquery...

Closes #254

@marco44 marco44 added the bug label Jan 22, 2025
@marco44 marco44 requested a review from rjuju January 22, 2025 09:02
@marco44 marco44 self-assigned this Jan 22, 2025
It wrongly computes the sampling: the "total" column is not correct in
this context: we (most of the time) have more than 1 record per sample.
If you have 100 sessions, the sampling will be off by a factor of 100,
and you'll end up with only 1 sample

It's a bit ugly to fix, because we need another subquery...

Closes #254
@@ -517,9 +517,9 @@ def BASE_QUERY_PGSA_SAMPLE(per_db=False):

# We use dense_rank() as we need ALL the records for a specific ts
return """
(SELECT *,max(number) OVER () AS total FROM
(SELECT pgsa_history.srvid,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you indent this as this is now a sub query?

@@ -546,7 +546,7 @@ def BASE_QUERY_PGSA_SAMPLE(per_db=False):
AND pgsac.srvid = %(server)s
) AS pgsa_history
{extra}
) AS pgsa
) AS temp) AS pgsa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ideally we should avoid keywords here even if that seems to be allowed. maybe just tmp or something

@rjuju
Copy link
Member

rjuju commented Feb 8, 2025

yeah the need for a sub query is annoying.

For the record some other db like duckdb now support a QUALIFY clause (https://duckdb.org/docs/sql/query_syntax/qualify.html) which fixes that problems. I implemented a prototype for postgres some months ago but never sent it. It's not part of the standard so I don't think it will be accepted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sampling doesn't work on the Global activity/Parallel activity/Backend age views
2 participants