Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance on prediction intervals for schemes #127

Open
StatsRhian opened this issue Oct 11, 2024 · 4 comments
Open

Guidance on prediction intervals for schemes #127

StatsRhian opened this issue Oct 11, 2024 · 4 comments
Assignees
Labels
help wanted Extra attention is needed must

Comments

@StatsRhian
Copy link
Member

StatsRhian commented Oct 11, 2024

Apparently some schemes are adding together P10/ P90 values to get prediction intervals across multiple rows in the spreadsheet. This is not correct, you can’t add together distributional values.

I’ve spoken to Jake/ Karen/ Peter today and the suggestion is:

Update the guidance https://connect.strategyunitwm.nhs.uk/nhp/project_information/user_guide/glossary.html#prediction-intervals to be a bit stronger on that adding these values together is wrong, rather than just different from what’s observed in the S curve

Provide some R code in the documentation so they have an option to do this themselves. Simple example from Mohammed

# Given values for two distributions (P10, mean, P90)
P10_A <- 54   # example value for P10 of A
Mean_A <- 70  # example value for mean of A
P90_A <- 86   # example value for P90 of A

P10_B <- 63   # example value for P10 of B
Mean_B <- 80  # example value for mean of B
P90_B <- 95   # example value for P90 of B

# Calculate standard deviation for each distribution
z_p10 <- qnorm(0.10)
z_p90 <- qnorm(0.90)

# Standard deviation for distribution A and B
sd_A <- (P90_A - P10_A) / (z_p90 - z_p10)
sd_B <- (P90_B - P10_B) / (z_p90 - z_p10)

# Sum of means
Mean_sum <- Mean_A + Mean_B

# Standard deviation of the sum of two independent distributions
sd_sum <- sqrt(sd_A^2 + sd_B^2)

# Calculate P10 and P90 for the summed distribution
P10_sum <- Mean_sum + z_p10 * sd_sum
P90_sum <- Mean_sum + z_p90 * sd_sum

# Output results
list(
  Mean_sum = Mean_sum,
  sd_sum = sd_sum,
  P10_sum = P10_sum,
  P90_sum = P90_sum
)

# incorrect percentiales
list(
  P10_A_inccorect_sum=P10_A+P10_B,
  P90_B_inccorect_sum=P90_A+P90_B
)
@StatsRhian StatsRhian self-assigned this Oct 11, 2024
@StatsRhian
Copy link
Member Author

I'm going to write a short lay summary and a worked example (like we did for the interaction term) for project information and share with MRMs and Mohammed / Steven

@StatsRhian
Copy link
Member Author

Just add the code initially.

@StatsRhian
Copy link
Member Author

StatsRhian commented Nov 13, 2024

I had a go at running this code on the example and I wasn't getting the same numbers (which I think is probably right but I confused myself)

Can look again but ideally with a bit more headspace!

@StatsRhian
Copy link
Member Author

StatsRhian commented Dec 24, 2024

I didn't manage to look at this :(
I've moved to sprint 3 for consideration, but maybe we want to give it to someone else? Gabriel? Bounce back to Mohammed?

@StatsRhian StatsRhian removed their assignment Jan 6, 2025
@yiwen-h yiwen-h added help wanted Extra attention is needed must labels Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed must
Projects
None yet
Development

No branches or pull requests

2 participants