-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add chain node timing metrics #11011
Conversation
dd71f6a
to
d28eb12
Compare
Deploying agoric-sdk with
|
Latest commit: |
a149714
|
Status: | ✅ Deploy successful! |
Preview URL: | https://74a4058c.agoric-sdk.pages.dev |
Branch Preview URL: | https://chain-node-timing-metrics.agoric-sdk.pages.dev |
d28eb12
to
0f0d765
Compare
@mhofman I don't think you can approve a PR that started off as yours, so let's handle it less formally and I'll apply my approval as a proxy for yours (once granted). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall.
I agree that switching to performance.now()
might be better and I think you make the necessary refactoring for that. Should we go all the way in this PR?
I'm not sure after all we should calculate block lag after the "after commit hangover". It might be better to just measure as close to the cosmos triggers as possible.
@@ -496,6 +568,12 @@ export async function launch({ | |||
initialQueueLengths, | |||
}); | |||
|
|||
const blockMetrics = Object.fromEntries( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have proper typing for this.fromEntries
tends to erase key types at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
const blockLag = times.previousAfterCommitBlockPosix - blockTime * 1000; | ||
times.previousAfterCommitBlockPosix = toPosix(hangoverTimestamp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name here is somewhat confusing. I'm actually wondering if it wouldn't be better to calculate the lag as purely the cosmos lag, aka ignoring the afterCommitHangover
time, and use the beginBlockTimestamp
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Not just yet; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved
7dc75d4
to
95fbd76
Compare
Deploying agoric-sdk with
|
Latest commit: |
95fbd76
|
Status: | ✅ Deploy successful! |
Preview URL: | https://d28cbc07.agoric-sdk.pages.dev |
Branch Preview URL: | https://chain-node-timing-metrics.agoric-sdk.pages.dev |
await background tasks closer to attribution so we can measure them
…s for clarity and accuracy
…ails The buckets have so much overlap that it makes sense to use a single shared definition for all of them except blockLag (which now extends that definition to include more boundaries beyond 30 seconds).
The buckets have so much overlap that it makes sense to use a single shared definition for all of them except blockLag (which now extends that definition to include more boundaries beyond 30 seconds).
95fbd76
to
a913df3
Compare
Deploying agoric-sdk with
|
Latest commit: |
a913df3
|
Status: | ✅ Deploy successful! |
Preview URL: | https://340589de.agoric-sdk.pages.dev |
Branch Preview URL: | https://chain-node-timing-metrics.agoric-sdk.pages.dev |
closes: #10960
Description
Formalizes some chain timing measurements and report them as explicit open telemetry metrics.
Security Considerations
None
Scaling Considerations
Histograms create a few raw metrics (depending on bucketing), and this reports 8 new histograms metrics, resulting in about 100 new raw values.
Documentation Considerations
None external.
Testing Considerations
TBD
Upgrade Considerations
None of this affects consensus, but being part of the chain software, this requires deploying new chain software.