Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(server): logging stack traces for db errors + allow async stack traces through env var #3819

Merged
merged 1 commit into from
Jan 14, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions packages/server/knexfile.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@ import {
postgresMaxConnections,
isDevOrTestEnv,
postgresConnectionAcquireTimeoutMillis,
postgresConnectionCreateTimeoutMillis
postgresConnectionCreateTimeoutMillis,
knexAsyncStackTracesEnabled
} from '@/modules/shared/helpers/envHelper'
import { dbLogger as logger } from '@/logging/logging'
import { Knex } from 'knex'
Expand Down Expand Up @@ -84,7 +85,8 @@ const configArgs: KnexConfigArgs = {
logger,
maxConnections: postgresMaxConnections(),
connectionAcquireTimeoutMillis: postgresConnectionAcquireTimeoutMillis(),
connectionCreateTimeoutMillis: postgresConnectionCreateTimeoutMillis()
connectionCreateTimeoutMillis: postgresConnectionCreateTimeoutMillis(),
asyncStackTraces: knexAsyncStackTracesEnabled()
}

const config: Record<string, Knex.Config> = {
Expand Down
5 changes: 4 additions & 1 deletion packages/server/logging/knexMonitoring.ts
Original file line number Diff line number Diff line change
Expand Up @@ -212,14 +212,17 @@ const initKnexPrometheusMetricsForRegionEvents = async (params: {
sqlNumberBindings: data.bindings?.length || -1
})
.observe(durationSec)

const trace = (new Error().stack || '').split('\n').slice(1).join('\n').trim()
params.logger.info(
{
region,
sql: data.sql,
sqlMethod: normalizeSqlMethod(data.method),
sqlQueryId: queryId,
sqlQueryDurationMs: toNDecimalPlaces(durationMs, 0),
sqlNumberBindings: data.bindings?.length || -1
sqlNumberBindings: data.bindings?.length || -1,
trace
},
'DB query successfully completed after {sqlQueryDurationMs} ms'
)
Expand Down
6 changes: 6 additions & 0 deletions packages/server/modules/shared/helpers/envHelper.ts
Original file line number Diff line number Diff line change
Expand Up @@ -433,3 +433,9 @@ export const shouldRunTestsInMultiregionMode = () =>
export function shutdownTimeoutSeconds() {
return getIntFromEnv('SHUTDOWN_TIMEOUT_SECONDS', '300')
}

export const knexAsyncStackTracesEnabled = () => {
const envSet = process.env.KNEX_ASYNC_STACK_TRACES_ENABLED
if (!envSet) return undefined
return getBooleanFromEnv('KNEX_ASYNC_STACK_TRACES_ENABLED')
}
16 changes: 14 additions & 2 deletions packages/shared/src/environment/multiRegionConfig.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import { z } from 'zod'
import fs from 'node:fs/promises'
import { Knex, knex } from 'knex'
import { Logger } from 'pino'
import { isUndefined } from '#lodash'

const regionConfigSchema = z.object({
postgres: z.object({
Expand Down Expand Up @@ -92,6 +93,12 @@ export type KnexConfigArgs = {
applicationName: string
connectionAcquireTimeoutMillis: number
connectionCreateTimeoutMillis: number
/**
* If set to any value - true or false - will explicitly enable or disable async stack traces
* that show where queries are launched from. If not set, will default to true in dev
* and test environments
*/
asyncStackTraces?: boolean
}

export const createKnexConfig = ({
Expand All @@ -103,11 +110,16 @@ export const createKnexConfig = ({
maxConnections,
caCertificate,
connectionAcquireTimeoutMillis,
connectionCreateTimeoutMillis
connectionCreateTimeoutMillis,
asyncStackTraces
}: {
connectionString?: string | undefined
caCertificate?: string | undefined
} & KnexConfigArgs): Knex.Config => {
const shouldEnableAsyncStackTraces = isUndefined(asyncStackTraces)
? isDevOrTestEnv
: asyncStackTraces

return {
client: 'pg',
migrations: {
Expand Down Expand Up @@ -137,7 +149,7 @@ export const createKnexConfig = ({
},
// we wish to avoid leaking sql queries in the logs: https://knexjs.org/guide/#compilesqlonerror
compileSqlOnError: isDevOrTestEnv,
asyncStackTraces: isDevOrTestEnv,
asyncStackTraces: shouldEnableAsyncStackTraces,
pool: {
min: 0,
max: maxConnections,
Expand Down
5 changes: 5 additions & 0 deletions utils/helm/speckle-server/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -772,6 +772,11 @@ Generate the environment variables for Speckle server and Speckle objects deploy
- name: POSTGRES_CONNECTION_ACQUIRE_TIMEOUT_MILLIS
value: {{ .Values.db.connectionAcquireTimeoutMillis | quote }}

{{- if .Values.db.knexAsyncStackTracesEnabled }}
- name: KNEX_ASYNC_STACK_TRACES_ENABLED
value: {{ .Values.db.knexAsyncStackTracesEnabled | quote }}
{{- end}}

- name: PGSSLMODE
value: "{{ .Values.db.PGSSLMODE }}"

Expand Down
5 changes: 5 additions & 0 deletions utils/helm/speckle-server/values.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,11 @@
"description": "The maximum time in milliseconds to wait for a new connection to be created in the connection pool. Should be less than the acquisition timeout, as a new connection may need to be created then acquired.",
"default": 5000
},
"knexAsyncStackTracesEnabled": {
"type": "boolean",
"description": "If enabled, will provide better stack traces for errors arising out of knex operations",
"default": false
},
"databaseName": {
"type": "string",
"description": "(Optional) The name of the Postgres database to which Speckle will connect. Only required for the Database Monitoring utility when the connection string is to a database connection pool and multi-region is disabled, otherwise this value is ignored.",
Expand Down
3 changes: 3 additions & 0 deletions utils/helm/speckle-server/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,9 @@ db:
## @param db.connectionCreationTimeoutMillis The maximum time in milliseconds to wait for a new connection to be created in the connection pool. Should be less than the acquisition timeout, as a new connection may need to be created then acquired.
##
connectionCreationTimeoutMillis: 5000
## @param db.knexAsyncStackTracesEnabled If enabled, will provide better stack traces for errors arising out of knex operations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd nest this, like so:

    knexAsyncStackTraces:
         enabled: false

It allows us for more flexibility later, should we wish to append any more configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iainsproat gergo already merged it, but in any case I'm not aware of any other configuration options relating to async stack traces

##
knexAsyncStackTracesEnabled: false

## @param db.databaseName (Optional) The name of the Postgres database to which Speckle will connect. Only required for the Database Monitoring utility when the connection string is to a database connection pool and multi-region is disabled, otherwise this value is ignored.
databaseName: ''
Expand Down
Loading