Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler reminder app problem - etcdserver: request timed out #245

Open
elena-kolevska opened this issue Jan 7, 2025 · 3 comments
Open

Comments

@elena-kolevska
Copy link
Contributor

elena-kolevska commented Jan 7, 2025

I noticed the scheduler remidner stopped triggering and found this in the scheduler logs; looks like etcd is experiencing delays in processing ReadIndex requests. We need to investigate why.

{"level":"warn","ts":"2025-01-05T09:11:23.551261Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536073,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:24.051853Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536073,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:24.552351Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536073,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:25.002321Z","caller":"v3rpc/interceptor.go:197","msg":"request stats","start time":"2025-01-05T09:11:18.001498Z","time spent":"7.000790212s","remote":"[::1]:42392","response type":"/etcdserverpb.KV/Put","request count":1,"request size":109,"response count":0,"response size":0,"request content":"key:\"dapr/counters/actorreminder||longhaul-test||playerActorType||player-1||healthDecayReminder\" value_size:15 "}
{"level":"warn","ts":"2025-01-05T09:11:25.002628Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00013a5a0/[::]:2379","attempt":0,"error":"rpc error: code = Unavailable desc = etcdserver: request timed out"}
{"level":"error","ts":"2025-01-05T09:11:25.002701Z","logger":"diagrid-cron.queue","caller":"queue/queue.go:180","msg":"failure marking job for next trigger","name":"dapr/jobs/actorreminder||longhaul-test||playerActorType||player-1||healthDecayReminder","error":"etcdserver: request timed out","stacktrace":"github.com/diagridio/go-etcd-cron/internal/queue.(*Queue).handleTrigger\n\t/home/runner/go/pkg/mod/github.com/diagridio/[email protected]/internal/queue/queue.go:180\ngithub.com/diagridio/go-etcd-cron/internal/queue.(*Queue).Run.(*Queue).executeFn.func1.1\n\t/home/runner/go/pkg/mod/github.com/diagridio/[email protected]/internal/queue/queue.go:231"}
{"level":"warn","ts":"2025-01-05T09:11:25.052967Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536073,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:25.553909Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536073,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:26.055066Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536073,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:26.555571Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536073,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:27.056203Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536073,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:27.556388Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536073,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:28.057307Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536073,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:28.557727Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536073,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:29.058863Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536073,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:29.559090Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536073,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:30.051301Z","caller":"etcdserver/server.go:1237","msg":"Failed to check current member's leadership","error":"context deadline exceeded"}
{"level":"warn","ts":"2025-01-05T09:11:30.051999Z","caller":"etcdserver/v3_server.go:932","msg":"timed out waiting for read index response (local node might have slow network)","timeout":"7s"}
{"level":"warn","ts":"2025-01-05T09:11:31.053912Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536086,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:31.554505Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536086,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:32.006054Z","caller":"v3rpc/interceptor.go:197","msg":"request stats","start time":"2025-01-05T09:11:25.005681Z","time spent":"7.000349152s","remote":"[::1]:42392","response type":"/etcdserverpb.KV/Put","request count":1,"request size":109,"response count":0,"response size":0,"request content":"key:\"dapr/counters/actorreminder||longhaul-test||playerActorType||player-1||healthDecayReminder\" value_size:15 "}
{"level":"warn","ts":"2025-01-05T09:11:32.006232Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00013a5a0/[::]:2379","attempt":0,"error":"rpc error: code = Unavailable desc = etcdserver: request timed out"}
{"level":"error","ts":"2025-01-05T09:11:32.006352Z","logger":"diagrid-cron.queue","caller":"queue/queue.go:180","msg":"failure marking job for next trigger","name":"dapr/jobs/actorreminder||longhaul-test||playerActorType||player-1||healthDecayReminder","error":"etcdserver: request timed out","stacktrace":"github.com/diagridio/go-etcd-cron/internal/queue.(*Queue).handleTrigger\n\t/home/runner/go/pkg/mod/github.com/diagridio/[email protected]/internal/queue/queue.go:180\ngithub.com/diagridio/go-etcd-cron/internal/queue.(*Queue).Run.(*Queue).executeFn.func1.1\n\t/home/runner/go/pkg/mod/github.com/diagridio/[email protected]/internal/queue/queue.go:231"}
{"level":"warn","ts":"2025-01-05T09:11:32.054743Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536086,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:32.554900Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536086,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:33.055690Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536086,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:33.556301Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536086,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:34.056903Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536086,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:34.558132Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536086,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:35.059709Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536086,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:35.560616Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536086,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:36.060893Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536086,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:36.561973Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536086,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:37.062540Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536086,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:37.553957Z","caller":"etcdserver/v3_server.go:932","msg":"timed out waiting for read index response (local node might have slow network)","timeout":"7s"}
{"level":"warn","ts":"2025-01-05T09:11:37.553978Z","caller":"etcdserver/server.go:1237","msg":"Failed to check current member's leadership","error":"context deadline exceeded"}
{"level":"warn","ts":"2025-01-05T09:11:38.054295Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536090,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:38.554906Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536090,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:39.002339Z","caller":"v3rpc/interceptor.go:197","msg":"request stats","start time":"2025-01-05T09:11:32.001749Z","time spent":"7.000554617s","remote":"[::1]:42392","response type":"/etcdserverpb.KV/Put","request count":1,"request size":105,"response count":0,"response size":0,"request content":"key:\"dapr/counters/actorreminder||longhaul-test||playerActorType||player-1||healthReminder\" value_size:16 "}
{"level":"warn","ts":"2025-01-05T09:11:39.002538Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00013a5a0/[::]:2379","attempt":0,"error":"rpc error: code = Unavailable desc = etcdserver: request timed out"}
{"level":"error","ts":"2025-01-05T09:11:39.002671Z","logger":"diagrid-cron.queue","caller":"queue/queue.go:180","msg":"failure marking job for next trigger","name":"dapr/jobs/actorreminder||longhaul-test||playerActorType||player-1||healthReminder","error":"etcdserver: request timed out","stacktrace":"github.com/diagridio/go-etcd-cron/internal/queue.(*Queue).handleTrigger\n\t/home/runner/go/pkg/mod/github.com/diagridio/[email protected]/internal/queue/queue.go:180\ngithub.com/diagridio/go-etcd-cron/internal/queue.(*Queue).Run.(*Queue).executeFn.func1.1\n\t/home/runner/go/pkg/mod/github.com/diagridio/[email protected]/internal/queue/queue.go:231"}
{"level":"warn","ts":"2025-01-05T09:11:39.008042Z","caller":"v3rpc/interceptor.go:197","msg":"request stats","start time":"2025-01-05T09:11:32.007619Z","time spent":"7.000343603s","remote":"[::1]:42392","response type":"/etcdserverpb.KV/Put","request count":1,"request size":109,"response count":0,"response size":0,"request content":"key:\"dapr/counters/actorreminder||longhaul-test||playerActorType||player-1||healthDecayReminder\" value_size:15 "}
{"level":"warn","ts":"2025-01-05T09:11:39.008199Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00013a5a0/[::]:2379","attempt":0,"error":"rpc error: code = Unavailable desc = etcdserver: request timed out"}
{"level":"error","ts":"2025-01-05T09:11:39.008290Z","logger":"diagrid-cron.queue","caller":"queue/queue.go:180","msg":"failure marking job for next trigger","name":"dapr/jobs/actorreminder||longhaul-test||playerActorType||player-1||healthDecayReminder","error":"etcdserver: request timed out","stacktrace":"github.com/diagridio/go-etcd-cron/internal/queue.(*Queue).handleTrigger\n\t/home/runner/go/pkg/mod/github.com/diagridio/[email protected]/internal/queue/queue.go:180\ngithub.com/diagridio/go-etcd-cron/internal/queue.(*Queue).Run.(*Queue).executeFn.func1.1\n\t/home/runner/go/pkg/mod/github.com/diagridio/[email protected]/internal/queue/queue.go:231"}
{"level":"warn","ts":"2025-01-05T09:11:39.055061Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536090,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:39.556214Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536090,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:40.057419Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536090,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:40.558976Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536090,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:41.059086Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536090,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:41.175294Z","caller":"v3rpc/interceptor.go:197","msg":"request stats","start time":"2025-01-05T09:11:36.174699Z","time spent":"5.000588363s","remote":"[::1]:42392","response type":"/etcdserverpb.Lease/LeaseRevoke","request count":-1,"request size":-1,"response count":-1,"response size":-1,"request content":""}
{"level":"warn","ts":"2025-01-05T09:11:41.175291Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00013a5a0/[::]:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
{"level":"warn","ts":"2025-01-05T09:11:41.562191Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536090,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:42.062437Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536090,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:42.562791Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536090,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:43.063298Z","caller":"etcdserver/v3_server.go:920","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":2173430840744536090,"retry-timeout":"500ms"}
{"level":"warn","ts":"2025-01-05T09:11:43.520094Z","caller":"etcdserver/server.go:1237","msg":"Failed to check current member's leadership","error":"context deadline exceeded"}
{"level":"warn","ts":"2025-01-05T09:12:16.867496Z","caller":"etcdserver/v3_server.go:932","msg":"timed out waiting for read index response (local node might have slow network)","timeout":"7s"}
{"level":"warn","ts":"2025-01-05T09:12:16.867834Z","caller":"v3rpc/interceptor.go:197","msg":"request stats","start time":"2025-01-05T09:11:39.009658Z","time spent":"37.858114845s","remote":"[::1]:42392","response type":"/etcdserverpb.KV/Put","request count":1,"request size":109,"response count":0,"response size":0,"request content":"key:\"dapr/counters/actorreminder||longhaul-test||playerActorType||player-1||healthDecayReminder\" value_size:15 "}
{"level":"warn","ts":"2025-01-05T09:12:16.871246Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00013a5a0/[::]:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
{"level":"warn","ts":"2025-01-05T09:12:16.872546Z","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00013a5a0/[::]:2379","attempt":0,"error":"rpc error: code = Canceled desc = context canceled"}
{"level":"warn","ts":"2025-01-05T09:12:16.872285Z","caller":"wal/wal.go:805","msg":"slow fdatasync","took":"58.870717526s","expected-duration":"1s"}
{"level":"warn","ts":"2025-01-05T09:11:43.520196Z","caller":"etcdserver/server.go:1177","msg":"Ignore the lease revoking request because current member isn't a leader","local-member-id":16674004417951637033}
{"level":"info","ts":"2025-01-05T09:12:16.872636Z","logger":"diagrid-cron.garbage-collector","caller":"garbage/collector.go:98","msg":"Shutting down garbage collector"}
{"level":"error","ts":"2025-01-05T09:12:16.874195Z","logger":"diagrid-cron.queue","caller":"queue/queue.go:180","msg":"failure marking job for next trigger","name":"dapr/jobs/actorreminder||longhaul-test||playerActorType||player-1||healthDecayReminder","error":"context deadline exceeded","stacktrace":"github.com/diagridio/go-etcd-cron/internal/queue.(*Queue).handleTrigger\n\t/home/runner/go/pkg/mod/github.com/diagridio/[email protected]/internal/queue/queue.go:180\ngithub.com/diagridio/go-etcd-cron/internal/queue.(*Queue).Run.(*Queue).executeFn.func1.1\n\t/home/runner/go/pkg/mod/github.com/diagridio/[email protected]/internal/queue/queue.go:231"}
{"level":"error","ts":"2025-01-05T09:12:16.875131Z","logger":"diagrid-cron.queue","caller":"queue/queue.go:195","msg":"failure failing job for next retry trigger","name":"dapr/jobs/actorreminder||longhaul-test||playerActorType||player-1||healthReminder","error":"context canceled","stacktrace":"github.com/diagridio/go-etcd-cron/internal/queue.(*Queue).handleTrigger\n\t/home/runner/go/pkg/mod/github.com/diagridio/[email protected]/internal/queue/queue.go:195\ngithub.com/diagridio/go-etcd-cron/internal/queue.(*Queue).Run.(*Queue).executeFn.func1.1\n\t/home/runner/go/pkg/mod/github.com/diagridio/[email protected]/internal/queue/queue.go:231"}
time="2025-01-05T09:12:16.891426906Z" level=info msg="EtcdCron shutting down" instance=dapr-scheduler-server-0 scope=dapr.scheduler.server.cron type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891475695Z" level=info msg="Removing a Sidecar connection from Scheduler for appID: longhaul-test/scheduler-workflow." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891506854Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.scheduler-workflow.workflow." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891516798Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.scheduler-workflow.activity." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891531163Z" level=info msg="Removing a Sidecar connection from Scheduler for appID: longhaul-test/scheduler-actor-reminders-server." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891547984Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/playerActorType." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891564021Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.scheduler-actor-reminders-server.workflow." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891577029Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.scheduler-actor-reminders-server.activity." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891591476Z" level=info msg="Removing a Sidecar connection from Scheduler for appID: longhaul-test/scheduler-actor-reminders-client." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891611801Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.scheduler-actor-reminders-client.workflow." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891624713Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.scheduler-actor-reminders-client.activity." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.89163875Z" level=info msg="Removing a Sidecar connection from Scheduler for appID: longhaul-test/scheduler-jobs." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891657958Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.scheduler-jobs.workflow." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.89166961Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.scheduler-jobs.activity." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891685112Z" level=info msg="Removing a Sidecar connection from Scheduler for appID: longhaul-test/workflow-gen." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891702741Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.workflow-gen.workflow." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891711981Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.workflow-gen.activity." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891729811Z" level=info msg="Removing a Sidecar connection from Scheduler for appID: longhaul-test/validation-worker." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891748463Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.validation-worker.workflow." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891759798Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.validation-worker.activity." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891774513Z" level=info msg="Removing a Sidecar connection from Scheduler for appID: longhaul-test/message-analyzer." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891793242Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.message-analyzer.workflow." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.89180314Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.message-analyzer.activity." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891818499Z" level=info msg="Removing a Sidecar connection from Scheduler for appID: longhaul-test/pubsub-workflow." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891837071Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.pubsub-workflow.workflow." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891847919Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.pubsub-workflow.activity." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.891864251Z" level=info msg="Removing a Sidecar connection from Scheduler for appID: longhaul-test/feed-generator." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.897866621Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.feed-generator.workflow." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.897896834Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.feed-generator.activity." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.897923785Z" level=info msg="Removing a Sidecar connection from Scheduler for appID: longhaul-test/snapshot." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.897954978Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.snapshot.workflow." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.897963165Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.snapshot.activity." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.89797556Z" level=info msg="Removing a Sidecar connection from Scheduler for appID: longhaul-test/hashtag-counter." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.897991313Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.hashtag-counter.workflow." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.898000616Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.hashtag-counter.activity." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.898012701Z" level=info msg="Removing a Sidecar connection from Scheduler for appID: longhaul-test/hashtag-actor." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.898026779Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/HashTagActor." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.898034562Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.hashtag-actor.workflow." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
time="2025-01-05T09:12:16.898042382Z" level=info msg="Removing a Sidecar connection from Scheduler for actor type: longhaul-test/dapr.internal.longhaul-test.hashtag-actor.activity." instance=dapr-scheduler-server-0 scope=dapr.runtime.scheduler type=log ver=1.15.0-rc.1
{"level":"warn","ts":"2025-01-05T09:12:16.920569Z","caller":"etcdserver/util.go:123","msg":"failed to apply request","took":"95.642µs","request":"header:<ID:2173430840744536094 > lease_revoke:<id:1e2993d596730e04>","response":"size:30","error":"lease not found"}
{"level":"warn","ts":"2025-01-05T09:12:16.920677Z","caller":"etcdserver/server.go:1198","msg":"failed to revoke lease","lease-id":"1e2993d596730e04","error":"lease not found"}

@elena-kolevska
Copy link
Contributor Author

It looks like etcd wasn't able to flush to disk properly ("msg":"slow fdatasync","took":"58.870717526s","expected-duration":"1s") which locked up the instance and all calls to it failed.
That's not a code problem per se, but since Scheduler is completely dependent on etcd, I think we should panic if it fails.
If you agree @cicoyle @JoshVanL let's move this issue to dapr/dapr.

@JoshVanL
Copy link
Contributor

We should never panic, but should gracefully error exit.

@elena-kolevska
Copy link
Contributor Author

Yeah, mind lapse, sorry; that's what I meant. exit 1 :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants