Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bgpd crash (frr 8.4.2, FreeBSD 13.1p5) #12693

Closed
1 of 2 tasks
opsec opened this issue Jan 26, 2023 · 14 comments
Closed
1 of 2 tasks

bgpd crash (frr 8.4.2, FreeBSD 13.1p5) #12693

opsec opened this issue Jan 26, 2023 · 14 comments
Labels
triage Needs further investigation

Comments

@opsec
Copy link

opsec commented Jan 26, 2023

Describe the bug

  • bgpd crashed after approx. 12hours of operation
  • from the FreeBSD ports tree, frr version 8.4.2
  • running on FreeBSD 13.1p5, amd64 architecture
  • CPU: Intel(R) Xeon(R) GoldIntel(R) 5315Y CPU @ 3.20GHz
  • NICs: 2x Intel(R) X550-T2 and 2x Mellanox MCX512A-ACAT
BGP: Received signal 6 at 1674728565 (si_addr 0x0); aborting...
BGP: in thread work_queue_run scheduled from lib/workqueue.c:137 work_queue_schedule()
  • Did you check if this is a duplicate issue? Yes, found no crash for 8.4.2
  • Did you test it on the latest FRRouting/frr master branch? No, will try to build, it is a production system

To Reproduce

It's unclear if/how this can be reproduced. It was no out-of-memory condition.

Expected behavior

bgpd should not crash

Versions

  • OS Version: FreeBSD 13.1p5
  • Kernel: FreeBSD 13.1p3
  • FRR Version: 8.4.2

Additional context
14 ipv4 peers
8 ipv6 peers

@opsec opsec added the triage Needs further investigation label Jan 26, 2023
@ton31337
Copy link
Member

Possible to get a full stack trace of the crash?

@opsec
Copy link
Author

opsec commented Jan 26, 2023

I'm trying to find a way to get a stack trace. /var/tmp/frr/ contains some stuff, but no core dumps.

@donaldsharp
Copy link
Member

the backtrace might be in one of the log files. Look in /var/log/frr as well

@opsec
Copy link
Author

opsec commented Jan 26, 2023

I think it ran into the assert(wq) in line 246, inside work_queue_run(). So abort() was called, and I try to find out how to force freebsd to create a stack trace when abort() was called. I checked /var/log/frr/bgpd and found only the two lines I quoted.

@opsec
Copy link
Author

opsec commented Jan 27, 2023

olivier@, the port maintainer on FreeBSD, gave me two hints:

sysctl kern.corefile=/var/crash/%N.%P.%U.core

to point the kernel to a place to put the core file. And: The port currently has --disable-backtrace.
We'll test with --enable-backtrace etc.

@opsec
Copy link
Author

opsec commented Feb 4, 2023

The problem did not re-appear until now. Still waiting for it to happen again.

@ocochard
Copy link
Contributor

Still nothing ?

@opsec
Copy link
Author

opsec commented Apr 18, 2023

Sorry, nothing. Or, I might say: A different crash, see
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270910
and https://nepustil.net/support/frr/

@pautiina
Copy link

Look #13346
Issue can be closed.

@opsec
Copy link
Author

opsec commented Apr 21, 2023

I'm pretty sure this was not the same crash cause, because the log entry is very different from the one in 13346.

@pautiina
Copy link

I'm pretty sure this was not the same crash cause, because the log entry is very different from the one in 13346.

If an error occurs, you can open a issue

@opsec
Copy link
Author

opsec commented Apr 21, 2023

This is why this issue still was open, because it's a different crash.

@ton31337 ton31337 reopened this Apr 21, 2023
@github-actions
Copy link

This issue is stale because it has been open 180 days with no activity. Comment or remove the autoclose label in order to avoid having this issue closed.

@frrbot
Copy link

frrbot bot commented Oct 19, 2023

This issue will be automatically closed in the specified period unless there is further activity.

@frrbot frrbot bot closed this as completed Oct 26, 2023
@frrbot frrbot bot removed the autoclose label Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

5 participants