Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core dump when building on Ubuntu24.04 #32

Open
AlexanderWells-diamond opened this issue Jan 10, 2025 · 9 comments
Open

Core dump when building on Ubuntu24.04 #32

AlexanderWells-diamond opened this issue Jan 10, 2025 · 9 comments

Comments

@AlexanderWells-diamond
Copy link
Contributor

AlexanderWells-diamond commented Jan 10, 2025

The PythonSoftIOC CI weekly run, which runs using "ubuntu-latest" and the master branches of all its various dependencies, is seeing a core dump in its latest runs.

The issue appears to be that "ubuntu-latest" is now "24.04", upgraded from "22.04". When building epicscorelibs on this system, it seems to cause a core dump when we first try and call into EPICS C code.

I'm unfortunately not sure what the actual cause of the failure is. My guess is that the version of the C runtime has updated, but I would have expected building epicscorelibs on the system itself to not have an issue with that.

I have a branch and PR where I have been investigating the issue here.

@mdavidsaver
Copy link
Member

This is probably another manifestation of epics-base/epics-base#514

Un-defining the C preprocessor macro _FORTIFY_SOURCE would be one conclusive test. Including ci-core-dumper should also give clear signs in the stack trace.

@AlexanderWells-diamond
Copy link
Contributor Author

I've run ci-core-dump here and have found this failing thread that confirms the issue is in dbAllocRecord:

Thread 1 (Thread 0x7ff339573080 (LWP 4898)):
  #0  0x00007ff33929eb1c in pthread_kill () from /lib/x86_64-linux-gnu/libc.so.6
  #1  0x00007ff33924526e in raise () from /lib/x86_64-linux-gnu/libc.so.6
  #2  <signal handler called>
  #3  0x00007ff33929eb1c in pthread_kill () from /lib/x86_64-linux-gnu/libc.so.6
  #4  0x00007ff33924526e in raise () from /lib/x86_64-linux-gnu/libc.so.6
  #5  0x00007ff3392288ff in abort () from /lib/x86_64-linux-gnu/libc.so.6
  #6  0x00007ff3392297b6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
  #7  0x00007ff339336c19 in __fortify_fail () from /lib/x86_64-linux-gnu/libc.so.6
  #8  0x00007ff3393365d4 in __chk_fail () from /lib/x86_64-linux-gnu/libc.so.6
  #9  0x00007ff336aca9a9 in dbAllocRecord () from /home/runner/.local/lib/python3.12/site-packages/epicscorelibs/lib/libdbCore.so.7.0.7.99.1
  #10 0x00007ff336ab8613 in dbCreateRecord () from /home/runner/.local/lib/python3.12/site-packages/epicscorelibs/lib/libdbCore.so.7.0.7.99.1
  #11 0x00007ff336abd5b3 in dbRecordHead.part.0 () from /home/runner/.local/lib/python3.12/site-packages/epicscorelibs/lib/libdbCore.so.7.0.7.99.1
  #12 0x00007ff336ac0042 in dbReadCOM () from /home/runner/.local/lib/python3.12/site-packages/epicscorelibs/lib/libdbCore.so.7.0.7.99.1
  #13 0x00007ff336a93cb2 in dbLoadDatabase () from /home/runner/.local/lib/python3.12/site-packages/epicscorelibs/lib/libdbCore.so.7.0.7.99.1
...

I'm afraid I don't know how to undefine a macro in EPICS from epicscorelibs build process.

Does that linked merged PR imply that this issue will be fixed in the future, with the next EPICS release? Is there anything I can do in the meantime to resolve this issue (asides from not using Ubunut-24)?

@mdavidsaver
Copy link
Member

I'm afraid I don't know how to undefine a macro in EPICS from epicscorelibs build process.

hmm. I don't think that I have had to deal with this either, until now. Annoyingly, this is handled by undef_macros=, separate from define_macros=. None of the existing mechanics in epicscorelibs.config are passing undef_macros= through to dependent module builds. I'm not even sure if setuptools-dso will handle this correctly. Sigh...

@mdavidsaver
Copy link
Member

... I'm not even sure if setuptools-dso will handle this correctly. Sigh...

Surprisingly, it looks like this may not be difficult. Apparently, for define_macros=[X] in addition to the documented forms ('NAME', 'value') -> -DNAME=value and ('NAME',None) -> -DNAME, there is an undocumented third form ('NAME',) -> -UNAME.

@mdavidsaver
Copy link
Member

@AlexanderWells-diamond Could you test #33 ?

@AlexanderWells-diamond
Copy link
Contributor Author

No obvious luck when just dumping it into the existing CI run, see here.

Neither any success when testing in my own Ubunutu-24.04 image - exactly the same error appears at exactly the same point.

@mdavidsaver
Copy link
Member

exactly the same error appears at exactly the same point.

Please re-test. When attempting to test for _FORTIFY_SOURCE I forgot that this builtin C macro is only defined when optimization is enabled.

With pip install -v ... you should see:

Detect _FORTIFY_SOURCE 3
Bypass _FORTIFY_SOURCE

then later many repetitions of -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2.

@AlexanderWells-diamond
Copy link
Contributor Author

AlexanderWells-diamond commented Jan 16, 2025

We no longer see an abort, and the tests pass fine. Thank you for fixing this!

@mdavidsaver
Copy link
Member

#33 is merged. fyi. this is a workaround. The real fix will come with the next merge from epics-base into epicscorelibs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants