Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] PX4 doesn't have a default compiler anymore #23963

Open
PetervdPerk-NXP opened this issue Nov 18, 2024 · 12 comments
Open

[Bug] PX4 doesn't have a default compiler anymore #23963

PetervdPerk-NXP opened this issue Nov 18, 2024 · 12 comments

Comments

@PetervdPerk-NXP
Copy link
Member

PetervdPerk-NXP commented Nov 18, 2024

Describe the bug

#23869 removes the installation of GCC 9.3.1
Instead install OS provided GCC.
Now we're all using different compilers

Host Compiler
Ubuntu 24.04 GCC 13.2
Ubuntu 22.04 GCC 10.3
Ubuntu 20.04 GCC 9-2019-q4
CI Logs show GCC 9.3.1
Any machine that used old ubuntu.sh GCC 9.3.1

https://launchpad.net/ubuntu/+source/gcc-arm-none-eabi

To Reproduce

No response

Expected behavior

No response

Screenshot / Media

No response

Flight Log

No response

Software Version

No response

Flight controller

No response

Vehicle type

None

How are the different components wired up (including port information)

No response

Additional context

No response

@dagar
Copy link
Member

dagar commented Nov 18, 2024

What I would like to do going forward is have the official/supported compiler version that we use for CI and to produce QGC flashable builds is to follow the current Ubuntu LTS.

Unofficially I would like to keep things working back to the previous LTS version. With newer CI resources we can now afford to do a few builds across a larger matrix of distros + versions. However if/when someone hits an issue we need to point them at the supported (current LTS) environment.

@PetervdPerk-NXP
Copy link
Member Author

PetervdPerk-NXP commented Nov 18, 2024

Wouldn't it just be easier to select a compiler that everyone should use?
I think you can install the ARM release of the GCC 13.2 on Ubuntu 22.04 as well.
So that would ensure that everyone is using GCC 13.2.

Reason I'm bringing this up because on a clean Ubuntu 24.04 it will use GCC 13.2 which fails to compile on various targets.

Also do we really want to go into runtime errors caused by different compiler versions?

@dagar
Copy link
Member

dagar commented Nov 18, 2024

We're still choosing a version that CI and production builds will use and what I will strongly recommend people use. The difference is we won't make a mess of your system trying to force an exact version of something.

Another piece to be aware of is that it's more than just a specific GCC version. There are different edge cases with cmake versions, python pain, and then other random tools.

The point of this is to try and be more tolerant of a wider array of setups. There have always been many people using Ubuntu versions in between an LTS, or Fedora, RHEL, Arch, etc, etc. They hit super minor things like compiler warnings or other annoyances, some of them get frustrated and waste a lot of time, others end up hacking these fixes in private and power through. Let's try to minimize the barrier to entry for PX4 development, because whether we like it or not people are doing it regardless.

Reason I'm bringing this up because on a clean Ubuntu 24.04 it will use GCC 13.2 which fails to compile on various targets.

It didn't happen the way I would have preferred, but things are in flux at the moment and will soon be reconciled with everything on 24.04 with packaged GCC (clean simple setup script, default px4-dev container, CI and QGC builds). On the side we'll have a minimal set of builds across all Ubuntu versions (and ideally other distros next).

@PetervdPerk-NXP
Copy link
Member Author

We're still choosing a version that CI and production builds will use and what I will strongly recommend people use. The difference is we won't make a mess of your system trying to force an exact version of something.

Okay that sounds reasonable to me, it's just that at the moment it's quite vague hence this issue whereas before we were fixed on GCC 9.3.1.

@dagar
Copy link
Member

dagar commented Nov 18, 2024

at the moment it's quite vague hence this issue whereas before we were fixed on GCC 9.3.1.

Let's try to get it wrapped up this week. Do you have any specific issues to share while we're here?

With the current setup script and in tree Dockerfile we're now working through validating builds and updating everything in CI.

@PetervdPerk-NXP
Copy link
Member Author

Some compilation errors with GCC 13.2 triggering on werror i.e.

/home/peter/src/PX4-Autopilot/platforms/common/spi.cpp: In function 'int px4_find_spi_bus(uint32_t)':
/home/peter/src/PX4-Autopilot/platforms/common/spi.cpp:90:59: error: the address of 'px4_spi_buses' will never be NULL [-Werror=address]
   90 |         for (int i = 0; ((px4_spi_bus_t *) px4_spi_buses) != nullptr && i < SPI_BUS_MAX_BUS_ITEMS; ++i) {
      |                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
compilation terminated due to -Wfatal-errors.
cc1plus: all warnings being treated as errors
[536/1335] Building CXX object platforms/common/uORB/CMakeFiles/uORB.dir/uORB.cpp.obj
ninja: build stopped: subcommand failed.
make: *** [Makefile:232: px4_fmu-v4] Error 1
peter@SMW015329:~/src/PX4-Autopilot$

@niklaut
Copy link
Contributor

niklaut commented Nov 19, 2024

Regardless of the specific version, for the compiler we recommend and use in the CI, I would like to see the official ARM compiler used. That ensures that on different host platforms ({Windows, Linux, Darwin} x {Intel, ARM}), the exact same compiler version is used with the exact same patches.

@alexcekay
Copy link
Member

Just a quick note. I have been using GCC 13 since months as I did not want to use the old GCC 9.
Apart from the spi.cpp problem described by @PetervdPerk-NXP I did not have to do any changes.

GCC 13 also has the advantage of using less FLASH for the same build.
For example https://github.com/Auterion/PX4_firmware_private/actions/runs/11930030958/job/33250019827?pr=2707 fails with GCC9, but runs fine with GCC13 locally as it uses 10K less FLASH.

@MaEtUgR
Copy link
Member

MaEtUgR commented Nov 20, 2024

Some compilation errors with GCC 13.2 triggering on werror i.e.

#23994

@PetervdPerk-NXP
Copy link
Member Author

@dagar can we pick a new number/version of GCC going forward?

It seems that V6X-RT has some weird startup issues with the newer GCC compilers (10.3, 13.2) whereas GCC 9.3.1 works fine. I want to look into this issue, however I don't want to debug all potential compiler versions, would be nice to bump the version of GCC and match CI/Releases with that.

@mrpollo
Copy link
Contributor

mrpollo commented Jan 6, 2025

Agreed, we will be picking a default compiler as a minimum version once things settle on our end with CI.

@alexcekay
Copy link
Member

alexcekay commented Jan 8, 2025

Just wanted to quickly add a few more comparisons between the GCC9 used by the CI and a local GCC13.3.1 (from ARM).
I think it would really help us with the FLASH problems we are facing.

FLASH (v5x)
Here is the output from bloaty (bloaty -s vm -d sections v5x-gcc13.elf -- v5x-gcc-ci.elf).

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +0.2%      +7  +0.2%      +7    .data
  -9.4%      -5  [ = ]       0    .ARM.attributes
  -9.2%      -7  [ = ]       0    .comment
  -3.3% -64.4Ki  [ = ]       0    .debug_abbrev
  +2.0% +3.20Ki  [ = ]       0    .debug_aranges
  +2.6% +12.1Ki  [ = ]       0    .debug_frame
  -2.7%  -737Ki  [ = ]       0    .debug_info
   +15%  +650Ki  [ = ]       0    .debug_line
  [NEW]    +423  [ = ]       0    .debug_line_str
  [DEL] -5.35Mi  [ = ]       0    .debug_loc
  [NEW] +3.46Mi  [ = ]       0    .debug_loclists
  [DEL] -1.25Mi  [ = ]       0    .debug_ranges
  [NEW]  +579Ki  [ = ]       0    .debug_rnglists
  +2.5% +86.4Ki  [ = ]       0    .debug_str
   +11%     +23  [ = ]       0    .shstrtab
 -102.2% -27.8Ki  [ = ]       0    [4 Others]
  [ = ]       0  [DEL]     -52    [ELF Header]
  [ = ]       0  -0.1%     -64    .bss
  [ = ]       0  [DEL]    -128    [ELF Program Headers]
  [DEL] -31.8Ki  [DEL] -31.8Ki    [LOAD #1 [RWX]]
  -2.1% -41.2Ki  -2.1% -41.2Ki    .text
  -5.8% -2.73Mi  -3.5% -73.3Ki    TOTAL

FLASH (v6x)
Here is the output from bloaty (bloaty -s vm -d sections v6x-gcc13.elf -- v6x-gcc-ci.elf).

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  -9.4%      -5  [ = ]       0    .ARM.attributes
  -9.2%      -7  [ = ]       0    .comment
  -3.4% -64.6Ki  [ = ]       0    .debug_abbrev
  +2.1% +3.20Ki  [ = ]       0    .debug_aranges
  +2.6% +11.8Ki  [ = ]       0    .debug_frame
  -2.8%  -731Ki  [ = ]       0    .debug_info
   +15%  +618Ki  [ = ]       0    .debug_line
  [NEW]    +423  [ = ]       0    .debug_line_str
  [DEL] -5.14Mi  [ = ]       0    .debug_loc
  [NEW] +3.32Mi  [ = ]       0    .debug_loclists
  [DEL] -1.19Mi  [ = ]       0    .debug_ranges
  [NEW]  +553Ki  [ = ]       0    .debug_rnglists
  +2.5% +85.2Ki  [ = ]       0    .debug_str
   +10%     +22  [ = ]       0    .shstrtab
  +0.7% +4.25Ki  [ = ]       0    .strtab
  -3.3% -19.3Ki  [ = ]       0    .symtab
  +5.0%     +40  [ = ]       0    [ELF Section Headers]
 -93.9% -68.6Ki  [ = ]       0    [Unmapped]
  [ = ]       0  -0.1%     -64    .bss
  -2.1% -39.4Ki  -2.1% -39.4Ki    .text
  -5.9% -2.67Mi  -2.0% -39.4Ki    TOTAL

CPU LOAD (v5x)
Test setup: v5x running on the bench and armed.

GCC9 max CPU load:  0.749
GCC13 max CPU load: 0.724

The results are very similar here, indicating that the FLASH savings are not gained by a space-time tradeoff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants