-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support "cold" data and code #9754
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -122,6 +122,14 @@ config COMPILER_INLINE_FUNCTION_OPTION | |||
help | |||
When enabled, -fno-inline-function option is not passed to compiler | |||
|
|||
config COLD_STORE_EXECUTE_DRAM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this concept of the cold store. i.e. we keeps things in the cold store until we need to use them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgirdwood rather "unless," than "until." In most cases e.g. with "cold base firmware code" it's never copied to SRAM, it's marked as non-performance critical and is always executed in DRAM. Same with LLEXT module code. By default the same is done with data. But with data we plan an optimisation where one can explicitly request a copy of "cold" data in SRAM for a limited period of time.
src/ipc/ipc4/handler.c
Outdated
@@ -131,14 +131,14 @@ static inline const struct ipc4_pipeline_set_state_data *ipc4_get_pipeline_data( | |||
/* | |||
* Global IPC Operations. | |||
*/ | |||
static int ipc4_new_pipeline(struct ipc4_message_request *ipc4) | |||
__cold static int ipc4_new_pipeline(struct ipc4_message_request *ipc4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the IPC irq TEXT is SRAM, but IRQ thread TEXT is DRAM and its stack is SRAM ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgirdwood this is just an example - it's marked "DNM" - I marked more or less random functions with __cold
for a test. When we do merge this, we'll make careful choices - which one goes where
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgirdwood this version can actually be merged - it does convert several functions and objects to "cold" - those, that seem safe to be converted. For data we don't have many objects that we can move over, but for functions we should add more gradually, after the base is merged
30a86f1
to
c34c76c
Compare
@lyakh can you rebase and force push to pick up the GH action fix. Thanks ! |
@@ -46,6 +46,7 @@ CONFIG_SOF_TELEMETRY_IO_PERFORMANCE_MEASUREMENTS=y | |||
CONFIG_SOF_TELEMETRY_PERFORMANCE_MEASUREMENTS=y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: enabld
To execute a non-performance critical part of SOF code base in DRAM we create an additional "cold" module, that will stay in DRAM next to the boot module. This patch only adds a Kconfig option and module manifests in TOML. Signed-off-by: Guennadi Liakhovetski <[email protected]>
This extends LLEXT cold data and code support to the entire SOF code base. Using the same __cold and __cold_rodata qualifies any code or data can be marked to be used directly in DRAM without being copied to SRAM. Signed-off-by: Guennadi Liakhovetski <[email protected]>
A lot of code isn't performance-critical and can be used directly in DRAM, without being copied into scarce SRAM. This commit selects two functions as first candidates for that. Moving data to DRAM is more difficult. The largest data blobs are audio processing coefficients and they're usually used during audio processing, i.e. when performance is critical. A good candidate for such data relocation is the src component, which has many coefficient sets, of which only some are used at run-time. Follow up work will switch to keeping all src coefficients in DRAM and only copying used ones into dynamically allocated SRAM buffers. This commit only moves several conversion function selection arrays into DRAM. Those arrays are small so this won't free a lot of SRAM, but at least this will serve as the first test. Signed-off-by: Guennadi Liakhovetski <[email protected]>
Leave non-performance critical code and data in DRAM, saving SRAM and sacrificing some performance. Signed-off-by: Guennadi Liakhovetski <[email protected]>
The LP64-WIP test is also failing with insufficient disc space errors. Apply the same work-around as for other tests. Signed-off-by: Guennadi Liakhovetski <[email protected]>
Depends on zephyrproject-rtos/zephyr#83275 - cannot be merged before it. Examples here are illustrative, will be dropped or replaced with something meaningful when cleaning up.
This extends the two primitives:
__cold
and__cold_rodata
to be usable universally - whether the code is built into the base firmware or as LLEXT. In both cases respectively marked code and data will remain in DRAM and won't be copied to SRAM. #9721 will be adapted to this after this is merged