-
Notifications
You must be signed in to change notification settings - Fork 518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CH6178] Checkpoints/diagnostics #1369
Conversation
…ed_system section size
…hen built by C (as opposed to C++) compiler. Replace const variables like OS_THREAD_INVALID_HANDLE in concurrent_hal.h with defines when being built by C compiler
…d of passing an integer argument to __builtin_return_address
void* stack; | ||
void* stack_start; | ||
void* stack_end; | ||
} os_thread_dump_info_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we wanted to expand this structure later while retaining backwards compatibility, what would we need to add/change now? A reserved void * at the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's currently only used within a single module and is not exported, so that shouldn't be a concern at the moment. If we later decide to export it, the first reserved
becomes a version
field as usual.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out @avtolstoy, sounds good.
This great feature is also going to need some docs to ensure users understand how to use it. |
Andrey, please fix up the conflicts so we can rebase for 0.8.0-rc.2 release. |
We also need to rename all the mentions of diagnostic to trace/tracing as originally mentioned in [CH8404] to avoid the confusion. |
Superseded by #1451
Problem
When a device hard faults, crashes, freezes or is otherwise misbehaving, it is presently difficult to know what the last action the device took is and whether the fault lies in application code, system code, peripheral code etc..
A checkpoints API would allow application and system firmware to record their current execution progress and provide an indication of where execution halted prior to the crash. This information is then published to the cloud on next successful connection.
It is also possible to get stack trace of all the threads using a naive stack unwinder, that just scans through the thread stack, looking for flash addresses (within system and user part bounds) and checks if there is a branching instruction at that location.
It would be useful to store per-thread stack traces in addition to checkpoint info whenever the device enters a hardfault, a panic state, or on demand. This will provide a better overview of the system state and help understand the cause of the crash or deadlock for difficult to reproduce issues.
Solution
System code may use the following macros defined in
diagnostic.h
:DIAGNOSTIC_CHECKPOINT()
- regular instruction address type checkpointDIAGNOSTIC_PANIC_CHECKPOINT()
- special instruction address type checkpoint which should only be called from a panic handler, which will prevent further modification of diagnostic dataDIAGNOSTIC_CRASH_CHECKPOINT(pc)
- special instruction address type checkpoint which should only be called from a crash (hardfault) handler, which will prevent further modification of diagnostic data. An instruction address needs to be manually passed as an argument.DIAGNOSTIC_UPDATE()
- forces a full system state update (including stacktraces).All
LOG
statements, except forLOG_DUMP
, make a call toDIAGNOSTIC_CHECKPOINT()
.Both hardfault and panic handlers make a call to their respective macros:
DIAGNOSTIC_CRASH_CHECKPOINT()
,DIAGNOSTIC_PANIC_CHECKPOINT()
Application code may use the following macros defined in
spark_wiring_diagnostic.h
:CHECKPOINT()
- a standard variant ofCHECKPOINT()
macro. Implementation depends on a preprocessor macroDIAGNOSTIC_ELF_AVAILABLE
.CHECKPOINT()
macro internally callsDIAGNOSTIC_CHECKPOINT()
, making it an instruction address type checkpointCHECKPOINT()
macro internally makes a call toDIAGNOSTIC_TEXT_CHECKPOINT()
, which saves the location of the checkpoint in textual format:__FILENAME__:__LINE__
CHECKPOINT(text)
- a specialized variant ofCHECKPOINT()
macro, which forces a textual checkpoint with the text passed as an argumentAll
Logger
class calls, except forprint
,printf
,write
anddump
, internally include a call to_LOG_CHECKPOINT()
macro defined in spark_wiring_diagnostic.h, which is only enabled whenDIAGNOSTIC_ELF_AVAILABLE == 1
. It acquires an address of the calling function and uses that as the checkpoint instruction address.See CH6178 for implementation details.
Steps to Test
Example App
References
Completeness
Feature
[PR #1369]
[Photon/P1/Electron]
Added Checkpoints/diagnostics