Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error handling (kernel vs firmware first) #71

Open
andreiw opened this issue Sep 7, 2023 · 11 comments
Open

Error handling (kernel vs firmware first) #71

andreiw opened this issue Sep 7, 2023 · 11 comments

Comments

@andreiw
Copy link
Collaborator

andreiw commented Sep 7, 2023

Current language for BRS-I mentions firmware-first (APEI) handling. We need to close on whether that's really doable for v1 of the BRS or whether we should just remove any requirement on error handling implementation for now.

@andreiw andreiw mentioned this issue Sep 7, 2023
@dhaval-rivos
Copy link

Let us discuss in BRS but we have spec WIP already as part of RAS POC.

@adurbin-rivos
Copy link
Collaborator

Sounds like there are in-flight changes to ACPI and UEFI (CPER) to be tracked in PRS. @dhaval-rivos indicating we need these changes to have a more functional APEI. Time frame on these changes proposals?

@andreiw
Copy link
Collaborator Author

andreiw commented Sep 14, 2023

This is what I got back from Himanshu at Ventana: APEI doesn't require any additional hardware capabilities.

So there's some disagreement here. To be investigated.

@dhaval-rivos
Copy link

dhaval-rivos commented Sep 14, 2023

Hi Andrei, I am not exactly sure of what Himanshu meant, but unless I misunderstood the question, let me provide specific example why I said APEI/CPER changes are tied to HW. i.e. we have notification method for RAS events. This notification method is HW dependent (RAS exception). Similarly CPER table contains information that comes from RV HW. So if the question was we could define APEI/CPER independent of RERI spec being defined, I still think there is dependency.

@avpatel
Copy link
Collaborator

avpatel commented Sep 14, 2023

List of things required for APEI/CPER are:

  • RAS Local IRQs which already standardized/ratified by AIA spec.
  • RAS Non-maskable interrupts defined by RNMI spec which is in final stages.
  • RAS exception codes will be standardized as part of Priv v1.13 spec by end-of this year

The APEI/CPER as being implemented for RISC-V RAS software stack are independent of RERI and the above mentioned dependencies will be resolved soon.

@hschauhan Please add if I missed anything.

@hschauhan
Copy link

The APEI only captures the information about how events are propagated. APEI doesn't depend on the hardware because I doesn't really deliver the event. The propagation is dependent on hardware but information about it is not. These two things are not same.

@dhaval-rivos
Copy link

dhaval-rivos commented Sep 14, 2023 via email

@avpatel
Copy link
Collaborator

avpatel commented Sep 14, 2023

@dhaval-rivos We can always do APEI related ECRs incrementally. The first set of APEI related ECRs can be very basic having no dependency on HW specs.

@dhaval-rivos
Copy link

dhaval-rivos commented Sep 14, 2023 via email

@andreiw
Copy link
Collaborator Author

andreiw commented Oct 16, 2023

So it sounds like there may be an AIA dependency (perhaps AIA thus implies compliance to the RAS), but yeah I see that the RAS specs are still in flight so we can hold our horses for sure...

@andreiw
Copy link
Collaborator Author

andreiw commented Apr 11, 2024

Deferring on this (but still keeping it around for context).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants