Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing imx6 recovery install timeout issue #314

Closed
wants to merge 1 commit into from

Conversation

kiya956
Copy link
Contributor

@kiya956 kiya956 commented Jul 30, 2024

Description

Due to different platform has different install and system boot time,
hard coding timeout to 600 is not appropriate, should support assign timeout to device system boot checking.
howto:
    in job.yaml provide

    provision_data:
        boot_timeout: <time in second>

Resolved issues

Fixing imx6 recovery install timeout issue

Documentation

Web service API changes

Tests

submit task yaml with boot_timeout and verified boot timeout can be updated

@jocave jocave requested a review from plars August 8, 2024 16:55
Copy link
Collaborator

@jocave jocave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving Paul to review the code, but as a minimum the documentation should be updated

Copy link
Contributor

@plars plars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it definitely needs documentation updates as Jonathan mentioned. However, I'm curious why is this on oemrecovery? I would have guessed this device might use something like muxpi/sdwire/zapper instead? I'm just wondering if we need to support this more generically for other device connectors also, or even if perhaps this wasn't the connector you meant to implement it on.

@kiya956 kiya956 force-pushed the timeout branch 2 times, most recently from b33242e to b73bd49 Compare August 9, 2024 02:26
Description:
    Due to different platform has different install and system boot time,
    hard coding timeout to 600 is not appropriate, should support assign timeout to device system boot checking.
    howto:
        in job.yaml provide

        provision_data:
        boot_timeout: <time in second>

Resolved issues:
    Fixing imx6 recovery install timeout issue

Documentation:
    N/A
Web service API changes:
    N/A
Tests
    submit task yaml with boot_timeout and verified boot timeout can be updated

Signed-off-by: ChunAn Wu <[email protected]>
@kiya956
Copy link
Contributor Author

kiya956 commented Aug 9, 2024

Yeah, it definitely needs documentation updates as Jonathan mentioned. However, I'm curious why is this on oemrecovery? I would have guessed this device might use something like muxpi/sdwire/zapper instead? I'm just wondering if we need to support this more generically for other device connectors also, or even if perhaps this wasn't the connector you meant to implement it on.

Hi Paul
I have updated the document but I am not sure is it appropriate.
And for the question "why is this on oemrecovery?". Our current SRU on IoT devices which with ubuntu core only do oemrecovery. And I haven't check other methods so I am not sure is it appropriate for others. @kevinyehk If I am wrong please correct me.

@plars
Copy link
Contributor

plars commented Aug 15, 2024

@kiya956 I talked to @kevinyehk and he confirmed that we have several IoT devices using oemrecovery (which is still very surprising to me! can these not boot from an sd card or usb?)

My biggest concern here is that it relies on the test requestor knowing some magic value that is needed for each specific device. The only other options I see are:

  1. make it a config option for the device connector - this is good because it allows us to set the appropriate value for each device without the test requestor needing to know it, but bad because it will require configs to be stored in c3, generated by our config generator, etc.
  2. set a larger default timeout - this might be the most sane thing to do, but we already wait an hour to see if the device was provisioned. How long were you thinking that we need to wait on these devices for the reinstall to work properly? Are we sure there's not something wrong with the device if it takes so long? slow storage for example?

@kiya956
Copy link
Contributor Author

kiya956 commented Aug 15, 2024

Hi @plars Thank you for your suggestion, Test requester need to know the recovery time is indeed an issue.
For me, the first option is more practical, due to the system on device would not be changed, so the recovery time shouldn't change much. Keep enlarging timeout is not a good idea for me (We will have uc22/uc24 with imx6, it would take longer).
What do you think?

@plars
Copy link
Contributor

plars commented Aug 20, 2024

@kiya956 Option 1 is definitely harder for us and requires a lot more changes and tracking of these special cases in some place like c3. I'd really like to avoid it if we can. Option 2 doesn't necessarily mean that other devices will take longer to provision, just that we might give them a bit more time until we give up trying. But what I'm really curious to know, is - how long do you think the timeout needs to be for these devices?

@kiya956
Copy link
Contributor Author

kiya956 commented Aug 21, 2024

@plars It's around 90~100 minutes.

@kiya956 kiya956 closed this Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants