Speed up CI tests #58

johnaohara · 2022-06-13T10:33:33Z

By slightly re-architecturing the testsuite, I have reduced the time taken to run the testsuite in CI from
a) sanity-tests: 13min -> 1min
b) full testsuite: ~55min -> ~2min

In order for part of this work, the PR to delete experiments is required (#57). Another speed-up is obtained by splitting the Docker build into a multi-part build, build a base image with all the required libaries etc, and a second stage where the service files are copied into a second image. This reduces the docker build time in CI from 8min 42s to 33s as the base image does not need to be rebuilt for every test. The only downside to this is a separate process needs to build and push the base image to quay.io. The base image should only need to change for component upgrades

Before I open a PR, does anyone have an opinion on these changes?

dinogun · 2022-06-13T10:50:45Z

@chandrams PTAL

johnaohara · 2022-06-13T10:56:42Z

@chandrams for ease or review, I have pushed all the related changes into one branch: https://github.com/johnaohara/hpo/commits/faster-ci

This isn't ready for a PR just yet, but are these are the changes that allowed me to speed up the CI execution (in addition to #57)

chandrams · 2022-06-13T11:36:48Z

Thanks @johnaohara.

I took a quick look at the changes, I think moving hpo deploy outside the loop is good, however some of the tests look for error messages in the HPO service logs. These error messages might be outdated too as the tests were added long back, will check and update those.

johnaohara · 2022-06-13T11:45:10Z

@chandrams yes, these changes means that there is one log for each test-case. To handle the error cases you mentioned, there could be a number of possible solutions;

Revert the change to start the hpo service once, and return to multiple log files. This would preserve current behaviour, but speed the test suite down
Return exceptions to the client when an error occurs. That way would allow assertions to be applied to the error response from the service, although introduce a security issue returning error details in a client response
Note the current line in the service logs before the test is run, and the log line number at the end of the test, and assert against the contents of the log file that changed during the test execution.

chandrams · 2022-06-15T05:55:11Z

Thanks @johnaohara

We had implemented the third solution for a similar issue in autotune tests. Have made the same changes to HPO tests now, testing it. Can I push these changes into #57 ?

johnaohara · 2022-06-15T06:32:19Z

@chandrams sure, if you already have a solution, push it into #57

johnaohara mentioned this issue Jun 13, 2022

Failures in full testsuite #59

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up CI tests #58

Speed up CI tests #58

johnaohara commented Jun 13, 2022

dinogun commented Jun 13, 2022

johnaohara commented Jun 13, 2022

chandrams commented Jun 13, 2022

johnaohara commented Jun 13, 2022

chandrams commented Jun 15, 2022

johnaohara commented Jun 15, 2022

Speed up CI tests #58

Speed up CI tests #58

Comments

johnaohara commented Jun 13, 2022

dinogun commented Jun 13, 2022

johnaohara commented Jun 13, 2022

chandrams commented Jun 13, 2022

johnaohara commented Jun 13, 2022

chandrams commented Jun 15, 2022

johnaohara commented Jun 15, 2022