Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up CI tests #58

Open
johnaohara opened this issue Jun 13, 2022 · 6 comments
Open

Speed up CI tests #58

johnaohara opened this issue Jun 13, 2022 · 6 comments

Comments

@johnaohara
Copy link
Contributor

By slightly re-architecturing the testsuite, I have reduced the time taken to run the testsuite in CI from
a) sanity-tests: 13min -> 1min
b) full testsuite: ~55min -> ~2min

In order for part of this work, the PR to delete experiments is required (#57). Another speed-up is obtained by splitting the Docker build into a multi-part build, build a base image with all the required libaries etc, and a second stage where the service files are copied into a second image. This reduces the docker build time in CI from 8min 42s to 33s as the base image does not need to be rebuilt for every test. The only downside to this is a separate process needs to build and push the base image to quay.io. The base image should only need to change for component upgrades

Before I open a PR, does anyone have an opinion on these changes?

@dinogun
Copy link
Contributor

dinogun commented Jun 13, 2022

@chandrams PTAL

@johnaohara
Copy link
Contributor Author

@chandrams for ease or review, I have pushed all the related changes into one branch: https://github.com/johnaohara/hpo/commits/faster-ci

This isn't ready for a PR just yet, but are these are the changes that allowed me to speed up the CI execution (in addition to #57)

@chandrams
Copy link
Contributor

Thanks @johnaohara.

I took a quick look at the changes, I think moving hpo deploy outside the loop is good, however some of the tests look for error messages in the HPO service logs. These error messages might be outdated too as the tests were added long back, will check and update those.

@johnaohara
Copy link
Contributor Author

@chandrams yes, these changes means that there is one log for each test-case. To handle the error cases you mentioned, there could be a number of possible solutions;

  1. Revert the change to start the hpo service once, and return to multiple log files. This would preserve current behaviour, but speed the test suite down
  2. Return exceptions to the client when an error occurs. That way would allow assertions to be applied to the error response from the service, although introduce a security issue returning error details in a client response
  3. Note the current line in the service logs before the test is run, and the log line number at the end of the test, and assert against the contents of the log file that changed during the test execution.

@chandrams
Copy link
Contributor

Thanks @johnaohara

We had implemented the third solution for a similar issue in autotune tests. Have made the same changes to HPO tests now, testing it. Can I push these changes into #57 ?

@johnaohara
Copy link
Contributor Author

@chandrams sure, if you already have a solution, push it into #57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants