-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor test apps to use unit-test framework #4014
Conversation
…e new framework (reason: because parallel flag is not set, doh!)
…, skip essential tests, list tests
…ce returning value
…use of errors when running on Windows virtual machine
…t wait the previous test to complete)
…ing, and running unit test. This can be used by all test apps. pjlib-util-test has been ported to use this utilities
…sential test because it does not exist in features test (in pjlib-test)
…up from 45m originally to 15m using 10 worker threads
…4:30 minutes with 10 worker threads, from 45:42m originally)
…provements due to exclusive tests
… automatic error reporting) hopefully make it easier to use
…e with unit-test logging (see unittest.md)
…args are set in GitHub action variables
… into unittest-framework
… into unittest-framework
I think this is done. TODO probably as follow up PRs:
|
For CI Windows, should we also add 64-bit version? |
Rather than double the number of tests, perhaps we can just make 64-bit as the default for testing (i.e. make every test config build for win64) instead? Since most machines should be 64-bit nowadays. |
Wholeheartedly agree on Win64 (I already wrote it as todo in my comment above). Unfortunately the pjsip/third_party_libs only has x86 as the target for most libs (some do have x64 as target). So rather than just building and testing basic exe, I was thinking to do the work as separate PR to create more elaborate CI. |
Win CI sometimes failed this week (it was very stable last week):
Will keep watching for these issues. |
This PR contains modifications to PJSIP test apps (
pjlib-test
,pjlib-util-test
,pjnath-test
,pjmedia-test
, andpjsip-test
) to use the new unit test framework (#4007) with the main objective to make them complete faster.Timing Results
Let's get straight into it. Below are the test time improvements from the original with the new framework using several worker thread settings (1-10).
GitHub CI timings:
Notes on timing
Settings with three worker threads (totalling four threads with the main thread) are significant because GitHub Ubuntu and Win runners use 4 VCPU this article. Mac-latest has 3 VCPU.
Some tests cannot be made faster than certain limit with more worker threads, because that is the longest test case duration in that test.
General look and feel
All test apps have common look and feel with uniform command line options, which look something like this:
The test outputs are also uniform, which look something like this:
Running the tests
With Makefile build system, it is easier to run the tests with the
make
command. TheMakefile
accepts two environment variables:CI_ARGS
contains arguments for the test apps, andCI_MODE
to indicate we're running under GitHub CI (#3374). Sample invocation:Otherwise (e.g. on Windows) run each of the app directly. Use
-h
to get help.GitHub CI modifications
CI_UBUNTU_ARGS
,CI_WIN_ARGS
,CI_MAC_ARGS
, andCI_MODE
.I think the CI should incorporate more elaborate tests and cover more features/scenarios, but that is outside the scope of this PR.
Tips on troubleshooting errors
When the logging does not convey sufficient info about the error, use
--log-no-cache
to display logs as they are written, most likely with-w 0
to disable worker thread to avoid cluttering the output.But sometimes, problem only arises with specific worker thread number and test orders. In this case, troubleshooting will be challenging indeed. :) Use
-v, --verbose
to display when tests are started/ended. This way you can know what tests were started when the failed test was running. After that, you can try running only these tests rather than all tests to reproduce the problem.Test shuffling (
--shuffle
arg) is used by default on GitHub CI via repository variables (see above). To reproduce the error, make note of the seed value used when running the (failed) test (it is printed in the output), and re-run the test (locally) using--shuffle --seed N
args.The
--stop-err
option is useful to avoid waiting for all tests to complete when debugging an error.Open issues
Reproducibility
As mentioned above, we're supposed to be able to reproduce the test sequence by using
--shuffle
and specific--seed
value. But this is not always the case. Even with the same seed, the test sequence can be different on different machine. We already even use our own psudo random number generator inunittest.c
, but sometimes this does not fix the problem.Test app modifications
General
There is a new utility file in
pjlib/src/pjlib-test/test_util.h
that is shared by all test apps to parse command line arguments, show usage, register tests, and control the unit testing process.The main front-end files (
main.c
) were modified to be more nice as command line apps.The main modification in test body (
test.c
) is to use the unit-test framework.Some test codes were changed, replacing manual checks with
PJ_TEST_XXX()
macros, mainly to test the usage of these macros and to make the test nicer. But since it made the PR very big, I didn't continue the effort, unless when it was necessary for debugging some problems.In general, large tests needed to be split into smaller ones to make them run in parallel. But major problems arose, mainly because the tests share global states or manipulate common objects.
More specific changes are discussed below.
pjlib-test
notespjlib-test
has "special" arrangements intest.c
, because it needs to test the unit-test (UT) framework first, before running the rest of the test using the UT framework. But before testing the UT framework, it needs to test the components needed by the UT framework such as list, fifobuf, and OS. And so on. That's why the test output is different than the rest of the test apps.Other than that, the modifications to the test functions are not too major, at least compared to pjnath-test and pjsip-test, and I think the test time is quite satisfactory.
pjlib-util-test
notesWe couldn't speed up more because tests such as
resolver_test()
andhttp_client_test()
takes about three minutes to complete and they couldn't be split up without major effort due to the use of global states. Since the test time is already quite satisfactory, I didn't pursue further optimizations.pjnath-test
notespjnath-test
requires large modifications to make the tests run in parallel as follows:mem
pool factory since many tests validate the memory leak in the pool factory, therefore having a single pool factory will not workserver.c
so that server can be instantiated multiple times simultaneously (this was the motivation behind API to get DNS server's bound address to allow specifying zero as port number #3999).ice_test
,turn_sock_test
,concur_test
) into individual test for each configuration, making them parallelable.As the result, there are 70 smaller test items in
pjnath-test
, and with 7 worker threads, we can save 40 minutes of test time!pjmedia-test
notespjmedia-test
has the least modifications because it has very few tests. The original duration was 4m18.691s, and has come down a little to 2m8.363s with 1 worker thread.Having said that, some minor modifications were done:
pjmedia_endpt_create()
withpjmedia_endpt_create2()
(similarly..destroy()
with..destroy2()
) inmips_test()
andcodec_test_vectors()
, to avoid inadvertently initializingpjmedia_aud_subsys
which on Ubuntu emits lots of debugging messages during initialization (although the messages should have been suppressed in the code).printf
with log in jbuf test to make the output tidy, and renamedjbuf_main
function name tojbuf_test
to be consistent.pjsip-test
notespjsip-test
has also gone through the biggest and most difficult modifications to make the tests parallelable, which involves:pjsip_tpselector
to bind transaction (andtdata
in case of stateless request) with specific loop transport, otherwise the transaction/tdata may find other instance of loop transporttsx_uac_test
failed because UA layer has now been registered before the test)tsx_basic_test
,tsx_uac_test
,tsx_uas_test
to take the index to parameters rather than the parameter itself to make the test output more informative.