[tests] [unit] Fix the CLI test so it doesn't leave a bunch of temp files behind #832

eskultety · 2025-02-11T11:10:07Z

Maintainers will complete the following section

Commit messages are descriptive enough
Code coverage from testing does not decrease and new code is covered
Docs updated (if applicable)
Docs links in the code are still valid (if docs were updated)

Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:

approve GitHub Actions workflows by clicking a button
approve the Red Hat Trusted App Pipeline container build by commenting /ok-to-test
(as is the standard for Pipelines as Code)

tests/unit/test_cli.py

a-ovchinnikov

Two commits in the middle are awesome, two commits at the edges are questionable (please see my comments inline).

tests/unit/test_cli.py

a-ovchinnikov · 2025-02-11T15:33:50Z

tests/unit/test_cli.py

@@ -928,10 +928,6 @@ def test_a_user_sees_error_when_they_provide_a_non_cachi2_sbom_for_a_merge(
    @pytest.mark.parametrize(
        "sbom_files_to_merge",
        [


I strongly suggest not skipping simpler tests. The idea behind having a progression of complexity is to help with detecting a breaking point. If some update breaks two-SBOMs case then it will likely break three-SBOMs case. If, however, something breaks three-SBOMs, but keeps two-SBOMs intact then there is quite a bit of narrowing down. It usually does not make sense to go beyond three, that's why I have never added more cases here.

I strongly suggest not skipping simpler tests. The idea behind having a progression of complexity is to help with detecting a breaking point. If some update breaks two-SBOMs case then it will likely break three-SBOMs case

May be true in general, but common sense tells me otherwise in this particular case. If you look at the actual functional code:

cachi2/cachi2/interface/cli.py

Lines 459 to 460 in 2235854

sbom = sum(sboms_to_merge[1:], start=start_sbom)

sbom_json = sbom.model_dump_json(indent=2, by_alias=True, exclude_none=True)

We are really only doing a string concatentation of JSONs and calling into the JSON formatter exactly once. I don't think testing 2 and 3 SBOM merges brings us any value because of this. Again, might make sense elsewhere but I don't think it makes it here. Additionally, if you look at our test suite, there's always only going to be more tests, so we really need to be careful and pragmatic for the number of variants we test to not go out of hand so as to make the test suite lose its meaning - provide instant local feedback (ideally) fast! This also goes in line with a recently filed issue #803 which is a very reasonable ask and a very real case, the tests are still taking ~17minutes (give or take), well, do they need to? Or can we get the feedback faster and be more productive in the meantime?

So, if possible, I'd like to keep this particular change in, because I don't think the parameter matrix as it currently is defined is IMO justified well.

If you look at the actual functional code

This is exactly how tests get coupled to implementations. What is happening inside a unit is unit's busyness, a test writer should at least abstract that away.

I am sorry, I don't follow the IT argument: this is a unit test. UTs run for ~10 seconds on my machine, I'd say that is pretty fast. The full nox suit runs for about a minute which is a regression from tox where --parallel flag would make the entire test matrix finish in about 12 seconds.

I won't block on this one, but I still believe that small incremental tests (especially when each takes a tiny fraction of a second) are better than a lump of all features crammed into a single one. If tests are code then lumped tests are the equivalent of enterprise-grade god-class which is demonstrably bad.

I thought about this in the background a bit and realized that smaller, more compact test cases followed by an expanded tests that build on the previous ones act as a form of induction proof steps for me.

a-ovchinnikov · 2025-02-11T15:45:28Z

tests/unit/test_cli.py

@@ -935,79 +935,51 @@ def test_a_user_sees_error_when_they_provide_a_non_cachi2_sbom_for_a_merge(
            ],
        ],
    )
-    def test_a_user_can_successfully_merge_several_cachi2_sboms(
+    def test_merge_several_cachi2_sboms(


Please don't. It is totally fine to have duplications in test cases. Test cases are not code, test cases exist to state boring facts about program's units behavior. Test cases names are to tell a summary of what is going to happen in a test, to tell someone what to expect from this test. This summary falls apart when one has to go back and forth and assemble it from ids and test names. I insisted on not modifying pre-existing tests when SPDX was added and I insist on that now. Tests must not be touched unless those are non-valuable tests which do not actually test anything. Having multiple test cases gradually adding features to be tested could greatly help during refactoring.

I'll try to respond inline and let's see if we can meet somewhere opinion-wise.

It is totally fine to have duplications in test cases. Test cases are not code

I categorically disagree. Tests are still code (just not a production one) and any duplication in any code simply smells. The impact on tests is that any refactoring/fixes on the actual code will result in many identical changes to several tests which not only may (and by Murphy's law they will) impede with any other type of productive work, but more importantly one may forget to update ALL of the tests and reviewers might simply not catch it either.
Pytest is a widely used test framework and the developers did give us a powerful tool to avoid code duplication because I assume the market demanded it - parametrize - and so while I am a big proponent of code readability, one should always approach things with common sense, in this particular case, the test case are literally 3-4 lines of code calling into a subprocess, there really is no good reason why this could not be consolidated into a single test case IMO. When this ends up generating a bunch of conditions modifying the code during runtime to satisfy all the parametrize variations, that's a different story! Then again, one should use common sense and the contributor and the reviewer should find some common ground that works for both parties so as not to lead to completely unproductive work.

Test cases names are to tell a summary of what is going to happen in a test, to tell someone what to expect from this test.

Not objecting and frankly I wouldn't mind keeping the test names as they are because they are descriptive (maybe even too much), but unfortunately once you condense several tests into a single one, that action alone invalidates this premise.

This summary falls apart when one has to go back and forth and assemble it from ids and test names

How is this a problem? Consider the following:

FAILED tests/unit/test_cli.py::TestMergeSboms::test_a_user_sees_error_when_they_provide_a_non_json_file_for_a_merge[sbom_files_to_merge0-does not look like] - assert False

FAILED tests/unit/test_cli.py::TestMergeSboms::test_save_merged_sboms_to_file[merge_our_own_cyclonedx] - assert False FAILED tests/unit/test_cli.py::TestMergeSboms::test_save_merged_sboms_to_file[merge_spdx_format] - assert False

I'd argue that you're not missing any important piece of information to allow you to debug the test. Quite the contrary, the fact that the original (the top one) test is defined with parametrize and yet doesn't provide an ID (a common antipattern in our old tests) makes it more confusing.

I insisted on not modifying pre-existing tests when SPDX was added and I insist on that now. Tests must not be touched unless those are non-valuable tests which do not actually test anything.

That's fair and I so I won't pursue that change now and will drop it. However, when the time comes and we need to update the tests because of a code change I will insist on making the change then and get rid of this trivial code duplication.

Having multiple test cases gradually adding features to be tested could greatly help during refactoring.

How is this in direct contrast with parametrizing tests?

Oh, one more thing - using and not using parametrize at the same time is also confusing and very unusual IMO which was another motivator for the change in question :) .

Tests are still code

Treating them as regular code (and not a very special kind) implies that they need tests. This is a bit paradoxical.

any refactoring ... will result in many identical changes to several tests

I'd say this is an indication of brittle tests which are tightly coupled to an implementation. Brittleness aside this usually indicates that the tests are, in fact, just a crude reimplementation of existing logic and exist solely to bump coverage numbers which is something to avoid. Tests which test through a contract, on the other hand, don't need to and must not to be changed during refactoring at all. Tests are supposed to verify external behavior, how this behavior is implemented is immaterial.

Pytest part reads a bit like an ad populum. While parametrization is an awesome tool for one-dimensional substitutions and should be used for that, I think multidimensional expansion is a tool that has to be used sparingly. Parametrization and overgeneralization lead to tests like those for gomod. It is a bit hard to tell what is going on in some of those because of the multitude of parameters supplied.

The other thing which I find problematic is that here we are effectively lumping together two distinct features into one test. Merging different types of SBOMs are different features and have been introduced separately in separate user stories, that's why I think it is worth it keeping their tests separate. The fact that they now share implementation is again an implementation detail and should not affect how tests are written.

when the time comes and we need to update the tests because of a code change

This means that our tests are tightly coupled to our implementation and that we are checking how we wrote some piece of code vs checking if it does what it is expected to do.

This all stems from the fact that we have not too many busyness rules on one hand and use a lot of external services on the other. This is somewhat similar to microservices which are usually heavily tested with integration tests and sporadically with unit and e2e. I believe we should revisit our approach to testing and more importantly to coverage calculation and interpretation.

This is definitely a bigger discussion than could (and should) fit into a cleanup PR thread.

tests/unit/test_cli.py

There really is no difference to merging 2 or 3 SBOMs, so let's just test the more complex case of merging 3 and reduce the matrix. Signed-off-by: Erik Skultety <[email protected]>

Fixes: b4a413d Signed-off-by: Erik Skultety <[email protected]>

Cosmetic change, but anyone ever needs to interactively debug this it's going to be easier to identify the tmp files based on our tool name, test domain and the test case which generated them. Signed-off-by: Erik Skultety <[email protected]>

a-ovchinnikov · 2025-02-13T00:20:38Z

tests/unit/test_cli.py

@@ -928,10 +928,6 @@ def test_a_user_sees_error_when_they_provide_a_non_cachi2_sbom_for_a_merge(
    @pytest.mark.parametrize(
        "sbom_files_to_merge",
        [


If you look at the actual functional code

This is exactly how tests get coupled to implementations. What is happening inside a unit is unit's busyness, a test writer should at least abstract that away.

I am sorry, I don't follow the IT argument: this is a unit test. UTs run for ~10 seconds on my machine, I'd say that is pretty fast. The full nox suit runs for about a minute which is a regression from tox where --parallel flag would make the entire test matrix finish in about 12 seconds.

I won't block on this one, but I still believe that small incremental tests (especially when each takes a tiny fraction of a second) are better than a lump of all features crammed into a single one. If tests are code then lumped tests are the equivalent of enterprise-grade god-class which is demonstrably bad.

eskultety requested review from a-ovchinnikov and slimreaper35 February 11, 2025 12:26

slimreaper35 reviewed Feb 11, 2025

View reviewed changes

tests/unit/test_cli.py Outdated Show resolved Hide resolved

eskultety force-pushed the fix_test_cli branch from 60a2b78 to 666f36d Compare February 11, 2025 13:56

slimreaper35 approved these changes Feb 11, 2025

View reviewed changes

a-ovchinnikov requested changes Feb 11, 2025

View reviewed changes

eskultety added 3 commits February 12, 2025 08:45

tests: unit: cli: merge sbom: Reduce the merge test matrix slightly

6db9213

There really is no difference to merging 2 or 3 SBOMs, so let's just test the more complex case of merging 3 and reduce the matrix. Signed-off-by: Erik Skultety <[email protected]>

tests: unit: cli: merge sbom: Make sure any tmp files are cleaned up

38a60e0

Fixes: b4a413d Signed-off-by: Erik Skultety <[email protected]>

eskultety force-pushed the fix_test_cli branch from 666f36d to 1b1879e Compare February 12, 2025 07:45

slimreaper35 mentioned this pull request Feb 12, 2025

Adding SBOM components and integration tests #831

Open

4 tasks

a-ovchinnikov approved these changes Feb 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tests] [unit] Fix the CLI test so it doesn't leave a bunch of temp files behind #832

[tests] [unit] Fix the CLI test so it doesn't leave a bunch of temp files behind #832

eskultety commented Feb 11, 2025

a-ovchinnikov left a comment

a-ovchinnikov Feb 11, 2025

eskultety Feb 12, 2025 •

edited

Loading

a-ovchinnikov Feb 13, 2025

a-ovchinnikov Feb 13, 2025

a-ovchinnikov Feb 11, 2025

eskultety Feb 12, 2025 •

edited

Loading

eskultety Feb 12, 2025

a-ovchinnikov Feb 13, 2025

a-ovchinnikov Feb 13, 2025

	sbom = sum(sboms_to_merge[1:], start=start_sbom)
	sbom_json = sbom.model_dump_json(indent=2, by_alias=True, exclude_none=True)

[tests] [unit] Fix the CLI test so it doesn't leave a bunch of temp files behind #832

Are you sure you want to change the base?

[tests] [unit] Fix the CLI test so it doesn't leave a bunch of temp files behind #832

Conversation

eskultety commented Feb 11, 2025

Maintainers will complete the following section

a-ovchinnikov left a comment

Choose a reason for hiding this comment

a-ovchinnikov Feb 11, 2025

Choose a reason for hiding this comment

eskultety Feb 12, 2025 • edited Loading

Choose a reason for hiding this comment

a-ovchinnikov Feb 13, 2025

Choose a reason for hiding this comment

a-ovchinnikov Feb 13, 2025

Choose a reason for hiding this comment

a-ovchinnikov Feb 11, 2025

Choose a reason for hiding this comment

eskultety Feb 12, 2025 • edited Loading

Choose a reason for hiding this comment

eskultety Feb 12, 2025

Choose a reason for hiding this comment

a-ovchinnikov Feb 13, 2025

Choose a reason for hiding this comment

a-ovchinnikov Feb 13, 2025

Choose a reason for hiding this comment

eskultety Feb 12, 2025 •

edited

Loading

eskultety Feb 12, 2025 •

edited

Loading