Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue/662 default storage class #745

Open
wants to merge 64 commits into
base: main
Choose a base branch
from

Conversation

fraugabel
Copy link
Contributor

fixed two bugs:

  1. make sure sonobuoy result file can be written
  2. cleanup test-pod at failure aswell, so the next run won't crash

@fraugabel fraugabel force-pushed the issue/662_default_storage_class branch from be2c002 to 7e9061a Compare September 9, 2024 11:33
Copy link
Member

@martinmo martinmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of review done, see inline comments.

Some general observations:

If the connection to the cluster fails, the script tries to continue anyway with the all the steps, which it shouldn't do.

The script still doesn't check if a provisioner such as cinder-csi is used.

Tests/requirements.in Outdated Show resolved Hide resolved
kubernetes_asyncio
python-dateutil
PyYAML
openstacksdk
requests
tomli

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Comment on lines 68 to 71
directory_path = os.path.dirname(f"./{test_name}.result.yaml")
os.makedirs(directory_path, exist_ok=True)

with open(f"./{test_name}.result.yaml", "w") as file:
yaml.dump(result_file, file)
with open(f"./{test_name}.result.yaml", "w") as file:
yaml.dump(result_file, file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still creates a file hierarchy replicating the absolute path to the script file, e.g. for me it creates:

/home/martinmorgenstern/workspace/scs/standards/Tests/home/martinmorgenstern/workspace/scs/standards/Tests/kaas/k8s-default-storage-class

The reason is that __file__ (which is the absolute path to the script) is used as the test_file_name argument passed to gen_sonobuoy_result_file and used without further processing.

In main(), you should use os.path.basename(__file__) to get only the script filename without the directories.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes you're right. ithought i already changed that

Tests/kaas/k8s-default-storage-class/helper.py Outdated Show resolved Hide resolved
Copy link
Contributor

@mbuechse mbuechse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I can follow, but it does seem to be work in progress anyway.

Comment on lines -84 to -87
namespace = "default"
pvc_name = "test-pvc"
pv_name = "test-pv"
pod_name = "test-pod"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please turn these into arguments with default values, i.e.,

namespace=NAMESPACE, pvc_name=PVC_NAME, ...

this makes the function more versatile, and we don't litter the code with these allcaps identifiers

Comment on lines 268 to 274
if isinstance(exc_value, SCSTestException):
self.return_message = exc_value.args[0]
self.return_code = exc_value.return_code
print(f"SCSTestException occurred with return_code: {self.return_code}")
else:
# No specific exception, handle normally
print(f"Exiting the context with return_code: {self.return_code}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems really convoluted. The exceptions are being caught in the main function anyway. And why should the test environment carry the burden of storing the return code? That violates the single-responsibility principle (aka separation of concerns).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because in this case I assumed that these exceptions are only intercepted afterwards. i'll check this again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the return code is kept there to feed it into the sonobuoy result, which is created in the exit, to make sure it gets created, after any error whatsoever. This is actually the reason why i created this context management to avoid having the same exception (SCSTestException) raised in 5 different places.

@mbuechse
Copy link
Contributor

@berendt Hi Christian, can you provide us with a k8s cluster for testing this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like a mistake, this file does not belong to this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file seems to have been added in error as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strange.. at checkout they should be reseted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll fix this asap

Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Matthias Büchse <[email protected]>
Signed-off-by: Matthias Büchse <[email protected]>
Signed-off-by: Matthias Büchse <[email protected]>
Copy link
Contributor

@mbuechse mbuechse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is going into the right direction. I'm afraid a few problems are still left to tackle. I also altered a few things myself (see the commits).


def check_csi_provider(k8s_core_api, allowed_csi_prov=ALLOWED_CSI_PROV):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this good for, and how was the list ALLOWED_CSI_PROV determined? I can't see any mention of this in the standard nor the testing notes. A comment in the code is required, but also some explanation in the testing notes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was a suggestion by @martinmo to provide a list of allowed csi-providers, that work accordingly to the standard

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, but this doesn't address any of my points.

return_code=33,
)
if not csi_list:
logger.info("CSI-Provider: No CSI Provider found.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not an error? (I really don't know. Maybe a comment would help)
Also please note that the indentation is wrong here.

Copy link
Contributor Author

@fraugabel fraugabel Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not an error, in case a kind cluster is used (where no csi privider is used)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should simply filter out "wrong" csi providers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if Kind is supposed to pass the test. Why have a special case for Kind? But this whole thing should AT MOST produce a warning anyway for the standard makes not mention of it whatsoever.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not an error, if e.g. a kind cluster is used.. we only test the csi-providers in use (these should sadisfy the standard), if no csi-providers are involved than this should not be an issue, this test should just filter out csi-providers in use, that are not sadisfing the standard

logger.error(f"Error preparing Environment: {e}")
return False

def clean(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get rid of (big parts of) prepare as well as the variable cleanup by making this method ignore any error that says "resource not found". To me, this would be very much preferable. (It would in principle also reduce the risk of race conditions.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good idea

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, i can catch a "404 resource not found" now and then skip the clean() execution, but if only one of both (pvc,pod) is missing the other one does not get deleted. so i am thinking, that i should ratehr catch that in the clean() function itself. what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My proposal was to call delete_namespace_pod and delete_namespaced_persistent_volume_claim unconditionally, and to swallow any exception that indicates that deletion was unsuccessful because the resource didn't exist in the first place.

Copy link
Contributor Author

@fraugabel fraugabel Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have done so now.. this leads to another problem: when the 404 is triggered, it doesn' t distinguish between the resources that are not found, so if e.g. the pvc is already built and only the pod is not found it will not delete the pvc.. so i will need to establish a flag or catch the ApiException on another level

Comment on lines +346 to +348
env.return_code = 1
logger.debug("check_default_storageclass() failed")
return env.return_code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
env.return_code = 1
logger.debug("check_default_storageclass() failed")
return env.return_code
logger.debug("check_default_storageclass() failed")
return 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately env.return_code has to be set otherwise it's not detected as an error in the exit part.. remember: i need the env.return_code variable everywhere in the context to print it in the report afterwards

Comment on lines 358 to 360
env.return_code = 3
env.return_message = "(409) conflicting resources"
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the return is in error here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no it needs the return to get to the exit where the error is handled and printed out (however, this whole exception might be unnedeeded as the conflicted resources is already dealed with in the prepare part )


def __exit__(self, exc_type, exc_value, traceback):
if self.cleanup:
self.clean()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(as reported by flake8: indention is wrong here as well)

Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file seems to be here in error.

fraugabel and others added 17 commits November 18, 2024 11:34
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Signed-off-by: Katharina Trentau <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Conformance check for scs-0211-v1 (and v2) is broken
3 participants