-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upstream Image Encryption Standardization #560
Comments
I have discussed and outlined the spec with @markus-hentsch and currently writing it. |
On the user side we'd need to be able to create LUKS-encrypted files. Nova devs made us aware of the QEMU tooling being able to do so allegedly. Here is the Nova implementation:
Here's a shortened example: echo "muchsecretsuchwow" > secret_file.key
qemu-img convert -f raw -O luks --object secret,id=sec,file=secret_file.key -o key-secret=sec \
-o cipher-alg=aes-256 -o cipher-mode=xts -o hash-alg=sha256 \
-o ivgen-alg=plain64 -o ivgen-hash-alg=sha256 \
$INPUT_FILE $OUTPUT_LUKS_FILE It works within a Docker container running Ubuntu LTS but interestingly it doesn't on my NixOS setup, failing with:
Versions of qemu-img are 6.2.0 (Ubuntu) and 8.1.5 (NixOS) respectively. I wonder if there is another system dependency in play here or if it's the version difference. We should investigate because this impacts the ability to create images on the client side. EDIT: Tried it on |
As I was adding restore instructions to the user data backup guide in SovereignCloudStack/docs#176 I reproduced the process of creating volumes from previously encrypted (LUKS) images originating from Cinder itself on my DevStack and noticed that Cinder does not verify that the target volume actually has an encrypted type: $ file image.raw
image.raw: LUKS encrypted file, ver 1 [aes, xts-plain64, sha256]
$ file image.key
image.key: data
$ openstack secret store --algorithm aes --bit-length 256 --mode cbc \
--secret-type symmetric --file image.key --name restored-image-key
$ openstack secret list -f value -c "Secret href" -c "Name"
http://10.0.1.116/key-manager/v1/secrets/6ea6b0a8-de50-45b8-90b7-9470c4dd201a restored-image-key
$ export SECRET_ID=6ea6b0a8-de50-45b8-90b7-9470c4dd201a
$ openstack image create --file image.raw \
--property cinder_encryption_key_id=$SECRET_ID \
--property cinder_encryption_key_deletion_policy=on_image_deletion \
restored-image
$ openstack volume create --size 1 \
--image restored-image volume-restored-notype
$ openstack volume show -f value -c type volume-restored-notype
lvmdriver-1 # <- this is an *unencrypted* volume type! (which contains LUKS blocks now)
$ openstack volume create --size 1 \
--image restored-image --type lvmdriver-1-LUKS \
volume-restored-lukstype
$ openstack volume show -f value -c type volume-restored-lukstype
lvmdriver-1-LUKS
$ openstack server create \
--volume volume-restored-notype \
... server-from-untyped-volume
$ openstack server create \
--volume volume-restored-lukstype \
... server-from-luks-volume This results in the server It seems that this is an oversight in Cinder in its current implementation? |
I tested this with a simple encrypted volume to encrypted image to volume:
A seemingly unencrypted volume is created from the encrypted image. But as far as we know, there is no decrypting mechanism implemented in Cinder to go from such a Cinder-specific LUKS encryption in the image to an unencrypted volume. We should definitely file a bug in Cinder for this one. |
How to test, if a server bootedWhen creating a server from a volume, like my above case, it will be shown as active even though this cannot be true
To show the boot log of a server the following command can be used:
The log will then look something like:
But if created from a server with an unencrypted volume from an encrypted image the log file remains empty:
|
I uploaded the new spec to gerrit: https://review.opendev.org/c/openstack/glance-specs/+/915726 |
I reported the bug uncovered in #560 (comment) at https://bugs.launchpad.net/cinder/+bug/2061154 |
I raised this in the IRC, the following was the answer:
|
We currently evaluate whether we will also need a spec for Cinder. What we definitely need is a blueprint for Cinder to track all development. I have written a blueprint and tried to include all possible points, where there will be implementation needed in Cinder: |
We got a review for the Spec (I am currently updating it) and there is one last question to discuss: We wanted to introduce a new container format "encrypted" - so it would be easily visible to anyone, that this image is encrypted. To identify what the underlying format is, we introduced a property: "os_decrypt_container_format". Now Nova and Cinder both would like to have the container format showing up in the original property and Cinder additionally would need some parameter to know whether the encrypted image is compressed or not. The thing here is, that we could check whether encrypted images are always qcow or raw when the container_format and decrypt_container_format is set. The metadata could be set after creating an image in the upload step. So it may be allowed to create an encrypted image with a format neither cinder nor nova could use. We would like to avoid such bad user experience. So we want to ask the following questions in the Glance meeting this thursday:
|
From the Cinder meeting today I got the wish to also create a small Cinder spec: https://etherpad.opendev.org/p/cinder-dalmatian-meetings |
I created the Cidner spec today: https://review.opendev.org/c/openstack/cinder-specs/+/919499 |
We got reviews on the spec, and Markus and I are going through them and answering questions. |
I adjusted the Glance spec and removed the container_format 'encrypted', because this was one of the most discussed parts of the spec: https://review.opendev.org/c/openstack/glance-specs/+/915726 |
I got feedback on the Glance patch with the question about what to do when the image conversion plugin is activated. Image conversion as I understand it, is not something that can be triggered by a CLI command. Instead when the plugin is activated ALL images that are created will be automatically converted to the ( through the config ) specified format after uploading and before storing it. This creates a few questions regarding encrypted images:
As this optional feature of Glance may already have problems with 1, and we would need to at least forbid uploading encrypted images when the target format is vmdk (maybe also when the target format is qcow2) this would be a lot more of implementation work. So we will for now render this out of scope for the spec. Maybe this can be done after the image encryption is in place and if it is needed by operators and users. |
Seems like this is part of the "Interoperable Image Import"1. Which references the following methods2:
This makes me wonder if this is applicable to images uploaded to Glance by Nova or Cinder at all. This is limited to images from external sources I think. Nonetheless, at the latest when the user is initiating such an image upload we need to make sure that we account for this concerning the image encryption. Footnotes |
I am looking into the Cinder Side a little bit more because, we got a comment from Dan Smith regarding image conversion and Glance checking a few Image parameters:
The image conversion does only seem to be triggered when a user is uploading an image, not if Cinder or Nova upload one. Forbidding that would put more responsibility on the user side, but we also do that with the image encryption. The check for the virtual size can be omitted, because we only allow 2 types of encrypted images: qcow2 and raw and we want to introduce the "os_decrypt_size" parameter that should discribe the size of the unencrypted image. We may even mandate using this parameter. The format for encrypted raw images is only checkable, when decrypting the image. So while this is a valid point to discuss, as Glance will reject images, that do not have the format, they say they have, the Glance team did not wanted to have the power to encrypt or decrypt images back when we discussed gpg-based image encryption. I doubt that this has changed and i also do not see a good reason, Glance should be able to do this. |
I edited the glance spec to forbid image conversion in a more explicit way and included some more comments on the glance spec. |
I adjusted the key-managing part of the spec to clarify the behavior of deleting keys and added a part to put images into ERROR state, when they are encrypted and image conversion is enabled and need to be done. |
I raised attention again for all the patches, in the irc glance channel and in the pop-up team meeting. |
Cinder and Glance patchsets have been updated to only deprecate |
I fixed some errors in the glance db data-migration patch. And looked through the Zuul logs for the Nova and Cidner patches. |
Upstream, the Cinder Tempest tests in Zuul currently fail during volume cloning of encrypted volumes1:
Source code of the Tempest test: https://github.com/openstack/cinder-tempest-plugin/blob/1aa0a56fca85ce23343950431e28f41c4fa811c5/cinder_tempest_plugin/scenario/test_volume_encrypted.py#L150-L157 I started debugging this on my DevStack but so far haven't been able to reproduce a volume in ERROR state when cloning an existing encrypted one. I happened to notice one difference though:
I don't think cloning the secret name without any suffix (1) is desired at all because this is confusing for users when their image secret appears multiple times in Barbican by name. This also seems to hint at differences of secret clone handling between (1) and (2). I will keep investigating. Footnotes |
The recheck for the Nova Patch led to a green zuul pipeline.
Both should not be related to the migration patch. I still wait for feedback from the Glance team, because the migration is not triggered on devstack. |
I adjusted the Cinder patchset to remove the name when cloning a secret. |
We got a review on the Glance patchset requesting the following changes:
I'll look into those. |
Possible changes to disk_formatInitially, we proposed to introduce a new On 2024-08-28 in the weekly Glance IRC meeting2, @josephineSei and I couldn't participate due to a conflicting meeting but the image encryption patchsets were discussed nonetheless and an interesting point was raised:
Then we received the following comment on the Glance patchset from Dan Smith (Nova):
As a result, we once again need to rethink and rework our encryption format approach in the Cinder and Glance patchsets to address this, it seems. Footnotes |
After reading through the Spec from Dan and looking into our own spec: https://review.opendev.org/c/openstack/glance-specs/+/915726/11/specs/2024.2/approved/glance/standardized_image_encryption.rst I would like to propose an update to our specs for Cinder and Glance. We should definitely state it there, because people will find these specs also when searching for documentation. Here is what we could do in Glance: https://review.opendev.org/c/openstack/glance-specs/+/927819/1/specs/2024.2/approved/glance/standardized_image_encryption.rst |
In addition to your spec updates and Dan Smith's comment:
I don't think we could or should go ahead and simply introduce some arbitrary I participated in today's Cinder IRC meeting, raised awareness of the topic again and asked people to attend next monday's popup meeting. |
I attended today's Glance meeting: seems like we're back to the drawing board regarding the image metadata attributes associated with encrypted images and need to discuss this in the next PTG with the teams of Nova, Glance and Cinder:
Source: https://meetings.opendev.org/meetings/glance/2024/glance.2024-09-05-14.00.log.html Update:
|
I think It would be good to have this conversation at the PTG in a cross-project session together with Cinder (and maybe Nova) to get everyone to agree on one way. But even more important: I think we (I can try to do so) should reach out to Dan Smith to sketch out one or more possible ways of implementing the image encryption - all BEFORE the PTG happens. He is the one working on the image format checker, so he knows best, what information is needed to verify that there is no malicious image upload. |
@josephineSei and I held today's image encryption popup team meeting on IRC1. About formats:
About vulnerabilities:
(past CVEs were based on qemu encountering a VMDK image with special instructions and executing them; but in this case, only the outer LUKS layer should actually be visible to qemu but that needs to be verified) Footnotes |
We took part in the PTG this week and we discussed with Cinder, Glance and some Nova people the impact of the CVEs on this topic. Our discussion resulted in adjustments we need to do on the spec and the patches:
For now I adjusted the spec accordingly: https://review.opendev.org/c/openstack/glance-specs/+/927819 |
Just to complement this, here are my notes from the meeting about what we agreed on:
|
I was just starting to work on dissecting the hex values of a LUKS header in order to add proper image inspector support in Glance when I happened to notice that the implementation was moved to oslo.utils in August 1. Furthermore, support for LUKSv1 header inspection for a It seems that this is a direct result of the PTG session and one less thing to consider in our patchsets. Footnotes |
I revised the patchsets for Glance and Cinder today. I added @josephineSei I removed
In both cases, the As a result, I started wondering whether trying to properly identify the cipher used (either by inspecting the header in the OSC during user upload or trying to guess it from the Volume Type encryption metadata in case of Cinder's volume-to-image) is worth the risk of getting it wrong, considering it has no functional purpose and is just informational metadata at this point. I think we could keep things a bit more simple by dropping Note: I still need to add |
I think you are correct, we do net loose information, when removing the We still have the following metadata:
The ones that MUST be set are:
The other ones are: The I think the decrypt format is needed, but it is also new (due to the disk_format being LUKS), and not added yet to the patches, right? @markus-hentsch do you want to discuss this point? Maybe it is not needed, because we only allow raw/gpt images in LUKS blocks... But I am not sure about this. |
I try to get more attention to all of this and added reviewers to the patches. |
I updated the Glance implementation patchset and addressed review comments by adding releases notes and unit test coverage for secret consumer exceptions. |
I started testing the changed architecture and ran into a failing For this purpose, I started a patchset for python-openstackclient, which I also labelled with the "LUKS-image-encryption" topic: https://review.opendev.org/c/openstack/python-openstackclient/+/934672 |
After our discussion About the 'os_decrypt_format' I removed the mention of this from the spec, and answered abhisheks comment on this. I also included, that we expect all luks-images to be 'raw' after encryption. |
Cinder internally rewriting format detection from 'luks' to 'raw' for qemu-img conversionsDuring testing of the changed implementation, I discovered that the shift in direction to actually introduce However, they use an override to treat images received from Glance that
In all cases, we actually have the same format on both sides (luks) but Cinder treats its own LUKS data (e.g. volumes) as 'raw'. We don't want any encryption or conversion here but simply have the data copied over. At first I tried removing the override and let Cinder actually detect its own data as 'luks'. However, this led to some very tricky pitfalls because as soon as 'luks' is on some side of a It turned out to be actually easier to keep treating Cinder's internal data as 'raw' and have isolated instances of function flags that allow overriding the |
I adjusted unit tests and added release notes to the Cinder patchset. Regular pipelines look good but IBM and StorPool integration tests fail. @josephineSei I added the os_decrypt_format/gpt topic along with a regular update and pipeline questions to https://etherpad.opendev.org/p/cinder-epoxy-meetings for discussion tomorrow. |
Compatibility with image compression (
|
I did some QA to the Cinder patchset. I added a bunch of unit tests to cover the intricate parts of the patchset where we do special handling for the luks format internally. With that, test coverage of the new behaviors in Cinder should be quite good now. |
I have looked into the zuul tests and found one failing unit test. I tried to fix it and also put the image encryption on the glance teams agenda to get more reviews and drive this forward. |
We got reviews on the spec (with a +2 :)) and on the glance patch. We may need to split up the glance patch in smaller patches, but most of the comments were on smaller issues. |
The patch set regarding the image encryption is in a good state: |
Currently there is the possibility in Cinder to encrypt volumes and in Nova to use qcow2 encrypted images (still under development).
Both can lead to and use LUKS-encrpyted images, but those are different and not aligned:
@markus-hentsch also found out in #541 that uploading a LUKS-encrypted image (that was created from a volume) to another cloud in combination with setting a few parameteres (cinder_encryption_key, etc...) will result in an image that can be used to create an encrpyted and functional volume.
As a user it would be good to have a streamlined operation to use encrypted images in openstack for both volumes and ephemeral storage and to also allow interoperability between clouds.
Therefore we need and will propose standardized parameters to describe and detect an encrypted image, which might be similar to the parameters described here: https://specs.openstack.org/openstack/cinder-specs/specs/zed/image-encryption.html
But will use the LUKS encryption.
So those encrypted images could be natively mounted in Nova or just formed into a volume (raw LUKS images can be directly used, qcow images need to be flattened).
With such a way encrypted backup images can be easily downloaded and transferred to another cloud.
This is a result from a lengthy discussion at the PTG with Nova, Cinder and Glance ( https://etherpad.opendev.org/p/dalmatian-ptg-cinder#L376 )
Followup tasks may be to implement re-encryption to fully change keys for LUKS volumes and images.
The text was updated successfully, but these errors were encountered: