Skip to content

Commit

Permalink
fix(vm): use generic model with explicit features for Discovery cpu type
Browse files Browse the repository at this point in the history
- Use kvm64 model for Discovery and Features types.
- Patch kubevirt to prevent node selector on VM Pos for kvm64 model.
- Add internal documentation describing problems with Discovery type.

Signed-off-by: Ivan Mikheykin <[email protected]>
  • Loading branch information
diafour committed Jan 14, 2025
1 parent e7db733 commit 8f9b0ae
Show file tree
Hide file tree
Showing 5 changed files with 76 additions and 2 deletions.
20 changes: 20 additions & 0 deletions docs/internal/cpu_model_type_discovery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# CPUModel type Discovery

## Problem

The first approach was to use host-model with a common set of features. This was a mistake, as
libvirt resolves the host-model to the specific host CPU model, which can be different on different nodes
and migration not works.

The second approach was to use "Empty" model in cpu-map directory. It works partially for some CPU combinations.
Other combinations lead to migration problems. These combinations are unpredictable, so no workaround.
The error might be a bug in libvirt when it compares features after resolving the target CPU model (still
need to investigate).

The current approach is to use kvm64 model for Discovery and Features types. This model contains a small
set of features and migration works well.

## Solution

1. Use kvm64 model for Discovery and Features vmclass types.
2. Add patch for kubevirt to prevent adding nodeSelector for cpu model "kvm64".
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
diff --git a/pkg/virt-controller/services/nodeselectorrenderer.go b/pkg/virt-controller/services/nodeselectorrenderer.go
index 390f359d2a..c21caf97dd 100644
--- a/pkg/virt-controller/services/nodeselectorrenderer.go
+++ b/pkg/virt-controller/services/nodeselectorrenderer.go
@@ -23,6 +23,9 @@ type NodeSelectorRenderer struct {

type NodeSelectorRendererOption func(renderer *NodeSelectorRenderer)

+// DeckhouseVirtualizationPlatformGenericCPUModel is a name of additional empty CPU model for Discovery type of VMClass.
+const DeckhouseVirtualizationPlatformGenericCPUModel = "kvm64"
+
func NewNodeSelectorRenderer(
vmiNodeSelectors map[string]string,
clusterWideConfNodeSelectors map[string]string,
@@ -51,7 +54,8 @@ func (nsr *NodeSelectorRenderer) Render() map[string]string {
if nsr.hyperv {
maps.Copy(nsr.podNodeSelectors, hypervNodeSelectors(nsr.vmiFeatures))
}
- if nsr.cpuModelLabel != "" && nsr.cpuModelLabel != cpuModelLabel(v1.CPUModeHostModel) && nsr.cpuModelLabel != cpuModelLabel(v1.CPUModeHostPassthrough) {
+ // Prevent adding node selector for host-model, host-passthrough and an empty CPU model.
+ if nsr.cpuModelLabel != "" && nsr.cpuModelLabel != cpuModelLabel(v1.CPUModeHostModel) && nsr.cpuModelLabel != cpuModelLabel(v1.CPUModeHostPassthrough) && nsr.cpuModelLabel != cpuModelLabel(DeckhouseVirtualizationPlatformGenericCPUModel) {
nsr.enableSelectorLabel(nsr.cpuModelLabel)
}
for _, cpuFeatureLabel := range nsr.cpuFeatureLabels {
diff --git a/pkg/virt-launcher/virtwrap/live-migration-source.go b/pkg/virt-launcher/virtwrap/live-migration-source.go
index 5cc14a1f85..6bd0ba3d9d 100644
--- a/pkg/virt-launcher/virtwrap/live-migration-source.go
+++ b/pkg/virt-launcher/virtwrap/live-migration-source.go
@@ -230,6 +230,15 @@ func migratableDomXML(dom cli.VirDomain, vmi *v1.VirtualMachineInstance, domSpec
return "", err
}

+ // Put back common model if specified in VMI.
+ vmiCPU := vmi.Spec.Domain.CPU
+ if vmiCPU != nil && vmiCPU.Model == "kvm64" {
+ if domcfg.CPU.Model != nil {
+ domcfg.CPU.Model.Value = vmiCPU.Model
+ domcfg.CPU.Model.Fallback = "allow"
+ }
+ }
+
return domcfg.Marshal()
}

6 changes: 6 additions & 0 deletions images/virt-artifact/patches/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,3 +134,9 @@ To force delivery of packages to only one VM pod, the special label `network.dec
When the migration completes, the label is removed and the target pod becomes accessible via network.

d8-cni-cilium ensures that once the label is removed from the target pod, only the target pod remains accessible over the network (while the source pod does not).

#### `031-prevent-adding-node-selector-for-dvp-generic-cpu-model.patch`

- Do not add cpu-model nodeSelector for "kvm64" model. This selector prevents starting VMs as node-labeler ignores to labeling nodes with "kvm64" model.

- Overwrite calculated model on migration, put back "kvm64" for Discovery and Features vmclass types.
2 changes: 1 addition & 1 deletion images/virt-launcher/werf.inc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ shell:
- mv /usr/bin/virt-launcher-monitor /usr/bin/virt-launcher-monitor-orig
- cp /scripts/virt-launcher-monitor-wrapper.sh /usr/bin/virt-launcher-monitor
- chmod +x /usr/bin/virt-launcher-monitor
# Configure liboverride globally.
# Configure liboverride globally. It should be done in the last stage (setup) to not break stapel commands.
- cp /etc/ld.so.preload.in /etc/ld.so.preload
# Create qemu group and user.
- groupadd --gid 107 qemu && useradd qemu --uid 107 --gid 107 --shell /bin/bash --create-home
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ import (
const (
CloudInitDiskName = "cloudinit"
SysprepDiskName = "sysprep"

// GenericCPUModel specifies the base CPU model for Features and Discovery CPU model types.
GenericCPUModel = "kvm64"
)

type KVVMOptions struct {
Expand Down Expand Up @@ -109,7 +112,8 @@ func (b *KVVM) SetCPUModel(class *virtv2.VirtualMachineClass) error {
cpu.Model = virtv1.CPUModeHostPassthrough
case virtv2.CPUTypeModel:
cpu.Model = class.Spec.CPU.Model
case virtv2.CPUTypeFeatures, virtv2.CPUTypeDiscovery:
case virtv2.CPUTypeDiscovery, virtv2.CPUTypeFeatures:
cpu.Model = GenericCPUModel
features := make([]virtv1.CPUFeature, len(class.Status.CpuFeatures.Enabled))
for i, feature := range class.Status.CpuFeatures.Enabled {
policy := "require"
Expand Down

0 comments on commit 8f9b0ae

Please sign in to comment.