Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(vm): use generic model with explicit features for Discovery cpu type #580

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions docs/internal/cpu_model_type_discovery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# CPUModel type Discovery

## Problem

The first approach was to use host-model with a common set of features. This was a mistake, as
libvirt resolves the host-model to the specific host CPU model, which can be different on different nodes
and migration not works.

The second approach was to use "Empty" model in cpu-map directory. It works partially for some CPU combinations.
Other combinations lead to migration problems. These combinations are unpredictable, so no workaround.
The error might be a bug in libvirt when it compares features after resolving the target CPU model (still
need to investigate).

The current approach is to use kvm64 model for Discovery and Features types. This model contains a small
set of features and migration works well.

## Solution

1. Use kvm64 model for Discovery and Features vmclass types.
2. Add patch for kubevirt to prevent adding nodeSelector for cpu model "kvm64".
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
diff --git a/pkg/virt-controller/services/nodeselectorrenderer.go b/pkg/virt-controller/services/nodeselectorrenderer.go
index 390f359d2a..c21caf97dd 100644
--- a/pkg/virt-controller/services/nodeselectorrenderer.go
+++ b/pkg/virt-controller/services/nodeselectorrenderer.go
@@ -23,6 +23,9 @@ type NodeSelectorRenderer struct {

type NodeSelectorRendererOption func(renderer *NodeSelectorRenderer)

+// DeckhouseVirtualizationPlatformGenericCPUModel is a name of additional empty CPU model for Discovery type of VMClass.
+const DeckhouseVirtualizationPlatformGenericCPUModel = "kvm64"
+
func NewNodeSelectorRenderer(
vmiNodeSelectors map[string]string,
clusterWideConfNodeSelectors map[string]string,
@@ -51,7 +54,8 @@ func (nsr *NodeSelectorRenderer) Render() map[string]string {
if nsr.hyperv {
maps.Copy(nsr.podNodeSelectors, hypervNodeSelectors(nsr.vmiFeatures))
}
- if nsr.cpuModelLabel != "" && nsr.cpuModelLabel != cpuModelLabel(v1.CPUModeHostModel) && nsr.cpuModelLabel != cpuModelLabel(v1.CPUModeHostPassthrough) {
+ // Prevent adding node selector for host-model, host-passthrough and an empty CPU model.
+ if nsr.cpuModelLabel != "" && nsr.cpuModelLabel != cpuModelLabel(v1.CPUModeHostModel) && nsr.cpuModelLabel != cpuModelLabel(v1.CPUModeHostPassthrough) && nsr.cpuModelLabel != cpuModelLabel(DeckhouseVirtualizationPlatformGenericCPUModel) {
nsr.enableSelectorLabel(nsr.cpuModelLabel)
}
for _, cpuFeatureLabel := range nsr.cpuFeatureLabels {
diff --git a/pkg/virt-launcher/virtwrap/live-migration-source.go b/pkg/virt-launcher/virtwrap/live-migration-source.go
index 5cc14a1f85..6bd0ba3d9d 100644
--- a/pkg/virt-launcher/virtwrap/live-migration-source.go
+++ b/pkg/virt-launcher/virtwrap/live-migration-source.go
@@ -230,6 +230,15 @@ func migratableDomXML(dom cli.VirDomain, vmi *v1.VirtualMachineInstance, domSpec
return "", err
}

+ // Put back common model if specified in VMI.
+ vmiCPU := vmi.Spec.Domain.CPU
+ if vmiCPU != nil && vmiCPU.Model == "kvm64" {
+ if domcfg.CPU.Model != nil {
+ domcfg.CPU.Model.Value = vmiCPU.Model
+ domcfg.CPU.Model.Fallback = "allow"
+ }
+ }
+
return domcfg.Marshal()
}

6 changes: 6 additions & 0 deletions images/virt-artifact/patches/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,3 +134,9 @@ To force delivery of packages to only one VM pod, the special label `network.dec
When the migration completes, the label is removed and the target pod becomes accessible via network.

d8-cni-cilium ensures that once the label is removed from the target pod, only the target pod remains accessible over the network (while the source pod does not).

#### `031-prevent-adding-node-selector-for-dvp-generic-cpu-model.patch`

- Do not add cpu-model nodeSelector for "kvm64" model. This selector prevents starting VMs as node-labeler ignores to labeling nodes with "kvm64" model.

- Overwrite calculated model on migration, put back "kvm64" for Discovery and Features vmclass types.
2 changes: 1 addition & 1 deletion images/virt-launcher/werf.inc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ shell:
- mv /usr/bin/virt-launcher-monitor /usr/bin/virt-launcher-monitor-orig
- cp /scripts/virt-launcher-monitor-wrapper.sh /usr/bin/virt-launcher-monitor
- chmod +x /usr/bin/virt-launcher-monitor
# Configure liboverride globally.
# Configure liboverride globally. It should be done in the last stage (setup) to not break stapel commands.
- cp /etc/ld.so.preload.in /etc/ld.so.preload
# Create qemu group and user.
- groupadd --gid 107 qemu && useradd qemu --uid 107 --gid 107 --shell /bin/bash --create-home
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ import (
const (
CloudInitDiskName = "cloudinit"
SysprepDiskName = "sysprep"

// GenericCPUModel specifies the base CPU model for Features and Discovery CPU model types.
GenericCPUModel = "kvm64"
)

type KVVMOptions struct {
Expand Down Expand Up @@ -109,7 +112,8 @@ func (b *KVVM) SetCPUModel(class *virtv2.VirtualMachineClass) error {
cpu.Model = virtv1.CPUModeHostPassthrough
case virtv2.CPUTypeModel:
cpu.Model = class.Spec.CPU.Model
case virtv2.CPUTypeFeatures, virtv2.CPUTypeDiscovery:
case virtv2.CPUTypeDiscovery, virtv2.CPUTypeFeatures:
cpu.Model = GenericCPUModel
features := make([]virtv1.CPUFeature, len(class.Status.CpuFeatures.Enabled))
for i, feature := range class.Status.CpuFeatures.Enabled {
policy := "require"
Expand Down
Loading