In today's computing landscape, ensuring live migration of virtual machines (VMs) is essential for maintaining high availability and minimizing downtime during maintenance tasks. KubeVirt, an extension of Kubernetes, integrates VM management into the containerized world, enabling unified control over both virtualized and containerized workloads.
In this guide, we’ll walk you through setting up a scalable infrastructure on AWS that supports the Live Migration feature in KubeVirt, utilizing AWS FSx for NetApp ONTAP and OpenShift Container Platform (OCP). By the end of this tutorial, you’ll have the tools to live migrate VMs effortlessly within your Kubernetes cluster, ensuring high availability and reliability in your cloud environment.
The following technologies are crucial to this setup:
Before proceeding, ensure you have:
First, download the OpenShift installer for your platform:
curl -fsSLO https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/4.14.4/openshift-install-mac-arm64-4.14.4.tar.gz
tar xzf openshift-install-mac-arm64-4.14.4.tar.gz
./openshift-install --help
Create a directory for your OpenShift manifests:
mkdir -p ocp-manifests-dir/
First, download the OpenShift installer for your platform:
apiVersion: v1
baseDomain: yourdomain.com
compute:
- name: worker
platform:
aws:
type: c5.metal
replicas: 2
controlPlane:
name: master
platform: {}
replicas: 3
metadata:
name: ocp-demo
networking:
networkType: OVNKubernetes
platform:
aws:
region: your-aws-region
publish: External
pullSecret: 'your-pull-secret'
sshKey: 'your-ssh-key'
Note: Replace placeholders like yourdomain.com, your-aws-region, your-pull-secret, and your-ssh-key with your actual values.
Generate the OpenShift manifests:
./openshift-install create manifests --dir ocp-manifests-dir
Backup your installation configuration:
cp -r ocp-manifests-dir/ ocp-manifests-dir-bkp
Start the cluster installation:
./openshift-install create cluster --dir ocp-manifests-dir --log-level debug
We need a multi-AZ file system to support ReadWriteMany access. Navigate to the fsx directory and create the FSx file system using AWS CloudFormation:
cd fsx
aws cloudformation create-stack \
--stack-name FSXONTAP \
--template-body file://./netapp-cf-template.yaml \
--region your-aws-region \
--parameters \
ParameterKey=Subnet1ID,ParameterValue=subnet-xxxxxxxx \
ParameterKey=Subnet2ID,ParameterValue=subnet-yyyyyyyy \
ParameterKey=myVpc,ParameterValue=vpc-zzzzzzzz \
ParameterKey=FSxONTAPRouteTable,ParameterValue=rtb-aaaaaaa,rtb-bbbbbbb \
ParameterKey=FileSystemName,ParameterValue=myFSxONTAP \
ParameterKey=ThroughputCapacity,ParameterValue=256 \
ParameterKey=FSxAllowedCIDR,ParameterValue=0.0.0.0/0 \
ParameterKey=FsxAdminPassword,ParameterValue=YourFSxAdminPassword \
ParameterKey=SvmAdminPassword,ParameterValue=YourSvmAdminPassword \
--capabilities CAPABILITY_NAMED_IAM
Note: Replace the parameter values with your actual AWS resource IDs and desired passwords.
export KUBECONFIG=$(pwd)/ocp-manifests-dir/auth/kubeconfig
kubectl get nodes
Create the trident namespace and install the Trident CSI driver:
oc create ns trident
curl -L -o trident-installer.tar.gz https://github.com/NetApp/trident/releases/download/v22.10.0/trident-installer-22.10.0.tar.gz
tar -xvf trident-installer.tar.gz
cd trident-installer/helm
helm install trident -n trident trident-operator-22.10.0.tgz
Create a svm_secret.yaml file with the following content:
apiVersion: v1
kind: Secret
metadata:
name: backend-fsx-ontap-nas-secret
namespace: trident
type: Opaque
stringData:
username: vsadmin
password: YourSvmAdminPassword
Apply the secret:
oc apply -f svm_secret.yaml
Edit backend-ontap-nas.yaml in the fsx directory, replacing placeholders with your FSx for ONTAP details:
version: 1
storageDriverName: ontap-nas
managementLIF: management-dns-name
dataLIF: nfs-dns-name
svm: svm-name
username: vsadmin
password: YourSvmAdminPassword
Apply the backend configuration:
oc apply -f fsx/backend-ontap-nas.yaml
Verify the backend status:
oc get tridentbackends -n trident
Create a storage class by applying storage-class-csi-nas.yaml:
oc apply -f fsx/storage-class-csi-nas.yaml
Verify the storage class:
oc get sc
Install KubeVirt in the openshift-cnv namespace:
echo '
apiVersion: v1
kind: Namespace
metadata:
name: openshift-cnv
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: kubevirt-hyperconverged-group
namespace: openshift-cnv
spec:
targetNamespaces:
- openshift-cnv
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: hco-operatorhub
namespace: openshift-cnv
spec:
source: redhat-operators
sourceNamespace: openshift-marketplace
name: kubevirt-hyperconverged
startingCSV: kubevirt-hyperconverged-operator.v4.14.0
channel: "stable"' | k apply -f-
Wait for all pods in openshift-cnv to be ready, then create the HyperConverged resource:
apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
metadata:
name: kubevirt-hyperconverged
namespace: openshift-cnv
spec:' | k apply -f-
Verify the installation:
oc get csv -n openshift-cnv
oc get kubevirt -n openshift-cnv
oc get HyperConverged -n openshift-cnv
You have two VMs running on separate bare-metal worker nodes in your OpenShift cluster. You need to perform maintenance on WorkerA and want to live-migrate its VM to WorkerB without downtime.
KubeVirt VMs are essentially Kubernetes pods. When a pod moves to a different node, it gets a new IP address, disrupting connectivity.
To maintain continuous network connectivity during migration, we'll add a second network interface to the VMs using a NetworkAttachmentDefinition (NAD). This secondary interface will have a static IP, ensuring seamless communication post-migration.
Create a namespace for your VMs
oc create ns vm-test
Apply the NAD configuration:
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
name: static-eth1
namespace: vm-test
spec:
config: '{
"cniVersion": "0.3.1",
"type": "bridge",
"bridge": "br1",
"ipam": {
"type": "static"
}
}'
Apply the NAD:
oc apply -f virtualization/nad.yaml
Create two VMs, each with two network interfaces:
oc apply -f virtualization/vm-rhel-9-dual-nic.yaml
Verify the VMs are running:
oc get vm -n vm-test
Access each VM console and assign static IPs to eth1:
VM A:
virtctl console -n vm-test rhel9-dual-nic-a
Inside the VM:
sudo ip addr add 192.168.1.10/24 dev eth1
VM B:
virtctl console -n vm-test rhel9-dual-nic-b
Inside the VM:
sudo ip addr add 192.168.1.11/24 dev eth1
From VM A, ping VM B:
ping 192.168.1.11
From VM B, ping VM A:
ping 192.168.1.10
Successful replies confirm network connectivity over the secondary interfaces.
Now, initiate live migration of VM A to WorkerB:
oc migrate vm rhel9-dual-nic-a -n vm-test
Monitor the migration status:
oc get vmim -n vm-test
In conclusion, by integrating AWS FSx for NetApp ONTAP with OpenShift and KubeVirt, we've successfully enabled live migration of virtual machines (VMs) within a Kubernetes cluster. Utilizing a secondary network interface with a static IP ensured continuous network connectivity during migrations, allowing for seamless maintenance and scaling operations without disrupting running applications.
This robust setup harnesses the power of AWS managed services and open-source technologies to deliver a scalable, resilient infrastructure ideal for modern cloud-native workloads, ensuring high availability and operational efficiency.