How to Enable Live Migration in KubeVirt with AWS FSx and OpenShift
In today's computing landscape, ensuring live migration of virtual machines (VMs) is essential for maintaining high availability and minimizing downtime during maintenance tasks. KubeVirt, an extension of Kubernetes, integrates VM management into the containerized world, enabling unified control over both virtualized and containerized workloads.
In this guide, we’ll walk you through setting up a scalable infrastructure on AWS that supports the Live Migration feature in KubeVirt, utilizing AWS FSx for NetApp ONTAP and OpenShift Container Platform (OCP). By the end of this tutorial, you’ll have the tools to live migrate VMs effortlessly within your Kubernetes cluster, ensuring high availability and reliability in your cloud environment.
Tech Stack Overview
The following technologies are crucial to this setup:
- AWS Bare Metal EC2 Instances
Provide the physical hardware required for running KubeVirt, as KubeVirt requires deployment on metal instances. - AWS FSx for NetApp ONTAP
Offers a fully managed shared file system with ReadWriteMany access, essential for live migration. - OpenShift Container Platform (OCP)
A Kubernetes-based container orchestration platform that simplifies application deployment and management. - KubeVirt
Extends Kubernetes by allowing it to manage virtual machines as native Kubernetes resources. - Trident CSI Driver
A Container Storage Interface (CSI) driver from NetApp that integrates with Kubernetes to manage storage provisioning.
Prerequisites
Before proceeding, ensure you have:
- Basic knowledge of Kubernetes, OpenShift, and AWS services.
- A clone of the project repository, which contains all the necessary configuration files and templates:
git clone https://github.com/kloia/aws-ocp-kubevirt-fsx.git
cd aws-ocp-kubevirt-fsx - An AWS account with the necessary permissions to create EC2 instances, FSx file systems, and VPC configurations.
- The OpenShift installer and kubectl command-line tools installed on your workstation.
Deployment
- Setting Up the Environment
- Installation
- Deploying the Trident CSI Driver
- Deploying KubeVirt
Live Migration of VMs with KubeVirtSetting Up the Environment
Download the OpenShift Installer
First, download the OpenShift installer for your platform:
curl -fsSLO https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/4.14.4/openshift-install-mac-arm64-4.14.4.tar.gz
tar xzf openshift-install-mac-arm64-4.14.4.tar.gz
./openshift-install --help
Create the Installation Configuration
Create a directory for your OpenShift manifests:
mkdir -p ocp-manifests-dir/
First, download the OpenShift installer for your platform:
apiVersion: v1
baseDomain: yourdomain.com
compute:
- name: worker
platform:
aws:
type: c5.metal
replicas: 2
controlPlane:
name: master
platform: {}
replicas: 3
metadata:
name: ocp-demo
networking:
networkType: OVNKubernetes
platform:
aws:
region: your-aws-region
publish: External
pullSecret: 'your-pull-secret'
sshKey: 'your-ssh-key'
Note: Replace placeholders like yourdomain.com, your-aws-region, your-pull-secret, and your-ssh-key with your actual values.
Generate Manifests
Generate the OpenShift manifests:
./openshift-install create manifests --dir ocp-manifests-dir
Installation
Install the OpenShift Cluster
Backup your installation configuration:
cp -r ocp-manifests-dir/ ocp-manifests-dir-bkp
Start the cluster installation:
./openshift-install create cluster --dir ocp-manifests-dir --log-level debug
Provision AWS FSx for NetApp ONTAP
We need a multi-AZ file system to support ReadWriteMany access. Navigate to the fsx directory and create the FSx file system using AWS CloudFormation:
cd fsx
aws cloudformation create-stack \
--stack-name FSXONTAP \
--template-body file://./netapp-cf-template.yaml \
--region your-aws-region \
--parameters \
ParameterKey=Subnet1ID,ParameterValue=subnet-xxxxxxxx \
ParameterKey=Subnet2ID,ParameterValue=subnet-yyyyyyyy \
ParameterKey=myVpc,ParameterValue=vpc-zzzzzzzz \
ParameterKey=FSxONTAPRouteTable,ParameterValue=rtb-aaaaaaa,rtb-bbbbbbb \
ParameterKey=FileSystemName,ParameterValue=myFSxONTAP \
ParameterKey=ThroughputCapacity,ParameterValue=256 \
ParameterKey=FSxAllowedCIDR,ParameterValue=0.0.0.0/0 \
ParameterKey=FsxAdminPassword,ParameterValue=YourFSxAdminPassword \
ParameterKey=SvmAdminPassword,ParameterValue=YourSvmAdminPassword \
--capabilities CAPABILITY_NAMED_IAM
Note: Replace the parameter values with your actual AWS resource IDs and desired passwords.
Deploying the Trident CSI Driver
Set KUBECONFIG Environment Variable
export KUBECONFIG=$(pwd)/ocp-manifests-dir/auth/kubeconfig
kubectl get nodes
Install Trident Operator
Create the trident namespace and install the Trident CSI driver:
oc create ns trident
curl -L -o trident-installer.tar.gz https://github.com/NetApp/trident/releases/download/v22.10.0/trident-installer-22.10.0.tar.gz
tar -xvf trident-installer.tar.gz
cd trident-installer/helm
helm install trident -n trident trident-operator-22.10.0.tgz
Create Secrets for Backend Access
Create a svm_secret.yaml file with the following content:
apiVersion: v1
kind: Secret
metadata:
name: backend-fsx-ontap-nas-secret
namespace: trident
type: Opaque
stringData:
username: vsadmin
password: YourSvmAdminPassword
Apply the secret:
oc apply -f svm_secret.yaml
Deploy the Trident Backend Configuration
Edit backend-ontap-nas.yaml in the fsx directory, replacing placeholders with your FSx for ONTAP details:
version: 1
storageDriverName: ontap-nas
managementLIF: management-dns-name
dataLIF: nfs-dns-name
svm: svm-name
username: vsadmin
password: YourSvmAdminPassword
Apply the backend configuration:
oc apply -f fsx/backend-ontap-nas.yaml
Verify the backend status:
oc get tridentbackends -n trident
Create a Storage Class
Create a storage class by applying storage-class-csi-nas.yaml:
oc apply -f fsx/storage-class-csi-nas.yaml
Verify the storage class:
oc get sc
Deploying KubeVirt
Install KubeVirt in the openshift-cnv namespace:
echo '
apiVersion: v1
kind: Namespace
metadata:
name: openshift-cnv
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: kubevirt-hyperconverged-group
namespace: openshift-cnv
spec:
targetNamespaces:
- openshift-cnv
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: hco-operatorhub
namespace: openshift-cnv
spec:
source: redhat-operators
sourceNamespace: openshift-marketplace
name: kubevirt-hyperconverged
startingCSV: kubevirt-hyperconverged-operator.v4.14.0
channel: "stable"' | k apply -f-
Wait for all pods in openshift-cnv to be ready, then create the HyperConverged resource:
apiVersion: hco.kubevirt.io/v1beta1
kind: HyperConverged
metadata:
name: kubevirt-hyperconverged
namespace: openshift-cnv
spec:' | k apply -f-
Verify the installation:
oc get csv -n openshift-cnv
oc get kubevirt -n openshift-cnv
oc get HyperConverged -n openshift-cnv
Live Migration of VMs with KubeVirt
Scenario
You have two VMs running on separate bare-metal worker nodes in your OpenShift cluster. You need to perform maintenance on WorkerA and want to live-migrate its VM to WorkerB without downtime.
Challenge
KubeVirt VMs are essentially Kubernetes pods. When a pod moves to a different node, it gets a new IP address, disrupting connectivity.
Solution
To maintain continuous network connectivity during migration, we'll add a second network interface to the VMs using a NetworkAttachmentDefinition (NAD). This secondary interface will have a static IP, ensuring seamless communication post-migration.
Create NetworkAttachmentDefinition (NAD)
Create a namespace for your VMs
oc create ns vm-test
Apply the NAD configuration:
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
name: static-eth1
namespace: vm-test
spec:
config: '{
"cniVersion": "0.3.1",
"type": "bridge",
"bridge": "br1",
"ipam": {
"type": "static"
}
}'
Apply the NAD:
oc apply -f virtualization/nad.yaml
Create VMs with Dual NICs
Create two VMs, each with two network interfaces:
oc apply -f virtualization/vm-rhel-9-dual-nic.yaml
Verify the VMs are running:
oc get vm -n vm-test
Assign IP Addresses to Secondary NICs
Access each VM console and assign static IPs to eth1:
VM A:
virtctl console -n vm-test rhel9-dual-nic-a
Inside the VM:
sudo ip addr add 192.168.1.10/24 dev eth1
VM B:
virtctl console -n vm-test rhel9-dual-nic-b
Inside the VM:
sudo ip addr add 192.168.1.11/24 dev eth1
Connectivity Test
From VM A, ping VM B:
ping 192.168.1.11
Connectivity Test
From VM B, ping VM A:
ping 192.168.1.10
Successful replies confirm network connectivity over the secondary interfaces.
Live Migration
Now, initiate live migration of VM A to WorkerB:
oc migrate vm rhel9-dual-nic-a -n vm-test
Monitor the migration status:
oc get vmim -n vm-test
In conclusion, by integrating AWS FSx for NetApp ONTAP with OpenShift and KubeVirt, we've successfully enabled live migration of virtual machines (VMs) within a Kubernetes cluster. Utilizing a secondary network interface with a static IP ensured continuous network connectivity during migrations, allowing for seamless maintenance and scaling operations without disrupting running applications.
This robust setup harnesses the power of AWS managed services and open-source technologies to deliver a scalable, resilient infrastructure ideal for modern cloud-native workloads, ensuring high availability and operational efficiency.
Bilal Unal
Platform Engineer @kloia