Difference between revisions of "Moving A Virtual Machine"

From Nuclear Physics Group Documentation Pages
Jump to navigationJump to search
 
(One intermediate revision by the same user not shown)
Line 13: Line 13:
 
* First, get the xml setup file over to Gourd:
 
* First, get the xml setup file over to Gourd:
 
  <code>
 
  <code>
 
+
virsh dumpxml einstein.unh.edu  > /kvm/einstein.unh.edu.xml
 
  </code>
 
  </code>
 +
* Edit this file, to make sure it is sane.
 +
** change: <type arch='x86_64' machine='rhel5.4.0'>hvm</type> to <type arch='x86_64' machine='pc'>hvm</type>
 +
** change: <source bridge='br0'/> to <source bridge='farmbr'/>
 +
** change: <source bridge='br1'/> to <source bridge='unhbr'/>
 
* You now want the qcow file on Gourd to be 100% in sync with the file on Endeavour, but on Endeavour, the VM is still running!
 
* You now want the qcow file on Gourd to be 100% in sync with the file on Endeavour, but on Endeavour, the VM is still running!
 
* Mount the qcow file on gourd:
 
* Mount the qcow file on gourd:
Line 26: Line 30:
 
  </code>
 
  </code>
 
  This will run ''much'' faster than copying the whole qcow file, plus, it works on a running Einstein. Speedup is about 1000x. It still takes 9 minutes if there are almost no differences.
 
  This will run ''much'' faster than copying the whole qcow file, plus, it works on a running Einstein. Speedup is about 1000x. It still takes 9 minutes if there are almost no differences.
* You can do a last second additional sync of only /var/log, and then shut Einstein down, or you can take Einstein down, guestmount the the drive on Endeavour as well, and do a final sync of only those things that may have changed:
+
* You can do a last second additional sync of only /var, while you already turned off mail (/etc/init.d/postfix stop, /etc/init.d/dovecot stop)  and then shut Einstein down, or you can take Einstein down, guestmount the the drive on Endeavour as well, and do a final sync of only those things that may have changed:
** /var/log
+
** /var/log,  /var/spool,  /var/db, /var/lib/ldap
** /var/spool
+
* Shutdown Einstein. "/sbin/shutdown -h now" on Einstein itself.
**
+
* Unmount the /mnt/einstein:
 +
<code>
 +
guestunmount /mnt/einstein
 +
</code>
 +
* Startup Einstein at the other end:
 +
<code>
 +
virsh define einstein.unh.edu.xml
 +
virsh start einstein.unh.edu
 +
virsh console einstein.unh.edu
 +
</code>
 +
The console part is recommended, so you can see if it boots properly.
 +
Note that if something is not configured correctly, you need to first "virsh undefine einstein.unh.edu", then edit the xml file, then "virsh define einstein.unh.edu.xml" again.
 +
* If all went well, Einstein is now running on Gourd.
 +
* You need to now UNDEFINE Einstein on Endeavour, so we never run 2 copies!
  
 
== Simple Move ==
 
== Simple Move ==
Line 57: Line 74:
  
 
Yes, you end up with a small inconsistency in that the machine and virsh think correctly that the name is machine.name while to files are all machine_clone. This can be changed with a rename of the files. You can also keep it to remind you the machine was moved.
 
Yes, you end up with a small inconsistency in that the machine and virsh think correctly that the name is machine.name while to files are all machine_clone. This can be changed with a rename of the files. You can also keep it to remind you the machine was moved.
 +
 +
== CPU Issues ==
 +
From Pumpkin to Gourd is going from Intel to AMD. To accommodate this read: (https://www.berrange.com/posts/2010/02/15/guest-cpu-model-configuration-in-libvirt-with-qemukvm/)

Latest revision as of 19:22, 9 July 2020

When you want to work on the hardware or underlying operating system of a VM server, you should consider moving all the VMs off to another server to avoid long downtimes for everyone. This is one of the main reasons for having VMs in the first place, so you can easily move them to other hardware. Moving a virtual machine is not so difficult, BUT, you must be careful.

Completely Cool Way to Move a VM

The issue with moving a VM is not so much the move, but making sure that the old node and the new node are identical and that the downtime is minimized I have not tried the "live move", nor do I trust this. The problem I see with the live move is that the underlying assumption is that the hardware and OS version of the server are identical, which for us in never the case.

Step 1: Synchronizing

  • I assume you have ripped one of the two RAID-1 drives from Endeavour and plugged it into Gourd. Note that Endeavour is an Xenon CPU at RHEL6 and Gourd is AMD at RHEL7, so not identical.
  • For some of the VMs, brief downtime and/or a short glitch period in the logs is not a big deal. In that case, follow the "Simple Move" recipe below.
  • First, get the xml setup file over to Gourd:

virsh dumpxml einstein.unh.edu  > /kvm/einstein.unh.edu.xml

  • Edit this file, to make sure it is sane.
    • change: <type arch='x86_64' machine='rhel5.4.0'>hvm</type> to <type arch='x86_64' machine='pc'>hvm</type>
    • change: <source bridge='br0'/> to <source bridge='farmbr'/>
    • change: <source bridge='br1'/> to <source bridge='unhbr'/>
  • You now want the qcow file on Gourd to be 100% in sync with the file on Endeavour, but on Endeavour, the VM is still running!
  • Mount the qcow file on gourd:

guestmount -a einstein.unh.edu.qcow -m /dev/sda2 /mnt/einstein

IF you don't know the device to mount, execute "virt-filesystems -a einstein.unh.edu.qcow --filesystems", which will list /dev/sda1 and /dev/sda2 in the case of einstein.unh.edu.qcow
  • Run an update compare between the running einstein.unh.edu and your local disk image:

rsync -a --delete -vv einstein:/ --one-file-system  /mnt/einstein/

This will run much faster than copying the whole qcow file, plus, it works on a running Einstein. Speedup is about 1000x. It still takes 9 minutes if there are almost no differences.
  • You can do a last second additional sync of only /var, while you already turned off mail (/etc/init.d/postfix stop, /etc/init.d/dovecot stop) and then shut Einstein down, or you can take Einstein down, guestmount the the drive on Endeavour as well, and do a final sync of only those things that may have changed:
    • /var/log, /var/spool, /var/db, /var/lib/ldap
  • Shutdown Einstein. "/sbin/shutdown -h now" on Einstein itself.
  • Unmount the /mnt/einstein:

guestunmount /mnt/einstein

  • Startup Einstein at the other end:
 
virsh define einstein.unh.edu.xml
virsh start einstein.unh.edu
virsh console einstein.unh.edu

The console part is recommended, so you can see if it boots properly.
Note that if something is not configured correctly, you need to first "virsh undefine einstein.unh.edu", then edit the xml file, then "virsh define einstein.unh.edu.xml" again.
  • If all went well, Einstein is now running on Gourd.
  • You need to now UNDEFINE Einstein on Endeavour, so we never run 2 copies!

Simple Move

The recipe for a simple move. Assume a move from Endeavour to Gourd.

On Endeavour:

virsh destroy machine.name  # Shut down the old machine.
virsh dumpxml machine.name  > /tmp/machine.name.xml
emacs -nw  /tmp/machine.name.xml # Fix the ethernet adaptors: br0 -> farmbr, br1 -> unhbr, also set machine="pc", i.e. generic computer. Also check storage location.
scp /tmp/machine.name.xml gourd:/kvm   # Copy the file to gourd.

/code> Most likely you can just copy the machine.name.qcow or machine.name.qcow2 file from Endeavour to Gourd, and be done with it. If you moved 1/2 of the kvm RAID, then the file is already there, but perhaps a tiny bit out of sync. Determine if that time lag matters and/or copying the whole file will be time consuming. If it is, then use the "Cool Way" to move the VM. (I.e. for Einstein.)

IF the new disk is not accepted at the other end, you may need to clone it:

virt-clone -o original_machine_name -n new_machine_name -f /net/data/gourd/kvm/new_machine_storage.qcow2

On Gourd:

virsh define /kvm/machine.name.xml   # Define machine
virsh start machine.name # Start it. This is where you are most likely to see an error.

virsh console machine.name # Connect to the console so you can see the boot progress!

Yes, you end up with a small inconsistency in that the machine and virsh think correctly that the name is machine.name while to files are all machine_clone. This can be changed with a rename of the files. You can also keep it to remind you the machine was moved.

CPU Issues

From Pumpkin to Gourd is going from Intel to AMD. To accommodate this read: (https://www.berrange.com/posts/2010/02/15/guest-cpu-model-configuration-in-libvirt-with-qemukvm/)