Gourd/Einstein Migration Plan

From Nuclear Physics Group Documentation Pages
Revision as of 19:08, 15 June 2010 by Aduston (talk | contribs)
Jump to navigationJump to search

This page is for notes on the steps needed in order to fully migrate from the current Einstein system to the new Gourd hardware and Einstein System.

System Diagrams

These notes and diagrams describe the migration from the old system layout with the original hardware Einstein to the current Gourd hardware / Einstein VM design.

First Proposed Redesign

This updated diagram incorporates Maurik's suggested changes:

Proposed redesign 01.png


Questions / Concerns

This design attempts to minimize the number of virtual machines which frees up a couple of host names. I made Corn a backup DNS server that would run on Taro because this is the best way to failover in case something happens with Jalapeno. Even if Einstein goes down entirely we'll at least have one DNS server still running.

Corn is running Debian and is designed to be a standalone Bugzilla appliance. It is not known whether this will make it difficult to add the DNS functionality ( it should be as simple as installing the service and copying the configuration from Jalapeno ). This needs to be investigated. An alternative is leaving Corn as a standalone and using the Okra virtual machine to provide printing as well as acting as a backup DNS.

It is not known yet whether the old Einstein will be usable as a mirror / failover for the new Einstein. This should be investigated. If it is, we should also investigate whether it will need a Red Hat license, or if CentOS will be sufficient.

Second Proposed Redesign

This design is deprecated, but left here for reference purposes.


System layout future.png

Gourd

Gourd will serve as the file server for home folders and mail, as well as the Virtualization host for Einstein and other Virtual Machines such as Roentgen and Corn

Migration Checklist for Gourd

  1. Drives and RAID
    1. Configure hard drives and RAID arrays as outlined here
      • Copy the 250GB system drive pass-thru disk to a 250GB RAID 1 volume on two 750GB disks (Slots 1 and 2) done
      • Remaining 500GB on each drive spanned to a 1TB RAID 0, mounted on /scratch done
      • Two 750GB disks as pass-thru, set up as software RAID (Slots 3 and 4) done
      • 500GB RAID 1 for home folders (/dev/md0) mounted on /home done
      • 100GB RAID 1 for Einstein's /var/spool/mail (/dev/md1) mounted on /mail done
      • 150GB RAID 1 for virtual machines (/dev/md2) mounted on /vmware and added as a local datastore in VMWare done
      • Two 750GB drives in Slots 7 and 8 as hot spares done
  2. System Setup
    1. NFS
      • Set up NFS Shares for /home and /mail done - Currently /mail share accessible by Tomato, need to change to Einstein at switchover
      • Create npghome.unh.edu alias interfaces on Gourd done
        • Add to DNS configs done - Assigned farm IP of 10.0.0.240
        • Needs to be added to Servers in LDAP for iptables to work done on Tomato
    2. Change Automount configuration in LDAP (possibly also on clients) to use npghome:/home instead of einstein for /net/home done on tomato
      • Ran into some trouble with this setup on feynman, could login but apps wouldn't run and Gnome would eventually freeze. Tested several possibilities:
        • Setting npghome to Einstein's IP address in hosts file worked
        • Bringing up npghome alias interfaces on Einstein worked
        • On a hunch tried bringing down the firewall on Gourd, and then I could login and mount /net/home to npghome with no issues. Fixed the firewall configuration (ports were set incorrectly, added eth0.2 as the unh interface instead of eth1, and had the iptables script going to tomato's ldap since it contains the entry for npghome which needed to be added to the firewall, and automount to npghome is now working on feynman, parity, gourd, and tomato without issue as of 01/16
    3. Backups
      • Change rsync-backup.conf so that /mail and /home get backed up
      • Create new LDAP group for backups so that gourd doesn't get backed up a second time as npghome - change backup script in Lentil to use new group
    4. Virtual Machines
      • Copy virtual machines from Taro to /vmware on gourd
        • Corn done
        • Roentgen


Einstein VM ( Currently Tomato )

  1. VM Setup done
    1. Create the Virtual Machine, Install / setup OS done
      • Tomato is currently a CentOS 5.4 machine running on gourd
        • Ran into issues with rhn since we don't seem to have a spare license to register tomato. We used CentOS so that we could install and update necessary packages and test out the new configuration, but we can set up Tomato with Einstein's license once that is free if needed, and then copy configs over from the current machine.
      • Tomato virtual machine is setup to boot when Gourd boots. Tested this setup and gourd comes up successfully after a reboot. Initial login on gourd is a bit sluggish as you have to wait for tomato to finish booting, but works fine after a few seconds.
  2. LDAP Configuration done
    1. Copied LDAP configuration from Einstein. Have tested authentication with tomato's LDAP on feynman, gluon, parity, gourd, and tomato itself. Seems to work as well as Einstein.
  3. Firewall setup done
    • used old Einstein's iptables-npg config. Probably need to clean up some of the old unused rules from the old machine, though.
  4. Mail Setup
    1. Copy over configs for Dovecot, Postfix, Spamassassin, Mailman and Squirrelmail done
      • Set up mail services using Einstein's current setup. Copied over CMUSieve plugin from Einstein.
      • Dovecot seems to have some issues accessing mail over NFS mounts. Initially received the following error in /var/log/maillog - "dovecot: Mailbox indexes in /var/spool/mail/aduston are in NFS mount. You must set mmap_disable=yes to avoid index corruptions. If you're sure this check was wrong, set nfs_check=no" Changed the mmap_disable setting in /etc/dovecot.conf to yes. Considering making other changes according to the Dovecot wiki article on NFS. Will test to make sure they don't break anything.
      • Mailman still needs setup
      • Need to setup Apache for Squirrelmail. Also /var/www from old einstein for automount. Should websites from Einstein run on the new VM, or move to Roentgen? done - websites will be left alone for now but should probably stop being served from einstein
        • Squirrelmail works, had some trouble caused by incorrect permissions on config files, now fixed.
        • Copied /var/www/html from Einstein. Need to add entry in export for automount

Day of Migration Checklist

  1. Need to send out e-mails reminding users of necessary steps on their end to prepare for the migration on Sunday. Sent initial e-mail, currently drafting the reminder before Friday night
  2. Prepare Tomato to take over for Einstein
    • We should do an initial rsync of mail and homes on Friday night or Saturday so that the day-of sync will take less time.
    • Tomato can be reconfigured to prepare for the switchover ahead of time. We will need to login via the VMWare interface and shut down the network interfaces to avoid conflicts with the current Einstein and then make configuration changes so that it identifies itself as Einstein (listed below)
    • At 1pm Sunday we will bring down mail on the current Einstein and then boot the newly reconfigured Einstein VM.
      • Test e-mail, webmail, ldap etc - login from several workstations to make sure things are authenticating correctly.
  3. If this is successful we can begin the process of switching automount to use npghome on workstations.
    • This may not require a reboot. I tested it on feynman and all I had to do was make the changes to auto.net and then restart autofs while only logged in as root. We can try it this way but be prepared to reboot if there's some issue with nfs.

List of configuration files where "tomato" needs to be changed back to "einstein"

Tomato

/etc/sysconfig/network change IP Addresses in network-scripts

/etc/openldap/slapd.conf done

/etc/httpd/conf/httpd.conf done /etc/httpd/conf.d/mailman.conf done

/etc/dovecot.conf done /etc/postfix/main.cf done

/etc/mailman/mm_cfg.py done

Gourd

/usr/local/bin/netgroup2iptables.pl -- set to authenticate to tomato. change back to einstein. /etc/exports -- Change /mail share so that it allows connections to Einstein's IP addresses, not tomato's


Other Machines

Other workstations and servers will have to have their configurations for /etc/auto.net changed so that home is mounted via npghome and not einstein. Other configs can be left the same apart from those whose ldap configurations were temporarily switched to use tomato for testing - Probably just Gourd, Feynman, Gluon and possibly parity. done for clients