Pumpkin

From Nuclear Physics Group Documentation Pages
Jump to navigationJump to search

Pumpkin

Pumpkin is our new 8 CPU 24 disk monster machine. It is really, really nice. Currently it is only tied to the "corn" ip address. Because of this, it seems like we're going to have Corn be the big physical machine, and pumpkin be the virtualized machine.

Basic Setup

  • We will run Xen on this so that it can have 2 personalies: Pumpkin, 64-bit, and Corn, 32-bit, RHEL5.
  • The RAID is currently split. This allows for much easier maintenance and in the future possible upgrades.
    • Disk 1 to 11 is in RAID Set 0, which holds the RAID Volumes: System (300GB, RAID6, SCSI:0.0.0), System1(300GB, RAID6, SCSI:0.0.1), Data1 (6833GB, RAID5, SCSI:0.0.2)
    • Disk 11 to 22 is RAID Set 1, which holds the RAID Volume: Data2 (7499GB, RAID5, SCSI:0.0.3)
    • Disk 23 and 24 are passthrough (single disks) at SCSI:0.0.6 and SCSI:0.0.7. These can be used as spares, as backup, or to expand the other RAID sets later on. Currently they are seen as /dev/sde* and /dev/sdf*. /dev/sdf1 and /dev/sdf2 hold the old RHEL4 install the system came with.
    • The RAID card can be monitored at http://10.0.0.99/ login as "admin" with a password that is the same as the door combo.
    • To use this card with Linux you need a driver: arcmsr. This must be part of the initrd for the kernel, else you cannot boot from the RAID. You can also install from the CDs, if you have a driver floppy. It will then add the arcmsr driver into the initrd for you. You will still always need to have this driver!
      • The kernel module can be build from the sources located in /usr/src/kernels/Acera_RAID. Just run make.
  • There exists a temporary drive which holds a RHEL5 distro and the original RHEL4 distro from the manufacturer. It is currently disconnected from pumpkin. This drive was mirrored to /dev/sdf*, and /dev/sde{1,2} also has an old RHEL5 distro. We can try to use these as temporary drives for cloning other systems.

Virtual Host: Corn

We run a 32-bit personality as "corn" using the Xen virtuallization system. Corn is RHEL5, para-virtualized with a 32-bit kernel. It is a fully separate system (that could be booted as the main system with a few modifications to config files. Hint: don't do that!). This means that any system stuff installed on Pumpkin needs to be installed on corn separately.

  • The system runs the alternate 2nd personality as "Corn".
  • This system is setup with Xen on the System1 drive (dev/sdb). The system "Pumpkin" is the master host, or domain0.
  • Subscription Issue: A virtual host needs to be setup special. Not sure yet exactly how, but at a minimum: Only subscribe to "base channel" and "Tools" (other may be ok, check!). Install the redhad-virtualization-host package. Now the system should show up as a "Virtual Host" on the RHN licensing page, however, it still consumes a real license. I could not figure out how to move it. RHN doc is rather sparse. At least we are closer to getting this be a virtual license.
  • The virtual host needs to have both ethernets bridged. This is done by modifying the /etc/xen/scripts/network-bridge script, which is now network-bridge-two which calls the original twice. For the host, create two interfaces, first the one to xenbr1 then the one to xenbr0, so that the first one ends up being eth1 and the second one eth0. Yes, seems backwards, but if now works. Key is to have the lines
alias eth0 xennet
alias eth1 xennet

in the /etc/modprobe.conf file. This is now working.

Creating a new Virtual Host from a previous installed disk.

It seems one can do the following to create a clone of a physical system as a virtual host. This still needs to be tested better!

  • Stick the disk with the operating system on it in slot 23 or 24.
  • Create a new fully virtualized host:
    • In virtual machine manager, click create new.
    • Choose a name, as in: VHost23_sde
    • Choose for a fully virtualized host, this allows more flexibility for the kernels etc.
    • Choose to install from the image at /data1/. This is a RHEL5 image. Choose the operating system version.
    • Choose the disk: /dev/sde or /dev/sdf
    • Set ethernet to xenbr1
    • Set memory (probably 1024) and cpus (probably 2)
    • Save config (click finish)
  • You can now boot the virtual system. (It will do this automatically.) When prompted Do not install a new system (idiot!) instead type "linux rescue".
  • When the rescue boots, it will look for installed operating systems.
  • From the rescue console you can figure out what the hardware signature is. Now some files on the operating system need to be adapted to the new (temporary) hardware signature. You could also do this ahead of time by mounting the disks on pumpkin and modifying the files there.
Backup all files you modify to a *_Physical version, so you can undo this before sticking it back into the physical system. Keep track of your changes on the wiki!
Hard disk will be /dev/hda*   ==> Modify grub.conf (or use LABEL=ROOT and label your partition), with LVM systems you should be OK.
                              ==> Modify /etc/fstab (probably change /dev/sda* to /dev/hda*)
Ethernet                      ==> Modify /etc/modules.conf or /etc/modprobe.conf and alias eth0 8139cp (REALTEK 8139cp driver) Needs test
                              ==> Same for eth1
  • Exit the console, or shutdown the machine. Now add another ethernet card to the config, hooking this up to xenbr0
  • Restart your virtual system. Make sure all VM disks are unmounted from pumpkin

If you need to fix things, or poke around, and want to boot from the iso image again, change the disk line in the /etc/xen configuration file to:

boot="d"
disk = [ 'phy:/dev/sdf,hda,w','file:/data1/rhel-5.1.-server-i386-dvd.iso,hdc:cdrom,r']   

Then run "xm create hvmconfig_file" to load your changes and boot the new config. It will boot from the cdrom image.

This seems to work! Remember to change boot="d" back to boot="c" to boot from your disk.

To Do

  • Setup SNMP for cacti monitoring.
  • Add the new systems to the lentil backup script
  • There must be other things....
  • Setup sensors so that we can monitor the system. Will have to wait for a kernel that supports it

Done

  • Setup ethernet.
  • Setup RAID volumes.
  • Setup partitions and create file systems.
  • Move the system to System drive and remove the current temp drive.
  • Setup mount points for the data drives.
  • Setup LDAP for users to log in.
  • Setup Exports, so other systems can see the drives. There were issues with firewall, so I modeled the firewall after taro's. Seems to be working, I can successfully ls /net/data/pumpkin1 and ls /net/data/pumpkin2 on einstein.
  • Setup autofs so that it can see other drives. What other drives? It's working for einstein:/home Other drives such as data drives
  • Setup smartd so we will know when a disk is going bad. This can be done inside the RAID card using a system to send SNMP and EMAIL. but it needs to be done. E-mail seems to be set up, let's see if we get any through npg-admins
  • Restrict access (/etc/security/access.conf)
  • Setup sudo on both pumpkin and corn.