Pumpkin

From Nuclear Physics Group Documentation Pages
Revision as of 14:14, 2 January 2008 by Steve (talk | contribs) (→‎To Do)
Jump to navigationJump to search

Pumpkin

Pumpkin is our new 8 CPU 24 disk monster machine. It is really, really nice. Currently it is only tied to the "corn" ip address.

Basic Setup

  • We will run Xen on this so that it can have 2 personalies: Pumpkin, 64-bit, and Corn, 32-bit, RHEL5.
  • The RAID is currently split. This allows for much easier maintenance and in the future possible upgrades.
    • Disk 1 to 11 is in RAID Set 0, which holds the RAID Volumes: System (300GB, RAID6, SCSI:0.0.0), System1(300GB, RAID6, SCSI:0.0.1), Data1 (6833GB, RAID5, SCSI:0.0.2)
    • Disk 11 to 22 is RAID Set 1, which holds the RAID Volume: Data2 (7499GB, RAID5, SCSI:0.0.3)
    • Disk 23 and 24 are passthrough (single disks) at SCSI:0.0.6 and SCSI:0.0.7. These can be used as spares, as backup, or to expand the other RAID sets later on.
    • The RAID card can be monitored at http://10.0.0.99/ login as "admin" with a password that is the same as the door combo.
    • To use this card with Linux you need a driver: arcmsr. This must be part of the initrd for the kernel, else you cannot boot from the RAID.
      • The kernel module can be build from the sources located in /usr/src/kernels/Acera_RAID. Just run make.
  • Currently we have a temporary drive in the system on the onboard SATA which holds a RHEL5 distro and the original RHEL4 distro from the manufacturer.

Matt's Notes 12/28

It seems that right now, the only bootable install is on the temporary drive. From what I understand, you can use xen to create a guest os on a partition, and once it's all set up, you can even point grub to boot that as the "real" os. A possible plan of action seems like it could be to get raid working on the current install, use xen to put a rhel5_64 install (pumpkin) on one of the raid sets (probably system), boot that, and then xen-install rhel5_32 corn to the other raid set. At that point, we could pull the random drive we're using now and be close to done.

I tried rebuilding the arcmsr driver, but when I ran 'make clean' in the src dir, I got lots of errors. That's a sign something's amiss. I can't access anything useful in the raid bios, since it's passworded and I haven't the slightest clue what the password is. From the initialization screen I found the model is the ARC-1280. The pre-made driver can be found here ftp://ftp.areca.com.tw/RaidCards/AP_Drivers/Linux/DRIVER/RedHat/Redhat-EnterpriseLinux/RHEL5.0/. Also, on further research, it's not entirely clear if the xen trickery I was thinking about will work. Not a problem if it doesn't. Also, I've downloaded the 64-bit install dvd onto feynman's data drive. Trying to boot into the original corn install results in a kernel panic due to switchroot failing. This almost isn't worth fixing, we should start over.

Considering the website has a precompiled rhel5 driver ready to be put on a floppy, as well as instructions for a first-time installation, I think we should scrap this whole temporary drive idea and install pumpkin on one of the system sets, so it can be our dom0 without any weird setups. Considering at least half our problems are made more difficult by weird setups, this could be a step in the right direction.

My proposed course of action

  1. Using the ARECA driver floppy, install RHEL5 64-bit on one of the "system" raid sets with a normal dvd-rom based install. Use a xen kernel. Name it pumpkin, use standard packages. Nothing fancy.
  2. Make sure pumpkin works.
  3. Use virt-manager to install corn on the other "system" raid set as RHEL 32-bit. We should use virt-manager because it's essentially a frontend to xen, so if something breaks we can do it cli-style, and if stuff isn't breaking there's less to be confused about. Also, it's the standard redhat tool for this. Based on my playing around with it today, this should be a breeze.
  4. Make sure corn works.
  5. ./hooray

To Do

  • Move the system to System drive and remove the current temp drive.
  • Setup mount points for the data drives.
  • Setup LDAP for users to log in. I started, but it's not working.
  • Setup Exports, so other systems can see the drives.
  • Setup autofs so that it can see other drives.
  • Setup sensors so that we can monitor the system.
  • Setup smartd so we will know when a disk is going bad. This can be done inside the RAID card using a system to send SNMP and EMAIL. but it needs to be done.
  • Setup the other system with Xen on the System1 drive
  • Setup SNMP for cacti monitoring.
  • Add the new systems to the lentil backup system
  • There must be other things....

Done

  • Setup ethernet.
  • Setup RAID volumes.
  • Setup partitions and create file systems.