Difference between revisions of "Upgrading Endeavour"
From Nuclear Physics Group Documentation Pages
Jump to navigationJump to search (→To Do) |
|||
Line 55: | Line 55: | ||
*** Bump Endeavour over to CENTOS 5: see http://wiki.centos.org/HowTos/MigrationGuide | *** Bump Endeavour over to CENTOS 5: see http://wiki.centos.org/HowTos/MigrationGuide | ||
**** Execute: rpm -ivh http://mirror.centos.org/centos/5/os/i386/CentOS/centos-release-5-11.el5.centos.i386.rpm http://mirror.centos.org/centos/5/os/i386/CentOS/centos-release-notes-5.11-0.i386.rpm http://mirror.centos.org/centos/5/os/x86_64/CentOS/centos-release-5-11.el5.centos.x86_64.rpm http://mirror.centos.org/centos/5/os/x86_64/CentOS/centos-release-notes-5.11-0.x86_64.rpm http://mirror.centos.org/centos/5/os/x86_64/CentOS/redhat-logos-4.9.99-11.el5.centos.noarch.rpm | **** Execute: rpm -ivh http://mirror.centos.org/centos/5/os/i386/CentOS/centos-release-5-11.el5.centos.i386.rpm http://mirror.centos.org/centos/5/os/i386/CentOS/centos-release-notes-5.11-0.i386.rpm http://mirror.centos.org/centos/5/os/x86_64/CentOS/centos-release-5-11.el5.centos.x86_64.rpm http://mirror.centos.org/centos/5/os/x86_64/CentOS/centos-release-notes-5.11-0.x86_64.rpm http://mirror.centos.org/centos/5/os/x86_64/CentOS/redhat-logos-4.9.99-11.el5.centos.noarch.rpm | ||
+ | * Reconfigure Ganglia | ||
+ | * Setup Splunk |
Revision as of 15:13, 15 April 2015
Upgrade main system
Started with a sideways migration to Centos 5. This worked except for the infiniband packages, which were skipped:
- Bump Endeavour over to CENTOS 5: see http://wiki.centos.org/HowTos/MigrationGuide
- Execute: rpm -ivh http://mirror.centos.org/centos/5/os/i386/CentOS/centos-release-5-11.el5.centos.i386.rpm http://mirror.centos.org/centos/5/os/i386/CentOS/centos-release-notes-5.11-0.i386.rpm http://mirror.centos.org/centos/5/os/x86_64/CentOS/centos-release-5-11.el5.centos.x86_64.rpm http://mirror.centos.org/centos/5/os/x86_64/CentOS/centos-release-notes-5.11-0.x86_64.rpm http://mirror.centos.org/centos/5/os/x86_64/CentOS/redhat-logos-4.9.99-11.el5.centos.noarch.rpm
- Execute: yum update --skip-broken
- Remove the broken packages by hand. Unfortunately, this took down the system :-)
- rpm -e --allmatches ibsim ibutils ibutils-libs infiniband-diags libibcommon libibcommon-devel libibcommon-static libibmad libibmad-devel libibmad-static libibumad libibumad-devel libibumad-static opensm opensm-devel opensm-libs opensm-static perftest srptools mvapich_gcc mvapich2_gcc
New system @ Centos 6.6:
- Reconfigured RAID. All the 2TB drives are now in slots 1-9 and configured for a RAID6, 14TB raid.
- Slots 10,11,12 will be hot-spare, and 2x passthrough.
- The passthrough are for: slot11 - Can contain Home directory drive when Gourd is being upgraded. slot12 -- OldSys a 1TB drive with the old Centos 5.5 system.
- There are 3 volumes on the RAID: "system" ~ 100GB, "system2" ~100GB, "data1"
- Remaining 12 slots will be filled with high density new drives for another RAID6
- Slots 10,11,12 will be hot-spare, and 2x passthrough.
- Restarted web server.
Upgrading Endeavour nodes
Upgrade to Centos 6 started March 17, 2015 with node2:
- Reboot Node2 from a USB key with Centos6 distribution installed. Chose "minimal install"
- Note: Should have added scp, i.e openssh-client stuff. Added this "by hand" by using from endeavour: cat openssh-client-... | ssh node2 "cat - > openssh-client.rpm" and then installing that rpm.
- SSH into the system
- Copy the Centos ISO to node2 with scp. Mount on /mnt/centos
- Install packages with: yum --disablerepo \* --enablerepo c6-media install
- List of old package installed are in ~root/new_packages.txt with the distribution and package version stripped already. From this list, the packages were filtered into "installed" and "available" with yum. From the resulting list of "available" only the x86_64 and noarch packages were installed.
- A number of config tweaks needed.
Done
- Initial installation of system
- Update the ethernet configuration (eth0) to 10.0.0.2 and onboot=yes.
- Install all previously installed packages
- setup ssh for passwordless entry. See: http://itg.chem.indiana.edu/inc/wiki/software/openssh/189.html or http://en.wikibooks.org/wiki/OpenSSH/Cookbook/Host-based_Authentication
- Note: Endeavour identifies as master.farm.physics.unh.edu, and so this needs to be in the ssh_known_hosts file and in the shosts.equiv file and .shosts file.
- Configure authentication: see SSSD
- Synchronise clock with endeavour: rdate -s endeavour && hwclock --systohc
- Configure automount: Frankly, it is a mystery why it works, since the setup seems incomplete, but hey, I won't complain.
- Configure Infiniband
- See: http://pkg-ofed.alioth.debian.org/howto/infiniband-howto-4.html includes a number of tests. All passed.
- See: https://pkg-ofed.alioth.debian.org/howto/infiniband-howto-5.html -- tested OK.
- See: https://access.redhat.com/solutions/301643
- Note: The ib0 does not seem to come up automatically? Needs to be checked.
- Disk cloning: Run /sbin/Node_Clone.sh 2 7 from node2 to clone from node2 to node7, when both disks are in node2. Currently node2 and node3 are setup for cloning.
To Do
- Configure MPI
- Later. I'm not sure anyone is using this right now.
- Configure open PBS - not an RPM
- See: http://docs.adaptivecomputing.com/hpc/7-1-0/basic/Content/topics/1-installation/installingTorque.htm
- See: http://www.adaptivecomputing.com/support/download-center/torque-download/
- Note: The current version 4.0 does not play nice with older versions so we need to upgrade the ENTIRE pbs system to the new version.
- Perhaps install 2.x first? -- YES. The 4.0 will not install on the master node because we cannot get to the RedHat repos to install openssl.
- -- NO. The 2.x does not talk with the PBS on endeavour. The entire system needs to be uniform.
- Bump Endeavour over to CENTOS 5: see http://wiki.centos.org/HowTos/MigrationGuide
- Execute: rpm -ivh http://mirror.centos.org/centos/5/os/i386/CentOS/centos-release-5-11.el5.centos.i386.rpm http://mirror.centos.org/centos/5/os/i386/CentOS/centos-release-notes-5.11-0.i386.rpm http://mirror.centos.org/centos/5/os/x86_64/CentOS/centos-release-5-11.el5.centos.x86_64.rpm http://mirror.centos.org/centos/5/os/x86_64/CentOS/centos-release-notes-5.11-0.x86_64.rpm http://mirror.centos.org/centos/5/os/x86_64/CentOS/redhat-logos-4.9.99-11.el5.centos.noarch.rpm
- Reconfigure Ganglia
- Setup Splunk