Difference between revisions of "Gourd"

From Nuclear Physics Group Documentation Pages
Jump to navigationJump to search
 
(82 intermediate revisions by 4 users not shown)
Line 1: Line 1:
== General ==
+
Gourd is a 2 quad-CPU server in a 2U rackmount chassis put together nicely by Microway. It arrived at UNH on 11/24/2009. The system has an Areca [[RAID]] card with ethernet port and an IPMI card with ethernet port. The motherboard is from Super Micro.  
Data server. Currently connected to the networks via the switch and VLAN.
 
Has 3dm raid monitoring and web interface installed and set up, accessible at [http://gourd.unh.edu:888].
 
  
Hostnames: <code>gourd.unh.edu</code>, <code>gourd.farm.physics.unh.edu</code>
+
This is the page for the new Gourd Hardware. The old Gourd is described [[old gourd|here]].
  
== Network Configuration ==
+
Gourd now hosts [[Einstein]] as a VM. The previous einstein hardware is described [[old einstein|here]]
=== /etc/sysconfig/network-scripts/ifcfg-bohr_tun ===
 
<pre>DEVICE=bohr_tun
 
DEVICETYPE=tunnel
 
TYPE=GRE
 
BOOTPROTO=none
 
ONBOOT=yes
 
USERCTL=no
 
  
MY_OUTER_IPADDR=132.177.88.183
+
[[Image:gourds.jpg|thumb|200px|Gourds]]
MY_INNER_IPADDR=132.177.88.183
 
PEER_OUTER_IPADDR=132.177.88.174
 
PEER_INNER_IPADDR=132.177.88.174
 
</pre>
 
=== /etc/sysconfig/network-scripts/ifcfg-dirac_tun ===
 
<pre>DEVICE=dirac_tun
 
DEVICETYPE=tunnel
 
TYPE=GRE
 
BOOTPROTO=none
 
ONBOOT=yes
 
USERCTL=no
 
  
MY_INNER_IPADDR=10.0.0.1
+
=Hardware Details=
MY_OUTER_IPADDR=132.177.88.183
 
PEER_INNER_IPADDR=132.177.88.51
 
PEER_OUTER_IPADDR=132.177.88.51
 
</pre>
 
=== /etc/sysconfig/network-scripts/ifcfg-eth0 ===
 
<pre>DEVICE=eth0
 
BOOTPROTO=none
 
HWADDR=00:E0:81:52:7A:79
 
IPADDR=10.0.0.252
 
NETMASK=255.255.252.0
 
ONBOOT=yes
 
TYPE=Ethernet
 
USERCTL=no
 
IPV6INIT=no
 
PEERDNS=yes
 
</pre>
 
=== /etc/sysconfig/network-scripts/ifcfg-eth0:1 ===
 
<pre>DEVICE=eth0:1
 
ONPARENT=yes
 
BOOTPROTO=none
 
IPADDR=10.0.0.1
 
NETMASK=255.255.255.255
 
TYPE=Ethernet
 
</pre>
 
=== /etc/sysconfig/network-scripts/ifcfg-eth0.2 ===
 
<pre># To UNH network
 
VLAN=yes
 
DEVICE=eth0.2
 
BOOTPROTO=none
 
BROADCAST=132.177.91.255
 
IPADDR=132.177.88.75
 
NETMASK=255.255.252.0
 
NETWORK=132.177.88.0
 
ONBOOT=yes
 
REORDER_HDR=no
 
GATEWAY=132.177.88.1
 
  
TYPE=Ethernet
+
*Microway 2U Storage Chassis with 8 SAS/SATA-II Hot-Swap Drive Bays with SAS Backplane
USERCTL=no
+
*500W Redundant Hot-Swap Power Supply (2)
IPV6INIT=no
+
*Navion-S Dual Opteron PCI-Express Motherboard (H8DME-2);
PEERDNS=yes
+
*Two AMD "Shanghai" 2.4 GHz Quad Core Opteron 2378 (Socket F) Processors
</pre>
+
*(8) 4GB DDR2 800 MHz ECC/Registered Memory (32GB Total Memory @ 533MHz)
=== /etc/sysconfig/network-scripts/ifcfg-eth0.2:1 ===
+
*Dual Integrated Gigabit Ethernet Ports; = "Nvidia" ethernet.
<pre>DEVICE=eth0.2:1
+
*Integrated ATI ES1000 Graphics;
ONPARENT=yes
+
*Six Integrated SATA-II Ports;
BOOTPROTO=none
+
*Two x8 PCI Express slots;
IPADDR=132.177.88.183
+
*Two 64-bit 133/100MHz PCI-X slots;
NETMASK=255.255.255.255
+
*Two 64-bit 100MHz PCI-X slots;
TYPE=Ethernet
+
*Up to 6 USB 1.1/2.0 ports (2 on rear);
</pre>
+
*PS/2 Mouse & Keyboard connectors;
=== /etc/sysconfig/network-scripts/ifcfg-lo ===
+
*Areca ARC-1220 8-Port SATA II Raid (256MB) (reports as 1680) - Low Profile PCI Express x8 at 10.0.0.152
<pre>DEVICE=lo
+
*SuperMicro AOC-SIM1U IPMI card.  
IPADDR=127.0.0.1
+
*Sony Slim CD/DVD 24x Black (IDE)
NETMASK=255.0.0.0
 
NETWORK=127.0.0.0
 
# If you're having problems with gated making 127.0.0.0/8 a martian,
 
# you can change this to something else (255.255.255.255, for example)
 
BROADCAST=127.255.255.255
 
ONBOOT=yes
 
NAME=loopback
 
</pre>
 
=== /etc/sysconfig/network-scripts/ifcfg-pauli_tun ===
 
<pre>DEVICE=pauli_tun
 
DEVICETYPE=tunnel
 
TYPE=GRE
 
BOOTPROTO=none
 
ONBOOT=yes
 
USERCTL=no
 
  
MY_INNER_IPADDR=132.177.88.183
+
= Setup =
MY_OUTER_IPADDR=132.177.88.183
 
PEER_INNER_IPADDR=132.177.88.54
 
PEER_OUTER_IPADDR=132.177.88.54
 
</pre>
 
  
== Hard disks ==
+
Details of Gourd [[Upgrading to Centos 7]]
=== Results of testing (as of 6/28/07) ===
 
Disks on 3ware raid device.
 
  
Disk0:
+
= Network Settings =
SMART Self-test log structure revision number 1
 
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 
# 1  Short offline      Completed without error      00%    27954        -
 
Disk1:
 
SMART Self-test log structure revision number 1
 
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 
# 1  Short offline      Completed without error      00%    27944        -
 
Disk2:
 
SMART Self-test log structure revision number 1
 
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 
# 1  Short offline      Completed without error      00%    22137        -
 
Disk3:
 
SMART Self-test log structure revision number 1
 
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 
# 1  Short offline      Completed without error      00%    27904        -
 
Disk4:
 
SMART Self-test log structure revision number 1
 
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 
# 1  Short offline      Completed without error      00%    27804        -
 
Disk5:
 
SMART Self-test log structure revision number 1
 
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 
# 1  Short offline      Completed without error      00%      5570        -
 
Disk6:
 
SMART Self-test log structure revision number 1
 
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 
# 1  Short offline      Completed without error      00%    27739        -
 
Disk7:
 
SMART Self-test log structure revision number 1
 
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 
# 1  Short offline      Completed without error      00%    27830        -
 
== Access Configuration ==
 
=== /etc/security/access.conf ===
 
<pre># NPG Config:
 
# Allow direct root logins only from console and einstein
 
+ : root : LOCAL einstein.unh.edu einstein.farm.physics.unh.edu lentil.unh.edu lentil.farm.physics.unh.edu
 
  
# Allow only NPG users and administrators
+
In Centos 7 there is an issue with the drivers for the Nvidia ethernet, which was previously provided through the forcedeth driver. The forcedeth driver which is disabled in the CentOS 7
- : ALL EXCEPT npg domain_admins : ALL
+
kernel. You can use the kmod-forcedeth driver from elrepo.org:
</pre>
 
== Backup Configuration ==
 
=== /etc/rsync-backup.conf ===
 
<pre># Backups are 'pull' only. Too bad there isn't a better way to enforce this.
 
read only      = yes
 
  
# Oh for the ability to retain CAP_DAC_READ_SEARCH, and no other.
+
http://elrepo.org/linux/elrepo/el7/x86_64/RPMS/kmod-forcedeth-0.64-1.el7.elrepo.x86_64.rpm
#uid            = root
 
# XXX There seems to be an obscure bug with pam_ldap and rsync whereby
 
# getpwnam(3) segfaults when (and only when) archiving /etc. Using a numeric
 
# uid avoids this bug. Only verified on Fedora Core 2.
 
uid            = 0
 
  
# There's not much point in putting the superuser in a chroot jail
+
The new udev renaming scheme calls eth0 or FARM network = enp0s8,  and eth1 or UNH network=enp0s9
# use chroot    = yes
 
  
# This isn't really an effective "lock" per se, since the value is per-module,
+
* IP address UNH: 132.177.88.75 (enp0s9)
# but there really ought never be more than one, and it would at least
+
* IP address Farm: 10.0.0.252 (enp0s8)
# ensure serialized backups.
+
* IP address RAID: 10.0.0.152
max connections = 1
+
* IP address IPMI:  10.0.0.151
  
filter  = : .rsync-filter
+
=Software and Services=
  
[usr]
+
This section contains details about the services and software on Gourd and information about their configurations.  
        path    = /usr
 
        comment = unpackaged software
 
        filter  =              \
 
                : .rsync-filter \
 
                + /            \
 
                + /local        \
 
                - /*
 
  
[opt]
+
== Software RAID ==
        path    = /opt
 
        comment = unpackaged software
 
  
[etc]
+
Gourd has a (somewhat simpler) ARECA RAID card. The / system and /scratch are on a RAID created by the card. This was NOT chosen for /home and /mail. Instead /home and /mail are on a software Raid, so that if these services ever need to be moved to other hardware, there is no issues with incompatibility due to Raid card.
        path    = /etc
 
        comment = conf files
 
  
[var]
+
The config can be found from /etc/mdadm.conf (least reliable!), but better from /proc/mdstat or mdadm --detail --scan, mdadm --detail /dev/md0
        path    = /var
 
        comment = user and system storage
 
</pre>
 
== SNMP Configuration ==
 
=== /etc/snmp/snmpd.conf ===
 
<pre>###############################################################################
 
#
 
# snmpd.conf:
 
#  An example configuration file for configuring the ucd-snmp snmpd agent.
 
#
 
###############################################################################
 
#
 
# This file is intended to only be as a starting point.  Many more
 
# configuration directives exist than are mentioned in this file.  For
 
# full details, see the snmpd.conf(5) manual page.
 
#
 
# All lines beginning with a '#' are comments and are intended for you
 
# to read.  All other lines are configuration commands for the agent.
 
  
###############################################################################
+
The drives are passthrough drives on the ARECA card. The association between passthrough drive and device name is established by looking at the SCSI channel number (web interface of the ARECA card) and the Linux /sys:
# Access Control
 
###############################################################################
 
  
# As shipped, the snmpd demon will only respond to queries on the
+
ls -l /sys/bus/scsi/drivers/sd/0\:0\:0\:0/
# system mib group until this file is replaced or modified for
+
lrwxrwxrwx 1 root root    0 May 19 21:38 block:sda -> ../../../../../../../block/sda/
# security purposes. Examples are shown below about how to increase the
 
# level of access.
 
  
# By far, the most common question I get about the agent is "why won't
+
for each of the 0:0:0:x scsi channels.
# it work?", when really it should be "how do I configure the agent to
 
# allow me to access it?"
 
#
 
# By default, the agent responds to the "public" community for read
 
# only access, if run out of the box without any configuration file in
 
# place.  The following examples show you other ways of configuring
 
# the agent so that you can change the community names, and give
 
# yourself write access to the mib tree as well.
 
#
 
# For more information, read the FAQ as well as the snmpd.conf(5)
 
# manual page.
 
  
####
+
*  '''See More about the disks below'''
# First, map the community name "public" into a "security name"
 
  
#      sec.name  source          community
+
==NFS Shares==
com2sec notConfigUser  default      public
 
  
####
+
Gourd serves two volumes over [[NFS]]
# Second, map the security name into a group name:
 
  
#      groupName      securityModel securityName
+
'''Home folders''': /home on Gourd contains home directories for all npg users. The NFS share is accessible to all hosts in the servers, npg_clients and dept_clients Netgroup lists, and to all 10.0.0.0/24 (server room backend) hosts.
group  notConfigGroup v1          notConfigUser
 
group  notConfigGroup v2c          notConfigUser
 
  
####
+
'''Mail''': To reduce the size of the size of the Einstein VM the /mail directory on Gourd stores mail for all npg users. The nfs share is accessible only to Einstein and is mounted in /var/spool/mail on Einstein.
# Third, create a view for us to let the group have rights to:
 
  
# Make at least  snmpwalk -v 1 localhost -c public system fast again.
+
===/etc/exports===
#      name          incl/excl    subtree        mask(optional)
 
view    systemview    included  .1.3.6.1.2.1.1
 
view    systemview    included  .1.3.6.1.2.1.25.1.1
 
  
####
+
# Share home folders (copied from old Einstein)
# Finally, grant the group read-only access to the systemview view.
+
 +
/home                                        \
 +
          @servers(rw,sync)  \
 +
          @npg_clients(rw,sync) \
 +
          @dept_clients(rw,sync) \
 +
          10.0.0.0/24(rw,no_root_squash,sync)
 +
 +
# Share /mail with Einstein
 +
 +
/mail \
 +
132.177.88.52(rw,sync) \
 +
10.0.0.248(rw,no_root_squash,sync)
  
#      group          context sec.model sec.level prefix read  write  notif
 
access  notConfigGroup ""      any      noauth    exact  systemview none none
 
  
# -----------------------------------------------------------------------------
+
== IPTables ==
  
# Here is a commented out example configuration that allows less
+
Note that Centos 7 (i.e. RHEL 7) comes standard with "firewalld". Not wanting to bother with "yet another config system for firewalls (tm)", this was disabled in favor of good old iptables, which is the underlaying system anyway. This policy may change int he future.
# restrictive access.
+
(To disable firewalld: "systemctl stop firewalld ; systemctl mask firewalld'. To setup iptables: "yum install iptables-services; systemctl enable iptables", and then of course, configure the tables.)
 +
Gourd uses the standard NPG [[iptables]] firewall (actually, I copied the endeavour version, with blacklist.) . Gourd allows ssh, svn, icmp, portmap and nfs.
  
# YOU SHOULD CHANGE THE "COMMUNITY" TOKEN BELOW TO A NEW KEYWORD ONLY
+
A quick, nice tutorial on systemd iptables is here: https://community.rackspace.com/products/f/25/t/4504
# KNOWN AT YOUR SITE. YOU *MUST* CHANGE THE NETWORK TOKEN BELOW TO
 
# SOMETHING REFLECTING YOUR LOCAL NETWORK ADDRESS SPACE.
 
  
##      sec.name  source          community
+
==Subversion==
#com2sec local    localhost      COMMUNITY
 
#com2sec mynetwork NETWORK/24      COMMUNITY
 
com2sec  local    localhost      NPG
 
com2sec  okra      okra.unh.edu    NPG
 
com2sec  farm      10.0.0.0/24    NPG
 
  
##    group.name sec.model  sec.name
+
Gourd our subversion code repositories stored in /home/svn/. The subversion service runs under xinetd. Its configuration is located in /etc/xinetd.d/svn
#group MyRWGroup  any        local
 
#group MyROGroup  any        mynetwork
 
#
 
#group MyRWGroup  any        otherv3user
 
#...
 
group  MyROGroup  v1        local
 
group  MyROGroup  v2c        local
 
group  MyROGroup  v1        okra
 
group  MyROGroup  v2c        okra
 
group  MyROGroup  any        farm
 
  
##          incl/excl subtree                          mask
+
===/etc/xinetd.d/svn===
view  all    included  .1                              80
 
  
## -or just the mib2 tree-
+
service svn
 +
{
 +
port        = 3690
 +
socket_type = stream
 +
protocol    = tcp
 +
wait        = no
 +
user        = svn
 +
server      = /usr/bin/svnserve
 +
server_args = -i -r /home/svn
 +
disable    = no
 +
}
  
#view mib2  included  .iso.org.dod.internet.mgmt.mib-2 fc
+
==VMWare==
  
 +
<s>Gourd is running [[VMWare]] Server version 2.0.2. It acts as our primary virtualization server. it is accessible at https://gourd.unh.edu:8333/ or from localhost:8222 if you're logged in or port forwarding from Gourd over SSH.</s>
  
##                context sec.model sec.level prefix read  write  notif
+
As of 6/9/2012 Gourd no longer runs VMWare!!!! We use KVM now.
access  MyROGroup ""      any      noauth    exact  all    none  none
 
#access MyRWGroup ""      any      noauth    0      all    all    all
 
  
 +
== KVM ==
  
###############################################################################
+
===Guest VMs on Gourd===
# Sample configuration to make net-snmpd RFC 1213.
 
# Unfortunately v1 and v2c don't allow any user based authentification, so
 
# opening up the default config is not an option from a security point.
 
#
 
# WARNING: If you uncomment the following lines you allow write access to your
 
# snmpd daemon from any source! To avoid this use different names for your
 
# community or split out the write access to a different community and
 
# restrict it to your local network.
 
# Also remember to comment the syslocation and syscontact parameters later as
 
# otherwise they are still read only (see FAQ for net-snmp).
 
#
 
  
# First, map the community name "public" into a "security name"
+
*[[Einstein]] - [[LDAP]] authentication and [[E-mail]]
#      sec.name        source          community
+
*[[Roentgen]] - Websites and [[MySQL]]
#com2sec notConfigUser  default        public
+
*[[Jalapeno]] - [[DNS]] and [[Printing]]
  
# Second, map the security name into a group name:
+
== UPS Configuration ==
#      groupName      securityModel  securityName
 
#group  notConfigGroup  v1              notConfigUser
 
#group  notConfigGroup  v2c            notConfigUser
 
  
# Third, create a view for us to let the group have rights to:
+
Gourd is connected to the APC SmartUPS 2200XL. It uses the apcupsd service to monitor the UPS. Use the following command (with sudo, or as root) to see a detailed list of information about the status of the UPS, including battery charge and current load.
# Open up the whole tree for ro, make the RFC 1213 required ones rw.
 
#      name            incl/excl      subtree mask(optional)
 
#view    roview          included        .1
 
#view    rwview          included        system.sysContact
 
#view    rwview          included        system.sysName
 
#view    rwview          included        system.sysLocation
 
#view    rwview          included        interfaces.ifTable.ifEntry.ifAdminStatus
 
#view    rwview          included        at.atTable.atEntry.atPhysAddress
 
#view    rwview          included        at.atTable.atEntry.atNetAddress
 
#view    rwview          included        ip.ipForwarding
 
#view    rwview          included        ip.ipDefaultTTL
 
#view    rwview          included        ip.ipRouteTable.ipRouteEntry.ipRouteDest
 
#view    rwview          included        ip.ipRouteTable.ipRouteEntry.ipRouteIfIndex
 
#view    rwview          included        ip.ipRouteTable.ipRouteEntry.ipRouteMetric1
 
#view    rwview          included        ip.ipRouteTable.ipRouteEntry.ipRouteMetric2
 
#view    rwview          included        ip.ipRouteTable.ipRouteEntry.ipRouteMetric3
 
#view    rwview          included        ip.ipRouteTable.ipRouteEntry.ipRouteMetric4
 
#view    rwview          included        ip.ipRouteTable.ipRouteEntry.ipRouteType
 
#view    rwview          included        ip.ipRouteTable.ipRouteEntry.ipRouteAge
 
#view    rwview          included        ip.ipRouteTable.ipRouteEntry.ipRouteMask
 
#view    rwview          included        ip.ipRouteTable.ipRouteEntry.ipRouteMetric5
 
#view    rwview          included        ip.ipNetToMediaTable.ipNetToMediaEntry.ipNetToMediaIfIndex
 
#view    rwview          included        ip.ipNetToMediaTable.ipNetToMediaEntry.ipNetToMediaPhysAddress
 
#view    rwview          included        ip.ipNetToMediaTable.ipNetToMediaEntry.ipNetToMediaNetAddress
 
#view    rwview          included        ip.ipNetToMediaTable.ipNetToMediaEntry.ipNetToMediaType
 
#view    rwview          included        tcp.tcpConnTable.tcpConnEntry.tcpConnState
 
#view    rwview          included        egp.egpNeighTable.egpNeighEntry.egpNeighEventTrigger
 
#view    rwview          included        snmp.snmpEnableAuthenTraps
 
  
# Finally, grant the group read-only access to the systemview view.
+
  service apcupsd status
#      group          context sec.model sec.level prefix read  write notif
 
#access  notConfigGroup ""      any      noauth    exact  roview rwview none
 
  
 +
===Shutdown Script===
  
 +
apcupsd allows us to do some other useful things, for example it runs the script /etc/apcupsd/shutdown2 which monitors the current battery charge and shuts down certain non-critical systems at specific points to extend battery life. First, when the battery reaches 50% it shuts down [[taro]], [[pumpkin]], [[tomato]], and [[roentgen]]. Later, when battery is at 5% it shuts down the remaining virtual machines, [[einstein]] and [[jalapeno]]. Shutting down einstein and jalapeno at 5% battery isn't meant to save battery, instead this is designed so that these machines have a chance to shut down normally before the battery backup is completely exhausted. The contents of the script can be viewed [[shutdown2|here]].
  
###############################################################################
+
====SSH Keys====
# System contact information
 
#
 
  
# It is also possible to set the sysContact and sysLocation system
+
In order to issue remote shutdown commands to other machines gourd needs to issue a command over an ssh connection without a password. It uses an rsa key for this purpose (/root/.ssh/shutdown_id_rsa) and each machine is configured to allow gourd to use this key to issue a remote shutdown command. This key can't be used for shell logins or any other commands.  
# variables through the snmpd.conf file:
 
  
syslocation Durham, NH, USA, University of New Hampshire, DeMeritt Hall
+
The official Site and Manual
syscontact NPG Admins <npg-admins@einstein.unh.edu>
+
  [http://apcupsd.org/ Official Site]
 +
  [http://nuclear.unh.edu/wiki/pdfs/apcupsd-manual.pdf User Manual]
  
# Example output of snmpwalk:
+
= Disks and Raid Configuration =
#  % snmpwalk -v 1 localhost -c public system
 
#  system.sysDescr.0 = "SunOS name sun4c"
 
#  system.sysObjectID.0 = OID: enterprises.ucdavis.ucdSnmpAgent.sunos4
 
#  system.sysUpTime.0 = Timeticks: (595637548) 68 days, 22:32:55
 
#  system.sysContact.0 = "Me <me@somewhere.org>"
 
#  system.sysName.0 = "name"
 
#  system.sysLocation.0 = "Right here, right now."
 
#  system.sysServices.0 = 72
 
  
 +
The disks is something that ''should not change'', but alas sometimes it does. Here are the investigation steps to figure out the, probably over complicated, disk setup.
  
# -----------------------------------------------------------------------------
+
=== Step 1: Hardware  ===
  
 +
'''Areca RAID Card:''' The hardware raid setup is found by visiting 10.0.0.152 (port 80) with a chrome browser, using admin and Tq....ab password.
  
###############################################################################
+
'''Disks and Raid configuration'''
# Process checks.
+
{| style="wikitable;"  border="1"
#
+
! Drive Bay !! Disk Size !! Usage !! Model
# The following are examples of how to use the agent to check for
+
|-
# processes running on the host. The syntax looks something like:
+
| 1 || 750.2 GB || System/Scratch || WDC WD7500AAKS-00RBA0
#
+
|-
#  proc NAME [MAX=0] [MIN=0]
+
| 2 || 2 TB || Pass Through || ST2000DM001-1ER164
#
+
|-
#  NAME:  the name of the process to check for. It must match
+
| 3 || 1 TB || Hot Spare || ST31000340NS
#        exactly (ie, http will not find httpd processes).
+
|-
#  MAX:  the maximum number allowed to be runningDefaults to 0.
+
| 4 || 750.2 GB || Pass Through || ST3750640AS
#  MIN:  the minimum number to be running.  Defaults to 0.
+
|-
 +
| 5 || 2 TB || Pass Through || ST2000DM001-1ER164
 +
|-
 +
| 6 || Empty || None || None
 +
|-
 +
| 7 || 500.1 GB || Pass Through || WDC WD5000AAKS-00TMA0
 +
|-
 +
| 8 || 750.2 GB || System/Scratch || WDC WD7500AAKS-00RBA0
 +
|}
 +
<br/>
  
#
+
'''Volume Set and Partition configuration'''
# Examples (commented out by default):
+
{| style="wikitable;"  border="1"
#
+
! Raid set!! Devices !! Volume set !! Volume size !! Partitions
 +
|-
 +
| Raid Set #000 || Slot2 || 0/0/3 || 2000.4 GB || ????
 +
|-
 +
| System/Scratcy || Slot1 + Slot8) || 0/0/0 (System: Raid1) || 250.1 GB || ?????
 +
|-
 +
|                            ||                        || 0/0/1 (www: Raid1)  || 100.0 GB || ?????
 +
|-
 +
|                            ||                      || 0/0/2 (scratch: Raid1) || 399.9 GB || ?????
 +
|-
 +
| Raid Set #002 || Slot 4 || 0/0/4 || 750.2 GB || ????
 +
|-
 +
| Raid Set #003|| Slot 5 || 0/0/5 || 2000.4 GB || ?????
 +
|-
 +
| Raid Set #005 || Slot 7 || 0/0/7 || 500.1 GB || ????
 +
|}
  
#  Make sure mountd is running
+
To see how the volume set maps on the host, you can look at "cat /proc/scsi/scsi" which will list the SCSI devices on the system. This confirms that Channel 0, Id: 0, Lun: 0, is indeed "System". Each "run" is mapped to a /dev/s** device. So, /0/0/0/ (System, RAID1) is mapped to /dev/sda, which is divided into 3 partitions: /dev/sda1, /dev/sda2, /dev/sda3, /dev/sda4.
#proc mountd
 
  
#  Make sure there are no more than 4 ntalkds running, but 0 is ok too.
+
To confuse things, further, in the case of /dev/sda, /dev/sda2 and /dev/sda4 are "lvm", so these are part of a logical volume. The advantage of logical volume is that they can be resized and changed on a live, running system. The disadvantage is that there is quite a bit of additional complexity.  More on this below.
#proc ntalkd 4
 
  
#  Make sure at least one sendmail, but less than or equal to 10 are running.
 
#proc sendmail 10 1
 
  
#  A snmpwalk of the process mib tree would look something like this:
 
#
 
# % snmpwalk -v 1 localhost -c public .1.3.6.1.4.1.2021.2
 
# enterprises.ucdavis.procTable.prEntry.prIndex.1 = 1
 
# enterprises.ucdavis.procTable.prEntry.prIndex.2 = 2
 
# enterprises.ucdavis.procTable.prEntry.prIndex.3 = 3
 
# enterprises.ucdavis.procTable.prEntry.prNames.1 = "mountd"
 
# enterprises.ucdavis.procTable.prEntry.prNames.2 = "ntalkd"
 
# enterprises.ucdavis.procTable.prEntry.prNames.3 = "sendmail"
 
# enterprises.ucdavis.procTable.prEntry.prMin.1 = 0
 
# enterprises.ucdavis.procTable.prEntry.prMin.2 = 0
 
# enterprises.ucdavis.procTable.prEntry.prMin.3 = 1
 
# enterprises.ucdavis.procTable.prEntry.prMax.1 = 0
 
# enterprises.ucdavis.procTable.prEntry.prMax.2 = 4
 
# enterprises.ucdavis.procTable.prEntry.prMax.3 = 10
 
# enterprises.ucdavis.procTable.prEntry.prCount.1 = 0
 
# enterprises.ucdavis.procTable.prEntry.prCount.2 = 0
 
# enterprises.ucdavis.procTable.prEntry.prCount.3 = 1
 
# enterprises.ucdavis.procTable.prEntry.prErrorFlag.1 = 1
 
# enterprises.ucdavis.procTable.prEntry.prErrorFlag.2 = 0
 
# enterprises.ucdavis.procTable.prEntry.prErrorFlag.3 = 0
 
# enterprises.ucdavis.procTable.prEntry.prErrMessage.1 = "No mountd process running."
 
# enterprises.ucdavis.procTable.prEntry.prErrMessage.2 = ""
 
# enterprises.ucdavis.procTable.prEntry.prErrMessage.3 = ""
 
# enterprises.ucdavis.procTable.prEntry.prErrFix.1 = 0
 
# enterprises.ucdavis.procTable.prEntry.prErrFix.2 = 0
 
# enterprises.ucdavis.procTable.prEntry.prErrFix.3 = 0
 
#
 
#  Note that the errorFlag for mountd is set to 1 because one is not
 
#  running (in this case an rpc.mountd is, but thats not good enough),
 
#  and the ErrMessage tells you what's wrong.  The configuration
 
#  imposed in the snmpd.conf file is also shown. 
 
#
 
#  Special Case:  When the min and max numbers are both 0, it assumes
 
#  you want a max of infinity and a min of 1.
 
#
 
  
 +
=== Step 2: Logical Volumes and Hardware Mapping ===
 +
In modern Linux, the hardware is often mapped, so you need to investigate what piece of hardware ends up to what device. Also, logical volumes are often used to add flexibility.
 +
For details on device mapping see: (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/logical_volume_manager_administration/device_mapper)
  
# -----------------------------------------------------------------------------
+
The command: "dmsetup info" will show the mapped devices. Each device shows up in /dev/mapper.
  
 +
For LVM, you have several layers. The physical (i.e. disks), the group, and the layer.
  
###############################################################################
+
Display the physical disks in use with "pvdisplay" or "pvs". Currently Gourd has 2 physical disks used for LVM: /dev/sda2 (30GB) and /dev/sda4 (140GB). Each disk is used for the same "centos" volume group.
# Executables/scripts
 
#
 
  
#
+
Display the volume groups with "vgdisplay". There is one "centos", with a VGsize of 167.55 GiB.
#  You can also have programs run by the agent that return a single
 
#  line of output and an exit code.  Here are two examples.
 
#
 
#  exec NAME PROGRAM [ARGS ...]
 
#
 
#  NAME:    A generic name.
 
#  PROGRAM:  The program to run. Include the path!
 
#  ARGS:    optional arguments to be passed to the program
 
  
# a simple hello world
+
Display the logical volumes with "lvdisplay" or "lvs". There are currently 3:
  
#exec echotest /bin/echo hello world
+
    LV  VG    Attr      LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
 +
    root centos -wi-ao---- 50.00g                                                   
 +
    swap centos -wi-ao---- 15.75g                                                   
 +
    tmp  centos -wi-ao----  5.00g
  
# Run a shell script containing:
+
Each of these logical volumes shows up in /dev/mapper as centos-root centos-swap and centos-tmp.  
#
 
# #!/bin/sh
 
# echo hello world
 
# echo hi there
 
# exit 35
 
#
 
# Note:  this has been specifically commented out to prevent
 
# accidental security holes due to someone else on your system writing
 
# a /tmp/shtest before you do.  Uncomment to use it.
 
#
 
#exec shelltest /bin/sh /tmp/shtest
 
  
# Then,
+
See more at: (https://srobb.net/lvm.htm)l  and for growing the filesystem: (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/logical_volume_manager_administration/lv_extend)
# % snmpwalk -v 1 localhost -c public .1.3.6.1.4.1.2021.8
 
# enterprises.ucdavis.extTable.extEntry.extIndex.1 = 1
 
# enterprises.ucdavis.extTable.extEntry.extIndex.2 = 2
 
# enterprises.ucdavis.extTable.extEntry.extNames.1 = "echotest"
 
# enterprises.ucdavis.extTable.extEntry.extNames.2 = "shelltest"
 
# enterprises.ucdavis.extTable.extEntry.extCommand.1 = "/bin/echo hello world"
 
# enterprises.ucdavis.extTable.extEntry.extCommand.2 = "/bin/sh /tmp/shtest"
 
# enterprises.ucdavis.extTable.extEntry.extResult.1 = 0
 
# enterprises.ucdavis.extTable.extEntry.extResult.2 = 35
 
# enterprises.ucdavis.extTable.extEntry.extOutput.1 = "hello world."
 
# enterprises.ucdavis.extTable.extEntry.extOutput.2 = "hello world."
 
# enterprises.ucdavis.extTable.extEntry.extErrFix.1 = 0
 
# enterprises.ucdavis.extTable.extEntry.extErrFix.2 = 0
 
  
# Note that the second line of the /tmp/shtest shell script is cut
+
= OLD DISK SETUP =
# off.  Also note that the exit status of 35 was returned.
+
This was the post-migration Gourd disk setup (as of 03/01/10).
  
# -----------------------------------------------------------------------------
+
'''Disks and Raid configuration'''
 +
{| style="wikitable;"  border="1"
 +
! Drive Bay !! Disk Size !! Raid Set !! Raid level
 +
|-
 +
| 1 || 750 GB ||rowspan="2"| System/Scratch ||rowspan="2"| Raid1 + Raid0
 +
|-
 +
| 2 || 750 GB
 +
|-
 +
| 3 || 750 GB || Software RAID || Pass Through
 +
|-
 +
| 4 || 750 GB || Software RAID || Pass Through
 +
|-
 +
| 5 || Empty || None || None
 +
|-
 +
| 6 || Empty || None || None
 +
|-
 +
| 7 || 750 GB || Hot Swap || None
 +
|-
 +
| 8 || 750 GB || Hot Swap || None
 +
|}
 +
<br/>
  
 
+
'''Volume Set and Partition configuration'''
###############################################################################
+
{| style="wikitable;" border="1"
# disk checks
+
! Raid set!! Volume set !! Volume size !! Partitions
#
+
|-
 
+
| Set 1 || System(Raid1) || 250 GB || System: (/, /boot, /var, /tmp, /swap, /usr, /data )
# The agent can check the amount of available disk space, and make
+
|-
# sure it is above a set limit. 
+
| Set 2 || Scratch(Raid0) || 1000GB || Scratch space (/scratch)
 
+
|-
# disk PATH [MIN=100000]
+
| /dev/md0 &nbsp; || sdc1 & sdb1 || 500 GB || Home Dirs: /home
#
+
|-
# PATH:  mount path to the disk in question.
+
| /dev/md1 &nbsp; || sdc2 & sdd2 || 100 GB || Mail: /mail
# MIN:  Disks with space below this value will have the Mib's errorFlag set.
+
|-
#        Default value = 100000.
+
| /dev/md2 &nbsp; || sdc3 & sdd3 || 150 GB || Virtual Machines: /vmware
 
+
|}
# Check the / partition and make sure it contains at least 10 megs.
 
 
 
disk /
 
disk /data
 
 
 
# % snmpwalk -v 1 localhost -c public .1.3.6.1.4.1.2021.9
 
# enterprises.ucdavis.diskTable.dskEntry.diskIndex.1 = 0
 
# enterprises.ucdavis.diskTable.dskEntry.diskPath.1 = "/" Hex: 2F
 
# enterprises.ucdavis.diskTable.dskEntry.diskDevice.1 = "/dev/dsk/c201d6s0"
 
# enterprises.ucdavis.diskTable.dskEntry.diskMinimum.1 = 10000
 
# enterprises.ucdavis.diskTable.dskEntry.diskTotal.1 = 837130
 
# enterprises.ucdavis.diskTable.dskEntry.diskAvail.1 = 316325
 
# enterprises.ucdavis.diskTable.dskEntry.diskUsed.1 = 437092
 
# enterprises.ucdavis.diskTable.dskEntry.diskPercent.1 = 58
 
# enterprises.ucdavis.diskTable.dskEntry.diskErrorFlag.1 = 0
 
# enterprises.ucdavis.diskTable.dskEntry.diskErrorMsg.1 = ""
 
 
 
# -----------------------------------------------------------------------------
 
 
 
 
 
###############################################################################
 
# load average checks
 
#
 
 
 
# load [1MAX=12.0] [5MAX=12.0] [15MAX=12.0]
 
#
 
# 1MAX:  If the 1 minute load average is above this limit at query
 
#        time, the errorFlag will be set.
 
# 5MAX:  Similar, but for 5 min average.
 
# 15MAX:  Similar, but for 15 min average.
 
 
 
# Check for loads:
 
#load 12 14 14
 
 
 
# % snmpwalk -v 1 localhost -c public .1.3.6.1.4.1.2021.10
 
# enterprises.ucdavis.loadTable.laEntry.loadaveIndex.1 = 1
 
# enterprises.ucdavis.loadTable.laEntry.loadaveIndex.2 = 2
 
# enterprises.ucdavis.loadTable.laEntry.loadaveIndex.3 = 3
 
# enterprises.ucdavis.loadTable.laEntry.loadaveNames.1 = "Load-1"
 
# enterprises.ucdavis.loadTable.laEntry.loadaveNames.2 = "Load-5"
 
# enterprises.ucdavis.loadTable.laEntry.loadaveNames.3 = "Load-15"
 
# enterprises.ucdavis.loadTable.laEntry.loadaveLoad.1 = "0.49" Hex: 30 2E 34 39
 
# enterprises.ucdavis.loadTable.laEntry.loadaveLoad.2 = "0.31" Hex: 30 2E 33 31
 
# enterprises.ucdavis.loadTable.laEntry.loadaveLoad.3 = "0.26" Hex: 30 2E 32 36
 
# enterprises.ucdavis.loadTable.laEntry.loadaveConfig.1 = "12.00"
 
# enterprises.ucdavis.loadTable.laEntry.loadaveConfig.2 = "14.00"
 
# enterprises.ucdavis.loadTable.laEntry.loadaveConfig.3 = "14.00"
 
# enterprises.ucdavis.loadTable.laEntry.loadaveErrorFlag.1 = 0
 
# enterprises.ucdavis.loadTable.laEntry.loadaveErrorFlag.2 = 0
 
# enterprises.ucdavis.loadTable.laEntry.loadaveErrorFlag.3 = 0
 
# enterprises.ucdavis.loadTable.laEntry.loadaveErrMessage.1 = ""
 
# enterprises.ucdavis.loadTable.laEntry.loadaveErrMessage.2 = ""
 
# enterprises.ucdavis.loadTable.laEntry.loadaveErrMessage.3 = ""
 
 
 
# -----------------------------------------------------------------------------
 
 
 
 
 
###############################################################################
 
# Extensible sections.
 
#
 
 
 
# This alleviates the multiple line output problem found in the
 
# previous executable mib by placing each mib in its own mib table:
 
 
 
# Run a shell script containing:
 
#
 
# #!/bin/sh
 
# echo hello world
 
# echo hi there
 
# exit 35
 
#
 
# Note:  this has been specifically commented out to prevent
 
# accidental security holes due to someone else on your system writing
 
# a /tmp/shtest before you do.  Uncomment to use it.
 
#
 
# exec .1.3.6.1.4.1.2021.50 shelltest /bin/sh /tmp/shtest
 
 
 
# % snmpwalk -v 1 localhost -c public .1.3.6.1.4.1.2021.50
 
# enterprises.ucdavis.50.1.1 = 1
 
# enterprises.ucdavis.50.2.1 = "shelltest"
 
# enterprises.ucdavis.50.3.1 = "/bin/sh /tmp/shtest"
 
# enterprises.ucdavis.50.100.1 = 35
 
# enterprises.ucdavis.50.101.1 = "hello world."
 
# enterprises.ucdavis.50.101.2 = "hi there."
 
# enterprises.ucdavis.50.102.1 = 0
 
 
 
# Now the Output has grown to two lines, and we can see the 'hi
 
# there.' output as the second line from our shell script.
 
#
 
# Note that you must alter the mib.txt file to be correct if you want
 
# the .50.* outputs above to change to reasonable text descriptions.
 
 
 
# Other ideas:
 
#
 
# exec .1.3.6.1.4.1.2021.51 ps /bin/ps
 
# exec .1.3.6.1.4.1.2021.52 top /usr/local/bin/top
 
# exec .1.3.6.1.4.1.2021.53 mailq /usr/bin/mailq
 
 
 
# -----------------------------------------------------------------------------
 
 
 
 
 
###############################################################################
 
# Pass through control.
 
#
 
 
 
# Usage:
 
#  pass MIBOID EXEC-COMMAND
 
#
 
# This will pass total control of the mib underneath the MIBOID
 
# portion of the mib to the EXEC-COMMAND. 
 
#
 
# Note:  You'll have to change the path of the passtest script to your
 
# source directory or install it in the given location.
 
#
 
# Example:  (see the script for details)
 
#          (commented out here since it requires that you place the
 
#          script in the right location. (its not installed by default))
 
 
 
# pass .1.3.6.1.4.1.2021.255 /bin/sh /usr/local/local/passtest
 
 
 
# % snmpwalk -v 1 localhost -c public .1.3.6.1.4.1.2021.255
 
# enterprises.ucdavis.255.1 = "life the universe and everything"
 
# enterprises.ucdavis.255.2.1 = 42
 
# enterprises.ucdavis.255.2.2 = OID: 42.42.42
 
# enterprises.ucdavis.255.3 = Timeticks: (363136200) 42 days, 0:42:42
 
# enterprises.ucdavis.255.4 = IpAddress: 127.0.0.1
 
# enterprises.ucdavis.255.5 = 42
 
# enterprises.ucdavis.255.6 = Gauge: 42
 
#
 
# % snmpget -v 1 localhost public .1.3.6.1.4.1.2021.255.5
 
# enterprises.ucdavis.255.5 = 42
 
#
 
# % snmpset -v 1 localhost public .1.3.6.1.4.1.2021.255.1 s "New string"
 
# enterprises.ucdavis.255.1 = "New string"
 
#
 
 
 
# For specific usage information, see the man/snmpd.conf.5 manual page
 
# as well as the local/passtest script used in the above example.
 
 
 
# Added for support of bcm5820 cards.
 
pass .1.3.6.1.4.1.4413.4.1 /usr/bin/ucd5820stat
 
 
 
###############################################################################
 
# Further Information
 
#
 
#  See the snmpd.conf manual page, and the output of "snmpd -H".
 
</pre>
 

Latest revision as of 18:11, 9 July 2020

Gourd is a 2 quad-CPU server in a 2U rackmount chassis put together nicely by Microway. It arrived at UNH on 11/24/2009. The system has an Areca RAID card with ethernet port and an IPMI card with ethernet port. The motherboard is from Super Micro.

This is the page for the new Gourd Hardware. The old Gourd is described here.

Gourd now hosts Einstein as a VM. The previous einstein hardware is described here

Gourds

Hardware Details

  • Microway 2U Storage Chassis with 8 SAS/SATA-II Hot-Swap Drive Bays with SAS Backplane
  • 500W Redundant Hot-Swap Power Supply (2)
  • Navion-S Dual Opteron PCI-Express Motherboard (H8DME-2);
  • Two AMD "Shanghai" 2.4 GHz Quad Core Opteron 2378 (Socket F) Processors
  • (8) 4GB DDR2 800 MHz ECC/Registered Memory (32GB Total Memory @ 533MHz)
  • Dual Integrated Gigabit Ethernet Ports; = "Nvidia" ethernet.
  • Integrated ATI ES1000 Graphics;
  • Six Integrated SATA-II Ports;
  • Two x8 PCI Express slots;
  • Two 64-bit 133/100MHz PCI-X slots;
  • Two 64-bit 100MHz PCI-X slots;
  • Up to 6 USB 1.1/2.0 ports (2 on rear);
  • PS/2 Mouse & Keyboard connectors;
  • Areca ARC-1220 8-Port SATA II Raid (256MB) (reports as 1680) - Low Profile PCI Express x8 at 10.0.0.152
  • SuperMicro AOC-SIM1U IPMI card.
  • Sony Slim CD/DVD 24x Black (IDE)

Setup

Details of Gourd Upgrading to Centos 7

Network Settings

In Centos 7 there is an issue with the drivers for the Nvidia ethernet, which was previously provided through the forcedeth driver. The forcedeth driver which is disabled in the CentOS 7 kernel. You can use the kmod-forcedeth driver from elrepo.org:

http://elrepo.org/linux/elrepo/el7/x86_64/RPMS/kmod-forcedeth-0.64-1.el7.elrepo.x86_64.rpm

The new udev renaming scheme calls eth0 or FARM network = enp0s8, and eth1 or UNH network=enp0s9

  • IP address UNH: 132.177.88.75 (enp0s9)
  • IP address Farm: 10.0.0.252 (enp0s8)
  • IP address RAID: 10.0.0.152
  • IP address IPMI: 10.0.0.151

Software and Services

This section contains details about the services and software on Gourd and information about their configurations.

Software RAID

Gourd has a (somewhat simpler) ARECA RAID card. The / system and /scratch are on a RAID created by the card. This was NOT chosen for /home and /mail. Instead /home and /mail are on a software Raid, so that if these services ever need to be moved to other hardware, there is no issues with incompatibility due to Raid card.

The config can be found from /etc/mdadm.conf (least reliable!), but better from /proc/mdstat or mdadm --detail --scan, mdadm --detail /dev/md0

The drives are passthrough drives on the ARECA card. The association between passthrough drive and device name is established by looking at the SCSI channel number (web interface of the ARECA card) and the Linux /sys:

ls -l /sys/bus/scsi/drivers/sd/0\:0\:0\:0/
lrwxrwxrwx 1 root root    0 May 19 21:38 block:sda -> ../../../../../../../block/sda/

for each of the 0:0:0:x scsi channels.

  • See More about the disks below

NFS Shares

Gourd serves two volumes over NFS

Home folders: /home on Gourd contains home directories for all npg users. The NFS share is accessible to all hosts in the servers, npg_clients and dept_clients Netgroup lists, and to all 10.0.0.0/24 (server room backend) hosts.

Mail: To reduce the size of the size of the Einstein VM the /mail directory on Gourd stores mail for all npg users. The nfs share is accessible only to Einstein and is mounted in /var/spool/mail on Einstein.

/etc/exports

# Share home folders (copied from old Einstein)

/home                                         \
         @servers(rw,sync)   \
         @npg_clients(rw,sync) \
         @dept_clients(rw,sync) \
         10.0.0.0/24(rw,no_root_squash,sync)

# Share /mail with Einstein 

/mail						\
	132.177.88.52(rw,sync)	\
	10.0.0.248(rw,no_root_squash,sync)


IPTables

Note that Centos 7 (i.e. RHEL 7) comes standard with "firewalld". Not wanting to bother with "yet another config system for firewalls (tm)", this was disabled in favor of good old iptables, which is the underlaying system anyway. This policy may change int he future. (To disable firewalld: "systemctl stop firewalld ; systemctl mask firewalld'. To setup iptables: "yum install iptables-services; systemctl enable iptables", and then of course, configure the tables.) Gourd uses the standard NPG iptables firewall (actually, I copied the endeavour version, with blacklist.) . Gourd allows ssh, svn, icmp, portmap and nfs.

A quick, nice tutorial on systemd iptables is here: https://community.rackspace.com/products/f/25/t/4504

Subversion

Gourd our subversion code repositories stored in /home/svn/. The subversion service runs under xinetd. Its configuration is located in /etc/xinetd.d/svn

/etc/xinetd.d/svn

service svn
{ 
	port        = 3690
	socket_type = stream
	protocol    = tcp
	wait        = no
	user        = svn
	server      = /usr/bin/svnserve
	server_args = -i -r /home/svn
	disable     = no
}

VMWare

Gourd is running VMWare Server version 2.0.2. It acts as our primary virtualization server. it is accessible at https://gourd.unh.edu:8333/ or from localhost:8222 if you're logged in or port forwarding from Gourd over SSH.

As of 6/9/2012 Gourd no longer runs VMWare!!!! We use KVM now.

KVM

Guest VMs on Gourd

UPS Configuration

Gourd is connected to the APC SmartUPS 2200XL. It uses the apcupsd service to monitor the UPS. Use the following command (with sudo, or as root) to see a detailed list of information about the status of the UPS, including battery charge and current load.

service apcupsd status

Shutdown Script

apcupsd allows us to do some other useful things, for example it runs the script /etc/apcupsd/shutdown2 which monitors the current battery charge and shuts down certain non-critical systems at specific points to extend battery life. First, when the battery reaches 50% it shuts down taro, pumpkin, tomato, and roentgen. Later, when battery is at 5% it shuts down the remaining virtual machines, einstein and jalapeno. Shutting down einstein and jalapeno at 5% battery isn't meant to save battery, instead this is designed so that these machines have a chance to shut down normally before the battery backup is completely exhausted. The contents of the script can be viewed here.

SSH Keys

In order to issue remote shutdown commands to other machines gourd needs to issue a command over an ssh connection without a password. It uses an rsa key for this purpose (/root/.ssh/shutdown_id_rsa) and each machine is configured to allow gourd to use this key to issue a remote shutdown command. This key can't be used for shell logins or any other commands.

The official Site and Manual

 Official Site
 User Manual

Disks and Raid Configuration

The disks is something that should not change, but alas sometimes it does. Here are the investigation steps to figure out the, probably over complicated, disk setup.

Step 1: Hardware

Areca RAID Card: The hardware raid setup is found by visiting 10.0.0.152 (port 80) with a chrome browser, using admin and Tq....ab password.

Disks and Raid configuration

Drive Bay Disk Size Usage Model
1 750.2 GB System/Scratch WDC WD7500AAKS-00RBA0
2 2 TB Pass Through ST2000DM001-1ER164
3 1 TB Hot Spare ST31000340NS
4 750.2 GB Pass Through ST3750640AS
5 2 TB Pass Through ST2000DM001-1ER164
6 Empty None None
7 500.1 GB Pass Through WDC WD5000AAKS-00TMA0
8 750.2 GB System/Scratch WDC WD7500AAKS-00RBA0


Volume Set and Partition configuration

Raid set Devices Volume set Volume size Partitions
Raid Set #000 Slot2 0/0/3 2000.4 GB ????
System/Scratcy Slot1 + Slot8) 0/0/0 (System: Raid1) 250.1 GB ?????
0/0/1 (www: Raid1) 100.0 GB ?????
0/0/2 (scratch: Raid1) 399.9 GB ?????
Raid Set #002 Slot 4 0/0/4 750.2 GB ????
Raid Set #003 Slot 5 0/0/5 2000.4 GB ?????
Raid Set #005 Slot 7 0/0/7 500.1 GB ????

To see how the volume set maps on the host, you can look at "cat /proc/scsi/scsi" which will list the SCSI devices on the system. This confirms that Channel 0, Id: 0, Lun: 0, is indeed "System". Each "run" is mapped to a /dev/s** device. So, /0/0/0/ (System, RAID1) is mapped to /dev/sda, which is divided into 3 partitions: /dev/sda1, /dev/sda2, /dev/sda3, /dev/sda4.

To confuse things, further, in the case of /dev/sda, /dev/sda2 and /dev/sda4 are "lvm", so these are part of a logical volume. The advantage of logical volume is that they can be resized and changed on a live, running system. The disadvantage is that there is quite a bit of additional complexity. More on this below.


Step 2: Logical Volumes and Hardware Mapping

In modern Linux, the hardware is often mapped, so you need to investigate what piece of hardware ends up to what device. Also, logical volumes are often used to add flexibility. For details on device mapping see: (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/logical_volume_manager_administration/device_mapper)

The command: "dmsetup info" will show the mapped devices. Each device shows up in /dev/mapper.

For LVM, you have several layers. The physical (i.e. disks), the group, and the layer.

Display the physical disks in use with "pvdisplay" or "pvs". Currently Gourd has 2 physical disks used for LVM: /dev/sda2 (30GB) and /dev/sda4 (140GB). Each disk is used for the same "centos" volume group.

Display the volume groups with "vgdisplay". There is one "centos", with a VGsize of 167.55 GiB.

Display the logical volumes with "lvdisplay" or "lvs". There are currently 3:

   LV   VG     Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
   root centos -wi-ao---- 50.00g                                                    
   swap centos -wi-ao---- 15.75g                                                    
   tmp  centos -wi-ao----  5.00g

Each of these logical volumes shows up in /dev/mapper as centos-root centos-swap and centos-tmp.

See more at: (https://srobb.net/lvm.htm)l and for growing the filesystem: (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/logical_volume_manager_administration/lv_extend)

OLD DISK SETUP

This was the post-migration Gourd disk setup (as of 03/01/10).

Disks and Raid configuration

Drive Bay Disk Size Raid Set Raid level
1 750 GB System/Scratch Raid1 + Raid0
2 750 GB
3 750 GB Software RAID Pass Through
4 750 GB Software RAID Pass Through
5 Empty None None
6 Empty None None
7 750 GB Hot Swap None
8 750 GB Hot Swap None


Volume Set and Partition configuration

Raid set Volume set Volume size Partitions
Set 1 System(Raid1) 250 GB System: (/, /boot, /var, /tmp, /swap, /usr, /data )
Set 2 Scratch(Raid0) 1000GB Scratch space (/scratch)
/dev/md0   sdc1 & sdb1 500 GB Home Dirs: /home
/dev/md1   sdc2 & sdd2 100 GB Mail: /mail
/dev/md2   sdc3 & sdd3 150 GB Virtual Machines: /vmware