Difference between revisions of "Lentil"

From Nuclear Physics Group Documentation Pages
Jump to navigationJump to search
m (Drive letter fix.)
 
(21 intermediate revisions by 4 users not shown)
Line 1: Line 1:
 
== General Information ==
 
== General Information ==
Lentil performs [[backups]]. Its backup script needs further investigation to determine exactly how it works.
+
Lentil performs [[backups]]. Its autofs is configured to mount harddrives labeled npg-daily-XX onto /mnt/npg-daily/XX where XX is a label number. Its backup script needs further investigation to determine exactly how it works.
== Network Configuration ==
 
Hostnames: lentil.unh.edu, lentil.farm.physics.unh.edu
 
  
Currently connected to the unh and farm networks via the switch and VLAN.
+
As of Feb. 15, 2015 Lentil is mounted with the following which will last us awhile:
=== ifconfig ===
+
*/dev/sdd1 mounted at /mnt/npg-daily/51; A filled 2TB hard drive
<pre>
+
*/dev/sda1 mounted at /mnt/npg-daily/52; A 4TB hard drive
eth0      Link encap:Ethernet  HWaddr 00:23:54:BC:70:F1 
+
*/dev/sdc1 mounted at /mnt/npg-daily/53; A 4TB hard drive
          inet addr:10.0.0.250  Bcast:10.0.0.255  Mask:255.255.255.0
+
*/dev/sdb1 as the root file system. Don't hot swap this.
          inet6 addr: fe80::223:54ff:febc:70f1/64 Scope:Link
 
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 
          RX packets:30404 errors:0 dropped:0 overruns:0 frame:0
 
          TX packets:1778 errors:0 dropped:0 overruns:0 carrier:0
 
          collisions:0 txqueuelen:1000
 
          RX bytes:3723593 (3.5 MiB)  TX bytes:227139 (221.8 KiB)
 
          Interrupt:66 Base address:0x6000
 
  
eth0.2    Link encap:Ethernet  HWaddr 00:23:54:BC:70:F1 
+
== Hardware Information ==
          inet addr:132.177.88.254  Bcast:132.177.91.255  Mask:255.255.252.0
+
  Motherboard: Asus P5QL-CM
          inet6 addr: fe80::223:54ff:febc:70f1/64 Scope:Link
+
    Specifications: [http://nuclear.unh.edu/wiki/pdfs/motherboards/16744.pdf Specifications]
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
+
    User Manual: [http://nuclear.unh.edu/wiki/pdfs/motherboards/E4411_P5QL-CM V2.pdf Users Manual]
          RX packets:28826 errors:0 dropped:0 overruns:0 frame:0
+
  Memory: 2 GB DDR2
          TX packets:1032 errors:0 dropped:0 overruns:0 carrier:0
 
          collisions:0 txqueuelen:0
 
          RX bytes:3068357 (2.9 MiB)  TX bytes:114723 (112.0 KiB)
 
  
lo        Link encap:Local Loopback 
 
          inet addr:127.0.0.1  Mask:255.0.0.0
 
          inet6 addr: ::1/128 Scope:Host
 
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
 
          RX packets:99 errors:0 dropped:0 overruns:0 frame:0
 
          TX packets:99 errors:0 dropped:0 overruns:0 carrier:0
 
          collisions:0 txqueuelen:0
 
          RX bytes:9335 (9.1 KiB)  TX bytes:9335 (9.1 KiB)
 
  
virbr0    Link encap:Ethernet  HWaddr 00:00:00:00:00:00 
+
== Authentication ==
          inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
+
Lentil authenticates against the LDAP server running on Einstein, by connecting to einstein.farm.physucs.unh.edu using sssd.
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
+
Previously, Lentil went on the UNH network to einstein.unh.edu, but this is blocked (I think by ip-tables). The farm network is the better choice anyhow.
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
 
          TX packets:54 errors:0 dropped:0 overruns:0 carrier:0
 
          collisions:0 txqueuelen:0
 
          RX bytes:0 (0.0 b)  TX bytes:10754 (10.5 KiB)
 
</pre>
 
=== /etc/sysconfig/network-scripts/ifcfg-farm ===
 
<pre>
 
# Intel Corporation 82541GI Gigabit Ethernet Controller
 
DEVICE=eth0
 
BOOTPROTO=none
 
#HWADDR=00:0E:0C:4C:E1:52
 
ONBOOT=yes
 
DHCP_HOSTNAME=lentil.unh.edu
 
IPADDR=10.0.0.250
 
NETMASK=255.255.255.0
 
TYPE=Ethernet
 
USERCTL=no
 
IPV6INIT=no
 
PEERDNS=yes
 
</pre>
 
  
=== /etc/sysconfig/network-scripts/ifcfg-unh ===
+
== Network Configuration ==
<pre>
+
Currently connected to farm networks via our switch and a direct port to UNH network.
VLAN=yes
+
Note: Previously, lentil went through the switch and a VLAN network. New network policy at UNH makes this not possible.
# Please read /usr/share/doc/initscripts-*/sysconfig.txt
 
# for the documentation of these parameters.
 
GATEWAY=132.177.88.1
 
TYPE=Ethernet
 
DEVICE=eth0.2
 
#HWADDR=00:0e:0c:4c:e1:52
 
BOOTPROTO=none
 
NETMASK=255.255.252.0
 
IPADDR=132.177.88.254
 
ONBOOT=yes
 
USERCTL=no
 
IPV6INIT=no
 
PEERDNS=yes
 
  
</pre>
 
 
=== /etc/sysconfig/network-scripts/ifcfg-lo ===
 
 
<pre>
 
<pre>
DEVICE=lo
+
eth0
IPADDR=127.0.0.1
+
  Hostname: lentil.farm.physics.unh.edu
NETMASK=255.0.0.0
+
  HWaddr 00:30:48:63:BB:40
NETWORK=127.0.0.0
+
  IP:10.0.0.250
# If you're having problems with gated making 127.0.0.0/8 a martian,
+
eth1
# you can change this to something else (255.255.255.255, for example)
+
  Hostname: lentil.unh.edu
BROADCAST=127.255.255.255
+
  HWaddr 00:30:48:63:BB:41
ONBOOT=yes
+
  IP:132.177.88.254
NAME=loopback
 
 
</pre>
 
</pre>
  
Line 99: Line 41:
 
===Location of Backups ===
 
===Location of Backups ===
 
<pre>
 
<pre>
/mnt/npg-daily/34/
+
/mnt/npg-daily-current
 +
/mnt/npg-daily/xx/
 
</pre>
 
</pre>
 
All backup related scripts are:
 
All backup related scripts are:
 
   /etc/auto.npg-daily
 
   /etc/auto.npg-daily
   /usr/local/bin/rsync_backup.pl
+
   /usr/local/bin/rsync_backup.py
 
   /etc/cron.daily/0rsync_backup
 
   /etc/cron.daily/0rsync_backup
 
   /usr/sbin/vgcfgbackup
 
   /usr/sbin/vgcfgbackup
 
   /etc/rsync-backup.conf
 
   /etc/rsync-backup.conf
 
=== /etc/rsync-backup.conf ===
 
<pre>
 
# Backups are 'pull' only.  Too bad there isn't a better way to enforce this.
 
read only      = yes
 
 
# Oh for the ability to retain CAP_DAC_READ_SEARCH, and no other. 
 
#uid            = root
 
# XXX There seems to be an obscure bug with pam_ldap and rsync whereby
 
# getpwnam(3) segfaults when (and only when) archiving /etc.  Using a numeric
 
# uid avoids this bug.  Only verified on Fedora Core 2.
 
uid            = 0
 
 
# There's not much point in putting the superuser in a chroot jail
 
# use chroot    = yes
 
 
# This isn't really an effective "lock" per se, since the value is per-module,
 
# but there really ought never be more than one, and it would at least
 
# ensure serialized backups.
 
max connections = 1
 
 
[usr_local]
 
        path    = /usr/local
 
        comment = unpackaged software
 
 
[opt]
 
        path    = /opt
 
        comment = unpackaged software
 
 
[etc]
 
        path    = /etc
 
        comment = conf files
 
 
[var]
 
        path    = /var
 
        comment = user and system storage
 
</pre>
 
  
 
== SNMP Configuration ==
 
== SNMP Configuration ==
Line 149: Line 55:
 
   Copied from [[Pepper]].
 
   Copied from [[Pepper]].
 
== Smartd Configuration ==
 
== Smartd Configuration ==
=== /etc/smartd.conf ===
+
The configuration file is at /etc/smartd.conf.  The smartd.conf does a silent check, which only emails reports if the SMART health status comes back as failedThis smartd.conf will look different from alot of the other computers because it doesn't have a RAID card installed, so each disk is mounted seperately for backups.
<pre>
 
# *SMARTD*AUTOGENERATED* /etc/smartd.conf
 
# Remove the line above if you have edited the file and you do not want
 
# it to be overwritten on the next smartd startup.
 
 
 
# Sample configuration file for smartdSee man 5 smartd.conf.
 
# Home page is: http://smartmontools.sourceforge.net
 
 
 
# The file gives a list of devices to monitor using smartd, with one
 
# device per line. Text after a hash (#) is ignored, and you may use
 
# spaces and tabs for white space. You may use '\' to continue lines.
 
 
 
# You can usually identify which hard disks are on your system by
 
# looking in /proc/ide and in /proc/scsi.
 
 
 
# The word DEVICESCAN will cause any remaining lines in this
 
# configuration file to be ignored: it tells smartd to scan for all
 
# ATA and SCSI devices.  DEVICESCAN may be followed by any of the
 
# Directives listed below, which will be applied to all devices that
 
# are found.  Most users should comment out DEVICESCAN and explicitly
 
# list the devices that they wish to monitor.
 
# DEVICESCAN
 
 
 
# First (primary) ATA/IDE hard disk.  Monitor all attributes
 
# /dev/hda -a
 
 
 
# Monitor SMART status, ATA Error Log, Self-test log, and track
 
# changes in all attributes except for attribute 194
 
# /dev/hdb -H -l error -l selftest -t -I 194
 
 
 
# A very silent check. Only report SMART health status if it fails
 
# But send an email in this case
 
/dev/hde -H -m root
 
/dev/sda -d ata -H -m root
 
/dev/sdb -d ata -H -m root
 
/dev/sdc -d ata -H -m root
 
 
 
# First two SCSI disks.  This will monitor everything that smartd can
 
# monitor.
 
# /dev/sda -d scsi
 
# /dev/sdb -d scsi
 
  
# HERE IS A LIST OF DIRECTIVES FOR THIS CONFIGURATION FILE
+
[[SMARTD]] Smartd setup and configuration
#  -d TYPE Set the device type to one of: ata, scsi
 
#  -T TYPE set the tolerance to one of: normal, permissive
 
#  -o VAL  Enable/disable automatic offline tests (on/off)
 
#  -S VAL  Enable/disable attribute autosave (on/off)
 
#  -H      Monitor SMART Health Status, report if failed
 
#  -l TYPE Monitor SMART log.  Type is one of: error, selftest
 
#  -f      Monitor for failure of any 'Usage' Attributes
 
#  -m ADD  Send warning email to ADD for -H, -l error, -l selftest, and -f
 
#  -M TYPE Modify email warning behavior (see man page)
 
#  -p      Report changes in 'Prefailure' Normalized Attributes
 
#  -u      Report changes in 'Usage' Normalized Attributes
 
#  -t      Equivalent to -p and -u Directives
 
#  -r ID  Also report Raw values of Attribute ID with -p, -u or -t
 
#  -R ID  Track changes in Attribute ID Raw value with -p, -u or -t
 
#  -i ID  Ignore Attribute ID for -f Directive
 
#  -I ID  Ignore Attribute ID for -p, -u or -t Directive
 
#  -v N,ST Modifies labeling of Attribute N (see man page)
 
#  -a      Default: equivalent to -H -f -t -l error -l selftest
 
#  -F TYPE Use firmware bug workaround. Type is one of: none, samsung
 
#  -P TYPE Drive-specific presets: use, ignore, show, showall
 
#    #      Comment: text after a hash sign is ignored
 
#    \      Line continuation character
 
# Attribute ID is a decimal integer 1 <= ID <= 255
 
# All but -d, -m and -M Directives are only implemented for ATA devices
 
#
 
# If the test string DEVICESCAN is the first uncommented text
 
# then smartd will scan for devices /dev/hd[a-l] and /dev/sd[a-z]
 
# DEVICESCAN may be followed by any desired Directives.
 
</pre>
 
 
== rc.local Configuration ==
 
== rc.local Configuration ==
=== /etc/rc.local ===
+
This script is modified to run commands when the system is done powering on.
<pre>
 
#!/bin/sh
 
#
 
# This script will be executed *after* all the other init scripts.
 
# You can put your own initialization stuff in here if you don't
 
# want to do the full Sys V style init stuff.
 
 
 
touch /var/lock/subsys/local
 
 
 
#This will send the boot.log to npg-admins everytime the pc is started.
 
mail -s "$HOSTNAME Started, Here is the boot.log" npg-admins@physics.unh.edu < /var/log/boot.log
 
</pre>
 
== Hardware Information ==
 
  Motherboard: Asus P5QL-CM
 
    Specifications: [http://nuclear.unh.edu/wiki/images/7/72/16744.pdf Specifications]
 
    User Manual: [http://nuclear.unh.edu/wiki/images/7/72/E4411_P5QL-CM V2.pdf Users Manual]
 
  Memory: 2 GB DDR2
 
  Wake On Lan Command: sudo ether-wake 00:1e:4f:9b:13:90
 
  
 +
This will send the boot.log to npg-admins everytime the pc is started.
 +
  mail -s "$HOSTNAME Started, Here is the boot.log" npg-admins@physics.unh.edu < /var/log/boot.log
 +
== If Lentil isn't sending e-mails ==
 +
Sometimes after a reboot Lentil won't send its regular e-mail reports. To fix this you simply need to restart sendmail. Be aware that it has saved all of those messages it didn't send, and once sendmail is working you'll get all of them at once.
 +
== Wake On LAN ==
 +
This is used so we can shutdown the server and remotely turn it back on.
 +
Wake On Lan Command:
 +
  sudo ether-wake 00:1e:4f:9b:13:90
 
== Fixes ==
 
== Fixes ==
 
*Kernel Crash Fix (2/24/2009)
 
*Kernel Crash Fix (2/24/2009)
 
**[[Hardware Issues History]]
 
**[[Hardware Issues History]]
 +
 +
*Hard Drive Enclosure Replacement (12/19/2009)
 +
**[[Hardware Issues History]]
 +
'''Important Note:'''
 +
  If this appears while booting:
 +
    request_module: runaway loop modprobe binfmt-464c
 +
  This is an indication that a drive (in the supermicro hot swap bay) is plugged in that
 +
  can't be mounted, like a drive with a software raid setup on it, so just pull the drive
 +
  and reboot and it should boot properly.

Latest revision as of 15:11, 12 March 2015

General Information

Lentil performs backups. Its autofs is configured to mount harddrives labeled npg-daily-XX onto /mnt/npg-daily/XX where XX is a label number. Its backup script needs further investigation to determine exactly how it works.

As of Feb. 15, 2015 Lentil is mounted with the following which will last us awhile:

  • /dev/sdd1 mounted at /mnt/npg-daily/51; A filled 2TB hard drive
  • /dev/sda1 mounted at /mnt/npg-daily/52; A 4TB hard drive
  • /dev/sdc1 mounted at /mnt/npg-daily/53; A 4TB hard drive
  • /dev/sdb1 as the root file system. Don't hot swap this.

Hardware Information

 Motherboard: Asus P5QL-CM
   Specifications: Specifications
   User Manual: V2.pdf Users Manual
 Memory: 2 GB DDR2


Authentication

Lentil authenticates against the LDAP server running on Einstein, by connecting to einstein.farm.physucs.unh.edu using sssd. Previously, Lentil went on the UNH network to einstein.unh.edu, but this is blocked (I think by ip-tables). The farm network is the better choice anyhow.

Network Configuration

Currently connected to farm networks via our switch and a direct port to UNH network. Note: Previously, lentil went through the switch and a VLAN network. New network policy at UNH makes this not possible.

eth0
  Hostname: lentil.farm.physics.unh.edu
  HWaddr 00:30:48:63:BB:40
  IP:10.0.0.250
eth1
  Hostname: lentil.unh.edu
  HWaddr 00:30:48:63:BB:41
  IP:132.177.88.254

Access Configuration

 /etc/security/access.conf
 Any valid user can log into lentil from any machine on the Internet.

Backup Configuration

Location of Backups

/mnt/npg-daily-current
/mnt/npg-daily/xx/

All backup related scripts are:

 /etc/auto.npg-daily
 /usr/local/bin/rsync_backup.py
 /etc/cron.daily/0rsync_backup
 /usr/sbin/vgcfgbackup
 /etc/rsync-backup.conf

SNMP Configuration

 /etc/snmp/snmpd.conf
 Copied from Pepper.

Smartd Configuration

The configuration file is at /etc/smartd.conf. The smartd.conf does a silent check, which only emails reports if the SMART health status comes back as failed. This smartd.conf will look different from alot of the other computers because it doesn't have a RAID card installed, so each disk is mounted seperately for backups.

SMARTD Smartd setup and configuration

rc.local Configuration

This script is modified to run commands when the system is done powering on.

This will send the boot.log to npg-admins everytime the pc is started.

 mail -s "$HOSTNAME Started, Here is the boot.log" npg-admins@physics.unh.edu < /var/log/boot.log

If Lentil isn't sending e-mails

Sometimes after a reboot Lentil won't send its regular e-mail reports. To fix this you simply need to restart sendmail. Be aware that it has saved all of those messages it didn't send, and once sendmail is working you'll get all of them at once.

Wake On LAN

This is used so we can shutdown the server and remotely turn it back on. Wake On Lan Command:

 sudo ether-wake 00:1e:4f:9b:13:90

Fixes

Important Note:

 If this appears while booting:
   request_module: runaway loop modprobe binfmt-464c
 This is an indication that a drive (in the supermicro hot swap bay) is plugged in that 
 can't be mounted, like a drive with a software raid setup on it, so just pull the drive 
 and reboot and it should boot properly.