Recent Config Changes

From Nuclear Physics Group Documentation Pages
Revision as of 13:33, 8 August 2017 by Maurik (talk | contribs) (→‎2017)
Jump to navigationJump to search
Reverse Chronological Order.

2017

  • 2017/08/08 -- Upgrade Taro to Centos7. Review of the iptable rules.
  • 2017/08/07 -- Jalapeno authenticates users agains Pepper for testing.
  • 2017/08/02 -- Upgrade Jalapeno to Centos7. Cleaned up version of the named.conf. Jalapeño is now root login only. See jalapeno
  • 2017/08/02 -- Issues with named on Jalapeño. It does not do forwarding correctly. Issue was the IP address for UNH network.
  • 2017/08/01 -- Has it really been that long since anything was done. Yikes.
  • 2017/07/04 -- Install iptables-netgroups on Jalapeño and the new Einstein.
  • 2017/07/04 -- Install Fail2Ban properly on Gourd, new Einstein and Jalapeño.

2016

  • 2016/11/29 -- Extended power down over Thanksgiving break caused big "foo bar" on our main server: Gourd.
    • System would not boot and hang on "systemctl emergency" barf.
    • When logging in as root in the emergency setup, it was clear from /proc/mdstat that the RAIDs were renamed. However, after each mdadm --assemble command, the *%$#! systemctl would reboot.
    • Solution was not so obvious:
  1. Reboot with the OLD kernel and all /dev/md* commented out of /etc/fstab.
  2. Reset the /dev/md* names using "mdadm --manage /dev/md125 --stop", followed by reassembly: "mdadm --assemble /dev/md0 /dev/sdd1 /dev/sdf1" etc. (3 times)
  3. Rebuild the initramfs for the *NEW* kernel: "dracut --force /boot/initramfs-(new kernel number).img (new kernel number) -M
  4. Separate issue was that nfs-server did not start up as expected. "systemctl enable nfs-server"
  5. Reboot - then mop up. Make sure Einstein starts up.
  6. In addition: Einstein VM and Roentgen VM and Jalapeno VM and Corn VM are now set to restart automatically when Gourd/Pumpkin reboot.
  • 2016/06/16 -- The sshd mod from yesterday had the side effect that backups no longer worked. Reason: "lentil.farm.physics.unh.edu" in authorized_keys could not be resolved, since DNS lookups are no longer done. Problem solved by adding 10.0.0.250 to the permitted systems in authorized_keys.
  • 2016/06/15 -- On each of the nodes, point to 10.0.0.100 in ntp.conf and start ntpd. Now they all (mostly) agree on what time it is.
  • 2016/06/15 -- On RHEL 6 or 7 systems, ssh became really slow. The reason was reverse dns lookups, which aren't really needed. I set "UseDNS no" in the sshd_config files and now ssh is fast again.
  • 2016/05/24 -- Move the Mail Home KVM RAIDS from Endeavour back to Gourd. Move Mail RAID
  • 2016/05/04 -- Turn off outside access to DNS on Jalapeno. Jalapeno will now only listen to 10.0.0.* for DNS requests.
  • 2016/05/04 -- Sideways migrate Taro to Centos 5 Sideways Migration from RH to Centos
  • 2016/03/20 -- Setup "epel" on RH7 Pumpkin with: "rpm -Uvh http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm"
  • 2016/03/10 -- Used proxy to update all software on all nodes, except 10, 11 which were down. Reboot all < 10, 4 and 7 did not return.
  • 2016/03/10 -- Installed squid proxy server on endeavour
  • 2016/03/10 -- Added rafopar to the domain_admins
  • 2016/02/27 -- Lentil started running cron jobs twice (probably since the update). Issue is both cron and anacron running /etc/cron.daily. Disabled anacron from running those jobs.
  • 2016/02/25 -- Jalapeno was used in a DDOS attack. Our DNS (bind) setup was too open, allowing "recursion". Closed it way down to "peers".
  • 2016/02/20 -- Updated all out systems in response to libc vulnerability. Note only RH6+7 are affected, but updated RH5 systems as well. Some, but not all, are also rebooted.
  • 2016/01/05 -- Rebooted Pumpkin on a new MB, Intel i7 CPU, 16 x 4 TB WD drives + 2x WD750 for system in RAID0. System installed is Centos 7

2015

  • 2015/08/31 -- Setup Gourd's new networking for "bridge".
  • 2015/08/28 -- installed the Maui scheduler as the default scheduler. It runs under "maurik" at this point.
  • 2015/08/28 -- Patched the Maui scheduler (code in /data1/System/maui-3.3.1 ) to take command line options properly (i.e. -d)
  • 2015/08/28 -- Fixed LOG issue on the pbs_server. Code is in /data1/System/torque.git file src/server/node_manager.c
  • 2015/08/28 -- Added a list to Einstein fail2ban rules to exclude "lost connection after AUTH from".
  • 2015/08/26 -- uninstall postfix, install sendmail on Endeavour. Postfix is too complicated, all we need is a mail forwarder.
  • 2015/08/26 -- Reinstall Splunk on Taro -- license overruns made it useless.
  • 2015/08/11 -- Started upgrade of Gourd to Centos 7, since it wasn't booting anymore anyway. Ran into trouble with the ethernet driver.
  • 2015/08/11 -- Roentgen aka Nuclear is running again on Endeavor. It is on OpenServer Net.
  • 2015/08/11 -- Einstein now on the OpenServer net. NOTE: They STILL have a tendency to block port 25 (incoming email), so an exception is made of this machine.
  • 2015/06/09 -- Got Torque (Portable Batch System) to run again on Endeavour - Yohoo!
  • 2015/06/09 -- Fixed mail not arriving, since port 25 was blocked again.
  • 2015/06/08 -- Fixed ldap connection on Lentil: MUST connect to einstein.farm.physics.unh.edu
  • 2015/06/08 -- Fixed mounting issues on Centos 6 systems (corn, lentil endeavour, nodes)
  • 2015/06/08 -- Einstein and Jalapeno moved from Gourd to Endeavour.
  • 2015/06/08 -- Properly started sssd on Corn and Jalapeno.
  • 2015/06/08 -- Moved Corn from Gourd to Endeavour.
  • 2015/06/08 -- Bridged all 3 interfaces on Endeavour. br0 = farm, br1= server net, br2 = unh
  • 2015/05/26 -- Reset root passwords.
  • 2015/05/26 -- Einstein mounts /mail from npghome:/mail which now is hosted on Endeavour.
  • 2015/05/26 -- Moved the /home and /mail drives from Gourd to Endeavour, reconstitute as RAID1 on slot 23 (0:0:0:4)=/dev/sde# and slot 24 (0:0:0:5)=/dev/sdf# -- Set backup to backup /home and /mail on Endeavour.
  • 2015/05/19 -- Cleaned up some old user home dirs. Anyone who did not have a login now also does not have a /home or /mail drive. Some users set to /bin/false to be removed later.
  • 2015/05/19 -- Inserted a 1TB drive in Gourd and set it up to mirror /home /mail and /kvm as it is supposed to be RAID1. No clue what happened with the original mirror disk.
  • 2015/05/19 -- Tried to get backup emails to work again on Lentil. We'll see.
  • 2015/05/14 -- Virtualization stopped on Lentil. Not needed on backup system.
  • 2015/05/14 -- Splunk running on Taro, Endeavour, Gourd, Einstein, Roentgen, Lentil
  • 2015/04/15 -- Fail2ban running on Endeavour.
  • 2015/04/15 -- Endeavour web server resurrected. Ganglia still needed.
  • 2015/04/14 -- Splunk restarted on Taro, setup on Einstein (forwarding to Taro).
  • 2015/04/09 -- Roentgen and Nuclear moved to Taro and Open Server net. MySQL DB copied over in final state. (so now you are looking at the new Roentgen.)
  • 2015/04/08 -- Node cloning in progress. See How to Clone a Node
  • 2015/04/08 -- Move to Open Server Net in progress. See Move To Open Server Net
  • 2015/04/03 -- Taro is now on Server-Open network with new IP 132.177.180.86. Endeavour is on 132.177.180.225.
  • 2015/04/02 -- Endeavour upgrade documented at Upgrading Endeavour --Node 2 is nearly done.
  • 2015/04/02 -- The backups are running but still not sending mail. It seems NO ONE IS LOOKING AT THIS! (that is a good way to piss me off.)
  • 2015/03/12 -- MailMan E-mail is not working properly. Changed backup cron job (/etc/cron.daily/0rsync_backup) to send email directly.
  • 2015/03/08 -- Gourd CentOS 6 Migration
  • 2015/03/02 -- SpamAssassin and E-mail -- misconfigured postfix did not run spam properly.
  • 2015/03/01 -- SpamAssassin Updated documentation how to filter spam better.
  • 2015/02/15 -- taro updated the globus toolkit for data transfers to/from Jlab.
  • 2015/02/03 -- lentil Changed the network-scripts. Lentil was still trying to use the VLAN, while it was directly connected to the UNH network. (Shame on us!) Also, configured sssd to contact einstein.farm.physics.unh.edu instead of einstein.unh.edu.

2014

2014/11/29 -- einstein Dovecot.conf:45 disable_plaintext_auth = yes 2014/11/29 -- einstein Changed the NTP servers to ns1.unh.edu, ns2.unh.edu, nic.unh.edu, since these actually work!
2014/11/04 -- einstein taro Iptables have a new, long, blacklist.
2014/11/04 -- einstein Changed Postfix authentication module from smtpd to dovecot. This fixes the issue with postfix claiming authentication methods which don't actually work.
2014/10/05 -- corn and jalapeno Change: Fully transitioned RHEL repositories and packages to their equivelant CentOS versions. Both changes required downloading and installing CentOS's repository keys, removing all packages that start with rhn (replaced with the CentOS versions), then cleaning all cached packages and running a standard yum upgrade. Details on what commands used can be found at http://knowledgelayer.softlayer.com/procedure/convert-redhat-centos
2014/09/26 -- einstein Change: Stop the avahi-daemon service and take it out of the automatically started services. Avahi-daemon implements Apple's "bonjour" protocols, which I don't think we need.

Older Pages with Completed tasks

Completed in Jan/Feb/Mar 2009
Completed in July/Aug/Sep/Oct/Nov 2008
Completed in March/April/May/June 2008
Completed in February 2008
Completed in January 2008
Completed in November/December 2007
Completed in October 2007
Completed in September 2007
Completed in August 2007
Completed in July 2007
Completed in June 2007