Difference between revisions of "Sysadmin Todo List"

From Nuclear Physics Group Documentation Pages
Jump to navigationJump to search
Line 36: Line 36:
 
* Figure out why jalapeno doesn't have 3dm sofware running.  If we find that there's no good reason, maybe we should install it?
 
* Figure out why jalapeno doesn't have 3dm sofware running.  If we find that there's no good reason, maybe we should install it?
 
* Certain settings are the similar or identical for all machines.  It would be beneficial to write a program to do remote configuration.  This would also simplify the process of adding/upgrading machines.
 
* Certain settings are the similar or identical for all machines.  It would be beneficial to write a program to do remote configuration.  This would also simplify the process of adding/upgrading machines.
* Update Tomato to RHEL5 and check all services einstein currently provides. Then switch einstein <-> tomato, and then upgrade what was originally einstein. Look into making an einstein, tomato failsafe setup.  '''A good preliminary step would be to find all of the custom scripts on einstein.  If they don't have "npg" in their filenames already, it should be added if possible, so that they can all be easily <code>locate</code>d.
+
* Update Tomato to RHEL5 and check all services einstein currently provides. Then switch einstein <-> tomato, and then upgrade what was originally einstein. Look into making an einstein, tomato failsafe setup.  '''A good preliminary step would be to find all of the custom scripts on einstein.  If they don't have "npg" in their filenames already, it should be added if possible, so that they can all be easily <code>locate</code>d.''' Maybe something other than just "npg", because there seems to be a lot of cruft with that label.
  
 
== Completed ==
 
== Completed ==

Revision as of 12:24, 3 July 2007

General Info

This is an unordered set of tasks. Detailed information on any of the tasks typically goes in related topics' pages, although usually not until the task has been filed under Completed.

Important

  • Schedule a date for password change for karpiusp.
  • Resize partitions on symanzik (and other machines as necessary) so that root has at least a gig of unused space.
  • Figure out exactly what our backups are doing, and see if we can implement some sort of NFS user access. NPG_backup_on_Lentil.
  • Look into getting computers to pull scheduled updates from rhn when they check in.
  • Find out why Steve isn't getting paid what he's supposed to be getting paid. May be getting fixed
  • Printer queue for Copier: Konica Minolta Bizhub 750. IP=pita.unh.edu Seems like we need info from the Konica guy to get it set up on Red Hat and OS X. The installation documentation for the driver doesn't mention things like the passcode, because those are machine-specific. Katie says that if he doesn't come on Monday, she'll make an inquiry. Mac OS X now working, IT guy should be here week of June 26th Did he ever come?
  • Need to get onto the "backups" shared folder, as well as be added as members to the lists. "backups" wasn't even a mailing list, according to the Mailman interface.
  • Figure out what network devices on tomato are doing what
  • Look into monitoring RAID, disk usage, etc.
  • Learn how to use cacti on okra. Seems like a nice tool, mostly set up for us already.
  • Find out why lentil and okra (and tomato?) aren't being read by cacti. Could be related to the warnings that repeat in okra:/var/www/cacti/log/cacti.log
  • Learn how to set up evolution fully so we can support users. Need LDAP address book.
  • Matt's learning a bit of Perl so we can figure out exactly how the backup works, as well as create more programs in the future, specifically thinking of monitoring. Look into the CPAN modules under Net::, etc.
  • Nobody is currently reading the mail that is send to "root". Einstein had 3000+ unread messages. I deleted almost all. There are some useful messages that are send to root with diagnostics in them, we should find a solution for this. Temporarily, both Matt and Steve have email clients set up to access root's account. The largest chunk of the e-mails concern updating ClamAV. Maybe we should just do that?
  • I set up "splunk" on einstein (production 2.2.3 version) and taro (beta 3 v2). I like the beta's functionality better, but it has a memory leak. Look for update to beta that fixes this and install update. (See: www.splunk.com/base/forum:SplunkGeneral While this sounds like it could only be indirectly related to our issue, it does sound close enough and is the only official word on splunk's memory usage that I could find:[1]
    When forwarding cooked data you may see the memory usage spike and kill the splunkd process. This should be fixed for beta 3.
    So, waiting for the next beta or later sounds like the best bet. I'm wary of running beta software on einstein, anyhow.

Ongoing

  • Maintain the Documentation of all systems!
    • Main function
    • Hardware
    • OS
    • Network
  • Clean up 202
    • Figure out what's worth keeping
    • Figure out what doesn't belong here
  • Take a look at spamassassin - Improve Performance of this if possible. See if our setup jives with this. On the surface, it doesn't seem to
  • Updated SpamAssassin and ran sa-update to get new rules. Are spammassassin or spamd/spamc even being used? ps -ef | grep spam doesn't show any of them. The SA documentation seems to indicate that having procmail send mail to one of them is the typical scenario. However, procmail isn't mentioned in the appropriate Postfix configuration file[2]. procmail and postfix are installed, though. Do we have a special mail setup?
  • Test unknown equipment:
    • UPS
  • Printer in 323 is not hooked up to a dead network port. Actually managed to ping it. One person reportedly got it to print, nobody else has, and that user has been unable ever since. Is this printer dead? We need to find out.
  • Eventually one day come up with a plan to deal with pauli2's kernel issue
  • Look into making a centralized interface to monitor/maintain all the machines at once. Along the same lines: Continue homogenizing the configurations of the machines.
  • Figure out why jalapeno doesn't have 3dm sofware running. If we find that there's no good reason, maybe we should install it?
  • Certain settings are the similar or identical for all machines. It would be beneficial to write a program to do remote configuration. This would also simplify the process of adding/upgrading machines.
  • Update Tomato to RHEL5 and check all services einstein currently provides. Then switch einstein <-> tomato, and then upgrade what was originally einstein. Look into making an einstein, tomato failsafe setup. A good preliminary step would be to find all of the custom scripts on einstein. If they don't have "npg" in their filenames already, it should be added if possible, so that they can all be easily located. Maybe something other than just "npg", because there seems to be a lot of cruft with that label.

Completed

  • Checked if the backups are actually happening and working - they are.
  • Renewed XML books for Amrita. They're due at the end of the month.
  • Fixed the amandabackup.sh script for consolidating amanda-style backups.
  • Investigate the change in ennui's host key Almost certainly caused by one of the updates. Just remembered that I was using ennui for a few minutes and I saw the "updates ready!" icon in the corner and habitually clicked it. Darn ubuntu habits. Doesn't explain WHY it changed, only how. It probably wasn't an individual update, but almost certainly was the transition from Fedora 5 to 7. ennui isn't a very popular machine to SSH into, so the change probably just went unnoticed for the two-or-so weeks since the upgrade. I had early thought that it couldn't have been the OS change, since it had been awhile, but upon further thought, it makes perfect sense.

Previous Months Completed

June 2007