Difference between revisions of "Splunk"

From Nuclear Physics Group Documentation Pages
Jump to navigationJump to search
Line 13: Line 13:
 
== Sophisticated stuff for Splunk ==
 
== Sophisticated stuff for Splunk ==
 
You can use the admin button on the splunk web interface to do administration, add a user account (licensed version only), add new input streams. This is pretty simple. More sophisticated use is documented here: [http://www.splunk.com/ Splunk.com] then go to documentation and click on the version used.
 
You can use the admin button on the splunk web interface to do administration, add a user account (licensed version only), add new input streams. This is pretty simple. More sophisticated use is documented here: [http://www.splunk.com/ Splunk.com] then go to documentation and click on the version used.
 +
 +
=== Splunk getting sysinfo from other nodes ===
 +
To get sysinfo (cpu load, users logged in, memory useage) from other nodes, without running splunk everywhere and without creating huge log files with this info everywhere, I made a "pipe" for splunk. This is a script that runs on splunk in $SPLUNKHOME/etc/bundles/sysinfo that will ssh over to each node monitorred and execute the command /root/splunk_ssh_info_pipe.
 +
 +
To make this whole thing secure, I did the following:
 +
* Modify the /root/.ssh/authorized_keys to have an entry that will only execute one command when jalapeno tries to connect to the node (pepper, taro,...) with a passwordless ssh connection. This command is our pipe script:
 +
no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty,from="jalapeno.farm.physics.unh.edu",command="/root/splunk_ssh_info_pipe" ssh-rsa verylongsshkeyishere root@jalapeno.unh.edu
 +
* This will only work is root is allowed to connect like this, so I modified /etc/security/access.conf to allow a root login from jalapeno.
 +
* The script when run on the node creates output that is then parsed by splunk.
 +
 +
This is fairly secure. I could have created a used "splunk" for all machines and set it up so that that user can only execute one command. Perhaps I'll switch to that at some point.

Revision as of 18:04, 9 August 2007

Splunk is a flexible data aggregation system. In laymens' words, Splunk is a system that combs through log files (and anything else that contains structured information that you want to throw at it) and presents the results in a summarized format. It is really a pretty neat thing. See the splunk website.

Splunk at UNH

We are running the free 3.0beta3 on our system Jalapeno. Splunk is resource hungry. It requires at least 600MB of memory and quite a bit of CPU. Although it is possible to run a splunkd server deamon on each node and have these pass the information to the master node, this is not how I chose to set it up. Our splunbk setup is as follows:

  • Splunk runs on Jalapeno. It is installed in /data/splunk, with a link to /opt/splunk.
  • Jalapeno mounts the /var/log directories from einstein and roentgen so that it can be accessed by splunk for aggregation.
  • The free version of splunk does not allow for login. We should restrict access to jalapeno to sysadmins.
  • This can be extended to do many different tasks!

Connecting to Splunk

jalapeno blocks all port 80 and port 8000 connections in the iptables, so it's not possible to access the interface to Splunk by simply opening a web browser and going to the appropriate port. This is a safety issue, so it is not going to change. There is a fairly simple workaround, you can open an ssh tunnel:

  1. ssh -L8001:localhost:8000 jalapeno. It doesn't necessarily have to be 8001, but some non-priviledged available port on your machine.
  2. Open a web browser with good Javascript support (and optionally Flash, for some fancy graphing features), and go to localhost:8001 (or whatever port you chose). On Linux and OS X only Firefox is compatible. On Window IE is compatible as well.

Sophisticated stuff for Splunk

You can use the admin button on the splunk web interface to do administration, add a user account (licensed version only), add new input streams. This is pretty simple. More sophisticated use is documented here: Splunk.com then go to documentation and click on the version used.

Splunk getting sysinfo from other nodes

To get sysinfo (cpu load, users logged in, memory useage) from other nodes, without running splunk everywhere and without creating huge log files with this info everywhere, I made a "pipe" for splunk. This is a script that runs on splunk in $SPLUNKHOME/etc/bundles/sysinfo that will ssh over to each node monitorred and execute the command /root/splunk_ssh_info_pipe.

To make this whole thing secure, I did the following:

  • Modify the /root/.ssh/authorized_keys to have an entry that will only execute one command when jalapeno tries to connect to the node (pepper, taro,...) with a passwordless ssh connection. This command is our pipe script:
no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty,from="jalapeno.farm.physics.unh.edu",command="/root/splunk_ssh_info_pipe" ssh-rsa verylongsshkeyishere root@jalapeno.unh.edu
  • This will only work is root is allowed to connect like this, so I modified /etc/security/access.conf to allow a root login from jalapeno.
  • The script when run on the node creates output that is then parsed by splunk.

This is fairly secure. I could have created a used "splunk" for all machines and set it up so that that user can only execute one command. Perhaps I'll switch to that at some point.