Difference between revisions of "Splunk"

From Nuclear Physics Group Documentation Pages
Jump to navigationJump to search
Line 1: Line 1:
 
Splunk is a flexible data aggregation system.  In laymens' words, Splunk is a system that combs through log files (and anything else that contains structured information that you want to throw at it) and presents the results in a summarized format. It is really a pretty neat thing. See the [http://www.splunk.com splunk website].
 
Splunk is a flexible data aggregation system.  In laymens' words, Splunk is a system that combs through log files (and anything else that contains structured information that you want to throw at it) and presents the results in a summarized format. It is really a pretty neat thing. See the [http://www.splunk.com splunk website].
 
== Splunk at UNH ==
 
== Splunk at UNH ==
We are now (June 2009) running the free 3.4.x on our system [[Pumpkin]], [[Endeavour]], etc. Splunk is resource hungry, but no longer too bad if configured as a forwarder.  
+
We are now (December 2009) running the free 4.0.7 on our systems: [[Pumpkin]], [[Taro]], [[Gourd]], [[Endeavour]], [[Einstein]], [[Tomato]], [[Improv]]. If it is not running on one of these systems, it should be. Splunk is no longer as resource hungry as before. On systems where the splunk system is starting to use too much resources we can reconfigure the splunk layer as a lightweight forwarder. Currently [[Pumpkin]] is set up as a receiver and [[Endeavour]] as a duplicate receiver.  
  
 
Our setup:
 
Our setup:
  
* Splunk runs on servers, with [[Pumpkin]] the master node.  
+
* Splunk runs on servers, with [[Pumpkin]] the master (receiver) node.  
* On Pumpkin, it is installed in /data1/splunk, with a link to /opt/splunk.
+
* On Pumpkin, it is installed in /data1/splunk, with a link to /opt/splunk. This should be fairly consistent among systems.
* Pumpkin mounts the /var/log directories from roentgen so that it can be accessed by splunk for aggregation, without the need to run a splunk copy on roentgen (which is virtual).
+
* Pumpkin mounts the /var/log directories from [[Roentgen]] so that it can be accessed by splunk for aggregation, without the need to run a splunk copy on roentgen (which is virtual).
* Splunk runs on [[Endeavour]] as a full server, on [[Einstein]],[[Taro]],[[Pepper] and [[Improv]] as a forwarding server.
+
* Splunk runs on [[Endeavour]] as a full server, on [[Einstein]],[[Taro]],[[Pepper]], [[Gourd]], [[Tomato]] and [[Improv]] it depends, it may run as forwarding server.
* The free version of splunk does not allow for login. We restrict access to the splunk console in iptables. Use an ssh tunnel to access the splunk web portal
+
* The free version of splunk does not allow for login. We restrict access to the splunk console in iptables. Use an ssh tunnel to access the splunk web portal.
 
* This can be extended to do many different tasks!
 
* This can be extended to do many different tasks!
 +
* The new >4 versions of Splunk come with applications. We run the *Nix application, which does a nice job of giving a sense of what is happening on Unix like systems.
  
 
== Connecting to Splunk ==
 
== Connecting to Splunk ==
 
Pumpkin blocks all port 80 and port 8000 connections in the iptables, so it's not possible to access the interface to Splunk by simply opening a web browser and going to the appropriate port. This is a safety issue, so it is not going to change. There is a fairly simple workaround, you can open an ssh tunnel:
 
Pumpkin blocks all port 80 and port 8000 connections in the iptables, so it's not possible to access the interface to Splunk by simply opening a web browser and going to the appropriate port. This is a safety issue, so it is not going to change. There is a fairly simple workaround, you can open an ssh tunnel:
 
# <code>ssh -L8001:localhost:8000 pumpkin</code>. It doesn't necessarily have to be 8001, but some non-priviledged available port on your machine.
 
# <code>ssh -L8001:localhost:8000 pumpkin</code>. It doesn't necessarily have to be 8001, but some non-priviledged available port on your machine.
# Open a web browser with good Javascript support (and optionally Flash, for some fancy graphing features), and go to ''localhost:8001'' (or whatever port you chose). On Linux and OS X only Firefox is compatible. On Window IE is compatible as well.
+
# Open a web browser with good Javascript support and Flash 10 or later, and go to ''localhost:8001'' (or whatever port you chose). On Linux and OS X only Firefox is compatible. On Windows IE is compatible as well (but you won't care, right?)
  
 
== Sophisticated stuff for Splunk ==
 
== Sophisticated stuff for Splunk ==
 
You can use the admin button on the splunk web interface to do administration, add a user account (licensed version only), add new input streams. This is pretty simple. More sophisticated use is documented here: [http://www.splunk.com/ Splunk.com] then go to documentation and click on the version used.
 
You can use the admin button on the splunk web interface to do administration, add a user account (licensed version only), add new input streams. This is pretty simple. More sophisticated use is documented here: [http://www.splunk.com/ Splunk.com] then go to documentation and click on the version used.
  
 +
 +
 +
= OLD Config things from version 3 =
 +
 +
These may or may not work anymore, but are saved here for documentation history.
  
 
=== Filtering the input files ===
 
=== Filtering the input files ===

Revision as of 21:53, 14 December 2009

Splunk is a flexible data aggregation system. In laymens' words, Splunk is a system that combs through log files (and anything else that contains structured information that you want to throw at it) and presents the results in a summarized format. It is really a pretty neat thing. See the splunk website.

Splunk at UNH

We are now (December 2009) running the free 4.0.7 on our systems: Pumpkin, Taro, Gourd, Endeavour, Einstein, Tomato, Improv. If it is not running on one of these systems, it should be. Splunk is no longer as resource hungry as before. On systems where the splunk system is starting to use too much resources we can reconfigure the splunk layer as a lightweight forwarder. Currently Pumpkin is set up as a receiver and Endeavour as a duplicate receiver.

Our setup:

  • Splunk runs on servers, with Pumpkin the master (receiver) node.
  • On Pumpkin, it is installed in /data1/splunk, with a link to /opt/splunk. This should be fairly consistent among systems.
  • Pumpkin mounts the /var/log directories from Roentgen so that it can be accessed by splunk for aggregation, without the need to run a splunk copy on roentgen (which is virtual).
  • Splunk runs on Endeavour as a full server, on Einstein,Taro,Pepper, Gourd, Tomato and Improv it depends, it may run as forwarding server.
  • The free version of splunk does not allow for login. We restrict access to the splunk console in iptables. Use an ssh tunnel to access the splunk web portal.
  • This can be extended to do many different tasks!
  • The new >4 versions of Splunk come with applications. We run the *Nix application, which does a nice job of giving a sense of what is happening on Unix like systems.

Connecting to Splunk

Pumpkin blocks all port 80 and port 8000 connections in the iptables, so it's not possible to access the interface to Splunk by simply opening a web browser and going to the appropriate port. This is a safety issue, so it is not going to change. There is a fairly simple workaround, you can open an ssh tunnel:

  1. ssh -L8001:localhost:8000 pumpkin. It doesn't necessarily have to be 8001, but some non-priviledged available port on your machine.
  2. Open a web browser with good Javascript support and Flash 10 or later, and go to localhost:8001 (or whatever port you chose). On Linux and OS X only Firefox is compatible. On Windows IE is compatible as well (but you won't care, right?)

Sophisticated stuff for Splunk

You can use the admin button on the splunk web interface to do administration, add a user account (licensed version only), add new input streams. This is pretty simple. More sophisticated use is documented here: Splunk.com then go to documentation and click on the version used.


OLD Config things from version 3

These may or may not work anymore, but are saved here for documentation history.

Filtering the input files

See Splunk File whitelist/blacklist.

We usually just let splunk loose on an entire directory (/var/log) of several machines (einstein, roentgen, pumpkin...). There are files splunk will skip automatically (mostly binaries). Others can be filtered out by editing /opt/splunk/etc/bundles/local/inputs.conf and adding a line like:

_blacklist = audit\.log|\.[12345]  # Ignore the audit files, which you should read with aureport anyhow.

You can see what the input files splunked will be with:

. /opt/splunk/bin/setSplunkEnv
/opt/splunk/bin/listtails

Splunk getting sysinfo from other nodes

This is discontinued. Too much ssh connections, causes lots of entries in log files, which is no good since it obfuscates what happens and ssh is important!

To get sysinfo (cpu load, users logged in, memory useage) from other nodes, without running splunk everywhere and without creating huge log files with this info everywhere, I made a "pipe" for splunk. This is a script that runs on splunk in $SPLUNKHOME/etc/bundles/sysinfo that will ssh over to each node monitorred and execute the command /root/splunk_ssh_info_pipe.

To make this whole thing secure, I did the following:

  • Modify the /root/.ssh/authorized_keys to have an entry that will only execute one command when jalapeno tries to connect to the node (pepper, taro,...) with a passwordless ssh connection. This command is our pipe script:
no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty,from="jalapeno.farm.physics.unh.edu",command="/root/splunk_ssh_info_pipe" ssh-rsa verylongsshkeyishere root@jalapeno.unh.edu
  • This will only work is root is allowed to connect like this, so I modified /etc/security/access.conf to allow a root login from jalapeno.
  • The script when run on the node creates output that is then parsed by splunk.

This is fairly secure. I could have created a used "splunk" for all machines and set it up so that that user can only execute one command. Perhaps I'll switch to that at some point.

Getting Splunk to run on a new node

Install splunk by untarring the install tar file, currently located at /net/data/pumpkin1/splunk
Standard location is /opt, move the resulting /opt/splunk to /opt/splunk-<version> and make a soft link to splunk.
Now start up the system:

 /opt/splunk/bin/splunk start --accept-license

Next, startup "firefox localhost:8000" or tunnel to the splunk web server.
Next go to admin tab:

  1. (Optional) Turn on the SSL
  2. Setup the logs to watch: data input -> files & directories -> New Input. Then add /var/log.
  3. Setup forwading to pumpkin.farm.physics.unh.edu port 8089. Do not store local data (usually).
  4. Run bin/splunk disable webserver (or the splunk/etc/system/local/web.conf set "startwebserver=0" to turn off the local web server.)
  5. Restart server: bin/splunk restart
  6. Make splunk start automaticalle: bin/splunk enable boot-start

Wow, you're done!

LAYOUT

Current setup:

  • Pumpkin is the master collector.
  • Endeavour stores local and sends to Pumpkin.
  • Einstein sends to Pumpkin AND Endeavour, no local store.
  • Taro sends to Pumpkin, no local store.
  • Improv sends to Pumpkin, no local store.
  • Pepper sends to Pumpkin, no local store.
  • Roentgen is sucked directly out of /var/log through a mount on Pumpkin.