running nagiosplugins via saltstacks peer communicationsystem

So …my previous post was  similar to this, but you most likely dont want to run the salt-master and nagios on the same server, so I had to find a way to let the nagios-server execute its plugins on hosts via the salt-master. This can be done using the python client api and saltstacks own peer communication system.

First of all, read this : http://docs.saltstack.com/ref/peer.html

Then check out my wrapper here : https://github.com/mortis1337/nagios-plugins/blob/master/check_by_saltpeer.py

Yay! Now you can throw away NRPE forever and stop using ssh-keys for the nagiosuser if you are doing that allready.

Nagiosplugins over zmq? I like it :)

Tags: , , , , ,

Running nagios-plugins via saltstack

I’m so sick of maintaining NRPE-config on my servers, and I dont really want root-sshkeys all over the place. Recently I discovered saltstack and started to play with it a bit. I came up with the idea of running Nagios(or Icinga) on the same server as my salt-master and so I created a little wrapper that lets me run nagios-checks via saltstack.

Here’s how it works.

This is my little wrapper-script written in python: https://github.com/mortis1337/nagios-plugins/blob/master/check_by_salt.py

The wrapper takes hostname, plugin and a timeoutvalue as arguments:

$ python check_by_salt.py -H examplehost -p “/path/to/existing/nagiosplugin arg1 arg2″ -t 10

The wrapper imports salt and runs commands on minions with cmd.run_all and returns the output and the exitcode.

For this to work as the nagios/icinga user, you will have to configure the client_acl for the user in the salt-master config, so go ahead and edit the master-configfile (default: /etc/salt/master)

Search for “client_acl” in the file and add this :

client_acl:
icinga:
- cmd.*

Yeeaaaap, thats quite the security risk right there, but read up on how to limit what can be done with the cmd-state in salt and atleast it will be safer than using ssh-keys :)

check_by_salt in combination with https://github.com/mortis1337/nagios-plugins/blob/master/check_disk_generic.py will instantly give you monitoring of all your disks with no clientside-configuration.

Use it if you like it and feel free to improve it.

 

 

 

Tags: , , , ,

How a nerd monitors his wife’s weight

So I got myself a new bodyscale recently. Ofcourse it had to be something of a gadget so I went for this Withings BodyScale. Withings allready has a nice webpage with graphs and stuff and also a couple of really nice iphone/ipad apps for it. The fact that it is integrated with other services like Runkeeper and such, made me think about if they had an API i could query. And it had. Also a quick search for “python withings api” gave some results with examples on how to use it.

I came across this thing : https://github.com/mote/python-withings …and then it was pretty much just about writing a bit of nagios-logic around it to make it into a plugin.

The first result is here: https://github.com/mortis1337/check_wife

The script takes a userid, an apikey and a name as arguments.

$ ./check_wife.py  -u 1111111 -k xxxxxxxxxxxxx -n Your(or your wife’s;)name
WARNING: <yourname>’s overweight. Size: <yoursize> – Weight: <yourweight> BMI: <yourbmi>

The script will give a WARNING whenever the BMI-value is about 25 or below 18,5.

Add this to your nagios-config and your operators can come point and laugh at you whenever a WARNING occurs :)

( yes, the “wife”-part is a joke…. go monitor your own weight;) )

Tags: , , , , , ,

gzip support in check_http

If you need gzip support in your nagios check_http plugin, here’s what you need to do.
First of all, fetch the latest version (1.4.15) of the nagios-plugins :

http://sourceforge.net/projects/nagiosplug/files/nagiosplug/1.4.15/

tar xzfv the downloaded file somewhere and enter the nagios-plugins-1.4.15/plugins directory…
Here you’ll find the check_http.c sourcefile which needs to be patched.
You can find the patch here :

http://sourceforge.net/tracker/index.php?func=detail&aid=3294169&group_id=29880&atid=397599

patch the sourcefile with the patch command : patch check_http.c checkhttpgzipdeflate.patch
Go down one directory and run ./configure && make
You’ll have a freshly compiled check_http plugin with gzip support in the plugins-directory.
Copy it to your nagios-plugins directory or wherever you keep maintained versions.

Tags: , , ,

Fun with sudo

Wanna have some fun with sudo?

A couple of neat tricks:

1. Insults when you type wrong password:
echo "Defaults insults" >> /etc/sudoers
When your users type incorrect password they are insulted:
$ sudo su -
Password:
Are you on drugs?

2. Make custom password-prompt when your users sudo
Add line to /etc/sudoers: Defaults passprompt="YOU BREAK IT, YOU FIX IT!:"
When ppl log in and try to use sudo they get a modified passwordprompt:
user@server:~$ sudo su -
YOU BREAK IT, YOU FIX IT!:

Any more tricks? Use comments :>

Tags: , ,

Monitor Dell servers on Debian Squeeze with Nagios

Im just writing up this post because the dellomsa packages arent working with the new Debian Squeeze 6.0.

I had problems with the omreport command not giving me info of ex memory/psu/cpu. (omreport chassis info said No sensors found etc)

I used some hours to try to get it working with a newer dellomsa but that didnt work either.
Then i found some official Dell Ubuntu packages, which i found working excellent on Debian Squeeze as well:
dpkg -P dellomsa #Make sure dellomsa isnt installed.
echo 'deb http://linux.dell.com/repo/community/deb/latest /' | sudo tee -a /etc/apt/sources.list.d/linux.dell.com.sources.list
apt-get update
apt-get install srvadmin-base smbios-utils

You will also need the libsmbios2_2.2.13-0ubuntu4_amd64.deb from Ubuntu Lucid to get smbios stuff working.
dpkg -i libsmbios2_2.2.13-0ubuntu4_amd64.deb
/etc/init.d/dataeng start #if this starts, omreport works!

Now you have the newer Debian Squeeze Dell stuff working.

We have deployed our hwmonitoring of our Dell servers with check_openmanage and Nagios
Read more about the check_openmanage on the check_openmanage site (this is a great plugin btw!)

Resources:
http://folk.uio.no/trondham/software/check_openmanage.html
http://linux.dell.com/repo/community/deb/latest/

Tags: , , , , ,

Use screen instead of !”¤#¤”&”# minicom

I didn’t know this until the other day, but how awesome is this – You can use screen to connect to your serial console :)

screen /dev/ttyS0 9600

VOILA – you’re in

Tags: , ,

Test your jumbo frame enabled network with ping

ping -Mdo -s

If it works:
$ ping -Mdo -s 8001 10.0.20.26
PING 10.0.20.26 (10.0.20.26) 8001(8029) bytes of data.
8009 bytes from 10.0.20.26: icmp_req=1 ttl=64 time=0.450 ms
8009 bytes from 10.0.20.26: icmp_req=2 ttl=64 time=0.468 ms (DUP!)
8009 bytes from 10.0.20.26: icmp_req=3 ttl=64 time=0.447 ms

If it doesnt:
$ ping -Mdo -s 2001 195.10.34.51 -c3
PING 195.10.34.51 (195.10.34.51) 2001(2029) bytes of data.
From XX.XX.XX.XX icmp_seq=1 Frag needed and DF set (mtu = 1500)
From XX.XX.XX.XX icmp_seq=1 Frag needed and DF set (mtu = 1500)
From XX.XX.XX.XX icmp_seq=1 Frag needed and DF set (mtu = 1500)

Tags: , , , , , ,

307 Temporary Redirect, to myself. Lets hope the image is done next time around

Funny FAIL-bug in the drupal imagecache module. Last night we had some serious trouble with our sites and witnessed the requests on our loadbalancers going from 1k to 10k pr second. From our graphs we found out which site was being hammered and then checked varnishtop to see what was going on. 2 missing images were causing a 307 temporary redirect. We fixed it fast by touching the missing files and the traffic went away. Today we did some research into what was going on, and with firebug we found out that the page was trying to redirect to it self and fetch the same missing image over and over. Here’s the rather naughty code in the imagecachemodule :


if (file_exists($lockfile)) {
watchdog('imagecache', 'ImageCache already generating: %dst, Lock file: %tmp.', array('%dst' => $dst, '%tmp' => $lockfile), WATCHDOG_NOTICE);
// 307 Temporary Redirect, to myself. Lets hope the image is done next time around.
header('Location: '. request_uri(), TRUE, 307);
exit;
}

Now imagine if the image “doesnt exist the next time around”.
FAIL!

Tags: , , , ,

Nvidia and invalid checksum for EDID (Xorg-issues)

Having troubles installing a DVI splitter through a HDMI converter on our Jira-dashboard i found that the splitter made the EDID(Extended display identification data) fancy automagically validation shit fucked up and made the screen falling back to 640×480.
This made me shat brix.

After alot of googling i found a faboulous trigger called IgnoreEDIDChecksum that i put under the Screen section in the xorg.conf.

Hurrayh for new fancy automagic-probe-validation-fuckups

Tags: , , ,