Monday, 19 March 2012

Keeping eth0 alive script

For some reason my servers at work are cursed with loosing connection to network - you have to connect with some app that send packets to keep network alive - for example Putty - or you won't be able to connect to server, without login in directly and pingin' some1. After little googlin' few years back, I thought it was related to net card driver. I've compiled few driver versions from vendors site but nothing changed. On second machine that I got later I had the same problem, and so did few of my co-workers with their machines. Apparently this behaviour can also mean that the router is dropping entries in arp tables for my machines. I don't have access to the router, so I've made a small hack/fix:

put this in file /etc/dhclient-eth0.conf
send dhcp-lease-time 1100;

This makes dhcp client renew lease every ~500 sec.

Of course I wasn't happy with this approach :). It depended on dhcp server to correctly work, could assign different ip etc. After changing to my eth0 configuration to static ip adress I had to think of other hack/fix.

Let's call it /usr/local/bin/keepalive.sh

#!/bin/bash

while [ -d /root ]
do
    if test -e /proc/net/nf_conntrack
    then
        cnt=`cat /proc/net/nf_conntrack | grep -v "192\.168" | grep EST | wc -l`
        if [ "${cnt}" -lt "3" ]; then
            ping -c 1 10.1.1.13  >> /dev/null 2>&1
        fi
    fi
    if [ ! -f /var/run/sshd.init.pid ]
    then
        break
    fi
    sleep 2m
done

Some explanation :
/proc/net/nf_conntrack - file containing information about network connection
grep -v "192\.168" - filters lines NOT including string "192.168", because I have virtual machines running on this server, with virtual network 192.168.1.0
grep EST - filter lines including connection status == ESTablished
wc -l - counts how many filtered lines were found

Instead of the whole line cnt=... you can just use :
cnt=`grep EST /proc/net/nf_conntrack | wc -l`

or if your system doesn't have /proc/net/nf_conntrack
cnt=`ss -an | grep -v "192\.168" | grep "ESTAB" | wc -l`
or
cnt=`netstat -an | grep -v "127\.0" | grep -v "192\.168" | grep "ESTABLISHED" | wc -l`
although ss is faster.

But to run it from crontab at boot I had to make a middle-man script to run my main script in background - scripts put in background directly from cron seem to time-out eventual and force cron to nag about it thou sendmail or syslog.

Let's call it /usr/local/bin/runbg.sh - not much to see here
/usr/local/bin/keepalive.sh &

And for the final touch, in /var/spool/cron/tabs/root add line
@reboot /usr/local/bin/runbg.sh >> /tmp/keepalive.log 2>&1

No comments:

Post a Comment