linux下利用curl监控web应用状态
最近发生了几次web应用停止响应,最后虽然解决了问题,但应该给应用加上监控
检查apache, ps aux | grep httpd 发现apache进程已经有10个之多, 再通过curl直接检查8080端口的jboss,发现jboss已经停止响应,从而拖垮了apahe
shell掌握有限, 简单用curl进行应用的监控
监控机器列表文件:
server.list
server1
server2
server3
建立监控脚本: webstatus.sh
#!/bin/sh
monitor_dir=/home/admin/monitor/
if[!-d$monitor_dir];then
mkdir$monitor_dir
fi
cd$monitor_dir
web_stat_log=web.status
if[!-f$web_stat_log];then
touch$web_stat_log
fi
server_list_file=server.list
if[!-f$server_list_file];then
echo"`date'+%Y-%m-%d%H:%M:%S'`ERROR:$server_list_fileNOTexists!">>$web_stat_log
exit1
fi
#total=`wc-l$server_list_file|awk'{print$1}'`
forwebsitein`cat$server_list_file`
do
url="http://$website/app.htm"
server_status_code=`curl-o/dev/null-s-m10--connect-timeout10-w%{http_code}"$url"`
if["$server_status_code"="200"];then
echo"`date'+%Y-%m-%d%H:%M:%S'`visit$websitestatuscode200OK">>$web_stat_log
else
echo"`date'+%Y-%m-%d%H:%M:%S'`visit$websiteerror!!!servercan'tconnectat10sorstopresponseat10s,sendalermsms...">>$web_stat_log
echo"!appalarm@136xxxxxxxxserver:$websitecan'tconnectat10sorstopresponseat10s..."|ncsmsserverport&
fi
done
exit 0主要是利用 curl -o /dev/null -s -m 10 --connect-timeout 10 -w %{http_code} "$url" 返回状态码是否200,如果10s没有返回200状态码,则发警报
最后让linux 定时执行脚本:
crontab -e
*/10 * * * * /home/admin/app/bin/webstatus.sh
这样每隔10分钟就会执行一次