宁波网络公司-浙江海商网-Nagios监控部署篇

yaozhan189

浏览: 48378 次
性别:
来自: 宁波

最近访客更多访客>>

zjp313497775

lds80

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

linux监控-nagios

nagios Linux C C#C++

宁波网络公司-浙江海商网 -Nagios监控部署实验环境

监控服务器（nagios服务器 -- 192.168.152.133）
CentOS5.4 + nagios-3.2.0 + nagios-plugins-1.4.14 + nrpe-2.12

被监控 客户端（linux客户端--192.168.152.129）
CentOS5.4 + nagios-plugins-1.4.14 + nrpe-2.12

被监控 客户端（linux客户端--192.168.152.132）
CentOS5.4 + nagios-plugins-1.4.14 + nrpe-2.12

一、准备软件

apache2.2.14 // 下载地址 http://httpd.apache.org/download.cgi

php-5.1.6.tar.gz //nagios3以后版本需要php支持

nagios3.2.0

nagios plugins1.4.14

nrpe2.12

cd /data/software
wget http://prdownloads.sourceforge.net/sourceforge/nagios/nagios-3.2.0.tar.gz
wget http://prdownloads.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.14.tar.gz
wget http://prdownloads.sourceforge.net/sourceforge/nagios/nrpe-2.12.tar.gz
wget http://apache.etoak.com/httpd/httpd-2.2.14.tar.gz

二、开始安装 监控服务器 192.168.152.133

1 、安装 apache

tar xvf httpd-2.2.14.tar.gz   
cd httpd-2.2.14
./configure --prefix=/usr/local/apache2   
make
make install
/usr/local/apache/bin/apachectl start     // 由于是没有改动的配置文件，可以直接启动
netstat –an |grep 80      // 检查 80 端口是否已经开启了

或者是在别的机子上输入服务器的 ip 地址，当看到 ”it works!” 时表明 apache 已经安装成功了

2 、安装 nagios

先添加一个 nagios 的账号

useradd nagios –s /sbin/nologin   // 有的文章说要启用账号，其实不用也可以，因为这个账号不需要登录
tar xvf nagios-3.2.0.tar.gz
cd nagios-3.2.0
./configure  --prefix=/usr/local/nagios  --with-nagios-user=nagios --with-nagios-group=nagios
make all
make install
make install-init    // 在 /etc/rc.d/init.d 安装启动脚本
make install-config  // 安装示例配置文件 , 安装的路径是 /usr/local/nagios/etc
make install-commandmode   // 配置目录权限

3 、安装 nagios 插件

tar xvf  nagios-plugins-1.4.14.tar.gz
cd nagios-plugins-1.4.14
./configure  --prefix=/usr/local/nagios    // 注意了，是放在 /usr/local/nagios 里，别搞错了
make
make install
chown -R nagios.nagios /usr/local/nagios

三、修改配置文件

1 、修改 apache 的配置文件，我只把改的地方贴出来

vi  /usr/local/apache2/conf/httpd.conf
User nagios        // 把 apache 运行用户改成 nagios
Group nagios                // 把 apache 运行组改成 naios
# 把下面的内容增加到文件的最后：
Scriptalias /nagios/cgi-bin /usr/local/nagios/sbin
        <directory "/usr/local/nagios/sbin">
        Authtype basic
        Options execcgi
        Allowoverride none
        Order allow,deny
        Allow from all
        Authname "nagios access"
        Authuserfile /usr/local/nagios/etc/htpasswd
        Require valid-user
        </directory>
 
   Alias /nagios /usr/local/nagios/share
        <directory "/usr/local/nagios/share">
        Authtype basic
         Options none
        Allowoverride none
        Order allow,deny
        Allow from all
        Authname "nagios access"
        Authuserfile /usr/local/nagios/etc/htpasswd
        Require valid-user
        </directory>

别忘记了重启 apache 服务喔。。。。

2 、修改 cgi 脚本控制文件 cgi.cfg

vi  /usr/local/nagios/etc/cgi.cfg
use_authentication=1     // 打开验证
default_user_name=test
authorized_for_system_information=nagiosadmin,test
authorized_for_configuration_information=nagiosadmin,test
authorized_for_system_commands=nagiosadmin,test
authorized_for_all_services=nagiosadmin,test
authorized_for_all_hosts=nagiosadmin,test
authorized_for_all_service_commands=nagiosadmin,test
authorized_for_all_host_commands=nagiosadmin,test
// 这里添加的用户 ”test” 可以通过浏览器对 nagios 服务的关闭、重启等操作 ，在这里为了安全也可以把 nagiosadmin 这一个用户给删掉，如果有多个用户用逗号隔开，如： nagiosadmin,test

为test账号添加密码
/usr/local/apache2/bin/htpasswd  -c /usr/local/nagios/etc/htpasswd test
new password: 输入你的密码
re -type new password: 再次确认

测试一下，输入你的 http:// 你的服务器 IP/nagios 之后会弹出以下界面：

在这里输入你刚刚设置的用户名密码，就可以登录你的监控平台了，接下来继续配置其他的配置文件；

3 、配置 nagios 主配置文件

在这里定义后面的配置文件的保存路径，下面只贴修改部分

vi /usr/local/nagios/etc/nagios.cfg
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
#cfg_file=/usr/local/nagios/etc/objects/contacts.cfg //这一行注释掉，为了方便管理，我们重新写一个联系人的配置文件
cfg_file=/usr/local/nagios/etc/contacts.cfg      //指定联系人配置文件路径
cfg_file=/usr/local/nagios/etc/contactgroups.cfg   //指定联系人组配置文件路径
#cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg   //注释掉，用自己写的监视时段配置文件
cfg_file=/usr/local/nagios/etc/timeperiods.cfg         //指定监视时段配置文件路径
cfg_file=/usr/local/nagios/etc/objects/templates.cfg     //指定临时配置文件路径
cfg_file=/usr/local/nagios/etc/services.cfg           //服务配置文件路径
#cfg_file=/usr/local/nagios/etc/objects/localhost.cfg   //注释掉，
cfg_file=/usr/local/nagios/etc/hosts.cfg               //主机配置文件路径
cfg_file=/usr/local/nagios/etc/hostgroups.cfg              //主机组配置文件路径
 
check_external_commands=1  //在web界面下重启nagios，停止主机/服务检查操作，默认关闭；
command_check_interval=10s      //定义这个命令检查时间间隔，默认是1秒；

4 、配置 timeperiods.cfg 文件

这是个服务器监控时间段的配置文件，一般都是全天 24 小时，名称是 24x7;

vi /usr/local/nagios/etc/timeperiods.cfg
define timeperiod{
        timeperiod_name         24x7
        alias                      24 hours a day,7days a week
        sunday                   00:00-24:00
        monday                  00:00-24:00
        tuesday                  00:00-24:00
        wednesday               00:00-24:00
        thursday                 00:00-24:00
        friday                    00:00-24:00
        saturday                 00:00-24:00
        }

在这里要注意时间段名称那里的后面不能有空格出现，

5 、创建联系人配置文件 ,contacts.cfg

vi /usr/local/nagios/etc/contacts.cfg
 define contact {
     contact_name         yaozhan189
     alias                system administrator
     service_notification_period    24x7
     host_notification_period       24x7
     service_notification_options   w,u,c,r
     host_notification_options       d,u,r
     service_notification_commands  notify-service-by-email
     host_notification_commands     notify-host-by-email
     email                          yaozhan189@163.com
    # pager                         13800138000
     }

创建一个名为 yaozhan189 的联系人，下面列出其中几个重要选项的说明

# 服务出了状况通知的时间段，这个时间段是前面 timeperiods.cfg 里面定义的。

service_notification_period 24x7

# 主机出现状况时通知的时间段，这个时间段是前面 timeperiods.cfg 里面定义的。

host_notification_period 24x7

# 当服务出现 w— 报警 (warning),u— 未知 (unkown),c— 严重 (critical),r— 从异常恢复到正常，在这四种情况下通知联系人

service_notification_options w,u,c,r

# 当主机出现 d— 当机 (down),u— 返回不可达 (unreachable),r— 从异常情况恢复正常 , 在这 3 种情况下通知联系人

host_notification_options d,u,r

# 服务出问题通知采用的命令 notify-service-by-email , 这个命令是在 commands.cfg 中定义的 , 作用是给联系人发邮件 . 在 nagios2.x 的版本上可以不一样，可以自己到 commands.cfg 里看一下；在这里也可以设置发送短信的方式通知联系人，前提是你要配置有发送知道的脚本，还要到 commands.cfg 里面添加发送脚本所用到的命令；

service_notification_commands notify-service-by-email

# 同上 , 主机出问题时采用的也是发邮件的方式通知联系人

host_notification_commands notify-host-by-email

# 指定联系的人 email 地址

email yaozhan189@163.com

# 联系人的手机 , 前提是要支持短信通知，这里没有启用通过手机短信的方式发送警报

pager 13800138000

如果有多个联系人的话，可以通过复制来创建多个联系人；

6 、创建联系人组配置文件 ,contactgroups.cfg ，把多个联系人加到一个组里面 ；

vi  /usr/local/nagios/etc/contactgroups.cfg
define contactgroup{
        contactgroup_name       sagroup
        alias                   system administrator group
        members                 yaozhan189
 }

注意： members 选项里面的联系人在 contacts.cfgj 里面要要定义，多个联系从之间用逗号隔开；

7 、创建 hosts.cfg 主机配置文件

vi  /usr/local/nagios/etc/hosts.cfg
define host{
        host_name               linux129
        alias                   linux-129
        address                 192.168.152.129
        contact_groups          sagroup
        check_command           check-host-alive
        max_check_attempts      5
        notification_interval   10
        notification_period     24x7
        notification_options    d,u,r
        }
define host{
        host_name               linux132
        alias                   linux-132
        address                 192.168.152.132
        contact_groups          sagroup
        check_command           check-host-alive
        max_check_attempts      5
        notification_interval   10
        notification_period     24x7
        notification_options    d,u,r
        }

在这里我定义了二台主机，只是作一个例子；如果你有更多的主机可以通过复制来添加主机，再修改一下相应的位置就可以了；

8 、创建 hostgroups.cfg 文件

vi  /usr/local/nagios/etc/hostgroups.cfg
define hostgroup{
        hostgroup_name  sa-servers
        alias           sa servers
        members         linux129,linux132
        }

这个跟联系人组配置差不多，要是有多台主机可以用逗号隔开； members 里的主机成员必须也是要在 hosts.cfg 里面定义的，其实这个文件也可以不要；

ok ，到这里就差可以说是完成了最基础的一部份了，现在就是最关键的一部分了，前面已经定义好了联系人，被监控主机，但是还没有定义好要监控主机上的什么东东；现在在这一部分就可以对主机上的各种信息进行监控， nagios 监控的信息主要有：本地资源，对外的服务等；本地资源主要包括 cpu ，硬盘， swap ，内存等；对外服务有 web,fpt,smtp,pop3 等；

9 、定义监控的项目 , 也叫服务 , 创建 services.cfg

vi  /usr/local/nagios/etc/services.cfg
#监控主机是否存活
define service{
       #host_name               nagios-server
        hostgroup_name          sa-servers
        service_description     check-host-alive
        check_command           check-host-alive
        max_check_attempts      5
        normal_check_interval   5
        retry_check_interval    2
        check_period            24x7
        notification_interval   10
        notification_period     24x7
        notification_options    w,u,c,r
        contact_groups          sagroup
        }
#监控主机的web服务
define service{
       #host_name               nagios-server
        hostgroup_name          sa-servers
        service_description     check_tcp 80
        check_period            24x7
        max_check_attempts      4
        normal_check_interval   3
        retry_check_interval    2
        contact_groups          sagroup
        notification_interval   10
        notification_period     24x7
        notification_options    w,u,c,r
        check_command           check_tcp!80
        }
#监控主机的cpu负载情况
define service{
       #host_name               nagios-server
        hostgroup_name          sa-servers
        service_description     cpu load
        check_command           check_nrpe!check_load
        check_period            24x7
        max_check_attempts      4
        normal_check_interval   3
        retry_check_interval    2
        contact_groups          sagroup
        notification_interval   10
        notification_period     24x7
        notification_options    w,u,c,r
        }
#监控主机的进程数
define service{
       #host_name               nagios-server
        hostgroup_name          sa-servers
        service_description     total-procs
        check_command           check_nrpe!check_total_procs
        check_period            24x7
        max_check_attempts      4
        normal_check_interval   3
        retry_check_interval    2
        contact_groups          sagroup
        notification_interval   10
        notification_period     24x7
        notification_options    w,u,c,r
        }

说明：

host_name ：必须是主机配置文件hosts.cfg 中定义的主机。

check_command ：在commands.cfg 文件中定义或在nrpe.cfg 里面定义的命令；

max_check_attempts: 最大重试次数, 一般设置为4 次左右；

normal_check_interval 和 retry_check_interval 检查间隔的单位是分钟。

notification_interval 通知间隔指探测到故障后，每隔多长时间发送一次报警信息，单位是分钟。

notification_options ：通知选项跟联系人配置文件相同。

contact_groups: 配置文件contactgroup.cfg 定义的组名称。

注意： check_command 选项后面跟的命令一定要在 commands.cfg 里有定义；

如果要监控其他的主机的信息，可以通过复制并修改想应的选项来进行添加

四、安装 nrpe

安装openssl

yum install openssl

tar  xvf  nrpe-2.12.tar.gz
cd  nrpe-2.12.
./configure  --prefix=/usr/local/nrpe
make
make install

# 复制文件，因为在 nrpe 安装目录 /usr/local/nrpe/libexec 里只有 cneck_nrpe 这一个文件，而在 nagios/libexec 里却没有，还有一个就是 nrpe.cfg 文件里面默认定义的那几个命令后面的路径是放在 /usr/local/nrpe/libexec 的目录里面，也要把那几个文件复制过来，如果不复制过来的话必须要修改 nrpe.cfg 里面定义的命令的路径，免得在 services.cfg 里面定义 check_command 时提示找不到命令；现在把下面的文件复制过来：

cp /usr/local/nrpe/libexec/check_nrpe  /usr/local/nagios/libexec
cp /usr/local/nagios/libexec/check_disk  /usr/local/nrpe/libexec
cp /usr/local/nagios/libexec/check_load  /usr/local/nrpe/libexec
cp /usr/local/nagios/libexec/check_ping  /usr/local/nrpe/libexec
cp /usr/local/nagios/libexec/check_procs  /usr/local/nrpe/libexec
cp /usr/local/nagios/libexec/check_users  /usr/local/nrpe/libexec

# 修改 nrpe 配置文件 , 只把改过的地方写出来

vi  /usr/local/nrpe/etc/nrpe.cfg
server_address=192.168.152.133      // 以单独的守护进程运行
allowed_hosts=127.0.0.1,192.168.152.133  // 设置允许 nagios 监控服务器可以访问
command[check_users]=/usr/local/nrpe/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nrpe/libexec/check_load -w 15,10,5 -c 30,25,20
#command[check_hda1]=/usr/local/nrpe/libexec/check_disk -w 20 -c 10 -p /dev/hda1   // 注释掉
command[check_df]=/usr/local/nrpe/libexec/check_disk -w 20 -c 10              // 添加这一行，监控整个磁盘利用率
command[check_zombie_procs]=/usr/local/nrpe/libexec/check_procs -w 5 -c 10 -s z
command[check_total_procs]=/usr/local/nrpe/libexec/check_procs -w 150 -c 200
command[check_ips]=/usr/local/nrpe/libexec/ip_conn.sh 8000 10000     // 监控 ip 连接数

说明：

● command[check_users]=/usr/local/nrpe/libexec/check_users –w 5 –c 10 在默认情况下 check_users 的插件是放在 /usr/local/nrpe/libexec/ 目录里面，而目录里面在默认情况下是没有这一个文件的，所以说要从 /usr/local/nagios/libexec/ 目录下拷贝一个过来；或者说的它后面的它改成 : command[check_users]=/usr/local/nagios/libexec/check_users –w 5 –c 10 这样的话就可以了，要不然在引用 check_users 的时候会提示没有那命令；
ps ：我这里为了方便，就是从 /usr/local/nagios/libexec 下把那几个文件拷贝过来；

● 在上面的 nrpe.cfg 配置文件里面，在中括号 “ [ ] “ 里面部分是命令名，也就是 check_nrep –c 后面可以接的内容，等号 = 后面的就是实际执行的插件程序的路径；从上往下分别是检测登录用户数， cpu 使用率，磁盘的容量，僵尸进程，总进程，连接数；

● 要是还要添加其它监控项目，不要忘记了在这里定义相应的命令；例：如果要监控主机的 swap 分区使用情况，当空闲空间小于 20% 时为警告状态，当空闲空间小于 10% 时为严重状态。需要在 nrpe.cnf 里面添加下面的命令： /usr/local/nagios/libexec/check_swap -w 20% -c 10% 如还有其它的，添加相就应的就可以了；关于命令用法可以能过 /usr/local/nagios/libexec/check_swap –h 这样的命令来查询；

● command[check_ips]=/usr/local/nrpe/libexec/ip_conn.sh 8000 10000 ip 连接数，

ip_conn.sh 脚本需要自己写，下面给出脚本的内容：

vi /usr/local/nrpe/libexec/ip_conn.sh

#!/bin/sh
#if [ $#-ne 2 ]
#then
# echo "usage:$0 -w num1 -c num2"
#exit 3
#fi
 
 
ip_conns=`netstat -an |grep tcp |grep est |wc -l`
 
        if [ $ip_conns -lt $1 ]
        then
        echo "ok -connectcounts is $ip_conns"
        exit 0
        fi
 
         if [ $ip_conns -gt $1 -a $ip_conns -lt $2 ]
        then
        echo "warning -connectcounts is $ip_conns"
        exit 1
        fi
 
        if [ $ip_conns -gt $2 ]
        then
        echo "critical -connectcounts is $ip_conns"
        exit 2
         fi

我在 nrpe 配置文件 nrpe.cfg 把脚本所需的两个参数写上了，因此这个脚本就不需判断两个参数输入值的情况。只要当前 ip 连接数大于 8000 ，系统就发 warning 报警，超过 10000 ，则发“ critical ”报警信息。把这个脚本放在目录 /usr/local/nrpe/libexec 下，并给于执行权限；

注：脚本来自田逸的《开源监控利器 nagios 》

修改 /usr/local/nagios/etc/objects/commands.cfg, 在最后添加以下内容：

########################################################################
# 'check_nrpe ' command definition
define command{
        command_name check_nrpe
        command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
        }

添加 check_nrpe 的命令支持，要不是加的话，在 ”check_cmmmands check_nrpe!check_nrpe” 这样的情况下的时候，会提示没有 check_nrpe 这一个命令。

五、被监控主机设置

这一部分内容参考 yahoon 小屋的 nagios 全攻略 ( 四 )---- 监控 linux 上的 ” 本地信息 ” ，地址是： http://yahoon.blog.51cto.com/13184/41893

nrpe 的工作原理图：

nrpe 总共由两部分组成:

– check_nrpe 插件 , 位于在监控主机上

– nrpe daemon , 运行在远程的linux 主机上( 通常就是被监控机)

按照上图, 整个的监控过程如下:

当nagios 需要监控某个远程linux 主机的服务或者资源情况时

1.nagios 会运行check_nrpe 这个插件, 告诉它要检查什么.

2.check_nrpe 插件会连接到远程的nrpe daemon, 所用的方式是ssl

3.nrpe daemon 会运行相应的nagios 插件来执行检查

4.nrpe daemon 将检查的结果返回给check_nrpe 插件, 插件将其递交给nagios 做处理.

注意:nrpe daemon 需要nagios 插件安装在远程的linux 主机上, 否则,daemon 不能做任何的监控.

1 、 linux 主机
1 ）添加用户

#  useradd  nagios  -s /sbin/nologin

2) 安装 nagios 插件

tar xvf  nagios-plugins-1.4.14.tar.gz
cd nagios-plugins-1.4.14
./configure  --prefix=/usr/local/nagios   
make
make install

修改目录权限：

chown –R  nagios.nagios  /usr/local/nagios
chown –R  nagios.nagios  /usr/local/nagios/libexec

安装了openssl与openssl-devel:

yum install openssl

3) 安装 nrpe

tar  xvf  nrpe-2.12.tar.gz
cd  nrpe nrpe-2.12.
./configure  --prefix=/usr/local/nagios  --enable-ssl --with-ssl-lib    //也把它放到nagios这一个目录
make  all
make  install-plugin    //安装check_nrpe这个插件
make install-daemon     //安装deamon
make install-daemon-config    //安装配置文件

注意：安装 nrpe 时，要先安装 SSL 这一个工具，因为监控服务器与被监控主机之间通讯时是通过 SSL 的方式来进行的。

4 ）修改 nrpe.cfg 配置文件

allowed_hosts=127.0.0.1,192.168.152.133    //允许监控服务器访问，中间用逗号隔开；
# 修改NRPE的监控命令，添加相应的命令；
# The following examples use hardcoded command arguments...
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
#command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
command[check_df]=/usr/local/nagios/libexec/check_disk -w 20 -c 10
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%
command[check_tcp]=/usr/local/nagios/libexec/check_tcp -p 80

5 ）启动 nrpe 服务

# /usr/local/nagios/bin/nrpe –c /usr/local/nagios/etc/nrpe.cfg -d

6) 在本机测试一下 nrpe.cfg 里面定义的命令有没有错误

/usr/local/nagios/libexec/check_nrpe -H localhost
/usr/local/nagios/libexec/check_nrpe -H localhost  –c  check_users
/usr/local/nagios/libexec/check_nrpe -H localhost  –c  check_load
/usr/local/nagios/libexec/check_nrpe -H localhost  –c  check_df
/usr/local/nagios/libexec/check_nrpe -H localhost  –c  check_zombie_procs
/usr/local/nagios/libexec/check_nrpe -H localhost  –c  check_total_procs
/usr/local/nagios/libexec/check_nrpe -H localhost  –c  check_swap
/usr/local/nagios/libexec/check_nrpe -H localhost  –c  check_tcp

7 ）在监控服务器上修改 /usr/local/nagios/etc/services.cfg 文件，添加相对应的监控项目.

六、启动 nrpe 服务并检查其配置

1、以独立守护进程启动 nrpe 服务

      /usr/local/nrpe/bin/nrpe –c /usr/local/nrpe/etc/nrpe.cfg –d

2、查看系统日志，如果正常启动可以看到以下输出：

      [root@rhel nrpe]# tail /var/log/messages
oct 15 18:01:16 rhel nrpe[11791]: starting up daemon
oct 15 18:01:16 rhel nrpe[11791]: listening for connections on port 5666
oct 15 18:01:16 rhel nrpe[11791]: allowing connections from: 127.0.0.1,192.168.152.133

查看端口 :

[root@rhel nrpe]# netstat -an |grep 5666
tcp        0      0 0.0.0.0:5666        0.0.0.0:*            listen

查看进程：

[root@rhel nrpe]# ps aux |grep nrpe |grep -v grep
nagios   11791  0.0  0.1   4868   928 ?   ss   18:01   0:00 nrpe -c /usr/local/nrpe/etc/nrpe.cfg –d

3、检查插件功能
1 ）检查 nrpe 的服务版本

      [root@rhel nrpe]# /usr/local/nrpe/libexec/check_nrpe -h 192.168.152.133
nrpe v2.12

2 ）检查 nrpe.cnf 里面定义的命令有没有错误，也就是检查主机资源：

[root@rhel nrpe]# /usr/local/nrpe/libexec/check_nrpe -h 192.168.152.133 -c check_df
disk ok - free space: / 5245 mb (60% inode=95%); /home 13329 mb (80% inode=99%); /var 843 mb (9% inode=99%); /boot 82 mb (88% inode=99%); /dev/shm 235 mb (100% inode=99%);| /=3495mb;9197;9207;0;9217 /home=3215mb;17426;17436;0;17446 /var=7897mb;9197;9207;0;9217 /boot=10mb;78;88;0;98 /dev/shm=0mb;215;225;0;235

[root@rhel nrpe]# /usr/local/nrpe/libexec/check_nrpe -h 192.168.152.133 -c check_load
ok - load average: 0.00, 0.00, 0.00|load1=0.000;15.000;30.000;0; load5=0.000;10.000;25.000;0; load15=0.000;5.000;20.000;0;

[root@rhel nrpe]# /usr/local/nrpe/libexec/check_nrpe -h 192.168.152.133 -c check_ips
ok -connectcounts is 4

在 nrpe.cnf 里面的其它的一些命令也可以测试一下；

七、启动 nagios

首先检查一下配置文件有没有错误

[root@rhel nrpe]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
 
nagios core 3.2.0
reading configuration data...
   read main config file okay...
……………省………………
   read object config files okay...
 
checking misc settings...
 
total warnings: 0
total errors:   0
 
things look okay - no serious problems were detected during the pre-flight check

如果要错误的话可以根据提示把错误全部改正，像上面输出 total warnings: 0 total errors: 0 都为 0 时，说明配置文件没有问题，下面就可以启动 nagios 了；

/usr/local/nagios/bin/nagios –d /usr/local/nagios/etc/nagios.cfg

开机启动

在/etc/rc.d/rc.local 里面加入下面一行就实现开机启动nrpe 了

/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg –d

同理要开机运行nagios 就在/etc/rc.d/rc.local 里面增加下面这行

/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg

正学启动之后，就可以通过浏览器输入： http://192.168.152.133/nagios , 再输入前面第三大步 2 小步那里设置的密码，登录进去后：

点击左边 current status 下面的 services 就可以看到宁波网络公司-浙江海商网监控服务器所监控的主机项目了；如图下:

查看图片附件

分享到：

nagios服务器监控的一些策略(转) | 淘宝内部员工都在哪里买东西？

2010-07-20 10:29
浏览 1562
评论(0)
分类:操作系统
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论