Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Nightingale 单机部署

服务端部署
一、环境准备
1.1 服务器规划
主机:10.0.0.100 master

主机:10.0.0.111 slave

1.2 配置时间(必须确认服务器时间是否于当前时间一
致)
#查看当前时间是否于实际一致
[root@10.0.0.100 /opt/n9e]# date
Tue Mar  1 15:40:14 CST 2022
#如果不一致,配置时间
yum -y install ntp ntpdate
ntpdate 0.asia.pool.ntp.org
#将系统时间写入硬件时间
hwclock --systohc

1.3 安装服务
1.3.1 安装普罗米修斯
mkdir -p /opt/prometheus
wget https://s3-gz01.didistatic.com/n9e-pub/prome/prometheus-2.28.0.linux-
amd64.tar.gz -O prometheus-2.28.0.linux-amd64.tar.gz
tar xf prometheus-2.28.0.linux-amd64.tar.gz
cp -far prometheus-2.28.0.linux-amd64/* /opt/prometheus/
mkdir -p /opt/prometheus
wget https://s3-gz01.didistatic.com/n9e-pub/prome/prometheus-2.28.0.linux-
amd64.tar.gz -O prometheus-2.28.0.linux-amd64.tar.gz
tar xf prometheus-2.28.0.linux-amd64.tar.gz
cp -far prometheus-2.28.0.linux-amd64/* /opt/prometheus/
1.3.2 编写守护进程
cat <<EOF >/etc/systemd/system/prometheus.service
[Unit]
Description="prometheus"
Documentation=https://prometheus.io/
After=network.target

[Service]
Type=simple

ExecStart=/opt/prometheus/prometheus  --
config.file=/opt/prometheus/prometheus.yml --
storage.tsdb.path=/opt/prometheus/data --web.enable-lifecycle --enable-
feature=remote-write-receiver --query.lookback-delta=2m

Restart=on-failure
SuccessExitStatus=0
LimitNOFILE=65536
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=prometheus

[Install]
WantedBy=multi-user.target
EOF

1.3.3 启动服务并检查
systemctl daemon-reload
systemctl enable prometheus
systemctl restart prometheus
systemctl status prometheu

1.3.4 安装MySQL
yum -y install mariadb*
systemctl enable mariadb
systemctl restart mariadb
mysql -e "SET PASSWORD FOR 'root'@'localhost' = PASSWORD('1234');"    ###
设置密码为1234,可更改

1.3.5 安装Redis
yum install -y redis
systemctl enable redis
systemctl restart redis
上例中mysql的root密码设置为了1234,建议维持这个不变,后续就省去了修改配置文件的麻

二、安装夜莺组件
mkdir -p /opt/n9e && cd /opt/n9e

# 去 https://github.com/didi/nightingale/releases 找最新版本的包,文档里的包地址
可能已经不是最新的了
tarball=n9e-5.0.0-ga-06.tar.gz
urlpath=https://github.com/didi/nightingale/releases/download/v5.0.0-ga-
06/${tarball}
wget $urlpath || exit 1

tar zxvf ${tarball}

mysql -uroot -p1234 < docker/initsql/a-n9e.sql

2.1 编写守护进程文件
cat <<EOF >/etc/systemd/system/n9e.service
[Unit]
Description=Nightingale
After=syslog.target network.target

[Service]
Type=forking
ExecStart=/bin/bash /opt/n9e/n9e.sh
ExecStop=/bin/pkill n9e
ExecReload=/bin/kill -USR2 n9e
PrivateTmp=true

[Install]
WantedBy=multi-user.target
EOF

2.2 启动脚本文件
cat <<EOF >/opt/n9e/n9e.sh
#!/bin/bash
cd /opt/n9e/
./n9e server &> server.log &
./n9e webapi &> webapi.log &
EOF
2.3 检查log日志及端口
cd /opt/n9e/
cat server.log

无报错信息,输出的都是info。

cat webapi.log

这一步主要是看服务是否都正常启动了。如果启动成功,server默认会监听在19000端口,
webapi会监听在18000端口,且日志没有报错。

浏览器访问webapi的端口(默认是18000)就可以体验相关功能了,默认用户是root,密码
是root.2020

3、使用TELEGRAF采集监控数据
Telegraf 是 InfluxData 开源的一款采集器,可以采集操作系统、各种中间件的监控指标,采
集目标列表,看起来是非常丰富,Telegraf是一个大一统的设计,即一个二进制可以采集
CPU、内存、mysql、mongodb、redis、snmp等,不像Prometheus的exporter,每个监控对
象一个exporter,管理起来略麻烦。一个二进制分发起来确实比较方便。

3.1 安装脚本(随便找个位置编写一个.sh的脚本文件)
#!/bin/sh

version=1.20.4
tarball=telegraf-${version}_linux_amd64.tar.gz
wget https://dl.influxdata.com/telegraf/releases/$tarball
tar xzvf $tarball

mkdir -p /opt/telegraf
cp -far telegraf-${version}/usr/bin/telegraf /opt/telegraf

cat <<EOF > /opt/telegraf/telegraf.conf


[global_tags]

[agent]
interval = "10s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = ""
hostname = ""
omit_hostname = false

[[outputs.opentsdb]]
host = "http://127.0.0.1"
port = 19000
http_batch_size = 50
http_path = "/opentsdb/put"
debug = false
separator = "_"

[[inputs.cpu]]
percpu = true
totalcpu = true
collect_cpu_time = false
report_active = true

[[inputs.disk]]
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs",
"squashfs"]

[[inputs.diskio]]

[[inputs.kernel]]

[[inputs.mem]]

[[inputs.processes]]

[[inputs.system]]
fielddrop = ["uptime_format"]

[[inputs.net]]
ignore_protocol_stats = true

EOF

cat <<EOF > /etc/systemd/system/telegraf.service


[Unit]
Description="telegraf"
After=network.target

[Service]
Type=simple

ExecStart=/opt/telegraf/telegraf --config telegraf.conf


WorkingDirectory=/opt/telegraf
SuccessExitStatus=0
LimitNOFILE=65536
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=telegraf
KillMode=process
KillSignal=SIGQUIT
TimeoutStopSec=5
Restart=always

[Install]
WantedBy=multi-user.target
EOF

3.2 查看是否启动
systemctl daemon-reload
systemctl enable telegraf
systemctl restart telegraf
systemctl status telegraf

/opt/telegraf/telegraf.conf的内容是个删减版,只是为了让程序快速跑起来,如果
要采集更多监控对象,比如mysql、redis、tomcat等,还是要仔细去阅读从tarball里解压出来
的那个配置文件,那里有很详细的注释,也可以参考官方提供的各个采集插件下的README

4、客户端部署(10.0.0.111)
客户端只需要安装telegraf就可以了

4.1 安装脚本(随便找个位置编写一个.sh的脚本文件)
#!/bin/sh

version=1.20.4
tarball=telegraf-${version}_linux_amd64.tar.gz
wget https://dl.influxdata.com/telegraf/releases/$tarball
tar xzvf $tarball

mkdir -p /opt/telegraf
cp -far telegraf-${version}/usr/bin/telegraf /opt/telegraf

cat <<EOF > /opt/telegraf/telegraf.conf


[global_tags]

[agent]
interval = "10s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = ""
hostname = ""
omit_hostname = false

[[outputs.opentsdb]]
host = "http://10.0.0.110"           #########注意这个地方需要改成服务端的IP
port = 19000
http_batch_size = 50
http_path = "/opentsdb/put"
debug = false
separator = "_"

[[inputs.cpu]]
percpu = true
totalcpu = true
collect_cpu_time = false
report_active = true

[[inputs.disk]]
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs",
"squashfs"]

[[inputs.diskio]]

[[inputs.kernel]]

[[inputs.mem]]

[[inputs.processes]]

[[inputs.system]]
fielddrop = ["uptime_format"]

[[inputs.net]]
ignore_protocol_stats = true

EOF

cat <<EOF > /etc/systemd/system/telegraf.service


[Unit]
Description="telegraf"
After=network.target

[Service]
Type=simple
ExecStart=/opt/telegraf/telegraf --config telegraf.conf
WorkingDirectory=/opt/telegraf

SuccessExitStatus=0
LimitNOFILE=65536
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=telegraf
KillMode=process
KillSignal=SIGQUIT
TimeoutStopSec=5
Restart=always

[Install]
WantedBy=multi-user.target
EOF

4.2 查看是否启动
systemctl daemon-reload
systemctl enable telegraf
systemctl restart telegraf
systemctl status telegraf

You might also like