Grafana+InfluxDB+Telegraf+SequoiaDB
Grafana作为一款强大的开源的监控软件,可以进行灵活的报表定制与性能监控。用户可以通过时序数据库InfluxDB作为数据源为Grafana提供性能监控数据,同时使用Telegraf作为性能数据采集工具从SequoiaDB中定时采集性能指标,已达到准实时性能监控的目的。
整个框架的搭建非常简单,需要编写代码的仅4行,基本上可以通过合理的配置完成环境的搭建。
本文使用1台阿里云RHEL7环境,并详细标明了每一步用户需要执行的命令。
1)安装Telegraf
直接通过wget下载安装
[root@iZ2ze06q07wqluc8htj4pdZ ~]# wget https://dl.influxdata.com/telegraf/releases/telegraf-1.5.0-1.x86_64.rpm
[root@iZ2ze06q07wqluc8htj4pdZ ~]# yum localinstall telegraf-1.5.0-1.x86_64.rpm
2)安装InfluxDB
直接通过wget下载安装
[root@iZ2ze06q07wqluc8htj4pdZ ~]# wget https://dl.influxdata.com/influxdb/releases/influxdb-1.4.2.x86_64.rpm
[root@iZ2ze06q07wqluc8htj4pdZ ~]# yum localinstall influxdb-1.4.2.x86_64.rpm
3)安装Grafana
直接通过wget下载安装
[root@iZ2ze06q07wqluc8htj4pdZ ~]# wget https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-4.6.3-1.x86_64.rpm
[root@iZ2ze06q07wqluc8htj4pdZ ~]# yum localinstall grafana-4.6.3-1.x86_64.rpm
4)安装SequoiaDB
在官网注册下载SequoiaDB后解压,所有参数使用默认即可。
[root@iZ2ze06q07wqluc8htj4pdZ ~]# ./sequoiadb-2.8.4-linux_x86_64-enterprise-installer.run –SMS true
5)连接SequoiaDB SAC进行图形化安装
参考官方文档http://doc.sequoiadb.com/cn/SequoiaDB-cat_id-1483944500-edition_id-208
5.1)admin/admin用户名密码登录
5.2)选择右下方一键部署
5.3)添加本机为主机
5.4)配置单节点单副本最小集群
5.5)确认配置点击下一步
5.6)启动成功
5.7)如果需要SQL接口可以下载并安装SequoiaSQL插件
所有安装参数均使用默认值即可。
[root@iZ2ze06q07wqluc8htj4pdZ ~]# ./sequoiasql-oltp-2.8.4-x86_64-enterprise-installer.run
5.7.1)启动SequoiaSQL服务
# 从root用户切换到sdbadmin用户
[root@iZ2ze06q07wqluc8htj4pdZ sequoiasqloltp]# su - sdbadmin
上一次登录:五 1月 5 13:44:47 CST 2018pts/1 上
# 创建一个叫做testinst的实例
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ /opt/sequoiasqloltp/bin/sdb_sql_ctl addinst testinst -D /home/sdbadmin/sdb_data
Adding instance testinst ...
Ok
# 启动testinst实例
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ /opt/sequoiasqloltp/bin/sdb_sql_ctl start testinst
Starting instance testinst ...
ok (PID: 24206)
# 创建一个叫做testdb的数据库
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ /opt/sequoiasqloltp/bin/sdb_sql_ctl createdb testdb testinst
Creating database testinst ...
ok
# 开始创建SDB集合空间与集合
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db=new Sdb()"
localhost:11810
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db.createCS ('testcs')"
localhost:11810.testcs
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db.testcs.createCL('testcl')"
localhost:11810.testcs.testcl
# 创建PGSQL映射表
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ /opt/sequoiasqloltp/bin/psql -p 5432 testdb
psql (9.3.4)
Type "help" for help.
# 创建Foreign Data Wrapper
testdb=# create extension sdb_fdw;
CREATE EXTENSION
# 创建SequoiaDB服务
testdb=# create server sdb_server foreign data wrapper sdb_fdw options(address '127.0.0.1', service '11810');
CREATE SERVER
# 创建SequoiaDB映射表,真实表存在于SequoiaDB之内
testdb=# create foreign table test ( name text, id numeric) server sdb_server options ( collectionspace 'testcs', collection 'testcl', decimal 'on');
CREATE FOREIGN TABLE
# 收集一下统计信息
testdb=# analyze test;
ANALYZE
# 确认表是空的
testdb=# select * from test ;
name | id
------+----
(0 rows)
# 写一条数据进去
testdb=# insert into test values('one', 1);
INSERT 0 1
# 试着更新一下
testdb=# update test set id=9 where name='one';
UPDATE 1
# 然后查看一下
testdb=# select * from test ;
name | id
------+----
one | 9
(1 row)
5.7.2)向SAC中添加SequoiaSQL服务
点击发现业务
选择SequoiaSQL引擎
填入内网IP地址与端口
在数据中可以看到相应的表与数据
6)验证SequoiaDB性能监控快照
# 连接数据库,验证snapshot与REST接口都是通的
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db=new Sdb()"
localhost:11810
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db.snapshot(SDB_SNAP_DATABASE)"
{
"TotalNumConnects": 1,
"TotalDataRead": 16187,
"TotalIndexRead": 238,
"TotalDataWrite": 15,
"TotalIndexWrite": 16,
"TotalUpdate": 8,
"TotalDelete": 0,
"TotalInsert": 7,
"ReplUpdate": 0,
"ReplDelete": 0,
"ReplInsert": 0,
"TotalSelect": 9706,
"TotalRead": 9738,
"TotalReadTime": 0,
"TotalWriteTime": 0,
"freeLogSpace": 1644167168,
"vsize": 4315893760,
"rss": 71836,
"fault": 2,
"TotalMapped": 1607335936,
"svcNetIn": 761,
"svcNetOut": 1396,
"shardNetIn": 1564857,
"shardNetOut": 1296727,
"replNetIn": 0,
"replNetOut": 0,
"ErrNodes": []
}
Return 1 row(s).
[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ curl -d "cmd=snapshot database" "172.17.230.225:11814"
{ "errno": 0 }{ "TotalNumConnects": 1.0, "TotalDataRead": 19679.0, "TotalIndexRead": 242.0, "TotalDataWrite": 15.0, "TotalIndexWrite": 16.0, "TotalUpdate": 8.0, "TotalDelete": 0.0, "TotalInsert": 7.0, "ReplUpdate": 0.0, "ReplDelete": 0.0, "ReplInsert": 0.0, "TotalSelect": 11804.0, "TotalRead": 11836.0, "TotalReadTime": 0.0, "TotalWriteTime": 0.0, "freeLogSpace": 1644167168.0, "vsize": 4315893760.0, "rss": 71842.0, "fault": 2.0, "TotalMapped": 1607335936.0, "svcNetIn": 761.0, "svcNetOut": 1396.0, "shardNetIn": 1920028.0, "shardNetOut": 1672847.0, "replNetIn": 0.0, "replNetOut": 0.0, "ErrNodes": [] }
7)启动influxdb与telegraf
# 启动influxdb服务
[root@iZ2ze06q07wqluc8htj4pdZ ~]# systemctl restart influxdb
[root@iZ2ze06q07wqluc8htj4pdZ ~]# ps -elf | grep influxdb
4 S influxdb 22288 1 1 80 0 - 55908 futex_ 14:09 ? 00:00:00 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
0 S root 22299 18679 0 80 0 - 28169 pipe_w 14:09 pts/1 00:00:00 grep --color=auto influxdb
# 进入influxdb并创建数据库
[root@iZ2ze06q07wqluc8htj4pdZ ~]# influx
Connected to http://localhost:8086 version 1.4.2
InfluxDB shell version: 1.4.2
> create database "telegraf"
> show databases
name: databases
name
----
telegraf
_internal
[root@iZ2ze06q07wqluc8htj4pdZ ~]# systemctl restart telegraf
[root@iZ2ze06q07wqluc8htj4pdZ ~]# ps -elf | grep telegraf
4 S telegraf 22195 1 1 80 0 - 23463 futex_ 14:08 ? 00:00:00 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegra/telegraf.d
0 S root 22206 18679 0 80 0 - 28169 pipe_w 14:08 pts/1 00:00:00 grep --color=auto telegraf
# telegraf启动以后确认influxdb可以被正常写入监控数据
[root@iZ2ze06q07wqluc8htj4pdZ ~]# influx
Connected to http://localhost:8086 version 1.4.2
InfluxDB shell version: 1.4.2
> use telegraf
Using database telegraf
> select * from cpu limit 10
name: cpu
time cpu host usage_guest usage_guest_nice usage_idle usage_iowait usage_irq usage_nice usage_softirq usage_steal usage_system usage_user
---- --- ---- ----------- ---------------- ---------- ------------ --------- ---------- ------------- ----------- ------------ ----------
1515132510000000000 cpu-total iZ2ze06q07wqluc8htj4pdZ 0 0 98.89834752131219 0.10015022533794657 0 0 0 0 0.35052578868309764 0.650976464697222
1515132510000000000 cpu0 iZ2ze06q07wqluc8htj4pdZ 0 0 98.69608826480209 0.10030090270805804 0 0 0 0 0.3009027081244592 0.9027081243733776
1515132510000000000 cpu1 iZ2ze06q07wqluc8htj4pdZ 0 0 99.09999999999854 0.09999999999990905 0 0 0 0 0.3999999999999915 0.40000000000006253
8)使用exec插件收集SequoiaDB性能指标
这里是本框架中唯一需要编写代码的地方,总共四行代码,分别为test.sh与test.js。
# 创建test.sh与test.js程序用于定期收集SequoiaDB性能指标
[root@iZ2ze06q07wqluc8htj4pdZ ~]# cat /tmp/test.sh
#!/bin/sh
/opt/sequoiadb/bin/sdb -f /tmp/test.js | sed '/Return 1/d'
[root@iZ2ze06q07wqluc8htj4pdZ ~]# cat /tmp/test.js
var db = new Sdb();
db.snapshot(SDB_SNAP_DATABASE);
# telegraf使用自己的用户执行,因此在root下创建的脚本需要给全局执行权限
[root@iZ2ze06q07wqluc8htj4pdZ ~]# chmod 777 /tmp/test.sh
[root@iZ2ze06q07wqluc8htj4pdZ ~]# chmod 777 /tmp/test.js
# 在telegraf配置中配置telegraf.conf,反注释inputs.exec并编辑相关内容
[root@iZ2ze06q07wqluc8htj4pdZ ~]# vi /etc/telegraf/telegraf.conf
……
[[inputs.exec]]
commands = ["/tmp/test.sh"]
timeout = "5s"
data_format = "json"
# 验证配置文件,看到inputs.exec类型的数据确实收集了
[root@iZ2ze06q07wqluc8htj4pdZ ~]# telegraf --test
2018/01/05 16:02:27 I! Using config file: /etc/telegraf/telegraf.conf
* Plugin: inputs.mem, Collection 1
> mem,host=iZ2ze06q07wqluc8htj4pdZ available=3092996096i,used=882089984i,buffered=66174976i,used_percent=22.190462451570358,available_percent=77.80953754842965,total=3975086080i,cached=3042557952i,active=2068545536i,inactive=1559457792i,slab=164098048i,free=133742592i 1515139347000000000
……
* Plugin: inputs.exec, Collection 1
> exec,host=iZ2ze06q07wqluc8htj4pdZ TotalUpdate=8,TotalRead=12064,TotalDataWrite=15,shardNetOut=1795359,svcNetOut=1396,replNetIn=0,TotalWriteTime=0,vsize=4525711360,replNetOut=0,TotalIndexRead=242,TotalSelect=12032,svcNetIn=761,TotalIndexWrite=16,ReplInsert=0,TotalNumConnects=1,fault=2,ReplUpdate=0,ReplDelete=0,TotalReadTime=0,TotalInsert=7,rss=72156,TotalDelete=0,TotalDataRead=20059,freeLogSpace=1644167168,shardNetIn=2034612,TotalMapped=1607335936 1515139348000000000
* Plugin: inputs.kernel, Collection 1
> kernel,host=iZ2ze06q07wqluc8htj4pdZ boot_time=1515116065i,processes_forked=67543i,interrupts=11615115i,context_switches=20009921i 1515139348000000000
……
# 重启telegraf服务
[root@iZ2ze06q07wqluc8htj4pdZ ~]# systemctl restart telegraf
# 确认数据每十秒钟录入influxdb
[root@iZ2ze06q07wqluc8htj4pdZ ~]# influx
Connected to http://localhost:8086 version 1.4.2
InfluxDB shell version: 1.4.2
> use telegraf
Using database telegraf
> select * from exec
name: exec
time ReplDelete ReplInsert ReplUpdate TotalDataRead TotalDataWrite TotalDelete TotalIndexRead TotalIndexWrite TotalInsert TotalMapped TotalNumConnects TotalRead TotalReadTime TotalSelect TotalUpdate TotalWriteTime fault freeLogSpace host replNetIn replNetOut rss shardNetIn shardNetOut svcNetIn svcNetOut vsize
---- ---------- ---------- ---------- ------------- -------------- ----------- -------------- --------------- ----------- ----------- ---------------- --------- ------------- ----------- ----------- -------------- ----- ------------ ---- --------- ---------- --- ---------- ----------- -------- --------- -----
1515139522000000000 0 0 0 20064 15 0 242 16 7 1607335936 1 12067 0 12035 8 0 2 1644167168 iZ2ze06q07wqluc8htj4pdZ 0 0 72156 2036130 1796971 761 1396 4525711360
1515140020000000000 0 0 0 20069 15 0 242 16 7 1607335936 1 12070 0 12038 8 0 2 1644167168 iZ2ze06q07wqluc8htj4pdZ 0 0 72156 2037648 1798583 761 1396 4525711360
9)启动grafana
[root@iZ2ze06q07wqluc8htj4pdZ ~]# systemctl start grafana-server
[root@iZ2ze06q07wqluc8htj4pdZ ~]# ps -elf | grep grafana
4 S grafana 5708 1 5 80 0 - 74671 futex_ 16:57 ? 00:00:00 /usr/sbin/grafan-server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafana-server.pid cfg:default.paths.logs=/var/log/grafana cfg:default.paths.data=/var/lib/grafana cfg:default.paths.plugins=/var/lib/grafana/plugins
0 S root 5725 1702 0 80 0 - 28169 pipe_w 16:57 pts/1 00:00:00 grep --color=auto grafana
连接3000端口可以看到
使用admin/admin登录后添加influxdb数据源
点击Save&Test通过测试
10)制定报表
在Dashboard中创建一个新的Panel,数据源选择刚刚添加的SequoiaDB,然后在指标中配置好4个指标TotalDataRead、TotalDataWrite、TotalIndexRead、TotalIndexWrite。
分别各做一个table和一个折线图
11)小结
至此为止,我们通过使用InfluxDB与Telegraf,仅编写4行代码就将Grafana监控软件与SequoiaDB成功对接。感兴趣的读者甚至可以编写并提交针对SequoiaDB的Telegraf inputs插件,使得其他用户不需要编写脚本即可直接获得SequoiaDB的性能指标。