博客 > 分布式数据库对接开源监控软件

分布式数据库对接开源监控软件

 2019-02-22  分布式 ,最佳实践 ,工具

 Grafana+InfluxDB+Telegraf+SequoiaDB


Grafana作为一款强大的开源的监控软件,可以进行灵活的报表定制与性能监控。用户可以通过时序数据库InfluxDB作为数据源为Grafana提供性能监控数据,同时使用Telegraf作为性能数据采集工具从SequoiaDB中定时采集性能指标,已达到准实时性能监控的目的。

 

整个框架的搭建非常简单,需要编写代码的仅4行,基本上可以通过合理的配置完成环境的搭建。


本文使用1台阿里云RHEL7环境,并详细标明了每一步用户需要执行的命令。

 

1)安装Telegraf

直接通过wget下载安装

[root@iZ2ze06q07wqluc8htj4pdZ ~]# wget https://dl.influxdata.com/telegraf/releases/telegraf-1.5.0-1.x86_64.rpm

[root@iZ2ze06q07wqluc8htj4pdZ ~]# yum localinstall telegraf-1.5.0-1.x86_64.rpm


2)安装InfluxDB

直接通过wget下载安装

[root@iZ2ze06q07wqluc8htj4pdZ ~]# wget https://dl.influxdata.com/influxdb/releases/influxdb-1.4.2.x86_64.rpm

[root@iZ2ze06q07wqluc8htj4pdZ ~]# yum localinstall influxdb-1.4.2.x86_64.rpm


3)安装Grafana

直接通过wget下载安装

[root@iZ2ze06q07wqluc8htj4pdZ ~]# wget https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-4.6.3-1.x86_64.rpm 

 [root@iZ2ze06q07wqluc8htj4pdZ ~]# yum localinstall grafana-4.6.3-1.x86_64.rpm


4)安装SequoiaDB

在官网注册下载SequoiaDB后解压,所有参数使用默认即可。

[root@iZ2ze06q07wqluc8htj4pdZ ~]# ./sequoiadb-2.8.4-linux_x86_64-enterprise-installer.run –SMS true


5)连接SequoiaDB SAC进行图形化安装

参考官方文档http://doc.sequoiadb.com/cn/SequoiaDB-cat_id-1483944500-edition_id-208


5.1)admin/admin用户名密码登录

图片.png

5.2)选择右下方一键部署

图片.png

 

5.3)添加本机为主机

图片.png

 

5.4)配置单节点单副本最小集群 

图片.png


5.5)确认配置点击下一步

图片.png

 

5.6)启动成功

图片.png

 

5.7)如果需要SQL接口可以下载并安装SequoiaSQL插件

所有安装参数均使用默认值即可。

[root@iZ2ze06q07wqluc8htj4pdZ ~]# ./sequoiasql-oltp-2.8.4-x86_64-enterprise-installer.run 

    

5.7.1)启动SequoiaSQL服务

# 从root用户切换到sdbadmin用户

[root@iZ2ze06q07wqluc8htj4pdZ sequoiasqloltp]# su - sdbadmin

上一次登录:五 1月  5 13:44:47 CST 2018pts/1 上

# 创建一个叫做testinst的实例

[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ /opt/sequoiasqloltp/bin/sdb_sql_ctl addinst testinst -D /home/sdbadmin/sdb_data

Adding instance testinst ...

Ok

# 启动testinst实例

[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ /opt/sequoiasqloltp/bin/sdb_sql_ctl start testinst

Starting instance testinst ...

ok (PID: 24206)

# 创建一个叫做testdb的数据库

[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ /opt/sequoiasqloltp/bin/sdb_sql_ctl createdb testdb testinst

Creating database testinst ...

ok

# 开始创建SDB集合空间与集合

[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db=new Sdb()"

localhost:11810

[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db.createCS ('testcs')"

localhost:11810.testcs

[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db.testcs.createCL('testcl')"

localhost:11810.testcs.testcl

# 创建PGSQL映射表

[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ /opt/sequoiasqloltp/bin/psql -p 5432 testdb

psql (9.3.4)

Type "help" for help.

# 创建Foreign Data Wrapper

testdb=# create extension sdb_fdw;

CREATE EXTENSION

# 创建SequoiaDB服务

testdb=# create server sdb_server foreign data wrapper sdb_fdw options(address '127.0.0.1', service '11810');

CREATE SERVER

# 创建SequoiaDB映射表,真实表存在于SequoiaDB之内

testdb=# create foreign table test ( name text, id numeric) server sdb_server options ( collectionspace 'testcs', collection 'testcl', decimal 'on');

CREATE FOREIGN TABLE

# 收集一下统计信息

testdb=# analyze test;

ANALYZE

# 确认表是空的

testdb=# select * from test ;

 name | id 

------+----

(0 rows)

# 写一条数据进去

testdb=# insert into test values('one', 1);

INSERT 0 1

# 试着更新一下

testdb=# update test set id=9 where name='one';

UPDATE 1

# 然后查看一下

testdb=# select * from test ;

 name | id 

------+----

 one  |  9

(1 row)


5.7.2)向SAC中添加SequoiaSQL服务

点击发现业务

图片.png


选择SequoiaSQL引擎 

图片.png


填入内网IP地址与端口

图片.png


在数据中可以看到相应的表与数据 

图片.png
图片.png


6)验证SequoiaDB性能监控快照

# 连接数据库,验证snapshot与REST接口都是通的

[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db=new Sdb()"

localhost:11810

[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ sdb "db.snapshot(SDB_SNAP_DATABASE)"

{

  "TotalNumConnects": 1,

  "TotalDataRead": 16187,

  "TotalIndexRead": 238,

  "TotalDataWrite": 15,

  "TotalIndexWrite": 16,

  "TotalUpdate": 8,

  "TotalDelete": 0,

  "TotalInsert": 7,

  "ReplUpdate": 0,

  "ReplDelete": 0,

  "ReplInsert": 0,

  "TotalSelect": 9706,

  "TotalRead": 9738,

  "TotalReadTime": 0,

  "TotalWriteTime": 0,

  "freeLogSpace": 1644167168,

  "vsize": 4315893760,

  "rss": 71836,

  "fault": 2,

  "TotalMapped": 1607335936,

  "svcNetIn": 761,

  "svcNetOut": 1396,

  "shardNetIn": 1564857,

  "shardNetOut": 1296727,

  "replNetIn": 0,

  "replNetOut": 0,

  "ErrNodes": []

}

Return 1 row(s).

[sdbadmin@iZ2ze06q07wqluc8htj4pdZ ~]$ curl -d "cmd=snapshot database" "172.17.230.225:11814" 

{ "errno": 0 }{ "TotalNumConnects": 1.0, "TotalDataRead": 19679.0, "TotalIndexRead": 242.0, "TotalDataWrite": 15.0, "TotalIndexWrite": 16.0, "TotalUpdate": 8.0, "TotalDelete": 0.0, "TotalInsert": 7.0, "ReplUpdate": 0.0, "ReplDelete": 0.0, "ReplInsert": 0.0, "TotalSelect": 11804.0, "TotalRead": 11836.0, "TotalReadTime": 0.0, "TotalWriteTime": 0.0, "freeLogSpace": 1644167168.0, "vsize": 4315893760.0, "rss": 71842.0, "fault": 2.0, "TotalMapped": 1607335936.0, "svcNetIn": 761.0, "svcNetOut": 1396.0, "shardNetIn": 1920028.0, "shardNetOut": 1672847.0, "replNetIn": 0.0, "replNetOut": 0.0, "ErrNodes": [] }


7)启动influxdb与telegraf

# 启动influxdb服务

[root@iZ2ze06q07wqluc8htj4pdZ ~]# systemctl restart influxdb

[root@iZ2ze06q07wqluc8htj4pdZ ~]# ps -elf | grep influxdb

4 S influxdb 22288     1  1  80   0 - 55908 futex_ 14:09 ?        00:00:00 /usr/bin/influxd -config /etc/influxdb/influxdb.conf

0 S root     22299 18679  0  80   0 - 28169 pipe_w 14:09 pts/1    00:00:00 grep --color=auto influxdb

# 进入influxdb并创建数据库

[root@iZ2ze06q07wqluc8htj4pdZ ~]# influx

Connected to http://localhost:8086 version 1.4.2

InfluxDB shell version: 1.4.2

> create database "telegraf"

> show databases

name: databases

name

----

telegraf

_internal

[root@iZ2ze06q07wqluc8htj4pdZ ~]# systemctl restart telegraf

[root@iZ2ze06q07wqluc8htj4pdZ ~]# ps -elf | grep telegraf

4 S telegraf 22195     1  1  80   0 - 23463 futex_ 14:08 ?        00:00:00 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegra/telegraf.d

0 S root     22206 18679  0  80   0 - 28169 pipe_w 14:08 pts/1    00:00:00 grep --color=auto telegraf

# telegraf启动以后确认influxdb可以被正常写入监控数据

[root@iZ2ze06q07wqluc8htj4pdZ ~]# influx

Connected to http://localhost:8086 version 1.4.2

InfluxDB shell version: 1.4.2

> use telegraf

Using database telegraf

> select * from cpu limit 10

name: cpu

time                cpu       host                    usage_guest usage_guest_nice usage_idle        usage_iowait        usage_irq usage_nice usage_softirq usage_steal usage_system        usage_user

----                ---       ----                    ----------- ---------------- ----------        ------------        --------- ---------- ------------- ----------- ------------        ----------

1515132510000000000 cpu-total iZ2ze06q07wqluc8htj4pdZ 0           0                98.89834752131219 0.10015022533794657 0         0          0             0           0.35052578868309764 0.650976464697222

1515132510000000000 cpu0      iZ2ze06q07wqluc8htj4pdZ 0           0                98.69608826480209 0.10030090270805804 0         0          0             0           0.3009027081244592  0.9027081243733776

1515132510000000000 cpu1      iZ2ze06q07wqluc8htj4pdZ 0           0                99.09999999999854 0.09999999999990905 0         0          0             0           0.3999999999999915  0.40000000000006253


8)使用exec插件收集SequoiaDB性能指标

这里是本框架中唯一需要编写代码的地方,总共四行代码,分别为test.sh与test.js。

# 创建test.sh与test.js程序用于定期收集SequoiaDB性能指标

[root@iZ2ze06q07wqluc8htj4pdZ ~]# cat /tmp/test.sh

#!/bin/sh

/opt/sequoiadb/bin/sdb -f /tmp/test.js | sed '/Return 1/d'

[root@iZ2ze06q07wqluc8htj4pdZ ~]# cat /tmp/test.js

var db = new Sdb();

db.snapshot(SDB_SNAP_DATABASE);

# telegraf使用自己的用户执行,因此在root下创建的脚本需要给全局执行权限

[root@iZ2ze06q07wqluc8htj4pdZ ~]# chmod 777 /tmp/test.sh

[root@iZ2ze06q07wqluc8htj4pdZ ~]# chmod 777 /tmp/test.js

 

# 在telegraf配置中配置telegraf.conf,反注释inputs.exec并编辑相关内容

[root@iZ2ze06q07wqluc8htj4pdZ ~]# vi /etc/telegraf/telegraf.conf

……

[[inputs.exec]]

commands = ["/tmp/test.sh"]

timeout = "5s"

data_format = "json"

 

# 验证配置文件,看到inputs.exec类型的数据确实收集了

[root@iZ2ze06q07wqluc8htj4pdZ ~]# telegraf --test

2018/01/05 16:02:27 I! Using config file: /etc/telegraf/telegraf.conf

* Plugin: inputs.mem, Collection 1

> mem,host=iZ2ze06q07wqluc8htj4pdZ available=3092996096i,used=882089984i,buffered=66174976i,used_percent=22.190462451570358,available_percent=77.80953754842965,total=3975086080i,cached=3042557952i,active=2068545536i,inactive=1559457792i,slab=164098048i,free=133742592i 1515139347000000000

……

* Plugin: inputs.exec, Collection 1

> exec,host=iZ2ze06q07wqluc8htj4pdZ TotalUpdate=8,TotalRead=12064,TotalDataWrite=15,shardNetOut=1795359,svcNetOut=1396,replNetIn=0,TotalWriteTime=0,vsize=4525711360,replNetOut=0,TotalIndexRead=242,TotalSelect=12032,svcNetIn=761,TotalIndexWrite=16,ReplInsert=0,TotalNumConnects=1,fault=2,ReplUpdate=0,ReplDelete=0,TotalReadTime=0,TotalInsert=7,rss=72156,TotalDelete=0,TotalDataRead=20059,freeLogSpace=1644167168,shardNetIn=2034612,TotalMapped=1607335936 1515139348000000000

* Plugin: inputs.kernel, Collection 1

> kernel,host=iZ2ze06q07wqluc8htj4pdZ boot_time=1515116065i,processes_forked=67543i,interrupts=11615115i,context_switches=20009921i 1515139348000000000

……

 

# 重启telegraf服务

[root@iZ2ze06q07wqluc8htj4pdZ ~]# systemctl restart telegraf

 

# 确认数据每十秒钟录入influxdb

[root@iZ2ze06q07wqluc8htj4pdZ ~]# influx

Connected to http://localhost:8086 version 1.4.2

InfluxDB shell version: 1.4.2

> use telegraf

Using database telegraf

> select * from exec

name: exec

time                ReplDelete ReplInsert ReplUpdate TotalDataRead TotalDataWrite TotalDelete TotalIndexRead TotalIndexWrite TotalInsert TotalMapped TotalNumConnects TotalRead TotalReadTime TotalSelect TotalUpdate TotalWriteTime fault freeLogSpace host                    replNetIn replNetOut rss   shardNetIn shardNetOut svcNetIn svcNetOut vsize

----                ---------- ---------- ---------- ------------- -------------- ----------- -------------- --------------- ----------- ----------- ---------------- --------- ------------- ----------- ----------- -------------- ----- ------------ ----                    --------- ---------- ---   ---------- ----------- -------- --------- -----

1515139522000000000 0          0          0          20064         15             0           242            16              7           1607335936  1                12067     0             12035       8           0              2     1644167168   iZ2ze06q07wqluc8htj4pdZ 0         0          72156 2036130    1796971     761      1396      4525711360

1515140020000000000 0          0          0          20069         15             0           242            16              7           1607335936  1                12070     0             12038       8           0              2     1644167168   iZ2ze06q07wqluc8htj4pdZ 0         0          72156 2037648    1798583     761      1396      4525711360


9)启动grafana

[root@iZ2ze06q07wqluc8htj4pdZ ~]# systemctl start grafana-server

[root@iZ2ze06q07wqluc8htj4pdZ ~]# ps -elf | grep grafana

4 S grafana   5708     1  5  80   0 - 74671 futex_ 16:57 ?        00:00:00 /usr/sbin/grafan-server --config=/etc/grafana/grafana.ini --pidfile=/var/run/grafana/grafana-server.pid cfg:default.paths.logs=/var/log/grafana cfg:default.paths.data=/var/lib/grafana cfg:default.paths.plugins=/var/lib/grafana/plugins

0 S root      5725  1702  0  80   0 - 28169 pipe_w 16:57 pts/1    00:00:00 grep --color=auto grafana


连接3000端口可以看到

 

图片.png

 

使用admin/admin登录后添加influxdb数据源 

图片.png

点击Save&Test通过测试

 

10)制定报表

在Dashboard中创建一个新的Panel,数据源选择刚刚添加的SequoiaDB,然后在指标中配置好4个指标TotalDataRead、TotalDataWrite、TotalIndexRead、TotalIndexWrite。

 

图片.png

 

分别各做一个table和一个折线图

图片.png

 

11)小结

至此为止,我们通过使用InfluxDB与Telegraf,仅编写4行代码就将Grafana监控软件与SequoiaDB成功对接。感兴趣的读者甚至可以编写并提交针对SequoiaDB的Telegraf inputs插件,使得其他用户不需要编写脚本即可直接获得SequoiaDB的性能指标。



准备开始体验 SequoiaDB 巨杉数据库?