SequoiaDB扩容的方法有很多,本文主要介绍了几种扩容实践及代码样例。
环境准备
建立数据域
db.createDomain("domainbefore",["datagroup1","datagroup2","datagroup3"], {AutoSplit:true});
建立集合空间
db.createCS("pri5_image",{Domain:"domainbefore"});
建立主表集合
db.pri5_image.createCL("atnnotate",{ShardingKey:{"dated":1},IsMainCL:true});
建立子表集合
db.pri5_image.createCL("atnnotate2016",{ShardingKey:{"name":1},ShardingType:"hash"}); db.pri5_image.createCL("atnnotate2017",{ShardingKey:{"name":1},ShardingType:"hash"});
attach子表集合
db.pri5_image.atnnotate.attachCL("pri5_image.atnnotate2016",{LowBound:{dated:"2016-01-01"},UpBound:{dated:"2017-01-01"}}); db.pri5_image.atnnotate.attachCL("pri5_image.atnnotate2017",{LowBound:{dated:"2017-01-01"},UpBound:{dated:"2018-01-01"}});
查看主表编目信息
> db.snapshot(SDB_SNAP_CATALOG,{"Name":"pri5_image.atnnotate"}); { …省略 "CataInfo": [ { "ID": 1, "SubCLName": "pri5_image.atnnotate2016", "LowBound": {"dated": "2016-01-01"}, "UpBound": {"dated": "2017-01-01"} }, { "ID": 2, "SubCLName": "pri5_image.atnnotate2017", "LowBound": { "dated": "2017-01-01" }, "UpBound": { "dated": "2018-01-01" } } ], "IsMainCL": true, "Name": "pri5_image.atnnotate", "ShardingKey": { "dated": 1 } …省略 }
此时主表包含两个子表,以dated为分区键
查看子表集合分布的数据组
子表pri5_image.atnnotate2017和pri5_image.atnnotate2016数据组信息一样,在此省略,此外省略了CataInfo的部分。
> db.snapshot(SDB_SNAP_CATALOG,{"Name":"pri5_image.atnnotate2016"}); {…省略 "CataInfo": [ { "GroupName": "datagroup1", "LowBound": {"": 0}, "UpBound": {"": 1365} }, { "GroupName": "datagroup2", "LowBound": {"": 1365}, "UpBound": {"": 2730} }, { "GroupName": "datagroup3", "LowBound": {"": 2730}, "UpBound": {"": 4096} } ], "MainCLName": "pri5_image.atnnotate", "Name": "pri5_image.atnnotate2016" …省略 }
各导入10000条dated 在2016年和2017年的数据。
查看数据量
> db.pri5_image.atnnotate.count(); 20000 > db.pri5_image.atnnotate2016.count(); 10000 > db.pri5_image.atnnotate2017.count(); 10000
准备扩容
新增数据组 datagroup4、datagroup5 和 datagroup6 添加至原数据域中
db.getDomain("domainbefore").alter({Groups:['datagroup1', 'datagroup2', 'datagroup3', 'datagroup4', 'datagroup5', 'datagroup6']})
新建子表
db.pri5_image.createCL("atnnotate2016_2",{ShardingKey:{"name":1},ShardingType:"hash"}); db.pri5_image.createCL("atnnotate2017_2",{ShardingKey:{"name":1},ShardingType:"hash"});
查看新建子表的编目信息
子表pri5_image.atnnotate2017_2和pri5_image.atnnotate2016_2数据组信息一样,在此省略
> db.snapshot(SDB_SNAP_CATALOG,{"Name":"pri5_image.atnnotate2016_2"}); { …省略 "CataInfo": [ { "GroupName": "datagroup1", "LowBound": {"": 0}, "UpBound": {"": 682} }, { "GroupName": "datagroup2", "LowBound": {"": 682}, "UpBound": {"": 1364} }, { "GroupName": "datagroup3", "LowBound": {"": 1364}, "UpBound": {"": 2046} }, { "GroupName": "datagroup4", "LowBound": {"": 2046}, "UpBound": {"": 2728} }, { "GroupName": "datagroup5", "LowBound": {"": 2728}, "UpBound": {"": 3410} }, { "GroupName": "datagroup6", "LowBound": {"": 3410}, "UpBound": {"": 4096} } ], "AutoSplit": true }
新的子表建立在了包含新增的数据组的总共6个数据组上。
开始扩容
创建管道文件pri5_image_atnnotate2016.json
mknod pri5_image_atnnotate2016.json p
导出原表pri5_image.atnnotate2016数据,导入到pri5_image.atnnotate2016_2
nohup sdbexprt -s localhost -p 11810 --type json --file ./pri5_image_atnnotate2016.json -c pri5_image -l atnnotate2016 --fields '_id,id,name,doubled,longd,boold,dated' & >/dev/null nohup sdbimprt -s localhost -p 11810 --type json --file ./pri5_image_atnnotate2016.json -c pri5_image -l atnnotate2016_2 --fields '_id oid,id int,name string,doubled double,longd long,boold bool,dated string' & >/dev/null
查看pri5_image.atnnotate2016_2数据量
> db.pri5_image.atnnotate2016_2.count(); 10000
按照id字段排序,各导出1000条pri5_image.atnnotate2016和pri5_image.atnnotate2016_2数据,进行diff,确认导出导入的数据是否一致
sdb "db.pri5_image.atnnotate2016.find().sort({"_id":1}).limit(1000)">>pri5_image_atnnotate2016.rec sdb "db.pri5_image.atnnotate2016_2.find().sort({"_id":1}).limit(1000)">>pri5_image_atnnotate2016_2.rec diff pri5_image_atnnotate2016.rec pri5_image_atnnotate2016_2.rec
detach原子表pri5_image.atnnotate2016,attach 新子表pri5_image.atnnotate2016
编写脚本delta-t.js,计算detach原表和attach新子表的耗时
var db = new Sdb(); var rdate = new Date(); var begin = rdate.getSeconds()*1000+rdate.getMilliseconds(); db.pri5_image.atnnotate.detachCL("pri5_image.atnnotate2016"); db.pri5_image.atnnotate.attachCL("pri5_image.atnnotate2016_2",{LowBound:{dated:"2016-01-01"},UpBound:{dated:"2017-01-01"}}); var ldate = new Date(); var end = ldate.getSeconds()*1000+ldate.getMilliseconds(); println(end-begin);
执行脚本
sdb –f delta-t.js
过程耗时41毫秒,detachCL旧表和attachCL新表过程很快,应用几乎无感
删除原子表pri5_image.atnnotate2016,释放空间给后面的子表数据迁移
db.pri5_image.dropCL("pri5_image.atnnotate2016")
进行下一个子表pri5_image.atnnotate2017的扩容,步骤和上述步骤一致,不再详述。
环境准备
和第一种方案环境准备章节一致,在以下部分的扩容步骤和第一种方案不同。
准备扩容
新增数据组datagroup4、datagroup5和datagroup6 添加至原数据域中
db.getDomain("domainbefore").alter({Groups:['datagroup1', 'datagroup2', 'datagroup3', 'datagroup4', 'datagroup5', 'datagroup6']})
新建子表
db.pri5_image.createCL("atnnotate2018",{ShardingKey:{"name":1},ShardingType:"hash",AutoSplit:false,Group:"datagroup4"});
查看子表编目信息,省略了部分信息
> db.snapshot(SDB_SNAP_CATALOG,{"Name":"pri5_image.atnnotate2018"}); { …省略 "Name": "pri5_image.atnnotate2018", "CataInfo": [ { "GroupName": "datagroup4", "LowBound": {"": 0}, "UpBound": {"": 4096} } ], "AutoSplit": false }
开始扩容
使用split拆分数据到数据组datagroup5和datagroup上
db.pri5_image.atnnotate2018.split("datagroup4","datagroup6",{"id":2730},{"id":4096}) db.pri5_image.atnnotate2018.split("datagroup4","datagroup5",{"id":1365},{"id":2730})
查看拆分后的子表编目信息
> db.snapshot(SDB_SNAP_CATALOG,{"Name":"pri5_image.atnnotate2018"}); {…省略 "AutoSplit": false, "CataInfo": [ { "GroupName": "datagroup4", "LowBound": {"": 0}, "UpBound": {"": 1365} }, { "GroupName": "datagroup5", "LowBound": {"": 1365}, "UpBound": {"": 2730} }, { "GroupName": "datagroup6", "LowBound": {"": 2730}, "UpBound": {"": 4096} } ], "Name": "pri5_image.atnnotate2018", …省略 }
新增的子表均匀切分到了新增的数据组上。
attach到主表
db.pri5_image.atnnotate.attachCL("pri5_image.atnnotate2018",{LowBound:{dated:"2018-01-01"},UpBound:{dated:"2019-01-01"}});
如果还有更多子表需要添加,重复以上步骤,进行扩容。
环境准备
和第一种方案环境准备章节一致,在以下部分的扩容步骤和第一种方案不同。
准备扩容
新建域
db.createDomain("domainafter",["datagroup4","datagroup5","datagroup6"], {AutoSplit:true});
建立集合空间
db.createCS("pri5_image_2",{Domain:"domainafter"});
创建子表
db.pri5_ image2.createCL("atnnotate2018",{ShardingKey:{"name":1},ShardingType:"hash"});
查看新建子表的编目信息
省略了部分输出信息
> db.snapshot(SDB_SNAP_CATALOG,{"Name":"pri5_image2.atnnotate2018"}); { "CataInfo": [ { "GroupName": "datagroup4", "LowBound": {"": 0}, "UpBound": {"": 1365} }, { "GroupName": "datagroup5", "LowBound": {"": 1365}, "UpBound": {"": 2730} }, { "GroupName": "datagroup6", "LowBound": {"": 2730}, "UpBound": {"": 4096} } ], "AutoSplit": true }
开始扩容
attach到主表
db.pri5_image.atnnotate.attachCL("pri5_image_2.atnnotate2018",{LowBound:{dated:"2018-01-01"},UpBound:{dated:"2019-01-01"}});
查看主表编目信息
> db.snapshot(SDB_SNAP_CATALOG,{"Name":"pri5_image.atnnotate"}); { "CataInfo": [ { "SubCLName": "pri5_image.atnnotate2017", "LowBound": {"dated": "2017-01-01"}, "UpBound": {"dated": "2018-01-01"} }, { "SubCLName": "pri5_image.atnnotate2016", "LowBound": {"dated": "2016-01-01"}, "UpBound": {"dated": "2017-01-01"} }, { "SubCLName": "pri5_image_2.atnnotate2018", "LowBound": {"dated": "2018-01-01"}, "UpBound": {"dated": "2019-01-01"} } ], "EnsureShardingIndex": true, "IsMainCL": true, "Name": "pri5_image.atnnotate", "ShardingKey": {"dated": 1}, "ShardingType": "range", }
新增的子表均匀切分到了新增的数据组上。
如果还有更多子表需要添加,重复以上步骤,进行扩容。
环境准备
创建集合空间及集合
>db.createCS("foo").createCL("bar",{ShardingKey:{"id":1},ShardingType:"hash"});
查看编目信息
> db.snapshot(SDB_SNAP_CATALOG,{"Name":"foo.bar"}); { "CataInfo": [ { "GroupName": "datagroup3", "LowBound": {"": 0}, "UpBound": {"": 4096} } ] }
开始扩容
开始切分
> db.foo.bar.split("datagroup3","datagroup1",{"id":0},{"id":2048});
查看编目信息
> db.snapshot(SDB_SNAP_CATALOG,{"Name":"foo.bar"}); > db.snapshot(SDB_SNAP_CATALOG,{"Name":"foo.bar"}); { "CataInfo": [ { "GroupName": "datagroup1", "LowBound": {"": 0}, "UpBound": {"": 2048} }, { "GroupName": "datagroup3", "LowBound": {"": 2048}, "UpBound": {"": 4096} } ] }
分区键在0-2048范围内的被切分到了datagroup1上。