本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《
阿里云开发者社区用户服务协议
》和
《
阿里云开发者社区知识产权保护指引
》。如果您发现本社区中有涉嫌抄袭的内容,填写
侵权投诉表单
进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。
Elasticsearch 是一个开源的分布式 RESTful 搜索和分析引擎。它可以在近实时条件下,存储,查询和分析海量的数据。它还支持将快照备份至HDFS/S3上面,而阿里云OSS兼容S3的API,本文将介绍如何使用ES的Repository-S3插件将快照备份至OSS。
部署与配置
首先,我们需要安装repository-s3,可以参考官方文档:
https://www.elastic.co/guide/en/elasticsearch/plugins/7.2/repository-s3.html
启动ES,我们可以从log中看到,ES已经load了这个plugin:
[2019-07-15T14:12:09,225][INFO ][o.e.p.PluginsService ] [master] loaded module [aggs-matrix-stats]
[2019-07-15T14:12:09,225][INFO ][o.e.p.PluginsService ] [master] loaded module [analysis-common]
[2019-07-15T14:12:09,225][INFO ][o.e.p.PluginsService ] [master] loaded module [ingest-common]
[2019-07-15T14:12:09,226][INFO ][o.e.p.PluginsService ] [master] loaded module [ingest-geoip]
[2019-07-15T14:12:09,226][INFO ][o.e.p.PluginsService ] [master] loaded module [ingest-user-agent]
[2019-07-15T14:12:09,226][INFO ][o.e.p.PluginsService ] [master] loaded module [lang-expression]
[2019-07-15T14:12:09,226][INFO ][o.e.p.PluginsService ] [master] loaded module [lang-mustache]
[2019-07-15T14:12:09,227][INFO ][o.e.p.PluginsService ] [master] loaded module [lang-painless]
[2019-07-15T14:12:09,227][INFO ][o.e.p.PluginsService ] [master] loaded module [mapper-extras]
[2019-07-15T14:12:09,227][INFO ][o.e.p.PluginsService ] [master] loaded module [parent-join]
[2019-07-15T14:12:09,227][INFO ][o.e.p.PluginsService ] [master] loaded module [percolator]
[2019-07-15T14:12:09,227][INFO ][o.e.p.PluginsService ] [master] loaded module [rank-eval]
[2019-07-15T14:12:09,228][INFO ][o.e.p.PluginsService ] [master] loaded module [reindex]
[2019-07-15T14:12:09,228][INFO ][o.e.p.PluginsService ] [master] loaded module [repository-url]
[2019-07-15T14:12:09,228][INFO ][o.e.p.PluginsService ] [master] loaded module [transport-netty4]
[2019-07-15T14:12:09,228][INFO ][o.e.p.PluginsService ] [master] loaded plugin [repository-s3]
[2019-07-15T14:12:12,375][INFO ][o.e.d.DiscoveryModule ] [master] using discovery type [zen] and seed hosts providers [settings]
[2019-07-15T14:12:12,801][INFO ][o.e.n.Node ] [master] initialized
[2019-07-15T14:12:12,802][INFO ][o.e.n.Node ] [master] starting ...
然后,我们需要将OSS使用的Access Key和Secret Key配置到ES去,分别执行下面的命令:
bin/elasticsearch-keystore add s3.client.default.access_key
bin/elasticsearch-keystore add s3.client.default.secret_key
首先,我们创建一个备份:
[root@master ~]# curl -XPUT 'http://localhost:9200/_snapshot/test' -H 'Content-Type: application/json' -d '{ "type": "s3", "settings": { "bucket": "hadoop-oss-test", "endpoint": "oss-cn-zhangjiakou-internal.aliyuncs.com"} }'
{"acknowledged":true}
NOTE:
上面的命令默认使用https协议来传输数据,如果想使用http协议,需要将
"protocol": "http", "disable_chunked_encoding": true
加到
settings
里面(这个特性将会在新版本发布后可用)。
可以使用下面的命令来确实创建是否成功:
[root@master ~]# curl -XGET localhost:9200/_snapshot/test?pretty
"test" : {
"type" : "s3",
"settings" : {
"bucket" : "hadoop-oss-test",
"endpoint" : "oss-cn-zhangjiakou-internal.aliyuncs.com"
我们可以写入一些测试数据到ES,然后看下目前集群的索引信息:
[root@master ~]# curl -X GET "localhost:9200/_cat/indices?v"
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open sales 89ouBy6RQsuT34QRbn_jeQ 10 0 271786 0 15mb 15mb
green open customer fQCMEvXsQOu0UgMm1SAJlA 5 0 10000 0 717kb 717kb
假设我们只备份sales索引:
[root@master ~]# curl -XPUT 'http://localhost:9200/_snapshot/test/sales' -H 'Content-Type: application/json' -d '{ "indices": "sales" }'
{"accepted":true}
然后我们可以从OSS控制台看到备份的结果:
![_2019_07_15_2_23_28 _2019_07_15_2_23_28](https://yqfile.alicdn.com/6fe521a30883041eb96e59d82523d0b2dbef2c23.png)
现在我们再往sales索引里面写一些数据:
[root@master ~]# curl -X GET "localhost:9200/_cat/indices?v"
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open sales 89ouBy6RQsuT34QRbn_jeQ 10 0 281502 0 15.6mb 15.6mb
green open customer fQCMEvXsQOu0UgMm1SAJlA 5 0 10000 0 717kb 717kb
我们利用刚才备份到OSS的快照来恢复sales索引,分别执行下面的命令:
[root@master ~]# curl -XPOST localhost:9200/sales/_close
{"acknowledged":true,"shards_acknowledged":true,"indices":{"sales":{"closed":true}}}
[root@master ~]# curl -XPOST 'http://localhost:9200/_snapshot/test/sales/_restore?pretty'
"accepted" : true
[root@master ~]# curl -X GET "localhost:9200/_cat/indices?v"
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open sales 89ouBy6RQsuT34QRbn_jeQ 10 0 271786 0 15mb 15mb
green open customer fQCMEvXsQOu0UgMm1SAJlA 5 0 10000 0 717kb 717kb
我们可以看到,sales索引跟之前的一致。
https://www.elastic.co/guide/en/elasticsearch/plugins/7.2/repository-s3.html
https://www.elastic.co/cn/products/elasticsearch
elasticsearch备份工具由elasticsearch-dump实现
官网:https://github.com/elasticsearch-dump/elasticsearch-dump
2.安装elasticsearch-dump
开箱即用-OSS无代理备份
OSS对象存储作为阿里云上海量,安全,低成本,高可靠的云存储服务,越来越多的数据放在OSS存储上。随着数据量的增长,误删除,数据错误修改等操作所造成的损失也越加巨大。
混合云备份(HBR)最近推出OSS备份无代理服务,能够非常轻松的将OSS当中的数据保护起来。