Home | 简体中文 | 繁体中文 | 杂文 | 打赏(Donations) | ITEYE 博客 | OSChina 博客 | Facebook | Linkedin | 知乎专栏 | Search | About

第 8 章 Elasticsearch

目录

8.1. 安装 Elasticsearch
8.1.1. 单机模式 (适用于开发环境)
8.1.2. Elasticsearch Cluster
8.1.3. 负载均衡配置
8.1.4. 安装指定版本的 Elasticsearch
8.1.5. Plugin
8.1.5.1. elasticsearch-analysis-ik
8.1.5.2. elasticsearch-analysis-pinyin
8.2. 管理
8.2.1. 查看索引
8.2.2. 节点健康状态
8.2.3. 节点http状态
8.2.4. 查看master节点
8.2.5. 查看索引的节点分布
8.2.6. 索引的开启与关闭
8.2.6.1. _open
8.2.6.2. _close
8.3. 文档API
8.3.1. 快速上手
8.3.2. 写入 PUT/POST
8.3.3. 获取 GET
8.3.3.1. _source
8.3.4. 检查记录是否存在
8.3.5. 删除 Delete
8.3.6. 参数
8.3.6.1. pretty 格式化 json
8.4. 搜索
8.4.1. URL 搜索
8.4.2. 分页
8.5. Query DSL
8.5.1. match 匹配
8.5.2. multi_match 多字段匹配
8.5.3. Query bool 布尔条件
8.5.3.1. must
8.5.3.2. should
8.5.3.3. must_not
8.5.4. filter 过滤
8.5.5. sort 排序
8.5.6. _source
8.5.7. highlight 高亮处理
8.6. 中文分词插件管理
8.6.1. 通过 elasticsearch-plugin 命令安装分词插件
8.6.2. 手工安装插件
8.6.3. 创建索引
8.6.4. 删除索引
8.6.5. 配置索引分词插件
8.6.5.1. 测试分词效果
8.7. 映射
8.7.1. 查看 _mapping
8.7.2. 删除 _mapping
8.7.3. 创建 _mapping
8.7.4. 更新 mapping
8.7.5. 修改 _mapping
8.7.6. 数据类型
8.7.6.1. date
8.8. Alias management 别名管理
8.8.1. 查看索引别名
8.8.2. 创建索引别名
8.8.3. 修改别名
8.8.4. 删除别名
8.9. Example
8.9.1. 新闻资讯应用案例
8.9.2. 文章搜索案例
8.9.2.1.
8.10. Migrating MySQL Data into Elasticsearch using logstash
8.10.1. 安装 logstash
8.10.2. 配置 logstash
8.10.3. 启动 Logstash
8.10.4. 验证
8.10.5. 配置模板
8.10.5.1. 全量导入
8.10.5.2. 多表导入
8.10.5.3. 通过 ID 主键字段增量复制数据
8.10.5.4. 通过日期字段增量复制数据
8.10.5.5. 指定SQL文件
8.10.5.6. 参数传递
8.10.5.7. 控制返回JDBC数据量
8.10.5.8. 输出到不同的 Elasticsearch 中
8.10.5.9. 日期格式转换
8.10.5.10. example
8.10.6. 解决数据不对称问题
8.10.7. 修改 Mapping
8.11. 安装 Elasticsearch 2.3
8.11.1. RPM 安装
8.11.2. YUM 安装
8.11.3. 测试安装是否正常
8.11.4. Plugin 插件管理
8.11.4.1. 手工安装插件
8.11.4.2. plugin 命令
8.11.4.3. 插件测试
8.12. FAQ
8.12.1. Plugin [analysis-ik] is incompatible with Elasticsearch [2.3.5]. Was designed for version [2.3.4]
8.12.2. plugin [analysis-ik] is incompatible with version [5.6.1]; was designed for version [5.5.2]
8.12.3. mapper_parsing_exception: failed to parse [ctime]
8.12.4. 配置 JAVA_HOME

http://www.elasticsearch.org/

8.1. 安装 Elasticsearch

8.1.1. 单机模式 (适用于开发环境)

使用 Netkiller OSCM 一键安装 Elasticsearch 5.6.0

# Java
curl -s https://raw.githubusercontent.com/oscm/shell/master/lang/java/openjdk/java-1.8.0-openjdk.sh | bash

# Install
curl -s https://raw.githubusercontent.com/oscm/shell/master/search/elasticsearch/elasticsearch-5.x.sh | bash

# Bind 0.0.0.0
curl -s https://raw.githubusercontent.com/oscm/shell/master/search/elasticsearch/network.bind_host.sh | bash

# Auto create index
curl -s https://raw.githubusercontent.com/oscm/shell/master/search/elasticsearch/action.auto_create_index.sh | bash

# elasticsearch-analysis-ik

curl -s https://raw.githubusercontent.com/oscm/shell/master/search/elasticsearch/5.5/elasticsearch-analysis-ik-5.6.0.sh | bash
			

通常 elasticsearch-analysis-ik 的版本会比 elasticsearch 慢一个版本,所以请使用下面命令查看版本是否一致,如果不一致可以修改 plugin-descriptor.properties 配置文件,使其一致。

root@netkiller /usr/share/elasticsearch/plugins/ik % grep ^version plugin-descriptor.properties
version=5.5.1
			

启动后使用 jps 命令检查进城是否工作正常

root@netkiller /var/log/elasticsearch % jps | grep Elasticsearch
9706 Elasticsearch

root@netkiller /var/log/elasticsearch % ss -lnt | grep 9200
LISTEN     0      128    127.0.0.1:9200                     *:*
			

8.1.2. Elasticsearch Cluster

集群模式需要两个以上的节点,通常是一个 master 节点,多个 data 节点

首先在所有节点上安装 elasticsearch,然后配置各节点的配置文件,对于 5.5.1 不需要配置决定哪些节点属于 master 节点 或者 data 节点。

curl -s https://raw.githubusercontent.com/oscm/shell/master/search/elasticsearch/elasticsearch-5.x.sh | bash			
			

配置文件

cluster.name: elasticsearch-cluster # 配置集群名称,所有服务器服务器保持一致

node.name: node-1 # 每个节点唯一标识,每个节点只需改动这里,一次递增 node-1, node-2, node-3 ...

network.host: 0.0.0.0

discovery.zen.ping.unicast.hosts: ["172.16.0.20", "172.16.0.21","172.16.0.22"]  # 所有节点的IP 地址写在这里

discovery.zen.minimum_master_nodes: 3 # 可以作为master的节点总数,有多少个节点就写多少

http.cors.enabled: true
http.cors.allow-origin: "*"
			

查看节点状态,使用curl工具: curl 'http://localhost:9200/_nodes/process?pretty'

root@netkiller /var/log/elasticsearch % curl 'http://localhost:9200/_nodes/process?pretty'
{
  "_nodes" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "cluster_name" : "my-application",
  "nodes" : {
    "-lnKCmBXRpiwExLns0jc9g" : {
      "name" : "node-1",
      "transport_address" : "10.104.3.2:9300",
      "host" : "10.104.3.2",
      "ip" : "10.104.3.2",
      "version" : "5.5.1",
      "build_hash" : "19c13d0",
      "roles" : [
        "master",
        "data",
        "ingest"
      ],
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 23669,
        "mlockall" : false
      }
    },
    "WVsgYi2HT8GWnZU1kUwFwA" : {
      "name" : "node-2",
      "transport_address" : "10.186.7.221:9300",
      "host" : "10.186.7.221",
      "ip" : "10.186.7.221",
      "version" : "5.5.1",
      "build_hash" : "19c13d0",
      "roles" : [
        "master",
        "data",
        "ingest"
      ],
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 12641,
        "mlockall" : false
      }
    }
  }
}
			

启动节点后回生成 cluster.name 为文件名的日志文件。

谁先启动谁讲成为master

[2017-08-11T17:42:46,018][INFO ][o.e.c.s.ClusterService   ] [node-1] new_master {node-1}{-lnKCmBXRpiwExLns0jc9g}{rZcJDIynSzq2Td3yP2kN5A}{10.104.3.2}{10.104.3.2:9300}, added {{node-2}{WVsgYi2HT8GWnZU1kUwFwA}{X13ShUpAQa2zA1Mgcsm3bQ}{10.186.7.221}{10.186.7.221:9300},}, reason: zen-disco-elected-as-master ([1] nodes joined)[{node-2}{WVsgYi2HT8GWnZU1kUwFwA}{X13ShUpAQa2zA1Mgcsm3bQ}{10.186.7.221}{10.186.7.221:9300}]			
			

如果master出现故障,其他节点会接管

[2017-08-11T17:44:52,797][INFO ][o.e.c.s.ClusterService   ] [node-2] master {new {node-2}{WVsgYi2HT8GWnZU1kUwFwA}{vl8kQx8sQdGVVohrNQnZOQ}{10.186.7.221}{10.186.7.221:9300}}, removed {{node-1}{-lnKCmBXRpiwExLns0jc9g}{rZcJDIynSzq2Td3yP2kN5A}{10.104.3.2}{10.104.3.2:9300},}, added {{node-1}{-lnKCmBXRpiwExLns0jc9g}{odnoG9kpQpeX1ltx5KYTSw}{10.104.3.2}{10.104.3.2:9300},}, reason: zen-disco-elected-as-master ([1] nodes joined)[{node-1}{-lnKCmBXRpiwExLns0jc9g}{odnoG9kpQpeX1ltx5KYTSw}{10.104.3.2}{10.104.3.2:9300}]
[2017-08-11T17:44:53,184][INFO ][o.e.c.r.DelayedAllocationService] [node-2] scheduling reroute for delayed shards in [59.5s] (11 delayed shards)
[2017-08-11T17:44:53,929][INFO ][o.e.c.r.a.AllocationService] [node-2] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[information][0]] ...]).		
			

master 节点恢复上线会提示

[2017-08-11T17:44:52,855][INFO ][o.e.c.s.ClusterService   ] [node-1] detected_master {node-2}{WVsgYi2HT8GWnZU1kUwFwA}{vl8kQx8sQdGVVohrNQnZOQ}{10.186.7.221}{10.186.7.221:9300}, added {{node-2}{WVsgYi2HT8GWnZU1kUwFwA}{vl8kQx8sQdGVVohrNQnZOQ}{10.186.7.221}{10.186.7.221:9300},}, reason: zen-disco-receive(from master [master {node-2}{WVsgYi2HT8GWnZU1kUwFwA}{vl8kQx8sQdGVVohrNQnZOQ}{10.186.7.221}{10.186.7.221:9300} committed version [44]])
			

8.1.3. 负载均衡配置

首先安装 nginx, 这里使用 Netkiller OSCM 一键安装脚本完成。

# curl -s https://raw.githubusercontent.com/oscm/shell/master/web/nginx/stable/nginx.sh | bash
			

因为 elasticsearch 没有用户认证机制我们通常在内网访问他。如果对外提供服务需要增加用户认证。

			
# printf "neo:$(openssl passwd -crypt s3cr3t)n" > /etc/nginx/passwords 			
			
			

创建 nginx 配置文件 /etc/nginx/conf.d/elasticsearch.conf

upstream elasticsearch {
	server 172.16.0.10:9200;
	server 172.16.0.20:9200;
	server 172.16.0.30:9200;

	keepalive 15;
}

server {
	listen 9200;
	server_name so.netkiller.cn;
	
	charset utf-8;
    access_log /var/log/nginx/so.netkiller.cn.access.log;
    error_log /var/log/nginx/so.netkiller.cn.error.log;
	
	auth_basic "Protected Elasticsearch";
	auth_basic_user_file passwords;

	location ~* ^(/_cluster|/_nodes) {
		return 403;
		break;
	}
    location ~* _(open|close) {
            return 403;
            break;
    }
	location / {
    
		if ($request_filename ~ _shutdown) {
		    return 403;
		    break;
		}

        if ($request_method !~ ^(GET|HEAD|POST)$) {
			return 403;
		}

		proxy_pass http://elasticsearch;
		proxy_http_version 1.1;
		proxy_set_header Connection "Keep-Alive";
		proxy_set_header Proxy-Connection "Keep-Alive";
	}

}
			

反复使用下面方法请求,最终你会发现 total_opened 会达到你的nginx 配置数量

$ curl 'http://test:test@localhost:9200/_nodes/stats/http?pretty' | grep total_opened
# "total_opened" : 15			
			

上面的例子适用于绝大多数场景。

例 8.1. Elasticsearch master / slave

				
upstream elasticsearch {
	server 172.16.0.10:9200;
	server 172.16.0.20:9200 backup;

	keepalive 15;
}

server {
	listen 9200;
	server_name so.netkiller.cn;
	
	auth_basic "Protected Elasticsearch";
	auth_basic_user_file passwords;

	location ~* ^(/_cluster|/_nodes) {
		return 403;
		break;
	} 

	location / {
    
		if ($request_filename ~ _shutdown) {
		    return 403;
		    break;
		}
		if ($request_method !~ "HEAD") {
          return 403;
          break;
        }
        if ($request_method ~ "DELETE") {
          return 403;
          break;
        }

		proxy_pass http://elasticsearch;
		proxy_http_version 1.1;
		proxy_set_header Connection "Keep-Alive";
		proxy_set_header Proxy-Connection "Keep-Alive";
	}

}
				
				

通过 limit_except 可以控制访问权限,例如删除操作。

			
limit_except PUT {
	allow 192.168.1.1;
	deny all;
}
limit_except DELETE {
	allow 192.168.1.1;
	deny all;
}
			
			

8.1.4. 安装指定版本的 Elasticsearch

使用 yum 安装默认为最新版本,我们常常会遇到一个问题 elasticsearch-analysis-ik 的版本晚于 Elasticsearch。如果使用 yum 安装 Elasticsearch 可能 elasticsearch-analysis-ik 插件不支持这个版本,有些版本的 elasticsearch-analysis-ik 可以修改插件配置文件中的版本号,使其与elasticsearch版本相同,可以欺骗 elasticsearch 跳过版本不一致异常。

最佳的解决方案是去 elasticsearch-analysis-ik github 找到兼容的版本,安装我们安装 elasticsearch-analysis-ik 的版本需求来指定安装 elasticsearch

Versions

IK version	ES version
master	5.x -> master
5.6.0	5.6.0
5.5.3	5.5.3
5.4.3	5.4.3
5.3.3	5.3.3
5.2.2	5.2.2
5.1.2	5.1.2
1.10.1	2.4.1
1.9.5	2.3.5
1.8.1	2.2.1
1.7.0	2.1.1
1.5.0	2.0.0
1.2.6	1.0.0
1.2.5	0.90.x
1.1.3	0.20.x
1.0.0	0.16.2 -> 0.19.0			
			

最新版是 elasticsearch 5.6.1 但分词插件 elasticsearch-analysis-ik 仅能支持到 elasticsearch 版本是 5.6.0

root@netkiller /var/log % yum --showduplicates list elasticsearch | expand | tail
Repository epel is listed more than once in the configuration  
elasticsearch.noarch                 5.5.3-1                  elasticsearch-5.x     
elasticsearch.noarch                 5.6.0-1                  elasticsearch-5.x   
elasticsearch.noarch                 5.6.1-1                  elasticsearch-5.x 
			

安装 5.6.0

# yum install elasticsearch-5.6.0-1

Loaded plugins: fastestmirror, langpacks
Repository epel is listed more than once in the configuration
Loading mirror speeds from cached hostfile
Resolving Dependencies
--> Running transaction check
---> Package elasticsearch.noarch 0:5.6.0-1 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

==========================================================================================================================================================================================================
 Package                                            Arch                                        Version                                      Repository                                              Size
==========================================================================================================================================================================================================
Installing:
 elasticsearch                                      noarch                                      5.6.0-1                                      elasticsearch-5.x                                       32 M

Transaction Summary
==========================================================================================================================================================================================================
Install  1 Package

Total download size: 32 M
Installed size: 36 M
Is this ok [y/d/N]: y
			

8.1.5. Plugin

Elasticsearch 提供了插件管理命令 elasticsearch-plugin

root@netkiller ~ % /usr/share/elasticsearch/bin/elasticsearch-plugin -h
A tool for managing installed elasticsearch plugins

Commands
--------
list - Lists installed elasticsearch plugins
install - Install a plugin
remove - removes a plugin from Elasticsearch

Non-option arguments:
command              

Option         Description        
------         -----------        
-h, --help     show help          
-s, --silent   show minimal output
-v, --verbose  show verbose output			
			

8.1.5.1. elasticsearch-analysis-ik

安装插件

root@netkiller ~ % /usr/share/elasticsearch/bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.5.1/elasticsearch-analysis-ik-5.5.1.zip
-> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.5.1/elasticsearch-analysis-ik-5.5.1.zip
[=================================================] 100%   
-> Installed analysis-ik
				
curl -XPOST http://localhost:9200/index/fulltext/_mapping -d'
{
        "properties": {
            "content": {
                "type": "text",
                "analyzer": "ik_max_word",
                "search_analyzer": "ik_max_word"
            }
        }
    
}'			
				

8.1.5.2. elasticsearch-analysis-pinyin

https://github.com/medcl/elasticsearch-analysis-pinyin