elasticsearch学习笔记(六)——快速入门案例实战之电商网站商品管理:多种搜索方式
简单介绍一下ES的多种搜索方式
1、query string search
格式:
GET /{index}/_search
GET /product/_search { "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 1.0, "hits" : [ { "_index" : "product", "_type" : "_doc", "_id" : "2", "_score" : 1.0, "_source" : { "name" : "jiajieshi yagao", "desc" : "youxiao fangzhu", "price" : 25, "producer" : "jiajieshi producer", "tags" : [ "fangzhu" ] } }, { "_index" : "product", "_type" : "_doc", "_id" : "3", "_score" : 1.0, "_source" : { "name" : "zhonghua yagao", "desc" : "caoben zhiwu", "price" : 40, "producer" : "zhonghua producer", "tags" : [ "qingxin" ] } } ] } }
简单见一下查询结果的各个值的含义:
took:耗费的时间 单位是毫秒
timed_out:是否超时
_shards: total是指打到的primary shard(或者replica shard)的个数,successful是指查询成功的分片数,skipped是指跳过的分片个数,failed是指查询失败的分片的个数
hits.total:value代表查询匹配的总数,relation代表The count is accurate (e.g. "eq" means equals).
hits.max_score:是指匹配的文档中相关度分数最高的
hits.hits:包含匹配搜索的document的详细数据
为什么叫做query string search ,主要是因为search参数都是以http请求的query string来附带的
例如搜索商品名称中包含yagao的商品,而且按照售价降序排列:
GET /product/_search?q=name:yagao&sort=price:desc { "took" : 36, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : null, "hits" : [ { "_index" : "product", "_type" : "_doc", "_id" : "3", "_score" : null, "_source" : { "name" : "zhonghua yagao", "desc" : "caoben zhiwu", "price" : 40, "producer" : "zhonghua producer", "tags" : [ "qingxin" ] }, "sort" : [ 40 ] }, { "_index" : "product", "_type" : "_doc", "_id" : "2", "_score" : null, "_source" : { "name" : "jiajieshi yagao", "desc" : "youxiao fangzhu", "price" : 25, "producer" : "jiajieshi producer", "tags" : [ "fangzhu" ] }, "sort" : [ 25 ] } ] } }
query string search适用于临时的在命令行使用的一些工具,比如curl,快速发出请求,来检索想要的信息。但是如果查询请求很复杂,就很难去构建搜索条件,在生产环境中很少使用。
2、query DSL
什么叫做DSL?
DSL:Domain Specified Language 特定领域语言
使用query DSL 查询时查询的参数采用的是请求体(http request body),可以用json的格式来构建查询语法,比较方便,可以构建各种复杂的语法。比query string search 肯定是强大多了
格式:
GET /{index}/{type}/_search { "json数据" }
下面都是实际的一些例子:
查询所有的商品:
GET /product/_search { "query": { "match_all": {} } }
查询名称中包含yagao的商品,同时按照价格降序排序:
GET /product/_search { "query": { "match": { "name": "yagao" } }, "sort": [ { "price": { "order": "desc" } } ] }
分页查询商品,总共3个商品,假设每一页就显示1条商品,现在显示第2页,所以就查出来第2个商品
GET /product/_search { "query": { "match_all": {} }, "from": 1, "size": 1 }
指定要查询出来的商品只返回名称和价格,也就是定制返回字段
GET /product/_search { "query": { "match_all": {} }, "_source": ["name", "price"] }
query DSL 更加适合生产环境使用,可以构建复杂的查询
3、query filter
搜索商品名称包含yagao,而且售价大于25元的商品
GET /product/_search { "query": { "bool": { "must": [ { "match": { "name": "yagao" } } ], "filter": { "range": { "price": { "gt": 25 } } } } } }
4、full-text search
GET /product/_search { "query": { "match": { "producer": "jiajieshi producer" } } } { "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 0.18232156, "hits" : [ { "_index" : "product", "_type" : "_doc", "_id" : "2", "_score" : 0.18232156, "_source" : { "name" : "jiajieshi yagao", "desc" : "youxiao fangzhu", "price" : 25, "producer" : "jiajieshi producer", "tags" : [ "fangzhu" ] } }, { "_index" : "product", "_type" : "_doc", "_id" : "3", "_score" : 0.18232156, "_source" : { "name" : "zhonghua yagao", "desc" : "caoben zhiwu", "price" : 40, "producer" : "zhonghua producer", "tags" : [ "qingxin" ] } } ] } }
为什么连zhonghua producer这个文档也被检索出来了,原因是producer这个字段一开始插入数据的时候,就会被拆解,建立倒排索引
jiajieshi 1
zhonghua 2
producer 1,2
搜索yagao producer的时候,会进行拆分变成yagao和producer
5、phrase search短语搜索
phrase search 跟全文检索相反,全文检索会将输入的搜索串拆解开来,去倒排索引里面一一去匹配,只要能匹配上任意一个拆解后的单词,就可以作为结果返回。但是phrase search要求输入的搜索串,必须在指定的字段文本中,完全包含一模一样的,才可以算匹配上了,作为结果返回。
GET /product/_search { "query": { "match_phrase": { "producer": "jiajieshi producer" } } } { "took" : 4, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 0.87546873, "hits" : [ { "_index" : "product", "_type" : "_doc", "_id" : "2", "_score" : 0.87546873, "_source" : { "name" : "jiajieshi yagao", "desc" : "youxiao fangzhu", "price" : 25, "producer" : "jiajieshi producer", "tags" : [ "fangzhu" ] } } ] } }
6、highlight search高亮搜索
GET /product/_search { "query": { "match_phrase": { "producer": "jiajieshi producer" } }, "highlight": { "fields": { "producer":{} } } } { "took" : 23, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 0.87546873, "hits" : [ { "_index" : "product", "_type" : "_doc", "_id" : "2", "_score" : 0.87546873, "_source" : { "name" : "jiajieshi yagao", "desc" : "youxiao fangzhu", "price" : 25, "producer" : "jiajieshi producer", "tags" : [ "fangzhu" ] }, "highlight" : { "producer" : [ "<em>jiajieshi</em> <em>producer</em>" ] } } ] } }
相关推荐
另外一部分,则需要先做聚类、分类处理,将聚合出的分类结果存入ES集群的聚类索引中。数据处理层的聚合结果存入ES中的指定索引,同时将每个聚合主题相关的数据存入每个document下面的某个field下。