ElasticSearch多种搜索方式
原文链接:ElasticSearch多种搜索方式
一、Query String Search(‘Query String’方式的搜索)
1.搜索全部商品
GET /shop_index/productInfo/_search
返回结果:
{ "took": 8, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 3, "max_score": 1, "hits": [ { "_index": "shop_index", "_type": "productInfo", "_id": "2", "_score": 1, "_source": { "test": "test" } }, { "_index": "shop_index", "_type": "productInfo", "_id": "zyWpRGkB8mgaHjxk0Hfo", "_score": 1, "_source": { "name": "HuaWei P20", "desc": "Expen but easy to use", "price": 5300, "producer": "HuaWei Producer", "tags": [ "Expen", "Fast" ] } }, { "_index": "shop_index", "_type": "productInfo", "_id": "1", "_score": 1, "_source": { "name": "HuaWei Mate8", "desc": "Cheap and easy to use", "price": 2500, "producer": "HuaWei Producer", "tags": [ "Cheap", "Fast" ] } } ] } }
字段解释:
took:耗费了几毫秒 timed_out:是否超时,这里是没有 _shards:数据被拆到了5个分片上,搜索时使用了5个分片,5个分片都成功地返回了数据,失败了0个,跳过了0个 hits.total:查询结果的数量,3个document max_score:就是document对于一个search的相关度的匹配分数,越相关,就越匹配,分数也越高 hits.hits:包含了匹配搜索的document的详细数据
2.搜索商品名称中包含HuaWei的商品,而且按照售价降序排序:
下面这种方法也是"Query String Search"的由来,因为search参数都是以http请求的query string来附带的.
GET /shop_index/productInfo/_search?q=name:HuaWei&sort=price:desc
返回结果:
{ "took": 23, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": null, "hits": [ { "_index": "shop_index", "_type": "productInfo", "_id": "zyWpRGkB8mgaHjxk0Hfo", "_score": null, "_source": { "name": "HuaWei P20", "desc": "Expen but easy to use", "price": 5300, "producer": "HuaWei Producer", "tags": [ "Expen", "Fast" ] }, "sort": [ 5300 ] }, { "_index": "shop_index", "_type": "productInfo", "_id": "1", "_score": null, "_source": { "name": "HuaWei Mate8", "desc": "Cheap and easy to use", "price": 2500, "producer": "HuaWei Producer", "tags": [ "Cheap", "Fast" ] }, "sort": [ 2500 ] } ] } }
二、Query DSL(DSL: Domain Specified Language,特定领域的语言)
这种方法是通过一个json格式的http request body请求体作为条件,可以完成多种复杂的查询需求,比query string的功能更加强大
1.搜索所有商品
GET /shop_index/productInfo/_search { "query": { "match_all": {} } }
返回结果省略...
2.查询名称中包含HuaWei的商品,并且按照价格降序排列
GET /shop_index/productInfo/_search { "query": { "match": { "name": "HuaWei" } }, "sort": [ { "price": { "order": "desc" } } ] }
返回结果省略...
3.分页查询第二页,每页1条记录
GET /shop_index/productInfo/_search { "query": { "match_all": {} }, "from": 1, "size": 1 }
返回结果:
{ "took": 6, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 3, "max_score": 1, "hits": [ { "_index": "shop_index", "_type": "productInfo", "_id": "zyWpRGkB8mgaHjxk0Hfo", "_score": 1, "_source": { "name": "HuaWei P20", "desc": "Expen but easy to use", "price": 5300, "producer": "HuaWei Producer", "tags": [ "Expen", "Fast" ] } } ] } }
注意:
(1)在实际项目中,如果有条件查询之后再需要分页,不需要单独查询总条数,ES会返回满足条件的总条数,可以直接使用;
(2)ES的分页默认from是从0开始的;
4.只查询特定字段,比如:name,desc和price字段,其他字段不需要返回
GET /shop_index/productInfo/_search { "query": { "match": { "name": "HuaWei" } }, "_source": ["name","desc","price"] }
返回结果:
{ "took": 27, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 0.2876821, "hits": [ { "_index": "shop_index", "_type": "productInfo", "_id": "zyWpRGkB8mgaHjxk0Hfo", "_score": 0.2876821, "_source": { "price": 5300, "name": "HuaWei P20", "desc": "Expen but easy to use" } }, { "_index": "shop_index", "_type": "productInfo", "_id": "1", "_score": 0.2876821, "_source": { "price": 2500, "name": "HuaWei Mate8", "desc": "Cheap and easy to use" } } ] } }
三.Query Filter(对查询结果进行过滤)
比如:查询名称中包含HuaWei,并且价格大于4000的商品记录:
GET /shop_index/productInfo/_search { "query": { "bool": { "must": [ { "match": { "name": "HuaWei" } } ], "filter": { "range": { "price": { "gt": 4000 } } } } } }
返回结果:
{ "took": 195, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.2876821, "hits": [ { "_index": "shop_index", "_type": "productInfo", "_id": "zyWpRGkB8mgaHjxk0Hfo", "_score": 0.2876821, "_source": { "name": "HuaWei P20", "desc": "Expen but easy to use", "price": 5300, "producer": "HuaWei Producer", "tags": [ "Expen", "Fast" ] } } ] } }
四、全文索引(Full-Text Search)
搜索生产厂商字段中包含"HuaWei MateProducer"的商品记录:
GET /shop_index/productInfo/_search { "query": { "match": { "producer": "HuaWei MateProducer" } } }
返回结果:
{ "took": 8, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 4, "max_score": 0.5753642, "hits": [ { "_index": "shop_index", "_type": "productInfo", "_id": "SiUBRWkB8mgaHjxkJHyS", "_score": 0.5753642, "_source": { "name": "HuaWei Mate10", "desc": "Cheap and Beauti", "price": 2300, "producer": "HuaWei MateProducer", "tags": [ "Cheap", "Beauti" ] } }, { "_index": "shop_index", "_type": "productInfo", "_id": "1", "_score": 0.2876821, "_source": { "name": "HuaWei Mate8", "desc": "Cheap and easy to use", "price": 2500, "producer": "HuaWei Producer", "tags": [ "Cheap", "Fast" ] } }, { "_index": "shop_index", "_type": "productInfo", "_id": "zyWpRGkB8mgaHjxk0Hfo", "_score": 0.18232156, "_source": { "name": "HuaWei P20", "desc": "Expen but easy to use", "price": 5300, "producer": "HuaWei Producer", "tags": [ "Expen", "Fast" ] } }, { "_index": "shop_index", "_type": "productInfo", "_id": "CSX8RGkB8mgaHjxkV3w1", "_score": 0.18232156, "_source": { "name": "HuaWei nova 4e", "desc": "cheap and look nice", "price": 1999, "producer": "HuaWei Producer", "tags": [ "Cheap", "Nice" ] } } ] } }
从以上结果中可以看到:
id为"SiUBRWkB8mgaHjxkJHyS"的记录score分数最高,表示匹配度最高;
原因:
producer分完词之后包括的词语有:
(1).HuaWei:
匹配到改词的记录ID:‘SiUBRWkB8mgaHjxkJHyS‘,‘1‘,‘CSX8RGkB8mgaHjxkV3w1‘,‘zyWpRGkB8mgaHjxk0Hfo‘
(2).MateProducer:
匹配到该词的记录ID:‘SiUBRWkB8mgaHjxkJHyS‘
由于"HuaWei MateProducer"两次匹配到ID为‘SiUBRWkB8mgaHjxkJHyS‘的记录,所以该记录的score分数最高。
五、Phrase Search(短语搜索)
短语索引和全文索引的区别:
(1)全文匹配:将要搜索的内容分词,然后挨个单词去倒排索引中匹配,只要匹配到任意一个单词,就算是匹配到记录;
(2)短语索引:输入的搜索串,必须在指定的字段内容中,完全包含一模一样的,才可以算匹配,才能作为结果返回;
例如:搜索name中包含"HuaWei MateProducer"短语的商品信息:
GET /shop_index/productInfo/_search { "query": { "match_phrase": { "producer": "HuaWei MateProducer" } } }
返回结果:
{ "took": 158, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.5753642, "hits": [ { "_index": "shop_index", "_type": "productInfo", "_id": "SiUBRWkB8mgaHjxkJHyS", "_score": 0.5753642, "_source": { "name": "HuaWei Mate10", "desc": "Cheap and Beauti", "price": 2300, "producer": "HuaWei MateProducer", "tags": [ "Cheap", "Beauti" ] } } ] } }
可以看到只有包含"HuaWei MateProducer"的记录才被返回。
六、Highlight Search(搜索高亮显示)
高亮搜索指的是搜索的结果中,将某些特别需要强调的词使用特定的样式展示出来。
例如:搜索商品名称中包含"Xiao‘Mi"的商品,并将搜索的关键词高亮显示:
GET /shop_index/productInfo/_search { "query": { "match": { "name": "Xiao‘Mi" } }, "highlight": { "fields": { "name": {} } } }
返回结果:
{ "took": 348, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.2876821, "hits": [ { "_index": "shop_index", "_type": "productInfo", "_id": "HiX9RGkB8mgaHjxk4nxC", "_score": 0.2876821, "_source": { "name": "Xiao‘Mi 9", "desc": "Expen but nice and Beauti", "price": 3500, "producer": "XiaoMi Producer", "tags": [ "Expen", "Beauti" ] }, "highlight": { "name": [ "<em>Xiao‘Mi</em> 9" ] } } ] } }
可以看到,"Xiao‘Mi"使用了标签返回了,可以在HTML中直接以斜体展示。
如果想使用自定义高亮样式,可以使用pre_tags和post_tags进行自定义,比如:想使用红色展示,如下所示:
GET /shop_index/productInfo/_search { "query": { "match": { "name": "Xiao‘Mi" } }, "highlight": { "fields": { "name": {} }, "pre_tags": [ "<em style=‘color:red;‘>" ], "post_tags": [ "</em>" ] } }
返回结果:
{ "took": 10, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.2876821, "hits": [ { "_index": "shop_index", "_type": "productInfo", "_id": "HiX9RGkB8mgaHjxk4nxC", "_score": 0.2876821, "_source": { "name": "Xiao‘Mi 9", "desc": "Expen but nice and Beauti", "price": 3500, "producer": "XiaoMi Producer", "tags": [ "Expen", "Beauti" ] }, "highlight": { "name": [ "<em style=‘color:red;‘>Xiao‘Mi</em> 9" ] } } ] } }
返回结果中的搜索关键字使用表示红色的css样式展示出来。
相关推荐
另外一部分,则需要先做聚类、分类处理,将聚合出的分类结果存入ES集群的聚类索引中。数据处理层的聚合结果存入ES中的指定索引,同时将每个聚合主题相关的数据存入每个document下面的某个field下。