关于ElasticSearch的聚类时出现fielddata=true问题
https://blog.csdn.net/baristas/article/details/78974090
在ElasticSearch中默认fielddata默认是false的,因为开启Text的fielddata后对内存的占用很高
index:megacorp
type:employee
如果进行聚合查询时候:
GET /megacorp/employee/_search
{
"aggs": {
"all_interests": {
"terms": { "field": "interests" }
}
}
}
返回错误码:
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [interests] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
}
],
"type" : "search_phase_execution_exception",
"reason" : "all shards failed",
"phase" : "query",
"grouped" : true,
"failed_shards" : [
{
"shard" : 0,
"index" : "megacorp",
"node" : "hDWX06IlTiu7gK3ybmsb-g",
"reason" : {
"type" : "illegal_argument_exception",
"reason" : "Fielddata is disabled on text fields by default. Set fielddata=true on [interests] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
}
}
]
},
"status" : 400
}
提示:fielddata=true未开启
此时需要做的是:
curl -i -H "Content-Type:application/json" -XPUT 127.0.0.1:9200/your_index/_mapping/your_type/?pretty -d‘{"your_type":{"properties":{"your_field_name":{"type":"text","fielddata":true}}}}‘
1
curl -i -H "Content-Type:application/json" -XPUT 127.0.0.1:9200/megacorp/_mapping/employee/?pretty -d‘{"employee":{"properties":{"interests":{"type":"text","fielddata":true}}}}‘
1
interests指的是需要开启fielddata的field
此时再进行测试
"aggregations" : {
"all_interests" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "music",
"doc_count" : 2
},
{
"key" : "forestry",
"doc_count" : 1
},
{
"key" : "sports",
"doc_count" : 1
}
]
}
就可以看出根据interests来进行聚类的查询结果,在这里就可以根据结果进行兴趣分析等
————————————————
原文链接:https://blog.csdn.net/baristas/article/details/78974090