ElasticSearch中插入的数据是韩文,因此我无法提供确切的情况,但可以说我有一个单词ABBCC
被标记为["A","BBCC"]
,另一个单词AZZXXX
被标记为["A","ZZXXX"]
。
如果我搜索ABBCC,那么AZZXXX是否应该出现,因为它们具有相同的令牌?还是这不是Elasticsearch的工作方式?
这是我检查已分析单词的方式:
GET recpost_test/_analyze
{
"analyzer": "my_analyzer",
"text":"my query String!"
}
这就是我创建索引的方式:
PUT recpost
{
"settings": {
"index": {
"analysis": {
"tokenizer": {
"nori_user_dict": {
"type": "nori_tokenizer",
"decompound_mode": "mixed",
"user_dictionary": "userdict_ko.txt"
}
},
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "nori_user_dict"
}
},
"filter": {
"substring": {
"type": "edgeNGram",
"min_gram": 1,
"max_gram": 10
}
}
}
}
}
}
这是我的搜索方式:
GET recpost/_search
{
"_source": [""],
"from": 0,
"size": 2,
"query":{
"multi_match": {
"query" : "my query String!",
"type": "best_fields",
"fields" : [
"brandkor",
"content",
"itemname",
"name",
"review",
"shortreview^2",
"title^3"]
}
}
}
编辑:我尝试添加“分析器”字段进行搜索,但仍然无法正常工作
GET recpost/_search
{
"_source": [""],
"from": 0,
"size": 2,
"query":{
"multi_match": {
"query" : "깡스",
"analyzer": "my_analyzer",
"type": "best_fields",
"fields" : [
"brandkor",
"content",
"itemname",
"name",
"review",
"shortreview^2",
"title^3"]
}
}
}
EDIT2:这是我的映射:
{
"recpost_test" : {
"mappings" : {
"properties" : {
"@timestamp" : {
"type" : "date"
},
"brandkor" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"content" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"field_statistics" : {
"type" : "boolean"
},
"fields" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"itemname" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"offsets" : {
"type" : "boolean"
},
"payloads" : {
"type" : "boolean"
},
"positions" : {
"type" : "boolean"
},
"review" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"shortreview" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"term_statistics" : {
"type" : "boolean"
},
"title" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"type" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
我没有看到您将字段装入索引(映射)。因此,据我所知,您是将所有字段(brandkor,content等)都索引为text
..基本上是在匹配精确值。
除非您将每个字段与其分析器相关联。