我知道elasticsearch中有一个很好的Compound Word Token Filter,但我的问题有点不同。我想知道像谷歌这样的搜索引擎如何处理开放形式的复合词,如“邮局”或“客厅”。如果您输入“邮局”而不是“邮局”,您仍然会得到相同的结果。我想在我的 Elasticsearch 搜索引擎中拥有这样的功能。这个问题的解决办法是什么?我应该将邮局代币化为一个代币吗?如果是真的,怎么办?
您应该添加一个分析器来搜索查询
请参阅我的回答
中的映射和文档复合查询
"something"
GET /decompounder/_search?filter_path=hits.hits
{
"query": {
"multi_match" : {
"query": "something",
"analyzer": "lowercase_english_decompounder_standard_analyzer",
"fields": ["name"]
}
}
}
回应
{
"hits" : {
"hits" : [
{
"_index" : "decompounder",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.23911434,
"_source" : {
"name" : "something sea"
}
},
{
"_index" : "decompounder",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.23911434,
"_source" : {
"name" : "something tea"
}
},
{
"_index" : "decompounder",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.23911434,
"_source" : {
"name" : "something seaside"
}
}
]
}
}
用两个词查询
"some thing"
GET /decompounder/_search?filter_path=hits.hits
{
"query": {
"multi_match" : {
"query": "some thing",
"analyzer": "lowercase_english_decompounder_standard_analyzer",
"fields": ["name"]
}
}
}
反应是一样的