我的类别字段中有很多带翻译的类别。我已经为索引中的字段定义了语言分析器,因此我可以搜索它们。但它没有找到我的单词的单数版本。在wasmachine
的titles.title-nl
是奇怪的wasmachines
但没找到。我错过了什么?
演示文档
"_source" : {
"google_id" : 2706,
"titles" : [
{
"title-en" : "laundry appliances",
"title-de" : "waschen & trocknen",
"title-fr" : "appareils de blanchisserie",
"title-nl" : "wasmachines"
}
]
}
我绘制它们的方式
PUT categories/_mapping/category
{
"dynamic": false,
"properties": {
"titles.title-nl": {
"type": "text",
"analyzer": "dutch"
},
"titles.title-en": {
"type": "text",
"analyzer": "english"
},
"titles.title-de": {
"type": "text",
"analyzer": "german"
},
"titles.title-fr": {
"type": "text",
"analyzer": "french"
}
}
}
我搜索它们的方式
GET categories/_search
{
"size": 4,
"query": {
"multi_match": {
"query": "wasmachines",
"fields": ["titles.title-de","titles.title-en", "titles.title-fr", "titles.title-nl"]
}
}
}
问题是默认的荷兰分析器不知道如何阻止单词wasmachines
,你需要使用stemmer_override
用自定义分析器重新创建你的索引。
看看elastic documentation你可以做以下重建dutch
分析仪并告诉wasmachines
应该被阻止到wasmachine
,只需将wasmachine => wasmachines
放入stemmer_override
的规则中
PUT categories/
{
"settings": {
"analysis": {
"filter": {
"dutch_stop": {
"type": "stop",
"stopwords": "_dutch_"
},
"dutch_keywords": {
"type": "keyword_marker",
"keywords": ["voorbeeld"]
},
"dutch_stemmer": {
"type": "stemmer",
"language": "dutch"
},
"dutch_override": {
"type": "stemmer_override",
"rules": [
"fiets=>fiets",
"bromfiets=>bromfiets",
"wasmachine=>wasmachines",
"ei=>eier",
"kind=>kinder"
]
}
},
"analyzer": {
"rebuilt_dutch": {
"tokenizer": "standard",
"filter": [
"lowercase",
"dutch_stop",
"dutch_keywords",
"dutch_override",
"dutch_stemmer"
]
}
}
}
}
}
您还需要在映射中使用新的分析器:
PUT categories/_mapping/category
{
"dynamic": false,
"properties": {
"titles.title-nl": {
"type": "text",
"analyzer": "rebuilt_dutch"
},
"titles.title-en": {
"type": "text",
"analyzer": "english"
},
"titles.title-de": {
"type": "text",
"analyzer": "german"
},
"titles.title-fr": {
"type": "text",
"analyzer": "french"
}
}
}
之后,您将能够搜索wasmachine
并获取具有wasmachines
的文档。