我需要能够查询 ElasticSearch 索引,以查看是否有任何文档已经具有如下所示字段的特定值:
"name" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "keyword"
}
}
}
我最初打算使用标准化器来完成此操作,但我希望避免对索引本身进行更改。 然后我找到了 match_phrase 查询,它几乎正是我所需要的。 问题是,只要它们开始相同,它也会返回部分匹配。 例如 - 如果我正在搜索值 this is a test
它将返回以下值的结果:
this is a test 1
this is a test but i'm almost done now
this is a test again
match_phrase
查询不会返回我上面发布的示例?
GET definitions/_search
{
"query": {
"bool":{
"must":{
"match_phrase":{
"name":{
"query":"Test Name"
}
}
},
"filter": [
{
"script": {
"script": {
"source": "doc['name.raw'].value.length() == 9",
"lang": "painless"
}
}
}
]
}
}
}
然后我想,如果我可以检查脚本中的长度,也许我可以做一个不区分大小写的比较:
GET definitions/_search
{
"query": {
"bool": {
"filter": [
{
"script": {
"script": {
"source": "doc['name.raw'].value.toLowerCase() == 'test name'",
"lang": "painless"
}
}
}
]
}
}
}
所以这些都是选择。 就我而言,我担心性能,所以我们硬着头皮创建了一个允许不区分大小写的比较的规范化器,所以这些甚至没有被使用。但我想我应该把这个扔在这里,因为我在其他地方找不到这些答案。
术语查询。由于您的 name
文本字段配有
name.raw
多字段,术语查询将为您提供精确匹配。从版本 7.10 开始,区分大小写是默认设置,也可以启用区分大小写。 示例:
http PUT ":9200/my-index" <<END
{
"mappings": {
"properties": {
"name": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}
END
http POST ":9200/my-index/_bulk" <<END
{"index":{"_id":"test1"}}
{"name":"this is a test"}
{"index":{"_id":"test2"}}
{"name":"this is a test 1"}
{"index":{"_id":"test3"}}
{"name":"this is a test but i'm almost done now"}
{"index":{"_id":"test4"}}
{"name":"No, this is not a test"}
END
#
# "term" query finds just "test1", as desired
#
http GET :9200/my-index/_search <<END
{
"query": {
"term":{
"name.raw":{
"value":"thiS is A TesT",
"case_insensitive": true
}
}
}
}
END
#
# Compare with match phrase query, which finds ALL four test documents.
# This includes the last, as we've played around with slop a bit
#
http GET :9200/my-index/_search <<END
{
"query": {
"match_phrase": {
"name": {
"query": "this is a test",
"slop": 1
}
}
}
}
END
# Note: this is on Opensearch 1.3.4, YMMV with other versions