Elasticsearch不返回具有相同令牌的结果吗?

问题描述 投票:0回答:1

ElasticSearch中插入的数据是韩文,因此我无法提供确切的情况,但可以说我有一个单词ABBCC被标记为["A","BBCC"],另一个单词AZZXXX被标记为["A","ZZXXX"]

如果我搜索ABBCC,那么AZZXXX是否应该出现,因为它们具有相同的令牌?还是这不是Elasticsearch的工作方式?

这是我检查已分析单词的方式:

GET recpost_test/_analyze
{
  "analyzer": "my_analyzer",
  "text":"my query String!" 
}

这就是我创建索引的方式:

PUT recpost
{
  "settings": {
    "index": {
      "analysis": {
        "tokenizer": {
          "nori_user_dict": {
            "type": "nori_tokenizer",
            "decompound_mode": "mixed",
            "user_dictionary": "userdict_ko.txt"
          }
        },
        "analyzer": {
          "my_analyzer": {
            "type": "custom",
            "tokenizer": "nori_user_dict"
          }
        },
        "filter": {
        "substring": {
          "type": "edgeNGram",
          "min_gram": 1,
          "max_gram": 10
        }
      }
      }
    }
  }
}

这是我的搜索方式:

GET recpost/_search
{
  "_source": [""],
  "from": 0,
  "size": 2,
  "query":{
    "multi_match": {
      "query" : "my query String!",
      "type": "best_fields", 
      "fields" : [
        "brandkor",
        "content",
        "itemname",
        "name",
        "review",
        "shortreview^2",
        "title^3"]
    }
  }
}

编辑:我尝试添加“分析器”字段进行搜索,但仍然无法正常工作

GET recpost/_search
{
  "_source": [""],
  "from": 0,
  "size": 2,
  "query":{
    "multi_match": {
      "query" : "깡스",
      "analyzer": "my_analyzer", 
      "type": "best_fields", 
      "fields" : [
        "brandkor",
        "content",
        "itemname",
        "name",
        "review",
        "shortreview^2",
        "title^3"]
    }
  }
}

EDIT2:这是我的映射:

{
  "recpost_test" : {
    "mappings" : {
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        },
        "brandkor" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "content" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "field_statistics" : {
          "type" : "boolean"
        },
        "fields" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "itemname" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "name" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "offsets" : {
          "type" : "boolean"
        },
        "payloads" : {
          "type" : "boolean"
        },
        "positions" : {
          "type" : "boolean"
        },
        "review" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "shortreview" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "term_statistics" : {
          "type" : "boolean"
        },
        "title" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "type" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}
elasticsearch search token
1个回答
0
投票

我没有看到您将字段装入索引(映射)。因此,据我所知,您是将所有字段(brandkor,content等)都索引为text ..基本上是在匹配精确值。

除非您将每个字段与其分析器相关联。

© www.soinside.com 2019 - 2024. All rights reserved.