应用 top_hits 聚合后的 post_filter 在弹性搜索中不起作用

问题描述 投票:0回答:1

期望: 我需要所有在上次尝试中未成功的用户。

实际/我的方法: 我应用了 userId 和 top_hits 的聚合,大小为 1 个文档,按时间降序排序。

我已经准备了这样的查询。通过这个我可以获得所有用户及其最后的状态。之后我想根据状态进行过滤。任何人都可以帮忙解决这个问题吗?我在聚合后应用了 post_filter ,但仍然没有过滤。 如果有任何其他方法,请在这里帮忙。

输入:

[
  {
    "userId": "u1",
    "status": "Failure",
    "time": 1719543600008 // This is most updated record for user - u1
  },
  {
    "userId": "u1",
    "status": "Success",
    "time": 1719543600007
  },
  {
    "userId": "u1",
    "status": "Timeout",
    "time": 1719543600006
  },
  {
    "userId": "u2",
    "status": "Timeout",
    "time": 1719543600004 // This is most updated record for user - u2
  },
  {
    "userId": "u2",
    "status": "Failure",
    "time": 1719543600003
  },
  {
    "userId": "u3",
    "status": "Success",
    "time": 1719543600002 // This is most updated record for user - u3. As its success, it needs to be discarded from output
  },
  {
    "userId": "u3",
    "status": "Failure",
    "time": 1719543600001
  }
]

预期输出:

[
  {
    "userId": "u1",
    "status": "Failure",
    "time": 1719543600008
  },
  {
    "userId": "u2",
    "status": "Timeout",
    "time": 1719543600004
  }
]

查询:

{
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "data.time": {
              "gte": "1719543600000",
              "lte": "1719584179015",
              "format": "epoch_millis"
            }
          }
        },
        {
          "query_string": {
            "query": "data.type:\"user-stats\""
          }
        }
      ]
    }
  },
  "aggs": {
    "group_by_userId": {
      "terms": {
        "field": "data.userId.keyword"
      },
      "aggs": {
        "users_last_status": {
          "top_hits": {
            "size": 1,
            "sort": [
              {
                "data.time": {
                  "order": "desc"
                }
              }
            ]
          }
        }
      }
    }
  },
  "post_filter": { // In this query this filter is not working
    "term": {
      "data.status.keyword": "failure"
    }
  }
}

实际产量:

[
  {
    "userId": "u1",
    "status": "Failure",
    "time": 1719543600008
  },
  {
    "userId": "u2",
    "status": "Timeout",
    "time": 1719543600004
  },
  {
    "userId": "u3", // This shouldn't come in output as we are concerned about only failure records.
    "status": "Success",
    "time": 1719543600002 
  }
]

注意: 由于用户数量没有限制,我们不想在应用程序/客户端进行过滤以减少负载。

elasticsearch spring-data-elasticsearch elasticsearch-aggregation
1个回答
0
投票

post_filter
只影响查询结果,不影响
aggregations
结果。

使用搜索 API 的

post_filter
参数。 搜索请求适用帖子 过滤器仅搜索命中,而不是聚合。 您可以使用帖子 过滤以根据更广泛的结果集计算聚合,以及 然后进一步缩小结果范围。 https://www.elastic.co/guide/en/elasticsearch/reference/current/filter-search-results.html

您可以在 query.bool.filter 中使用

terms
查询,如下所示。

{
  "query":{
    "bool":{
      "filter":[
        {"range":{"data.time":{"gte":"1719543600000","lte":"1719584179015","format":"epoch_millis"}}},
        {"query_string":{"query":"data.type:\"user-stats\""}},
        {"terms":{"status":["Timeout","Failure"]}}
      ]
    }
  },
  "aggs": {...}
}
最新问题
© www.soinside.com 2019 - 2025. All rights reserved.