通过elasticsearch中的公共字段合并两个索引值

问题描述 投票:0回答:1
PUT _enrich/policy/merge
{
  "match": {
    "indices": "es.event-154.23-09-01-6-27-22",
    "match_field": "ancestor_id",
    "enrich_fields":  ["status","status_id","type","sub_type","primary_assigned","vehicle_of_interest","dms_ro_number","dms_deal_number","primary_assigned_user_name","dealership_id", "ancestor_id","event_id","deleted","service_date","update_date","secondary_assigned_user_name","bdc_assigned_user_name"]
  }
}

PUT _ingest/pipeline/enrich
{
  "processors": [
    {
      "enrich": {
        "description": "Add Events to customer",
        "policy_name": "merge",
        "field": "contact_id",
        "target_field": "events",
        "max_matches": "1"
        
        
      }
    }
  ]
}

POST _reindex?wait_for_completion=false
{
  "source": {
    "index": "es.customer-154.24-08-09-6-27-22"
  },
  "dest": {
    "index": "es.customerevent1-154.25-08-09-6-27-22",
    "pipeline": "enrich"
  }
}

在这里,我尝试向索引添加一些值。如果那里有更多匹配项那么应该取哪条记录? 。看起来它是按比赛场随机记录的。是否可以提及通过字段与特定顺序 asc/desc 匹配?请帮我解决这个问题。

elasticsearch elastic-stack elasticsearch-8
1个回答
0
投票

到目前为止取得了巨大进步!!

实现此目的的一种方法是将所有匹配项存储在一个数组中(这里 10 个匹配项存储在

tmpArray
中,然后使用
script
处理器选择最适合您的匹配项(这里
tmpArray
someField
排序,并且我们将第一个存储在
event
字段中),如下所示:

PUT _ingest/pipeline/enrich
{
  "processors": [
    {
      "enrich": {
        "description": "Add Events to customer",
        "policy_name": "merge",
        "field": "contact_id",
        "target_field": "tmpArray",
        "max_matches": "10"
        
        
      }
    },
    {
      "script": {
        "if": "ctx.tmpArray != null",
        "source": """
        Integer sortByField(Map o1, Map o2) {
          return o1.someField != null && o2?.someField != null ? o2.someField.compareTo(o1.someField) : 0;
        }
        ctx.event = ctx.tmpArray.stream()
          .sorted(this::sortByField)
          .findFirst()
          .orElse(null);

        ctx.remove('tmpArray');
        """
      }
    }
  ]
}
© www.soinside.com 2019 - 2024. All rights reserved.