Jolt 组合两个数组并合并公共元素和偶尔的字段

问题描述 投票:0回答:1

我有以下 json 记录,其中有 2 个数组,我想将它们组合成一个数组:

{
  "id": "1234X",
  "rating": "B",
  "files": [
    {
      "sequenceId": 7,
      "show": "No",
      "hash": "ABC123",
      "collectTime": 1716308631000
    },
    {
      "sequenceId": 13,
      "collectTime": 1716308631000
    },
    {
      "sequenceId": 10,
      "hash": "DEF234",
      "collectTime": 1716308631000
    },
    {
      "sequenceId": 8,
      "show": "No",
      "collectTime": 1716308631000
    }
  ],
  "tags": [
    {
      "hash": "DEF234",
      "tag": "Corrupt"
    }
  ]
}

结果将是:

{
  "id" : "1234X",
  "rating" : "B",
  "tempFiles" : [ {
    "sequenceId" : 7,
    "show" : "No",
    "hash" : "ABC123",
    "collectTime" : 1716308631000
  }, {
    "sequenceId" : 13,
    "collectTime" : 1716308631000
  }, {
    "sequenceId" : 10,
    "hash" : "DEF234",
    "tag" : "Corrupt",    
    "collectTime" : 1716308631000
  }, {
    "sequenceId" : 8,
    "show" : "No",
    "collectTime" : 1716308631000
  } ]
}

我将两个数组合并为一个,并将基数设置为 1,但我对如何在数组元素上设置基数感到困惑,因为每个元素不一定具有相同的 LHV

我尝试了以下移位和基数操作:

[
  {
    "operation": "shift",
    "spec": {
      "files|tags": {
        "*": "tempFiles[]"
      },
      "*": "&"
    }
  }, {
    "operation": "cardinality",
    "spec": {
      "tempFiles[]": {
        "*": {
          "hash": "ONE"
        }
      }
    }
  }
]

但我明白:

{
  "id" : "1234X",
  "rating" : "B",
  "tempFiles" : [ {
    "sequenceId" : 7,
    "show" : "No",
    "hash" : "ABC123",
    "collectTime" : 1716308631000
  }, {
    "sequenceId" : 13,
    "collectTime" : 1716308631000
  }, {
    "sequenceId" : 10,
    "hash" : "DEF234",
    "collectTime" : 1716308631000
  }, {
    "sequenceId" : 8,
    "show" : "No",
    "collectTime" : 1716308631000
  }, {
    "hash" : "DEF234",
    "tag" : "Corrupt"
  } ]
}

如何获取包含哈希值、collectTime 和带有sequenceId 的标签的一条数组记录? 我猜有更好的方法可以做到这一点,但我对这些转换很陌生

transform apache-nifi jolt
1个回答
0
投票

首先,我不认为基数是这样工作的。基数会在应用“ONE”时将对象数组转置为一个复杂对象,并在应用“MANY”时将复杂对象转置为数组。有关更多信息,请参阅基数。对于你所拥有的,我能够使用以下规范来做到这一点。我不确定是否可以用少于 4 个来完成,特别是当哈希不是必填字段时:

[
  {
    // assign empty hash where missing
    "operation": "default",
    "spec": {
      "files[]": {
        "*": {
          "hash": ""
        }
      }
    }
  }
  ,
  {
    //group files and tags that belong to the same hash
    // under temp. for items with no hash just dump into 
    // NoHash array
    "operation": "shift",
    "spec": {
      "*": "&",
      "files": {
        "*": {
          "hash": {
            //group files under temp.(hashcode)
            "*": {
              "@(2)": "temp.&1"
            },
            "": {
              "@(2)": "NoHash[]"
            }
          }
        }
      },
      // group tags under temp.(hashcode)
      "tags": {
        "*": {
          "*": "temp.@(1,hash).&",
          "hash": null
        }
      }
    }
  }
  ,
  //Transpose NoHash elements and whatever was merged under temp
  //into the tempFiles[]
  {
    "operation": "shift",
    "spec": {
      "*": "&",
      "temp": {
        "*": "tempFiles[]"
      },
      "NoHash": {
        "*": "tempFiles[]"
      }
    }
  }
  ,
  // remove unwanted hashcode with empty string value that was
  // add in the first place
  {
    "operation": "shift",
    "spec": {
      "*": "&",
      "tempFiles": {
        "*": {
          "hash": {
            "": null,
            "*": {
              "$": "tempFiles[&3].&2"
            }
          },
          "*": "tempFiles[&1].&"
        }
      }
    }
  }
  /**/
]
© www.soinside.com 2019 - 2024. All rights reserved.