我必须映射包含多个集合、嵌套数组的 API 的输出 Json 格式... 我很好地收到了 Json 文件,然后我使用活动副本将 Json 转换为 Parquet 文件,在映射设置中,我手动创建复杂的映射并获取数据,它工作正常,但是...... 问题是交叉连接应用得不好,意味着只给我嵌套数组(ChildRows)的第一个索引。
因此,在上面的示例中,只会给出 ChildRows (数组)的 rowuid_Child_value1 的第一个值,而不是第二个值(rowuid_Child_value2)...
我也尝试过高级编辑器、动态映射,但没有达到预期的结果。 请问您有什么建议吗?或者 ADF 有类似的情况吗?我无法使用 DataFlow,因为我们的项目中没有使用 adf,所以这不是一个选项。 谢谢你的帮助...
您可以通过以下步骤来达到您的要求:
转到复制活动映射启用高级编辑器并输入$['Rows'][0]['ChildRows']集合引用如下所示:
然后映射将设置如下:
通过上述映射,Json 数据将修改如下:
行[0]['RowUID'] | 表格编号 | 事件日期 | 事件总结 | 需要恢复 | 商业实体 | 分类 | 行[0]['EntityName'] | 行[0]['ParentRowUID'] | 项目 | 左行UID | 行[0]['ChildRows'][0]['RowUID'] | 序数 | 类型 | 行[0]['ChildRows'][0]['ParentRowUID'] | 可报告环境损坏 | 行[0]['ChildRows'][0]['EntityName'] | 后果 | 代码 | 可报告 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
rowuid_值 | xxx | 日期 | 数据 | 1 | 数据 | 数据 | xxx | rowuid_值 | xxx | rowuid_值 | rowuid_Child_value1 | 0 | xxx | xxx | 0 | xxx | xxx | xxx | xxx |
rowuid_value | xxx | 日期 | 数据 | 1 | 数据 | 数据 | xxx | rowuid_值 | xxx | rowuid_值 | rowuid_Child_value2 | 0 | xxx | rowuid_值 | 1 | xxx | xxx | xxx | xxx |
这是管道 Json 供您参考:
{
"name": "pipeline2",
"properties": {
"activities": [
{
"name": "Copy data1",
"type": "Copy",
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"source": {
"type": "JsonSource",
"storeSettings": {
"type": "AzureBlobFSReadSettings",
"recursive": true,
"enablePartitionDiscovery": false
},
"formatSettings": {
"type": "JsonReadSettings"
}
},
"sink": {
"type": "DelimitedTextSink",
"storeSettings": {
"type": "AzureBlobFSWriteSettings"
},
"formatSettings": {
"type": "DelimitedTextWriteSettings",
"quoteAllText": true,
"fileExtension": ".txt"
}
},
"enableStaging": false,
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": {
"path": "$['Rows'][0]['RowUID']"
},
"sink": {
"name": "Rows'][0]['RowUID"
}
},
{
"source": {
"path": "$['Rows'][0]['FormNumber']"
},
"sink": {
"name": "FormNumber"
}
},
{
"source": {
"path": "$['Rows'][0]['IncidentDate']"
},
"sink": {
"name": "IncidentDate"
}
},
{
"source": {
"path": "$['Rows'][0]['IncidentSummary']"
},
"sink": {
"name": "IncidentSummary"
}
},
{
"source": {
"path": "$['Rows'][0]['RecoveryRequired']"
},
"sink": {
"name": "RecoveryRequired"
}
},
{
"source": {
"path": "$['Rows'][0]['BusinessEntity']"
},
"sink": {
"name": "BusinessEntity"
}
},
{
"source": {
"path": "$['Rows'][0]['Classification']"
},
"sink": {
"name": "Classification"
}
},
{
"source": {
"path": "$['Rows'][0]['EntityName']"
},
"sink": {
"name": "Rows'][0]['EntityName"
}
},
{
"source": {
"path": "$['Rows'][0]['ParentRowUID']"
},
"sink": {
"name": "Rows'][0]['ParentRowUID"
}
},
{
"source": {
"path": "$['Rows'][0]['Item']"
},
"sink": {
"name": "Item"
}
},
{
"source": {
"path": "['LeftRowUID']"
},
"sink": {
"name": "LeftRowUID"
}
},
{
"source": {
"path": "['RowUID']"
},
"sink": {
"name": "Rows'][0]['ChildRows'][0]['RowUID"
}
},
{
"source": {
"path": "['Ordinal']"
},
"sink": {
"name": "Ordinal"
}
},
{
"source": {
"path": "['Type']"
},
"sink": {
"name": "Type"
}
},
{
"source": {
"path": "['ParentRowUID']"
},
"sink": {
"name": "Rows'][0]['ChildRows'][0]['ParentRowUID"
}
},
{
"source": {
"path": "['IsReportableEnvDamage']"
},
"sink": {
"name": "IsReportableEnvDamage"
}
},
{
"source": {
"path": "['EntityName']"
},
"sink": {
"name": "Rows'][0]['ChildRows'][0]['EntityName"
}
},
{
"source": {
"path": "['Consequence']"
},
"sink": {
"name": "Consequence"
}
},
{
"source": {
"path": "['Code']"
},
"sink": {
"name": "Code"
}
},
{
"source": {
"path": "['Reportable']"
},
"sink": {
"name": "Reportable"
}
}
],
"collectionReference": "$['Rows'][0]['ChildRows']",
"mapComplexValuesToString": false
}
},
"inputs": [
{
"referenceName": "Json2",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "DelimitedText1",
"type": "DatasetReference"
}
]
}
],
"annotations": []
}
}