我希望将带有数组的JSON转换为csv格式。数组中的元素数量对于每一行都是动态的。我尝试使用这个流程,(在帖子上附上了流文件xml)。
GetFile - > ConvertRecord - > UpdateAttribute - > PutFile
还有其他选择吗?
JSON格式:
{ "LogData": {
"Location": "APAC",
"product": "w1" }, "Outcome": [
{
"limit": "0",
"pri": "3",
"result": "pass"
},
{
"limit": "1",
"pri": "2",
"result": "pass"
},
{
"limit": "5",
"priority": "1",
"result": "fail"
} ], "attr": {
"vers": "1",
"datetime": "2018-01-10 00:36:00" }}
csv中的预期输出:
location, product, limit, pri, result, vers, datetime
APAC w1 0 3 pass 1 2018-01-10 00:36:00
APAC w1 1 2 pass 1 2018-01-10 00:36:00
APAC w1 5 1 fail 1 2018-01-10 00:36:00
附加流的输出:LogData,Outcome,attr“MapRecord [{product = w1,Location = APAC}]”,“[MapRecord [{limit = 0,result = pass,pri = 3}],MapRecord [{limit = 1,result = pass,pri = 2}],MapRecord [{limit = 5,result = fail}]]“,”MapRecord [{datetime = 2018-01-10 00:36:00,vers = 1}]“
ConvertRecord - 我正在使用JSONTreereader和CSVRecordSSetwriter配置如下:
JSONTreereader Controler服务配置: CSVRecordReader控制器服务配置: Avro架构注册管理机构服务配置:
Avro架构:{“name”:“myschema”,“type”:“record”,“namespace”:“myschema”,“fields”:[{“name”:“LogData”,“type”:{“name” :“LogData”,“type”:“record”,“fields”:[{“name”:“Location”,“type”:“string”},{“name”:“product”,“type”:“ string“}]}},{”name“:”Outcome“,”type“:{”type“:”array“,”items“:{”name“:”Outcome_record“,”type“:”record“, “fields”:[{“name”:“limit”,“type”:“string”},{“name”:“pri”,“type”:[“string”,“null”]},{“name “:”result“,”type“:”string“}]}}},{”name“:”attr“,”type“:{”name“:”attr“,”type“:”record“,”字段“:[{”name“:”vers“,”type“:”string“},{”name“:”datetime“,”type“:”string“}]}}]}
在ConvertRecord之前在JoltTransformJSON中尝试此规范:
{
"operation": "shift",
"spec": {
"Outcome": {
"*": {
"@(3,LogData.Location)": "[#2].location",
"@(3,LogData.product)": "[#2].product",
"@(3,attr.vers)": "[#2].vers",
"@(3,attr.datetime)": "[#2].datetime",
"*": "[#2].&"
}
}
}
}
]```
似乎您需要在转换为CSV之前执行Jolt变换,否则无法工作。