我在 SFTP 中有 2 对 xlsx 和 pdf 文件,需要使用 ADF 在 azure synape 中上传。我需要首先上传转储到 SFTP 文件夹中的最新一对文件,然后重新运行管道以获取下一对最新文件。
Employee_81.xslx 01/02/2022 8:30:00 上午
Employee_83.xslx 01/02/2022 8:40:00 上午
这里,Employee_83 将是第一个上传的。
这是我尝试过的:
ADF活动:
获取元数据1:
参数 = 最后修改时间
过滤条件:
@activity('Get Metada1').output.lastModified
@and(equals(item().type,'File'), and(endswith(item().name,'.xlsx'), startsWith(item().name,'Employee_')))
运行成功,但edl synapse中没有上传文件。出了什么问题?
为了达到您的要求,您可以按照以下步骤操作: 根据您的要求创建两个参数date(日期值最小)和fileName,如下所示:
使用获取元数据活动列出 SFTP 文件夹中的文件,并使用 Child items 字段列表,如下所示:
它将检索文件夹中的文件,添加具有 sequential 顺序和
@activity('Get Metadata1').output.childItems
项目的 foreach 活动,如下所示:
使用 fileName 字符串参数作为文件名创建数据集。在每个活动内部添加获取元数据活动,使用创建的数据集和 Last Modified 字段列表,
@item().name
fileName 字符串参数的值,如下所示:
它将检索文件夹中每个文件的最后修改日期。 将 if 条件活动添加到具有条件
@greater(activity('Get Metadata2').output.lastModified, variables('date'))
的元数据活动
在 if 条件内,对于 True 条件添加两个设置变量活动来更新上次修改日期和上次修改日期文件名,如下所示:
latestModifiedDate(设置变量1):
date : @activity('Get Metadata2').output.lastModified
文件名(设置变量2):
fileName:@item().name
在foreach活动后添加设置变量活动以获取最后修改的文件名:
latestfileName:@variables('fileName')
它会给出最新修改的文件名,如下所示:
这里是管道json参考供您参考:
{
"name": "pipeline4",
"properties": {
"activities": [
{
"name": "Get Metadata1",
"type": "GetMetadata",
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"dataset": {
"referenceName": "DelimitedText2",
"type": "DatasetReference"
},
"fieldList": [
"childItems"
],
"storeSettings": {
"type": "AzureBlobFSReadSettings",
"enablePartitionDiscovery": false
},
"formatSettings": {
"type": "DelimitedTextReadSettings"
}
}
},
{
"name": "ForEach1",
"type": "ForEach",
"dependsOn": [
{
"activity": "Get Metadata1",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"items": {
"value": "@activity('Get Metadata1').output.childItems",
"type": "Expression"
},
"isSequential": true,
"activities": [
{
"name": "Get Metadata2",
"type": "GetMetadata",
"dependsOn": [],
"policy": {
"timeout": "0.12:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"dataset": {
"referenceName": "DelimitedText1",
"type": "DatasetReference",
"parameters": {
"fileName": {
"value": "@item().name",
"type": "Expression"
}
}
},
"fieldList": [
"lastModified"
],
"storeSettings": {
"type": "AzureBlobFSReadSettings",
"enablePartitionDiscovery": false
},
"formatSettings": {
"type": "DelimitedTextReadSettings"
}
}
},
{
"name": "If Condition1",
"type": "IfCondition",
"dependsOn": [
{
"activity": "Get Metadata2",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"expression": {
"value": "@greater(activity('Get Metadata2').output.lastModified, variables('date'))",
"type": "Expression"
},
"ifTrueActivities": [
{
"name": "latestModifiedDate",
"type": "SetVariable",
"dependsOn": [],
"policy": {
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"variableName": "date",
"value": {
"value": "@activity('Get Metadata2').output.lastModified",
"type": "Expression"
}
}
},
{
"name": "FileName",
"type": "SetVariable",
"dependsOn": [
{
"activity": "latestModifiedDate",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"variableName": "fileName",
"value": {
"value": "@item().name",
"type": "Expression"
}
}
}
]
}
}
]
}
},
{
"name": "latestFileName",
"type": "SetVariable",
"dependsOn": [
{
"activity": "ForEach1",
"dependencyConditions": [
"Succeeded"
]
}
],
"policy": {
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"variableName": "latestfileName",
"value": {
"value": "@variables('fileName')",
"type": "Expression"
}
}
}
],
"variables": {
"date": {
"type": "String",
"defaultValue": "2023-10-15T00:00:00Z"
},
"fileName": {
"type": "String"
},
"latestfileName": {
"type": "String"
}
},
"annotations": []
}
}