Fabric 管道中复制活动后出现重复值

问题描述 投票:0回答:1

我正在开发 Microsoft Fabric 管道,我想将一些数据从 API URL 复制到 Fabric Lakehouse,我面临的问题是它向 Lakehouse 中的表写入相同的数据重复 864 次,但我不这样做不明白为什么。

我还有一个 Authentication 标头和 OSvC-CREST-Application-Context 标头,这是从该 api 检索数据所必需的。

我正在使用复制数据活动执行此操作,现在我只想使用类似于以下内容的 API URL 获取一个元素的详细信息:https://mysite.example.com/services/rest/connect/v1。 4/件/1

这个 api 调用的结果看起来像这样:

{
    "id": 1,
    "lookupName": "ITEM1",
    "createdTime": "2016-01-19T11:08:02.000Z",
    "updatedTime": "2024-11-06T08:57:50.000Z",
    "locations": {
        "links": [
            {
                "rel": "self",
                "href": "https://mysite.example.com/services/rest/connect/v1.4/items/1/locations"
            },
            {
                "rel": "full",
                "href": "https://mysite.example.com/services/rest/connect/v1.4/items/1/locations/{location_id}",
                "templated": true
            }
        ]
    },
    "banner": {
        "importanceFlag": {
            "id": 3,
            "lookupName": "High"
        },
        "text": " ",
        "updatedByperson": {
            "links": [
                {
                    "rel": "self",
                    "href": "https://mysite.example.com/services/rest/connect/v1.4/persons/2"
                },
                {
                    "rel": "canonical",
                    "href": "https://mysite.example.com/services/rest/connect/v1.4/persons/2"
                },
                {
                    "rel": "describedby",
                    "href": "https://mysite.example.com/services/rest/connect/v1.4/metadata-catalog/persons",
                    "mediaType": "application/schema+json"
                }
            ]
        },
        "updatedTime": "2016-01-22T15:00:26.000Z"
    },
    "customFields": {
        "c": {
            "maintenance": true,
            "blacklist": false,
            "org_remote": null,
            "tip_cont": {
                "id": 68,
                "lookupName": "abc"
            },
            "amef_cnt": null,
            "term_date": null,
            "term_reason": null,
            "phisical_country": "Romania"
        }
    },
    "externalReference": null,
    "attachments": {
        "links": [
            {
                "rel": "self",
                "href": "https://mysite.example.com/services/rest/connect/v1.4/items/1/attachments"
            },
            {
                "rel": "full",
                "href": "https://mysite.example.com/services/rest/connect/v1.4/items/1/attachments/{attachment_id}",
                "templated": true
            }
        ]
    },
    "industry": null,
    "login": "log.in",
    "name": "item1",
    "nameFurigana": null,
    "notes": {
        "links": [
            {
                "rel": "self",
                "href": "https://mysite.example.com/services/rest/connect/v1.4/items/1/notes"
            },
            {
                "rel": "full",
                "href": "https://mysite.example.com/services/rest/connect/v1.4/items/1/notes/{note_id}",
                "templated": true
            }
        ]
    },
    "itemHierarchy": {
        "links": [
            {
                "rel": "self",
                "href": "https://mysite.example.com/services/rest/connect/v1.4/items/1/itemHierarchy"
            },
            {
                "rel": "full",
                "href": "https://mysite.example.com/services/rest/connect/v1.4/items/1/itemHierarchy/{itemHierarchy_id}",
                "templated": true
            }
        ]
    },
    "parent": null,
    "actionSettings": {
        "acquiredDate": null,
        "actionperson": {
            "links": [
                {
                    "rel": "self",
                    "href": "https://mysite.example.com/services/rest/connect/v1.4/persons/2"
                },
                {
                    "rel": "canonical",
                    "href": "https://mysite.example.com/services/rest/connect/v1.4/persons/2"
                },
                {
                    "rel": "describedby",
                    "href": "https://mysite.example.com/services/rest/connect/v1.4/metadata-catalog/persons",
                    "mediaType": "application/schema+json"
                }
            ]
        },
        "total": {
            "currency": {
                "id": 2,
                "lookupName": "RON"
            },
            "exchangeRate": null
        }
    },
    "serviceSettings": {
        "sLAInstances": {
            "links": [
                {
                    "rel": "self",
                    "href": "https://mysite.example.com/services/rest/connect/v1.4/items/1/serviceSettings/sLAInstances"
                },
                {
                    "rel": "full",
                    "href": "https://mysite.example.com/services/rest/connect/v1.4/items/1/serviceSettings/sLAInstances/{sLAInstance_id}",
                    "templated": true
                }
            ]
        }
    },
    "source": {
        "id": 10,
        "lookupName": "Contact",
        "parents": [
            {
                "id": 3,
                "lookupName": "consola"
            }
        ]
    },
    "supersededBy": null,
    "links": [
        {
            "rel": "self",
            "href": "https://mysite.example.com/services/rest/connect/v1.4/items/1"
        },
        {
            "rel": "canonical",
            "href": "https://mysite.example.com/services/rest/connect/v1.4/items/1"
        },
        {
            "rel": "describedby",
            "href": "https://mysite.example.com/services/rest/connect/v1.4/metadata-catalog/items",
            "mediaType": "application/schema+json"
        }
    ]
}

我认为不需要分页规则,但我最初设置为分页规则 RFC5988 = True 然后我更改为 MaxRequestNumber=1 只是为了测试是否得到其他结果,不,结果是相同的.

管道运行后的输出如下所示:

{
    "dataRead": 16168,
    "dataWritten": 5933,
    "filesWritten": 1,
    "sourcePeakConnections": 1,
    "sinkPeakConnections": 1,
    "rowsRead": 1,
    "rowsCopied": 864,
    "copyDuration": 17,
    "throughput": 1.47,
    "errors": [],
    "usedDataIntegrationUnits": 4,
    "usedParallelCopies": 1,
    "executionDetails": [
        {
            "source": {
                "type": "RestService"
            },
            "sink": {
                "type": "Lakehouse"
            },
            "status": "Succeeded",
            "start": "12/13/2024, 4:35:22 PM",
            "duration": 17,
            "usedDataIntegrationUnits": 4,
            "usedParallelCopies": 1,
            "profile": {
                "queue": {
                    "status": "Completed",
                    "duration": 6
                },
                "transfer": {
                    "status": "Completed",
                    "duration": 11,
                    "details": {
                        "readingFromSource": {
                            "type": "RestService",
                            "workingDuration": 0,
                            "timeToFirstByte": 0
                        },
                        "writingToSink": {
                            "type": "Lakehouse",
                            "workingDuration": 0
                        }
                    }
                }
            },
            "detailedDurations": {
                "queuingDuration": 6,
                "timeToFirstByte": 0,
                "transferDuration": 11
            }
        }
    ],
    "dataConsistencyVerification": {
        "VerificationResult": "Unsupported"
    }
}

为什么我会得到重复的值以及如何解决它? 或者也许有另一种从 api 获取数据的方法,比如使用 Web Activity 获取数据,然后使用另一个 Activity 将其写入 Lakehouse?!

azure-data-factory microsoft-fabric
1个回答
0
投票

您可以按照以下步骤来实现您的要求: 创建管道并运行 Web 活动以检索 Api 的详细信息,如下所示:

enter image description here

管道 Web 活动成功执行后,将检索 Api 的详细信息,为源创建 Rest Api 链接服务,为接收器创建 Lake House 链接服务。在 Web 活动成功时添加 foreach 活动,添加以下范围函数作为 foreach 活动的项目:

@range(1,activity('restAPI').output.total_pages)

在每个活动添加复制活动中,通过使用创建的链接服务,使用数据集参数

rurl
和值
@dataset().rurl
创建 Rest Api 数据集,将其添加为复制活动的源,并将 get 活动设置为请求方法并添加值
?page=@{item()} 
rurl
如下图:

enter image description here

使用创建的链接服务为表创建带有数据集参数

tablename
的 Lake House 数据集,将其添加到值为
Apipage@{item()}
tablename
的接收器并启用自动创建表。根据要求绘制数据。调试管道,它将成功执行。数据复制成功如下图:

enter image description here

在示例 Api 中有两个页面,这就是为什么创建了 2 个包含完整数据的表。

© www.soinside.com 2019 - 2024. All rights reserved.