Get Meta Data Activity -> Take a Binary dataset without any container or file path and use Child items in the activity.
For-Each activity -> Give the child items list to for-each activity and enable the Sequential checkbox.
- Dataflow 1 -> Create a JSON dataset with dataset parameter in the container name and give it as source of the dataflow1. Pass the container name from the for loop to dataset parameter in the activity. Inside dataflow source settings give a wild card file path A/*.json and add your transformations after the source. In the sink give your parquet dataset. Use dataset parameters for the file name of the parquet if needed.
- Dataflow 2 -> Similarly, do the same for folder B and add folder B transformations in the wild card file path, give B/*.json.
- Dataflow 3 -> Similarly, do the same for folder C and add folder C transformations in the wild card file path, give C/*.json.
了解将触发参数传递给管道参数。 为增量负载创建另一组3个数据流。 获取带有容器的数据集参数的镶木quet数据集,并为您的文件夹提供一个镶木式文件名。将其用作source1和sink1。使用数据集参数创建另一个JSON数据集,用于容器和文件名。将此数据集用作数据流中的源2。使用联合转换
然后遵循以下管道设计。Set variable -> Extract the folder name and store it in a string variable `folder` using this expression from the pipeline parameter split(pipeline().parameters.folder_path,'/')[1].
Switch activity -> Give the `folder` variable to this
- Case `A` -> Give the case name as `A`.
- Dataflow A -> pass the folder name and file name parameters to the dataflow dataset parameters.
- Case `B` -> Give the case name as `B`.
- Dataflow B -> pass the folder name and file name parameters to the dataflow dataset parameters.
- Case `C` -> Give the case name as `C`.
- Dataflow C -> pass the folder name and file name parameters to the dataflow dataset parameters.