How to Copy Multiple Files from Amazon S3 to Azure Blob Storage by using Azure Data Factory

 Issue: How to Copy Multiple Files from Amazon S3 to Azure Blob Storage by using Azure Data Factory.


In this article, we are going to learn how to copy multiple files from Amazon S3 to Azure Blob storage by using Azure Data Factory, lets start our demonstration.

Open your Azure data factory, then click on the Manage tab and then click on linked services, then click on the + New button to create a new linked service for Amazon S3. 



Select Amazon S3 and then click on continue.


Name your linked service, then select integration runtime, then select authentication type, then provide access key then provide secret key, select ''To linked connection,'' and then click on create.


Once our linked service is created, go to the author tab, then click on the + button, and then click on the pipeline to create a new pipeline.


In the pipeline, find and drag the metadata activity, it will bring the list of the files, then go to the dataset tab and then click on the + New button to create a new dataset.

Select Amazon S3 and then click on continue.


Select binary as format and then click on continue.


Name your dataset, then select the linked service, then provide the bucket path and then click on ok


In the filed list click on + New and select Child items and then click on debug, it will bring the list of the files.



Next, find and drag the ForEach loop activity, then connect with the metadata activity and then go to the settings tab and click on add dynamic content.



Click on the Metadata, and it will bring the dynamic content and then add ''.childitems'' at the end and click on finish.


Once our dynamic content is added, now click on the pencil sign at top of the ForEach loop activity and go inside the activity, where we will set up our copy data activity.


Inside the ForEach loop activity, find and drag the copy data activity, then go to the source tab and click on the + New button to create a new source dataset.


Select the Amazon S3 and then click on continue.


Select the binary as format and then click on continue.


Name your dataset, then select the linked service, then provide the file path from where we will copy the data, and then click on ok.



Once our dataset is configured, click on the open in the source tab and create a parameter.


In the parameter tab, click on the + New button, then name your parameter and then go to the Connections tab.


 Inside the connections tab, click on the ''Add dynamic content'' and then use the parameter we have created.


Now back to the sink tab, click on 'Add dynamic content'' and provide the ForEach loop Item and then we are good to go.



Next, go to the sink tab, and then click on the + New button to create a new sink dataset, where we will copy our files.


 
Select Azure blob storage and then click on continue.


Select binary as format and click on continue.


Name your dataset, then select the linked service, then provide the path where we have to download the data, and then click on ok.


Now go back to your main pipeline and click on Debug, it will copy the data from the Amazon S3 bucket to our Azure blob storage.



Video Demo: How to Copy Multiple Files from Amazon S3 to Azure Blob Storage by using Azure Data Factory
















1 comment:

  1. Using Gs Richcopy 360 to transfer from S3 to blob is more easy , reliable and accurate.
    Also there are Goodsync and Syncback

    ReplyDelete