Issue: How to Load Multiple CSV Files to Azure SQL Table and Move them to the Archive Folder and Delete from source folder in Azure Data Factory.
In this article, we are going to learn how to load multiple CSV files to the Azure SQL table, and how to move them into another folder and then delete the files from the source. let's start our demonstration, First of all, we will go to the storage account and create some containers, and load the files.
Open your Azure Data Factory then go to the Storage and then click on containers and then click on + Container to create a new container, then name the container and click on create, once our container is created do the same for the next container, I have created two folders, one is Source and another one is Archive, you can create tons of containers as per your requirement.
Once our folders are created, open the source folder and click on the upload button on the top of the window, then navigate to the files which you want to upload, select the files and click on upload.
Once our files are uploaded now our next step is to open our Azure data Factory Studio and create a linked service that we need to use in the pipeline.
How to Create a Linked Service in Azure Data Factory.
Open the Azure Data Factory Studio, then go to the Manage tab, then click on linked services, and then click on the + New button to create a new linked service.
Then select Azure blob storage, as we read the data from our blob storage, and then click on Continue.
Name the linked service, then select Azure subscription, then select Storage account name and then click on create.
Name the linked service, then select Azure subscription, then select server name, then provide a database name, then select authentication type, then provide username and password, then test the connection and then click on create.
Now we have created two linked services, one for our Azure blob storage from where we will read the data and one for our Azure SQL database where we will write the data, the next step is to create a pipeline.
How to create a pipeline in Azure Data Factory.
Go to the Author tab and then click on the + button then click on New pipeline.
Name the pipeline and then go to the activities tab and find and drag the Copy Data activity to the working window, and then click on the source and then click on + New button to create a new source dataset.
Then select the Azure blob storage and click on continue.
Then select CSV file as file format and then click on continue.
Name the pipeline, then select a linked service which we have created, then select the source files path, select the first row as header, then select none for import schema, then click on ok.
Once our source dataset is created, now we have to create a sink dataset, to create a sink data set go to the sink tab and then click on the + New button.
Then select Azure SQL database, and then click on continue.
Name the dataset, then select the linked service that we have created before, then provide the table name where we will write the data, and then click on ok.
Then go to the source tab and click on the wildcard file path and then input the values (*.txt).
Now bring another copy activity and connect with the older one and then click on the copy activity and go to the source tab then select the dataset which we have created in our previous copy activity, then click on the wildcard file path and then input the values (*.txt).
Then go to the sink tab and here we need a new dataset because we are pointing to a new path to write the data, so click on the + New button.
Select the Azure blob storage and click on continue.
Select CSV as file format and then click on continue.
Name the dataset, then select the linked service which we have used in our previous dataset, then provide the folder path then select first row ad header if your files has header, then click none for import schema and then click on ok
The next step is to bring delete activity and place it at the end of our activities, then connect and go to the source tab, select the source dataset, then click on the wildcard file path and then put the values (Emp*.txt).
as we have created our pipeline and placed the activities and configured accordingly, now click on Debug.
Really informative and deep detailed very well written with visuals helped me in setting these multiple CSV files in right way. I will use information in SEO course
ReplyDeletePlz read our reviews here.REVIEW
ReplyDeleteThanks for creating this ADF course Amir, I been following your SSIS and SSRS course from longtime and those really helped me to get a job at the beginning of my career, whenever I want to switch my jobs, I refer your course to brush-up. I am glad that you created this ADF course
ReplyDelete