How to Prevent Creation of Empty Files through Azure Data Factory ADF Tutorial 2021

Issue: How to prevent the creation of empty files through Azure Data Factory.


In this article, we are going to learn how to prevent the creation of empty files through Azure Data Factory, let's start our demonstration, go to the Azure data factory and click on the author and then click on Pipelines and select new pipeline, then name the pipeline and drag the copy data activity from activities tab, click on the copy data activity and click on the source, here we have to create and new source data set, to create new source data set click on + New button then select Azure SQL Database as connection and then click on continue, it will move to another window where we have to create a new linked service so click on +New button and then Name the linked service, select the Azure subscription, Authentication type, user name & Password, test the connection and then click on create, as shown in the picture below.


Fig-1: Linked service for Source dataset.

Once our linked service is created, select the table from where we need to read the data, then select None for import schema, and then click ok. 

Now as we are done with our Source data set let's click on the sink tab and create a sink dataset where we write the data, click on the + New button then select Azure Blob Storage, and click on continue,  select the delimited text which is CSV file, then click on continue, now the next window will ask for the linked service, so we have to create a new linked service which will connect with the output folder where we want to write the data, click on the + New button and then name the linked service, select the Azure subscription, Storage account name, test the connection and click on create. as shown in the picture below.


Fig-2: Linked service for Sink dataset.

Once our linked service is created select the input folder as the sink folder where our data will be written, select the first row as header, then select none for import schema and click ok.

Once our sink dataset is created, just click on Debug, once debug process will be completed, go to your input folder, and your desired file will be there, now go back to our SSMS and delete the records from our file, and again click on debug, and this time it will overwrite and make the file empty, so in this scenario what we have to do we have to write logic and make a condition if there is no data in the file it will be notified and should not create any file.

Now go to the azure data factory and search and drag lookup, from the activities tab, click on lookup activity and go to the settings tab and create a new source dataset, select the Azure SQL Database as connection, click on continue and it will ask for the linked service, select the linked service which we have created already, select the table which data coming from, select none as import schema, and click ok, then checkmark the ''First row only'' box and then go to the activities tab and drag the if condition activity and place between the Lookup activity & Copy data activity, then click on If condition activity and go to the activities, and click on add dynamic content, then select the Contains function and then inside the contains function to select the lookup and then insert the first row, as shown in the picture below.


Fig-3: Create dynamic content in the if condition activity.

Once our If Condition activity is configured right click on copy data activity and Select cut, then click on True in If Condition Activity and open it and paste the copy data activity there, and then back to the main window. and Click on Debug, once debug process is completed, let's take a look at the output folder of lookup activity, as shown in the picture below there is no file or data copied.



Fig-4: Debug completion report

Now let's go to SQL Server Management Studio and insert some records in our table, once we insert the records go back to our Azure data factory and click on ''debug'', once our debug process is completed, this time it will read the data from the source and write it in the input folder because our file has the records, as you can see in the picture below.


Fig-5: Debug completion report.


 

Video Demo: How to Prevent Creation of Empty File through Azure Data Factory


No comments:

Post a Comment