Export Data from Synapse Dedicated SQL Pool to Single CSV File in ADLS Gen2
📘 Overview
In Azure Synapse Analytics, it’s common to export data from a Dedicated SQL Pool to Azure Data Lake Storage Gen2 in CSV format for reporting, archival, or integration with downstream tools. This tutorial demonstrates how to do that and ensure the output is saved as a single CSV file.
🛠️ Step-by-Step: Export to Single CSV File
✅ 1. Query Data Using Synapse Spark
%%pyspark
df = spark.read \
.synapsesql("yourdedicatedpool.dbo.SalesData")
df.show()
✅ 2. Repartition to Single File
df_single = df.repartition(1)
This ensures only one output file is generated when writing to storage.
✅ 3. Write to ADLS Gen2 as CSV
df_single.write \
.mode("overwrite") \
.option("header", "true") \
.csv("abfss://export@yourstorageaccount.dfs.core.windows.net/sales/csv/")
📌 Notes
- repartition(1) should only be used for small to medium datasets due to single-threaded write.
- Azure Storage Explorer can be used to rename the output file (e.g., from
part-00000.csv
tosales_report.csv
). - Make sure Synapse workspace has access permissions (RBAC or SAS).
🎯 Use Cases
- Exporting summarized data for business intelligence
- Staging datasets for Power BI or external tools
- Archiving daily/weekly sales reports
📺 Watch the Video Tutorial
📚 Credit: Content created with the help of ChatGPT and Gemini.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.