How to Read Data from Azure SQL Table and Write to JSON File in Blob Storage | PySpark Tutorial
This PySpark tutorial demonstrates how to read data from an Azure SQL Table using JDBC and write the results to a JSON file in Azure Blob Storage using SAS token access.
1️⃣ Step 1: Set Up Spark Session
from pyspark.sql import SparkSession
spark = SparkSession.builder \
.appName("ReadSQL_WriteToBlob_JSON") \
.getOrCreate()
2️⃣ Step 2: Define JDBC Connection
jdbcHostname = "yourserver.database.windows.net"
jdbcPort = 1433
jdbcDatabase = "yourdatabase"
jdbcUsername = "sqladmin"
jdbcPassword = "YourPassword123!"
jdbcUrl = f"jdbc:sqlserver://{jdbcHostname}:{jdbcPort};database={jdbcDatabase};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30"
connectionProperties = {
"user": jdbcUsername,
"password": jdbcPassword,
"driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}
3️⃣ Step 3: Read Data from Azure SQL Table
df = spark.read.jdbc(
url=jdbcUrl,
table="your_table_name",
properties=connectionProperties
)
df.show()
4️⃣ Step 4: Set SAS Token and Blob Storage Config
container_name = "yourcontainer"
storage_account_name = "yourstorageaccount"
sas_token = "<your_sas_token>"
spark.conf.set(
f"fs.azure.sas.{container_name}.{storage_account_name}.blob.core.windows.net",
sas_token
)
output_path = f"wasbs://{container_name}@{storage_account_name}.blob.core.windows.net/sql_to_json"
5️⃣ Step 5: Write to JSON Format in Blob Storage
df.coalesce(1).write.mode("overwrite").json(output_path)
print("✅ Data successfully written to Azure Blob Storage in JSON format.")
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.