How to Read data from Azure SQL Table and Write to JSON File in Blob Storage | PySpark Tutorial

How to Read Data from Azure SQL Table and Write to JSON File in Blob Storage | PySpark Tutorial

How to Read Data from Azure SQL Table and Write to JSON File in Blob Storage | PySpark Tutorial

This PySpark tutorial demonstrates how to read data from an Azure SQL Table using JDBC and write the results to a JSON file in Azure Blob Storage using SAS token access.

1️⃣ Step 1: Set Up Spark Session

from pyspark.sql import SparkSession

spark = SparkSession.builder \
    .appName("ReadSQL_WriteToBlob_JSON") \
    .getOrCreate()

2️⃣ Step 2: Define JDBC Connection

jdbcHostname = "yourserver.database.windows.net"
jdbcPort = 1433
jdbcDatabase = "yourdatabase"
jdbcUsername = "sqladmin"
jdbcPassword = "YourPassword123!"

jdbcUrl = f"jdbc:sqlserver://{jdbcHostname}:{jdbcPort};database={jdbcDatabase};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30"

connectionProperties = {
  "user": jdbcUsername,
  "password": jdbcPassword,
  "driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}

3️⃣ Step 3: Read Data from Azure SQL Table

df = spark.read.jdbc(
    url=jdbcUrl,
    table="your_table_name",
    properties=connectionProperties
)

df.show()

4️⃣ Step 4: Set SAS Token and Blob Storage Config

container_name = "yourcontainer"
storage_account_name = "yourstorageaccount"
sas_token = "<your_sas_token>"

spark.conf.set(
  f"fs.azure.sas.{container_name}.{storage_account_name}.blob.core.windows.net",
  sas_token
)

output_path = f"wasbs://{container_name}@{storage_account_name}.blob.core.windows.net/sql_to_json"

5️⃣ Step 5: Write to JSON Format in Blob Storage

df.coalesce(1).write.mode("overwrite").json(output_path)

print("✅ Data successfully written to Azure Blob Storage in JSON format.")

📺 Watch the Full Tutorial

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.