Delta Table Operations in Lakehouse: Insert, Update, Delete, Merge | Microsoft Fabric Tutorial

Delta Table Operations in Lakehouse: Insert, Update, Delete, Merge

Microsoft Fabric Tutorial

🧾 Overview

This guide demonstrates how to perform Delta Table operations inside a Microsoft Fabric Lakehouse using PySpark. You’ll learn how to read data, write it as a Delta table, and perform INSERT, UPDATE, DELETE, and MERGE operations efficiently in a notebook environment.

🧠 What is a Delta Table?

A Delta table is a high-performance storage format that supports:

ACID transactions
Efficient updates and deletes
Schema evolution (adaptive schemas)
Versioning and time travel

Delta Tables combine the speed of Apache Spark with the consistency of a data warehouse.

📥 Step 1: Read the CSV File from Lakehouse

# Replace path if needed
df = spark.read.option("header", True).option("inferSchema", True).csv("Files/sample_customers_dw.csv")
display(df)

📤 Step 2: Write to a Delta Table

df.write.format("delta").mode("overwrite").saveAsTable("customerdelta")

➕ Insert Operation

from pyspark.sql import Row

new_row = Row(customerid=300, firstname="Zeeshan", lastname="Ali", city="New York")
df_new = spark.createDataFrame([new_row])
df_new.write.format("delta").mode("append").saveAsTable("customerdelta")

✏️ Update Operation

from delta.tables import DeltaTable

target = DeltaTable.forName(spark, "customerdelta")
target.update(
    condition="customerid = 300",
    set={"city": "'Los Angeles'"}
)

🗑️ Delete Operation

target.delete(condition="customerid = 300")

🔁 Merge (Upsert) Operation

updatesDf = spark.createDataFrame([
    Row(customerid=101, firstname="Aamir", lastname="Shahzad", city="Charlotte")
])

target.alias("t").merge(
    updatesDf.alias("s"),
    "t.customerid = s.customerid"
).whenMatchedUpdate(set={"city": "s.city"}) \
 .whenNotMatchedInsert(values={
    "customerid": "s.customerid",
    "firstname": "s.firstname",
    "lastname": "s.lastname",
    "city": "s.city"
}).execute()

Welcome To TechBrothersIT

Label