Read, Clean, and Save Lakehouse to Delta Table Using Data Wrangler
Microsoft Fabric Tutorial
📘 What is Data Wrangler in Microsoft Fabric?
Data Wrangler is a powerful UI-based tool in Microsoft Fabric that allows you to explore, clean, transform, and prepare your data visually before saving it to a Delta table. It is built with data engineers and analysts in mind, allowing no-code or low-code data shaping before analysis or modeling.
✅ What You'll Learn
- How to launch Data Wrangler from your Lakehouse
- Read raw files (CSV/Parquet/JSON) into a temporary table
- Apply transformations such as filtering, renaming, changing data types, and cleaning nulls
- Save the cleaned dataset as a managed Delta Table in the Lakehouse
🛠️ Step-by-Step Instructions
- Go to your Lakehouse workspace in Microsoft Fabric
- Navigate to the Files tab and locate a CSV or Parquet file
- Right-click the file and select Open in Data Wrangler
- Apply transformations using the UI (you’ll see Spark/Python code generated)
- Click Export to Lakehouse and choose to save as a Delta Table
🎯 Best Practices
- Preview your data to detect malformed rows before loading
- Use column profiling to check distributions and nulls
- Export frequently used clean datasets into Delta for better query performance
- Rename columns and fix data types early in the wrangling process
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.