How to Query Tables from Lake Database Using Serverless SQL Pool in Azure Synapse
📘 Overview
In Azure Synapse Analytics, a Lake Database created using Spark pools can be queried directly using Serverless SQL Pools. This allows seamless integration of big data stored in Data Lake with familiar T-SQL syntax for reporting and analytics without data movement.
🎯 Why Query Lake Databases with Serverless SQL?
- No need to provision or manage compute resources
- Familiar T-SQL interface for querying big data
- Cost-effective pay-per-query model
- Supports querying Delta, Parquet, and CSV formats
🛠️ Step-by-Step: Querying Spark-Created Tables
✅ Step 1: Create Table in Lake Database via Spark (Optional)
%%spark
CREATE TABLE lakehouse.customer (
id INT,
name STRING,
country STRING
)
USING DELTA
LOCATION 'abfss://datalake@youraccount.dfs.core.windows.net/lake/customer'
✅ Step 2: Confirm Table Appears in Lake Database
Go to Synapse Studio > Data > Lake Database and confirm the table is visible under the specified database.
✅ Step 3: Query the Table Using Serverless SQL
-- Use Serverless SQL Pool
SELECT TOP 10 *
FROM [YourLakeDB].[dbo].[customer];
✅ The table is accessible through the built-in metadata and requires no external table definition.
📌 Tips
- Ensure Spark and SQL pools have access to the same storage and metadata
- Use
TOP
orWHERE
clauses for performance optimization - Lake DB tables should be created under Spark with valid Delta/Parquet paths
📈 Use Cases
- Ad-hoc querying of curated data lakes
- Feeding Power BI dashboards directly from Lake DB
- Joining lake tables with Serverless SQL objects like external tables or views
📺 Watch the Video Tutorial
📚 Credit: Content created with the help of ChatGPT and Gemini.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.