🔍 What is Serverless SQL Pool?
Create a Serverless SQL Pool Database and External Table in Azure Synapse
Serverless SQL pool in Azure Synapse is a pay-per-query distributed query engine that enables users to run T-SQL queries directly on data stored in Azure Data Lake Storage (ADLS) without the need to provision or manage dedicated compute resources.
✅ Key Features:
- No infrastructure management
- Query data in CSV, Parquet, or JSON formats
- Cost-efficient: only pay for the data you query
- Supports
OPENROWSET
and external tables
📆 How to Create a Serverless SQL Pool Database
Serverless SQL Pool is available by default in each Synapse workspace under the name Built-in or ServerlessSQLPool.
Steps:
- Open Azure Synapse Studio.
- Navigate to the Data tab.
- Choose the built-in pool.
- Run the following T-SQL to create a new database:
CREATE DATABASE demo_serverless_db;
You can now query data directly using T-SQL from this database.
💰 Pricing Details
Serverless SQL pool uses a pay-per-query pricing model.
📆 Charges:
- $5 per TB of data processed (latest pricing may vary)
- No charges for:
- Metadata-only queries
- Failed queries
- Repeated queries (cached)
- Use
sys.dm_external_data_processed
to track usage.
💸 Cost Formula:
Processed Data (in TB) × $5 = Total Cost
🛠️ Creating External Tables in Serverless SQL Pool
External tables provide schema-on-read capability over data in Azure Data Lake.
✅ Prerequisites:
- Data must reside in Azure Data Lake Storage Gen2
- Create an External Data Source and a File Format
Example Setup:
-- Step 1: External Data Source
CREATE EXTERNAL DATA SOURCE MyDataLake
WITH (
LOCATION = 'https://<yourstorage>.dfs.core.windows.net/<container>'
);
-- Step 2: File Format
CREATE EXTERNAL FILE FORMAT CsvFormat
WITH (
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS (
FIELD_TERMINATOR = ',',
STRING_DELIMITER = '"',
FIRST_ROW = 2
)
);
-- Step 3: External Table
CREATE EXTERNAL TABLE dbo.CustomerData (
customer_id INT,
fname VARCHAR(50),
lname VARCHAR(50),
salary FLOAT
)
WITH (
LOCATION = 'customer/',
DATA_SOURCE = MyDataLake,
FILE_FORMAT = CsvFormat
);
-- Query the Table
SELECT TOP 10 * FROM dbo.CustomerData;
💡 Tips and Extras
Using OPENROWSET
for Quick Queries:
SELECT * FROM OPENROWSET(
BULK 'customer/file.csv',
DATA_SOURCE = 'MyDataLake',
FORMAT = 'CSV',
PARSER_VERSION = '2.0',
FIRSTROW = 2
) AS data;
- No need to ingest data
- Always reads the most recent file content
🚀 Summary
Serverless SQL pools offer a flexible, cost-effective way to query files directly in your Data Lake without complex infrastructure. Ideal for exploratory data analysis, quick reports, and hybrid data architecture use cases.
📺 Watch the Full Tutorial
Learn how to set up and use Serverless SQL Pool in the video below:
This blog post was created with assistance from ChatGPT and Gemini AI to ensure technical accuracy and clarity.
No comments:
Post a Comment