Optimize Query Performance with Statistics in Azure Synapse Dedicated SQL Pool | Step-by-Step
📘 What are Statistics in Synapse?
In Azure Synapse Dedicated SQL Pools, statistics help the query optimizer estimate row counts and distribution, which affects the execution plan. Without accurate or up-to-date statistics, your queries may underperform or scan more data than needed.
🎯 Why Are Statistics Important?
- Improve query plan selection
- Reduce resource consumption (CPU, IO)
- Help avoid expensive broadcast joins or data movement
🛠️ Example 1: Create Statistics Manually
For a specific column (e.g., CustomerID):
CREATE STATISTICS stat_CustomerID
ON dbo.SalesFact_CSV(CustomerID);
🛠️ Example 2: Update Existing Statistics
To update outdated or stale stats:
UPDATE STATISTICS dbo.SalesFact_CSV(stat_CustomerID);
🛠️ Example 3: View All Statistics on a Table
SELECT name, auto_created, user_created, no_recompute
FROM sys.stats
WHERE object_id = OBJECT_ID('dbo.SalesFact_CSV');
🛠️ Example 4: Check Last Updated Date
SELECT name, STATS_DATE(object_id, stats_id) AS LastUpdated
FROM sys.stats
WHERE object_id = OBJECT_ID('dbo.SalesFact_CSV');
🛠️ Example 5: Create Statistics with FULLSCAN
CREATE STATISTICS stat_ProductID
ON dbo.SalesFact_CSV(ProductID)
WITH FULLSCAN;
📌 Best Practices
- Use
WITH FULLSCAN
on small-to-medium tables for high accuracy - Always update statistics after large data loads
- Do not rely solely on auto-created stats — they might not cover all key columns
- Monitor stale stats using
STATS_DATE
and refresh as needed
💡 Tip: When to Create vs. Update?
- Create: When no statistics exist on a column
- Update: When the existing statistics are outdated due to data changes
📺 Watch the Video Tutorial
📚 Credit: Content created with the help of ChatGPT and Gemini.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.