How to Check File Schema Using Serverless SQL Pool in Azure Synapse | Azure Synapse Analytics Tutorial

How to Check File Schema Using Serverless SQL Pool in Azure Synapse

๐Ÿ“˜ How to Check File Schema Using Serverless SQL Pool in Azure Synapse

๐ŸŽฏ Objective:

Learn how to inspect the schema (columns and data types) of CSV, Parquet, and JSON files stored in Azure Data Lake Gen2 using Serverless SQL Pool — without importing the data.

๐Ÿงช Option 1: Use TOP 1 for a Quick Preview

✅ CSV File

SELECT TOP 1 *
FROM OPENROWSET(
    BULK 'https://<account>.dfs.core.windows.net/<container>/yourfile.csv',
    FORMAT = 'CSV',
    HEADER_ROW = TRUE,
    PARSER_VERSION = '2.0'
) AS rows;

✅ Parquet File

SELECT TOP 1 *
FROM OPENROWSET(
    BULK 'https://<account>.dfs.core.windows.net/<container>/yourfile.parquet',
    FORMAT = 'PARQUET'
) AS rows;

✅ JSON File (NDJSON)

To preview the schema of newline-delimited JSON (NDJSON), use the WITH clause and parse with OPENJSON:

SELECT TOP 1
    jsonData.customer_id,
    jsonData.name,
    jsonData.email
FROM OPENROWSET(
    BULK 'https://<account>.dfs.core.windows.net/<container>/yourfile.json',
    FORMAT = 'CSV',
    FIELDQUOTE = '0x0b',
    FIELDTERMINATOR = '0x0b',
    ROWTERMINATOR = '0x0a'
) WITH (
    jsonContent VARCHAR(MAX)
) AS raw
CROSS APPLY OPENJSON(jsonContent)
WITH (
    customer_id INT,
    name VARCHAR(100),
    email VARCHAR(100)
) AS jsonData;

๐Ÿงช Option 2: Use sp_describe_first_result_set (Schema Only)

This method returns only metadata without scanning actual data — great for cost-free inspection.

✅ CSV File Schema

EXEC sp_describe_first_result_set N'
SELECT *
FROM OPENROWSET(
    BULK ''https://<account>.dfs.core.windows.net/<container>/yourfile.csv'',
    FORMAT = ''CSV'',
    HEADER_ROW = TRUE,
    PARSER_VERSION = ''2.0''
) AS rows;';

✅ Parquet File Schema

EXEC sp_describe_first_result_set N'
SELECT *
FROM OPENROWSET(
    BULK ''https://<account>.dfs.core.windows.net/<container>/yourfile.parquet'',
    FORMAT = ''PARQUET''
) AS rows;';

✅ JSON File Schema (NDJSON)

EXEC sp_describe_first_result_set N'
SELECT
    jsonData.customer_id,
    jsonData.name,
    jsonData.email
FROM OPENROWSET(
    BULK ''https://<account>.dfs.core.windows.net/<container>/yourfile.json'',
    FORMAT = ''CSV'',
    FIELDQUOTE = ''0x0b'',
    FIELDTERMINATOR = ''0x0b'',
    ROWTERMINATOR = ''0x0a''
) WITH (
    jsonContent VARCHAR(MAX)
) AS raw
CROSS APPLY OPENJSON(jsonContent)
WITH (
    customer_id INT,
    name VARCHAR(100),
    email VARCHAR(100)
) AS jsonData;';

๐Ÿ“บ Watch the Full Tutorial

Learn how to inspect file schemas visually in the video below:



This blog post was created with assistance from ChatGPT and Gemini AI to ensure technical accuracy and clarity.

No comments:

Post a Comment