Fabric Warehouse Advanced: COPY INTO, CTAS, Dynamic Management Views, Query Insights, Visual Query Editor, SSMS Connectivity, and T-SQL with Notebooks
Our Warehouse Practical Guide covered the fundamentals — creating tables, MERGE, stored procedures, views, and security. This post covers the ADVANCED capabilities: bulk loading with COPY INTO, creating tables from queries with CTAS, monitoring query performance with DMVs and Query Insights, the no-code Visual Query editor, connecting from SSMS, and the powerful pattern of combining T-SQL with Spark notebooks.
Table of Contents
- COPY INTO: Bulk Data Loading
- CTAS: Create Table As Select
- Dynamic Management Views (DMVs)
- Query Insight Views
- Visual Query Editor (No-Code)
- Connecting from SSMS and Azure Data Studio
- Integrating T-SQL with Spark Notebooks
- Warehouse-Specific Optimization
- Statistics and Query Plans
- Table Distribution and Partitioning
- Result Set Caching
- Common Mistakes
- Interview Questions
- Wrapping Up
COPY INTO: Bulk Data Loading
COPY INTO is the fastest way to load data from files into Warehouse tables:
-- Load CSV from Lakehouse Files into Warehouse table
COPY INTO dbo.stg_customers
FROM 'https://onelake.dfs.fabric.microsoft.com/workspace/lakehouse.Lakehouse/Files/raw_csv/customers.csv'
WITH (
FILE_TYPE = 'CSV',
FIRSTROW = 2, -- Skip header
FIELDTERMINATOR = ',',
ROWTERMINATOR = '
',
FIELDQUOTE = '"'
);
-- Load Parquet (even simpler — no format options needed)
COPY INTO dbo.stg_orders
FROM 'https://onelake.dfs.fabric.microsoft.com/workspace/lakehouse.Lakehouse/Files/parquet/orders/'
WITH (FILE_TYPE = 'PARQUET');
-- Load with wildcard (all files in folder)
COPY INTO dbo.stg_events
FROM 'https://onelake.dfs.fabric.microsoft.com/workspace/lakehouse.Lakehouse/Files/events/*.parquet'
WITH (FILE_TYPE = 'PARQUET');
COPY INTO vs Pipeline Copy Activity: COPY INTO runs inside the Warehouse (T-SQL), no pipeline needed. Copy Activity runs in a pipeline with orchestration. Use COPY INTO for simple, fast loads inside stored procedures. Use Copy Activity when you need orchestration, error handling, and monitoring.
CTAS: Create Table As Select
Create a new table from the results of a query — useful for materialized summaries, snapshots, and transformed tables:
-- Create summary table from a query
CREATE TABLE gold.monthly_revenue
AS
SELECT
YEAR(order_date) AS year,
MONTH(order_date) AS month,
department,
SUM(amount) AS total_revenue,
COUNT(DISTINCT customer_id) AS unique_customers
FROM dbo.fact_orders f
JOIN dbo.dim_department d ON f.dept_key = d.dept_key
GROUP BY YEAR(order_date), MONTH(order_date), department;
-- Create a snapshot for auditing
CREATE TABLE audit.customers_snapshot_20260605
AS SELECT * FROM gold.dim_customer;
-- Create a filtered subset
CREATE TABLE staging.high_value_customers
AS SELECT * FROM gold.dim_customer WHERE lifetime_value > 10000;
CTAS vs SELECT INTO: In Fabric Warehouse, use CTAS (ANSI standard). SELECT INTO syntax may not be supported.
Dynamic Management Views (DMVs)
DMVs expose query performance and session information:
-- View active queries
SELECT * FROM sys.dm_exec_requests
WHERE status = 'running';
-- View query execution history
SELECT
command,
status,
start_time,
end_time,
DATEDIFF(SECOND, start_time, end_time) AS duration_seconds,
row_count
FROM sys.dm_exec_requests
ORDER BY start_time DESC;
-- View active sessions
SELECT
session_id,
login_name,
status,
last_request_start_time
FROM sys.dm_exec_sessions
WHERE is_user_process = 1;
-- View connection info
SELECT * FROM sys.dm_exec_connections;
Query Insight Views
Fabric Warehouse provides built-in Query Insight views for analyzing query performance without DMVs:
-- Query history with performance metrics
SELECT
query_text,
start_time,
end_time,
duration_ms,
row_count,
status,
submitted_by
FROM queryinsights.exec_requests_history
ORDER BY start_time DESC;
-- Frequently run queries
SELECT
query_hash,
query_text,
execution_count,
avg_duration_ms,
total_duration_ms
FROM queryinsights.frequently_run_queries
ORDER BY execution_count DESC;
-- Long-running queries
SELECT *
FROM queryinsights.long_running_queries
ORDER BY duration_ms DESC;
These views are pre-built — no setup required. Use them to identify slow queries, frequent queries, and optimization opportunities.
Visual Query Editor (No-Code)
The Visual Query editor lets analysts build queries by dragging and dropping — no SQL required:
- Open your Warehouse
- Click New visual query
- Drag tables from the Explorer onto the canvas
- Draw lines between tables to create joins (drag key column to key column)
- Click columns to add them to the output
- Add filters, aggregations, sorting via the ribbon
- Click Show SQL to see the generated T-SQL
- Click Run to execute
The Visual Query editor generates standard T-SQL. Analysts build queries visually, and engineers can review the generated SQL for optimization.
Connecting from SSMS and Azure Data Studio
SSMS Connection
- Open SQL Server Management Studio (SSMS)
- Server type: Database Engine
- Server name: Copy from Warehouse settings → SQL connection string (format:
xxx.datawarehouse.fabric.microsoft.com) - Authentication: Azure Active Directory – Universal with MFA
- Database: Select your warehouse name
- Click Connect
Once connected, you can run T-SQL, manage security (GRANT, DENY, RLS, CLS), and use all SSMS features.
Granting Access via SSMS
-- Connected to Warehouse via SSMS:
-- Grant schema-level access
GRANT SELECT ON SCHEMA::gold TO [analyst@company.com];
DENY SELECT ON SCHEMA::staging TO [analyst@company.com];
-- Grant table-level
GRANT SELECT ON gold.dim_customer (customer_id, name, city) TO [analyst@company.com];
-- RLS
CREATE FUNCTION dbo.fn_rls(@region VARCHAR(50))
RETURNS TABLE WITH SCHEMABINDING
AS RETURN SELECT 1 AS result WHERE @region = USER_NAME();
CREATE SECURITY POLICY RegionFilter
ADD FILTER PREDICATE dbo.fn_rls(region) ON gold.fact_sales
WITH (STATE = ON);
Integrating T-SQL with Spark Notebooks
Run T-SQL queries from within a Fabric Spark notebook using cross-database queries:
# Method 1: SparkSQL cross-database query
df = spark.sql('''
SELECT c.name, c.city, SUM(f.amount) as total_spent
FROM gold_warehouse.gold.dim_customer c
JOIN gold_warehouse.gold.fact_sales f ON c.customer_key = f.customer_key
GROUP BY c.name, c.city
ORDER BY total_spent DESC
''')
df.show()
# Method 2: Read Warehouse table into Spark DataFrame
df = spark.read.table("gold_warehouse.gold.dim_customer")
# Method 3: Write Spark results to Warehouse via cross-DB insert
# (Write from Lakehouse notebook to Warehouse is limited —
# use pipelines or stored procedures for Warehouse writes)
The pattern: Read from Warehouse (T-SQL via SparkSQL) → transform in PySpark → write to Lakehouse. Or: read from Lakehouse (PySpark) → the pipeline calls a Warehouse stored procedure for Gold layer loading.
Warehouse-Specific Optimization
Statistics
-- Create statistics for better query plans
CREATE STATISTICS stat_customer_city ON gold.dim_customer (city);
CREATE STATISTICS stat_sales_date ON gold.fact_sales (date_key);
-- Update statistics after large loads
UPDATE STATISTICS gold.fact_sales;
UPDATE STATISTICS gold.dim_customer;
Result Set Caching
-- Enable result set caching (returns cached results for repeated queries)
ALTER DATABASE CURRENT SET RESULT_SET_CACHING ON;
-- Check if a query hit the cache
SELECT result_cache_hit FROM sys.dm_exec_requests WHERE session_id = @@SPID;
Common Mistakes
- Not using COPY INTO for bulk loads — INSERT INTO row-by-row is dramatically slower than COPY INTO for large datasets.
- Not checking Query Insights — slow queries hide in production. Check queryinsights views weekly.
- Forgetting to update statistics — stale statistics cause bad query plans and slow queries.
- Not using CTAS for materialized summaries — recalculating aggregations every time is wasteful. CTAS creates a pre-computed table.
Interview Questions
Q: What is COPY INTO and when do you use it? A: COPY INTO bulk-loads data from files (CSV, Parquet) into Warehouse tables. Faster than INSERT statements or pipeline Copy for large files. Use inside stored procedures for automated loading or for one-time bulk loads.
Q: What are Query Insight views? A: Built-in views in Fabric Warehouse (queryinsights schema) that show query execution history, frequently run queries, and long-running queries with performance metrics. No setup required — they help identify optimization opportunities.
Wrapping Up
The Fabric Warehouse is more than basic T-SQL. COPY INTO handles bulk loading, CTAS creates materialized results, DMVs and Query Insights reveal performance issues, the Visual Query editor enables no-code analysis, and SSMS connectivity brings familiar tooling. Combined with the T-SQL + Notebook integration pattern, the Warehouse becomes a powerful component of your Fabric data platform.
Related posts: – Warehouse Practical Guide – Lakehouse vs Warehouse – SQL Indexes and Execution Plans
Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.