Fabric Warehouse Advanced: COPY INTO, CTAS, Dynamic Management Views, Query Insights, Visual Query Editor, SSMS Connectivity, and T-SQL with Notebooks

Fabric Warehouse Advanced: COPY INTO, CTAS, Dynamic Management Views, Query Insights, Visual Query Editor, SSMS Connectivity, and T-SQL with Notebooks

Our Warehouse Practical Guide covered the fundamentals — creating tables, MERGE, stored procedures, views, and security. This post covers the ADVANCED capabilities: bulk loading with COPY INTO, creating tables from queries with CTAS, monitoring query performance with DMVs and Query Insights, the no-code Visual Query editor, connecting from SSMS, and the powerful pattern of combining T-SQL with Spark notebooks.

Table of Contents

  • COPY INTO: Bulk Data Loading
  • CTAS: Create Table As Select
  • Dynamic Management Views (DMVs)
  • Query Insight Views
  • Visual Query Editor (No-Code)
  • Connecting from SSMS and Azure Data Studio
  • Integrating T-SQL with Spark Notebooks
  • Warehouse-Specific Optimization
  • Statistics and Query Plans
  • Table Distribution and Partitioning
  • Result Set Caching
  • Common Mistakes
  • Interview Questions
  • Wrapping Up

COPY INTO: Bulk Data Loading

COPY INTO is the fastest way to load data from files into Warehouse tables:

-- Load CSV from Lakehouse Files into Warehouse table
COPY INTO dbo.stg_customers
FROM 'https://onelake.dfs.fabric.microsoft.com/workspace/lakehouse.Lakehouse/Files/raw_csv/customers.csv'
WITH (
    FILE_TYPE = 'CSV',
    FIRSTROW = 2,              -- Skip header
    FIELDTERMINATOR = ',',
    ROWTERMINATOR = '
',
    FIELDQUOTE = '"'
);

-- Load Parquet (even simpler — no format options needed)
COPY INTO dbo.stg_orders
FROM 'https://onelake.dfs.fabric.microsoft.com/workspace/lakehouse.Lakehouse/Files/parquet/orders/'
WITH (FILE_TYPE = 'PARQUET');

-- Load with wildcard (all files in folder)
COPY INTO dbo.stg_events
FROM 'https://onelake.dfs.fabric.microsoft.com/workspace/lakehouse.Lakehouse/Files/events/*.parquet'
WITH (FILE_TYPE = 'PARQUET');

COPY INTO vs Pipeline Copy Activity: COPY INTO runs inside the Warehouse (T-SQL), no pipeline needed. Copy Activity runs in a pipeline with orchestration. Use COPY INTO for simple, fast loads inside stored procedures. Use Copy Activity when you need orchestration, error handling, and monitoring.

CTAS: Create Table As Select

Create a new table from the results of a query — useful for materialized summaries, snapshots, and transformed tables:

-- Create summary table from a query
CREATE TABLE gold.monthly_revenue
AS
SELECT
    YEAR(order_date) AS year,
    MONTH(order_date) AS month,
    department,
    SUM(amount) AS total_revenue,
    COUNT(DISTINCT customer_id) AS unique_customers
FROM dbo.fact_orders f
JOIN dbo.dim_department d ON f.dept_key = d.dept_key
GROUP BY YEAR(order_date), MONTH(order_date), department;

-- Create a snapshot for auditing
CREATE TABLE audit.customers_snapshot_20260605
AS SELECT * FROM gold.dim_customer;

-- Create a filtered subset
CREATE TABLE staging.high_value_customers
AS SELECT * FROM gold.dim_customer WHERE lifetime_value > 10000;

CTAS vs SELECT INTO: In Fabric Warehouse, use CTAS (ANSI standard). SELECT INTO syntax may not be supported.

Dynamic Management Views (DMVs)

DMVs expose query performance and session information:

-- View active queries
SELECT * FROM sys.dm_exec_requests
WHERE status = 'running';

-- View query execution history
SELECT
    command,
    status,
    start_time,
    end_time,
    DATEDIFF(SECOND, start_time, end_time) AS duration_seconds,
    row_count
FROM sys.dm_exec_requests
ORDER BY start_time DESC;

-- View active sessions
SELECT
    session_id,
    login_name,
    status,
    last_request_start_time
FROM sys.dm_exec_sessions
WHERE is_user_process = 1;

-- View connection info
SELECT * FROM sys.dm_exec_connections;

Query Insight Views

Fabric Warehouse provides built-in Query Insight views for analyzing query performance without DMVs:

-- Query history with performance metrics
SELECT
    query_text,
    start_time,
    end_time,
    duration_ms,
    row_count,
    status,
    submitted_by
FROM queryinsights.exec_requests_history
ORDER BY start_time DESC;

-- Frequently run queries
SELECT
    query_hash,
    query_text,
    execution_count,
    avg_duration_ms,
    total_duration_ms
FROM queryinsights.frequently_run_queries
ORDER BY execution_count DESC;

-- Long-running queries
SELECT *
FROM queryinsights.long_running_queries
ORDER BY duration_ms DESC;

These views are pre-built — no setup required. Use them to identify slow queries, frequent queries, and optimization opportunities.

Visual Query Editor (No-Code)

The Visual Query editor lets analysts build queries by dragging and dropping — no SQL required:

  1. Open your Warehouse
  2. Click New visual query
  3. Drag tables from the Explorer onto the canvas
  4. Draw lines between tables to create joins (drag key column to key column)
  5. Click columns to add them to the output
  6. Add filters, aggregations, sorting via the ribbon
  7. Click Show SQL to see the generated T-SQL
  8. Click Run to execute

The Visual Query editor generates standard T-SQL. Analysts build queries visually, and engineers can review the generated SQL for optimization.

Connecting from SSMS and Azure Data Studio

SSMS Connection

  1. Open SQL Server Management Studio (SSMS)
  2. Server type: Database Engine
  3. Server name: Copy from Warehouse settings → SQL connection string (format: xxx.datawarehouse.fabric.microsoft.com)
  4. Authentication: Azure Active Directory – Universal with MFA
  5. Database: Select your warehouse name
  6. Click Connect

Once connected, you can run T-SQL, manage security (GRANT, DENY, RLS, CLS), and use all SSMS features.

Granting Access via SSMS

-- Connected to Warehouse via SSMS:

-- Grant schema-level access
GRANT SELECT ON SCHEMA::gold TO [analyst@company.com];
DENY SELECT ON SCHEMA::staging TO [analyst@company.com];

-- Grant table-level
GRANT SELECT ON gold.dim_customer (customer_id, name, city) TO [analyst@company.com];

-- RLS
CREATE FUNCTION dbo.fn_rls(@region VARCHAR(50))
RETURNS TABLE WITH SCHEMABINDING
AS RETURN SELECT 1 AS result WHERE @region = USER_NAME();

CREATE SECURITY POLICY RegionFilter
ADD FILTER PREDICATE dbo.fn_rls(region) ON gold.fact_sales
WITH (STATE = ON);

Integrating T-SQL with Spark Notebooks

Run T-SQL queries from within a Fabric Spark notebook using cross-database queries:

# Method 1: SparkSQL cross-database query
df = spark.sql('''
    SELECT c.name, c.city, SUM(f.amount) as total_spent
    FROM gold_warehouse.gold.dim_customer c
    JOIN gold_warehouse.gold.fact_sales f ON c.customer_key = f.customer_key
    GROUP BY c.name, c.city
    ORDER BY total_spent DESC
''')
df.show()

# Method 2: Read Warehouse table into Spark DataFrame
df = spark.read.table("gold_warehouse.gold.dim_customer")

# Method 3: Write Spark results to Warehouse via cross-DB insert
# (Write from Lakehouse notebook to Warehouse is limited —
#  use pipelines or stored procedures for Warehouse writes)

The pattern: Read from Warehouse (T-SQL via SparkSQL) → transform in PySpark → write to Lakehouse. Or: read from Lakehouse (PySpark) → the pipeline calls a Warehouse stored procedure for Gold layer loading.

Warehouse-Specific Optimization

Statistics

-- Create statistics for better query plans
CREATE STATISTICS stat_customer_city ON gold.dim_customer (city);
CREATE STATISTICS stat_sales_date ON gold.fact_sales (date_key);

-- Update statistics after large loads
UPDATE STATISTICS gold.fact_sales;
UPDATE STATISTICS gold.dim_customer;

Result Set Caching

-- Enable result set caching (returns cached results for repeated queries)
ALTER DATABASE CURRENT SET RESULT_SET_CACHING ON;

-- Check if a query hit the cache
SELECT result_cache_hit FROM sys.dm_exec_requests WHERE session_id = @@SPID;

Common Mistakes

  1. Not using COPY INTO for bulk loads — INSERT INTO row-by-row is dramatically slower than COPY INTO for large datasets.
  2. Not checking Query Insights — slow queries hide in production. Check queryinsights views weekly.
  3. Forgetting to update statistics — stale statistics cause bad query plans and slow queries.
  4. Not using CTAS for materialized summaries — recalculating aggregations every time is wasteful. CTAS creates a pre-computed table.

Interview Questions

Q: What is COPY INTO and when do you use it? A: COPY INTO bulk-loads data from files (CSV, Parquet) into Warehouse tables. Faster than INSERT statements or pipeline Copy for large files. Use inside stored procedures for automated loading or for one-time bulk loads.

Q: What are Query Insight views? A: Built-in views in Fabric Warehouse (queryinsights schema) that show query execution history, frequently run queries, and long-running queries with performance metrics. No setup required — they help identify optimization opportunities.

Wrapping Up

The Fabric Warehouse is more than basic T-SQL. COPY INTO handles bulk loading, CTAS creates materialized results, DMVs and Query Insights reveal performance issues, the Visual Query editor enables no-code analysis, and SSMS connectivity brings familiar tooling. Combined with the T-SQL + Notebook integration pattern, the Warehouse becomes a powerful component of your Fabric data platform.

Related posts:Warehouse Practical GuideLakehouse vs WarehouseSQL Indexes and Execution Plans


Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
Share via
Copy link