Power BI in Fabric: Direct Lake, Semantic Models, Import vs DirectQuery vs Direct Lake, and Connecting Your Data to Reports

Power BI in Fabric: Direct Lake, Semantic Models, Import vs DirectQuery vs Direct Lake, and Connecting Your Data to Reports

Every pipeline, every notebook, every transformation — all of it leads to one thing: a dashboard that business users look at. Power BI is that dashboard. And in Fabric, Power BI gets a superpower it never had before: Direct Lake mode.

Direct Lake reads data directly from OneLake Delta tables — no data import, no scheduled refresh, no DirectQuery performance penalty. It combines the speed of Import mode with the freshness of DirectQuery. This post explains how it works, when it falls back to DirectQuery, and how to connect your Lakehouse and Warehouse tables to Power BI reports.

Think of Import mode like printing a report — fast to read but stale the moment you print it. DirectQuery is like a live video feed — always current but slow because it queries the source every time you interact. Direct Lake is like a smart TV with local cache — it reads directly from storage (fast) and the data is always the latest version (fresh). Best of both worlds.

Table of Contents

  • The Three Connection Modes
  • Import Mode (Traditional)
  • DirectQuery Mode
  • Direct Lake Mode (New in Fabric)
  • How Direct Lake Works
  • When Direct Lake Falls Back to DirectQuery
  • Semantic Models in Fabric
  • What Is a Semantic Model
  • Auto-Generated vs Custom Semantic Models
  • Creating a Semantic Model from Lakehouse
  • Creating a Semantic Model from Warehouse
  • Building Relationships
  • Measures and Calculated Columns
  • Connecting Power BI Desktop to Fabric
  • Default Semantic Model vs Custom
  • Report Refresh Behavior
  • Row-Level Security with Direct Lake
  • Performance Optimization for Direct Lake
  • V-Order
  • Column Data Types
  • Partitioning
  • Real-World Scenario: End-to-End Pipeline to Dashboard
  • Common Mistakes
  • Interview Questions
  • Wrapping Up

The Three Connection Modes

Feature Import DirectQuery Direct Lake
Where data lives Copied into Power BI dataset Stays in source DB Stays in OneLake (Delta)
Freshness Stale (until refresh) Always current Always current
Speed Fastest (in-memory) Slower (queries source on every interaction) Fast (reads Delta directly)
Refresh needed? Yes (scheduled) No No
Dataset size limit 1 GB (Pro), 400 GB (Premium) No limit No limit
Works with Any data source Any data source Fabric Lakehouse/Warehouse only
Available in Always Always Fabric only

Import Mode (Traditional)

Data Source → [Scheduled Refresh] → Power BI Dataset (in-memory copy) → Report
  - Data is COPIED into Power BI
  - Fast queries (everything in RAM)
  - Stale between refreshes (8 refreshes/day on Pro)
  - Size limited (1 GB Pro, 400 GB Premium)

DirectQuery Mode

Report interaction → Power BI sends SQL query → Data Source → results → Report
  - No data copied (always queries the source)
  - Always fresh (real-time)
  - Slower (every click sends a query to the source)
  - Source must be performant

Direct Lake Mode (New in Fabric)

Report interaction → Power BI reads Delta files directly from OneLake → Report
  - No data copied (reads Delta Parquet files directly)
  - Always fresh (reads latest Delta version)
  - Fast (reads optimized columnar files, similar speed to Import)
  - No scheduled refresh needed
  - ONLY works with Fabric Lakehouse and Warehouse tables

How Direct Lake Works

Step 1: You create a semantic model on a Lakehouse or Warehouse
Step 2: Power BI identifies the Delta table files in OneLake
Step 3: When a user opens a report, Power BI reads the Parquet files directly
        (no data copy, no SQL query to the source)
Step 4: Data is loaded into the VertiPaq engine (in-memory, columnar)
Step 5: Report renders with Import-like speed

When the underlying Delta table changes (notebook writes new data):
  → Power BI detects the change on next interaction
  → Loads the updated Delta files
  → Report shows fresh data — no scheduled refresh!

The key insight: Direct Lake bypasses the SQL query layer entirely. Instead of sending SELECT queries to a database (DirectQuery), it reads the raw Parquet files that make up the Delta table. This is why it is as fast as Import but as fresh as DirectQuery.

When Direct Lake Falls Back to DirectQuery

Direct Lake has guardrails. If the data exceeds certain thresholds, it falls back to DirectQuery mode (slower but still works):

Guardrail F2 Capacity F64 Capacity F128+ Capacity
Max rows per table 300 million 1.5 billion 3 billion
Max tables per model 500 500 500
Max model size (on disk) 10 GB 40 GB 80 GB
Max columns per table 2,000 2,000 2,000

If ANY threshold is exceeded → automatic fallback to DirectQuery for that table.

How to prevent fallback: – Use OPTIMIZE on Delta tables (fewer, larger files = faster loading) – Choose appropriate data types (INT instead of BIGINT, avoid unnecessary VARCHARs) – Aggregate at the Gold layer (do not point Power BI at raw Bronze tables) – V-Order optimize your Delta tables

Semantic Models in Fabric

What Is a Semantic Model

A Semantic Model (formerly “dataset”) is the analytical layer between your data and Power BI reports. It defines which tables are available, their relationships, measures (DAX calculations), and security rules.

Lakehouse/Warehouse Tables → Semantic Model → Power BI Report(s)
  (raw data)                   (relationships,    (visuals, charts,
                                measures, RLS)     dashboards)

Auto-Generated vs Custom Semantic Models

Type Created Best For
Auto-generated Automatically when you create a Lakehouse or Warehouse Quick exploration, prototyping
Custom Manually by you Production reports (controlled relationships, measures, RLS)

Creating a Semantic Model from Lakehouse

  1. Open your Lakehouse → click the SQL analytics endpoint view
  2. Click New semantic model (or Reporting → New semantic model)
  3. Select tables to include (e.g., dim_customer, dim_product, fact_sales)
  4. Click Confirm
  5. The semantic model opens in the web-based modeling view

Creating a Semantic Model from Warehouse

  1. Open your Warehouse
  2. Click New semantic model
  3. Select tables/views from your schemas (gold.dim_customer, reports.vw_revenue)
  4. Click Confirm

Building Relationships

In the modeling view: 1. Drag a column from one table to another (e.g., fact_sales.customer_keydim_customer.customer_key) 2. Configure: Cardinality (Many-to-One), Cross filter direction (Single) 3. Repeat for all dimension-to-fact relationships

dim_customer ──── customer_key ───── fact_sales
dim_product  ──── product_key  ───── fact_sales
dim_date     ──── date_key     ───── fact_sales

Measures and Calculated Columns

// DAX Measures (calculated at query time — recommended)
Total Revenue = SUM(fact_sales[total_amount])
Order Count = COUNTROWS(fact_sales)
Avg Order Value = DIVIDE([Total Revenue], [Order Count], 0)
YoY Growth = 
    VAR CurrentYear = [Total Revenue]
    VAR PrevYear = CALCULATE([Total Revenue], SAMEPERIODLASTYEAR(dim_date[full_date]))
    RETURN DIVIDE(CurrentYear - PrevYear, PrevYear, 0)

Connecting Power BI Desktop to Fabric

  1. Open Power BI Desktop
  2. Get DataMicrosoft FabricLakehouse or Warehouse
  3. Sign in with your organizational account
  4. Select workspace → select Lakehouse/Warehouse
  5. Choose tables → Load
  6. Connection mode is automatically Direct Lake (if using Fabric)

Report Refresh Behavior

Mode When Data Updates User Experience
Import Only after scheduled refresh Stale between refreshes
DirectQuery Every report interaction Always fresh, but slower
Direct Lake When underlying Delta files change Fresh within seconds of notebook/pipeline completion — no refresh needed

The magic: Run your notebook at 6 AM → it writes to the Lakehouse → Power BI report at 6:01 AM shows the new data. No refresh button. No schedule. It just works.

Row-Level Security with Direct Lake

RLS works the same as traditional Power BI but is enforced through the SQL analytics endpoint:

-- In the Lakehouse SQL analytics endpoint or Warehouse:
-- Create a security function
CREATE FUNCTION dbo.fn_rls_filter(@region VARCHAR(50))
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN SELECT 1 AS result WHERE @region = USER_NAME();

-- Create security policy
CREATE SECURITY POLICY RegionFilter
ADD FILTER PREDICATE dbo.fn_rls_filter(region) ON dbo.fact_sales
WITH (STATE = ON);

Or define RLS roles in the Semantic Model using DAX:

// Role: EastRegion
// Table: fact_sales
// Filter: [region] = "East"
[region] = USERPRINCIPALNAME()

Performance Optimization for Direct Lake

V-Order

V-Order is a write-time optimization that sorts data within Parquet files for faster Direct Lake reads:

# V-Order is enabled by default in Fabric notebooks
# Verify:
spark.conf.get("spark.sql.parquet.vorder.enabled")  # "true"

# If disabled, enable:
spark.conf.set("spark.sql.parquet.vorder.enabled", "true")

Best Practices

  1. Run OPTIMIZE on Gold tables — compacts small files, improves Direct Lake load time
  2. Use appropriate types — INT instead of BIGINT, DATE instead of DATETIME when time is not needed
  3. Aggregate at Gold layer — Direct Lake on a 100M-row fact table is fast; on a 10B-row Bronze table, it falls back
  4. Limit columns — include only columns needed for reporting in the semantic model
  5. Use star schema — fact + dimension tables with proper relationships load faster than wide denormalized tables

Real-World Scenario: End-to-End Pipeline to Dashboard

6:00 AM — Pipeline runs:
  Copy Activity → Azure SQL to bronze_lakehouse (raw data)

6:05 AM — Notebook runs:
  NB_Bronze_to_Silver → cleans, validates, deduplicates

6:15 AM — Notebook runs:
  NB_Silver_to_Gold → Delta MERGE into dim/fact tables in gold_lakehouse
  NB runs OPTIMIZE on all Gold tables

6:20 AM — Semantic Model:
  Points to gold_lakehouse SQL analytics endpoint
  Tables: dim_customer, dim_product, dim_date, fact_sales
  Mode: Direct Lake (automatic)

6:21 AM — Business user opens Power BI report:
  Report reads Delta files from gold_lakehouse via Direct Lake
  Data is from TODAY's 6 AM load — completely fresh
  No scheduled refresh was triggered
  Report loads in 2-3 seconds (Import-like speed)

Common Mistakes

  1. Pointing Direct Lake at Bronze tables — raw, unoptimized tables with billions of rows cause fallback to DirectQuery. Always point to optimized Gold layer tables.

  2. Not running OPTIMIZE — fragmented Delta tables with thousands of small files slow down Direct Lake. OPTIMIZE weekly or after large loads.

  3. Using Import mode in Fabric — there is almost no reason to use Import when Direct Lake is available. Direct Lake gives Import speed with DirectQuery freshness.

  4. Not building a proper star schema — wide denormalized tables are harder for Direct Lake to optimize. Star schema (fact + dimensions) loads faster and creates better DAX performance.

  5. Ignoring fallback indicators — check the semantic model properties for “Direct Lake fallback” warnings. If falling back, optimize the underlying data.

Interview Questions

Q: What is Direct Lake mode in Power BI? A: Direct Lake reads Delta Parquet files directly from OneLake without copying data (Import) or querying a database (DirectQuery). It provides Import-like speed with DirectQuery freshness. Data is always current — when a notebook writes new data, the next report interaction sees it with no scheduled refresh. Only available with Fabric Lakehouse and Warehouse.

Q: When does Direct Lake fall back to DirectQuery? A: When guardrails are exceeded: rows per table (300M-3B depending on capacity), model size on disk (10-80 GB), or columns per table (2,000). Also falls back for features not supported in Direct Lake mode. Prevent fallback by running OPTIMIZE, using proper data types, and aggregating at the Gold layer.

Q: What is the difference between a Semantic Model and a Power BI Dataset? A: They are the same thing — Microsoft renamed “Dataset” to “Semantic Model.” It is the analytical layer that defines tables, relationships, measures (DAX), and security rules between the data source and Power BI reports.

Wrapping Up

Direct Lake is the reason Fabric is transformative for Power BI. No more choosing between speed (Import) and freshness (DirectQuery). Build your pipelines, write to your Gold Lakehouse or Warehouse, and Power BI reads the latest data directly — fast, fresh, and maintenance-free.

Related posts:Fabric Lakehouse Practical GuideFabric Warehouse Practical GuideLakehouse vs WarehouseNormalization and Star Schema


Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
Share via
Copy link