Mirrored Databases in Microsoft Fabric: Real-Time Replication from SQL Server, Cosmos DB, Snowflake, and PostgreSQL Without Building a Single Pipeline

Mirrored Databases in Microsoft Fabric: Real-Time Replication from SQL Server, Cosmos DB, Snowflake, and PostgreSQL Without Building a Single Pipeline

You have been building pipelines throughout this entire blog — ADF Copy Activities to move data from SQL to ADLS, Databricks notebooks to transform Delta tables, Fabric Dataflow Gen2 to clean data. Every pipeline requires scheduling, monitoring, error handling, and maintenance.

What if you could skip ALL of that for the initial replication? What if your SQL Server database automatically appeared in Fabric as Delta tables — updated in near real-time — without building a single pipeline, writing a single line of code, or scheduling a single trigger?

That is exactly what Mirrored Databases do.

Mirroring creates a near real-time, read-only replica of your operational database inside Fabric OneLake. Changes in the source database (inserts, updates, deletes) are automatically captured and replicated as Delta tables. No CDC pipelines to build. No ADF Copy Activities. No scheduling. No maintenance. The data just… appears.

Think of mirroring like a live TV broadcast of a security camera. The camera (source database) records everything happening in the store. The broadcast (mirroring) sends the feed to your monitoring room (Fabric) in near real-time. You do not need to walk to the store (build a pipeline) to see what is happening. The feed is always on, always current, and you watch it on your own screens (notebooks, SQL endpoint, Power BI).

Table of Contents

  • What Is Database Mirroring in Fabric?
  • Why Mirroring Changes Everything
  • How Mirroring Works Under the Hood
  • All Supported Sources
  • Setting Up Mirroring: Azure SQL Database
  • Setting Up Mirroring: SQL Server (On-Premises)
  • Setting Up Mirroring: Cosmos DB
  • Setting Up Mirroring: Snowflake
  • Setting Up Mirroring: PostgreSQL
  • Open Mirroring (Build Your Own)
  • What Gets Mirrored (and What Does Not)
  • The Mirrored Database Item in Your Workspace
  • Querying Mirrored Data
  • With the SQL Analytics Endpoint
  • With Spark Notebooks
  • With Power BI (Direct Lake)
  • Mirroring + Shortcuts: The Power Combination
  • Cross-Database Joins (Mirrored + Warehouse + Lakehouse)
  • Real-World Scenario 1: E-Commerce Analytics
  • Real-World Scenario 2: Financial Reporting
  • Real-World Scenario 3: IoT with Cosmos DB
  • Real-World Scenario 4: Multi-Cloud with Snowflake
  • Mirroring vs Pipelines: When to Use Which
  • Security and Networking
  • Monitoring Mirroring
  • Cost: Mirroring Is Free (Yes, Really)
  • Limitations and Considerations
  • Common Mistakes
  • Interview Questions
  • Wrapping Up

What Is Database Mirroring in Fabric?

Mirroring continuously replicates data from an operational database into Fabric OneLake as Delta tables. It captures every change (INSERT, UPDATE, DELETE) from the source and applies it to the replica — automatically, continuously, in near real-time.

Source Database                          Fabric OneLake
┌──────────────┐                        ┌──────────────────────┐
│ SQL Server   │ ──── Mirroring ──────► │ Delta Tables         │
│              │     (automatic,        │ (read-only replica)  │
│ INSERT INTO  │      continuous,       │                      │
│ UPDATE       │      near real-time)   │ SQL Analytics Endpoint│
│ DELETE FROM  │                        │ Spark Notebooks       │
│              │                        │ Power BI Direct Lake  │
└──────────────┘                        └──────────────────────┘
  Source keeps running                    No pipelines needed
  No performance impact                   Always current

Why Mirroring Changes Everything

Before Mirroring (The Pipeline Approach)

Step 1: Create ADF/Fabric Pipeline
Step 2: Create Connection to source database
Step 3: Configure Copy Activity (source → destination)
Step 4: Handle incremental logic (watermark, CDC)
Step 5: Schedule trigger (every hour? every 15 minutes?)
Step 6: Build error handling (retry, alerts)
Step 7: Monitor runs daily
Step 8: Fix failures when they occur (schema changes, connectivity)

Time to set up: 2-5 days
Ongoing maintenance: Weekly monitoring
Data freshness: 15 minutes to 24 hours (depends on schedule)

After Mirroring

Step 1: Click "New mirrored database" → select source → authenticate → done

Time to set up: 5-15 minutes
Ongoing maintenance: Near zero
Data freshness: Seconds to minutes (near real-time)

Real-life analogy: Building a pipeline is like hiring a delivery service to bring inventory from a warehouse to your store on a schedule — you manage the schedule, handle delays, and deal with failed deliveries. Mirroring is like installing a conveyor belt between the warehouse and your store — inventory flows automatically, continuously, with zero manual effort.

How Mirroring Works Under the Hood

1. INITIAL SNAPSHOT
   → Fabric reads ALL existing data from the source database
   → Converts to Delta tables in OneLake
   → This can take minutes to hours depending on data size

2. CONTINUOUS CHANGE CAPTURE
   → Source database has a transaction log (every INSERT/UPDATE/DELETE is logged)
   → Mirroring reads the transaction log (similar to CDC)
   → Changes are applied to the Delta tables in OneLake
   → Latency: typically seconds to a few minutes

3. DELTA TABLE FORMAT
   → All mirrored data is stored as Delta Lake tables in OneLake
   → V-Order optimized for fast reads
   → Queryable via SQL analytics endpoint, Spark notebooks, Power BI

The key insight: mirroring reads from the transaction log, not by querying the tables directly. This means near-zero performance impact on the source database.

All Supported Sources

Source Status Connectivity
Azure SQL Database GA Direct (public or private endpoint)
Azure SQL Managed Instance GA Direct (public or private endpoint)
SQL Server 2016-2025 GA On-premises data gateway or Azure VM
Azure Cosmos DB GA Direct (analytical store)
Snowflake GA Direct or via gateway for private instances
Azure Database for PostgreSQL GA Direct (flexible server)
Azure Databricks GA Unity Catalog tables
Oracle GA Via gateway
SAP Datasphere GA Via connector
SharePoint Lists Preview Direct
Dremio Preview Via connector
Open Mirroring GA Any source (you build the connector)

Setting Up Mirroring: Azure SQL Database

Prerequisites

  • Azure SQL Database (any tier — Basic, Standard, Premium, Hyperscale)
  • Fabric workspace with capacity (F2 or higher)
  • SQL admin credentials or Azure AD authentication

Step by Step

  1. Open your Fabric workspace
  2. Click + New itemMirrored Azure SQL Database
  3. Connection: Enter server name, database name, authentication
  4. Select tables: Choose which tables to mirror (or select all)
  5. Click Mirror database
  6. Wait for initial sync (minutes for small databases, hours for large ones)

What Happens

Azure SQL Database: AdventureWorksLT
  ├── SalesLT.Customer (20,000 rows)
  ├── SalesLT.Product (300 rows)
  └── SalesLT.SalesOrderHeader (31,465 rows)

After mirroring, in your Fabric workspace:
  Mirrored Database: AdventureWorksLT_mirrored
    ├── SQL analytics endpoint (auto-created)
    │     ├── SalesLT.Customer (Delta, 20,000 rows)
    │     ├── SalesLT.Product (Delta, 300 rows)
    │     └── SalesLT.SalesOrderHeader (Delta, 31,465 rows)
    └── Default dataset (for Power BI)

Every INSERT, UPDATE, DELETE in Azure SQL is replicated within seconds.

Setting Up Mirroring: SQL Server (On-Premises)

For SQL Server running on-premises or in a non-Azure VM:

Prerequisites

  • SQL Server 2016 or later
  • On-premises data gateway installed OR Azure VM with connectivity
  • CDC enabled on the source database (for change tracking)

Step by Step

  1. + New itemMirrored SQL Server Database
  2. Connection: Enter server address, database, authentication
  3. Gateway: Select your on-premises data gateway
  4. Select tables → Mirror database

Real-world use case: A bank running core banking on SQL Server 2019 in their data center. Mirroring replicates transaction tables to Fabric for analytics — no pipeline, no ETL, near real-time.

Setting Up Mirroring: Cosmos DB

What Makes Cosmos DB Mirroring Special

Cosmos DB is a NoSQL database — documents, not tables. Mirroring converts JSON documents into tabular Delta tables automatically.

Cosmos DB Container: orders
  Document: {
    "id": "ord-001",
    "customer": {"name": "Naveen", "city": "Toronto"},
    "items": [{"product": "Laptop", "qty": 1}],
    "total": 999.99
  }

After mirroring → Delta table with flattened columns:
  | id      | customer_name | customer_city | total  |
  |---------|-------------|-------------|--------|
  | ord-001 | Naveen      | Toronto     | 999.99 |

Nested JSON objects are flattened. Arrays may need additional processing in notebooks.

Step by Step

  1. + New itemMirrored Azure Cosmos DB
  2. Connection: Enter Cosmos DB account URL, database name
  3. Authentication: Account key or Azure AD
  4. Select containers → Mirror database

Zero impact on Cosmos DB RU consumption — mirroring uses the change feed, not query RUs.

Setting Up Mirroring: Snowflake

Step by Step

  1. + New itemMirrored Snowflake
  2. Connection: Enter Snowflake account URL, warehouse, database
  3. Authentication: Username/password or key pair
  4. Select schemas and tables → Mirror database

Supports managed tables AND Apache Iceberg tables — Snowflake’s open table format is mirrored natively.

Real-world use case: Company has analytics in Snowflake (AWS) but reporting in Power BI (Fabric). Mirror Snowflake into Fabric → Power BI Direct Lake on mirrored data → near real-time dashboards without moving off Snowflake.

Setting Up Mirroring: PostgreSQL

  1. + New itemMirrored Azure Database for PostgreSQL
  2. Connection: Enter flexible server URL, database, credentials
  3. Select tables → Mirror database

Supports schema evolution — when source tables add columns, the mirrored Delta tables update automatically.

Open Mirroring (Build Your Own)

For sources not natively supported, Open Mirroring lets you build a custom connector:

Your custom application writes changes to OneLake using the mirroring API:
  1. Call the Fabric REST API to create a mirrored database item
  2. Write Delta files directly to the mirrored database's OneLake folder
  3. Fabric treats them as a mirrored database (SQL endpoint, Power BI)

Use case: Mirror from MongoDB, MySQL, DynamoDB, or any custom source

What Gets Mirrored (and What Does Not)

Mirrored ✅ NOT Mirrored ❌
Table data (rows and columns) Stored procedures
Schema changes (new columns) Views
INSERTs, UPDATEs, DELETEs Triggers
Primary keys / indexes (metadata) Functions
Multiple schemas Users / permissions
Data types (mapped to Delta) Database-level settings

Security must be re-configured in Fabric. Source database permissions do not carry over — you set up RLS, CLS, and workspace roles in Fabric separately.

The Mirrored Database Item in Your Workspace

When you create a mirrored database, TWO items appear in your workspace:

Workspace: Analytics_Prod
  ├── AdventureWorksLT_mirrored     ← Mirrored database item
  │     (shows replication status, table list, monitor)
  │
  └── AdventureWorksLT_mirrored     ← SQL analytics endpoint
        (auto-generated, query with T-SQL)

The mirrored database item shows: – Replication status: Running, Stopped, Initial sync in progress – Table list: All mirrored tables with row counts – Last sync time: When the most recent change was replicated – Controls: Stop replication, restart replication

Querying Mirrored Data

With the SQL Analytics Endpoint

-- Query mirrored tables with T-SQL (read-only)
SELECT TOP 100 * FROM SalesLT.Customer ORDER BY CustomerID;

SELECT p.Name AS product, SUM(sod.OrderQty) AS total_qty
FROM SalesLT.Product p
JOIN SalesLT.SalesOrderDetail sod ON p.ProductID = sod.ProductID
GROUP BY p.Name
ORDER BY total_qty DESC;

With Spark Notebooks

# Read mirrored table in a notebook
df = spark.sql("SELECT * FROM AdventureWorksLT_mirrored.SalesLT.Customer")
df.show(5)

# Transform mirrored data and write to a lakehouse
df_clean = df.select("CustomerID", "FirstName", "LastName", "EmailAddress")              .withColumn("FullName", concat(col("FirstName"), lit(" "), col("LastName")))

df_clean.write.format("delta").mode("overwrite")     .saveAsTable("silver_lakehouse.customers_clean")

With Power BI (Direct Lake)

  1. Create a new Semantic Model in the workspace
  2. Select the mirrored database’s SQL analytics endpoint
  3. Choose tables → build your model
  4. Create reports with Direct Lake mode — always fresh, near-instant

Mirroring + Shortcuts: The Power Combination

Scenario: Three source databases, one analytics layer

Mirrored Database 1: SQL Server (on-prem) → mirrored to Fabric
Mirrored Database 2: Cosmos DB → mirrored to Fabric
Mirrored Database 3: Snowflake → mirrored to Fabric

Lakehouse: analytics_lakehouse
  Tables/
    sql_customers ← SHORTCUT to Mirrored DB 1 → Customer table
    cosmos_orders ← SHORTCUT to Mirrored DB 2 → orders container
    snow_products ← SHORTCUT to Mirrored DB 3 → products table

One notebook can now join ALL three sources:
  SELECT c.name, o.total, p.product_name
  FROM sql_customers c
  JOIN cosmos_orders o ON c.customer_id = o.customer_id
  JOIN snow_products p ON o.product_id = p.product_id

Three databases. Three clouds. One query. Zero pipelines.

Cross-Database Joins (Mirrored + Warehouse + Lakehouse)

-- From the Warehouse, join mirrored data + lakehouse data + warehouse data
SELECT
    w.customer_key,
    w.name,
    m.LatestOrderDate,     -- From mirrored SQL Server (near real-time)
    l.churn_score,         -- From lakehouse ML predictions
    w.lifetime_revenue     -- From warehouse Gold layer
FROM gold.dim_customer w
JOIN SQLServer_mirrored.dbo.CustomerLatestOrder m ON w.customer_id = m.CustomerID
JOIN ml_lakehouse.dbo.churn_predictions l ON w.customer_id = l.customer_id
WHERE l.churn_score > 0.7;

Real-World Scenario 1: E-Commerce Analytics

Source: Azure SQL Database (production e-commerce app)
  Tables: Customers, Orders, Products, Inventory, Reviews

Setup: Mirror entire database → Fabric

Result:
  Analysts query live order data with T-SQL (no pipeline delay)
  Power BI shows real-time revenue dashboards
  Data scientists join mirrored data with ML feature tables
  Inventory alerts trigger when stock drops below threshold

Benefit: Orders placed at 2:15 PM appear in the dashboard by 2:16 PM
  (vs 2:00 AM next day with a nightly pipeline)

Real-World Scenario 2: Financial Reporting

Source: SQL Server 2019 (on-premises core banking)
  Tables: Accounts, Transactions, Balances, Customers

Setup: Mirror via on-premises gateway → Fabric

Result:
  Finance team sees near real-time account balances
  Regulatory reports use always-current data
  Audit queries run on the replica (zero impact on production banking app)
  Historical trend analysis using Delta time travel

Benefit: End-of-day reporting is now end-of-MINUTE reporting

Real-World Scenario 3: IoT with Cosmos DB

Source: Cosmos DB (IoT sensor readings — 1M documents/day)
  Containers: sensor_readings, device_metadata, alerts

Setup: Mirror Cosmos DB → Fabric

Result:
  Sensor data automatically lands as Delta tables
  Spark notebooks run anomaly detection on mirrored data
  Power BI shows real-time sensor dashboards
  No RU consumption impact on the Cosmos DB production workload

Benefit: IoT analytics without a single pipeline or ETL job

Real-World Scenario 4: Multi-Cloud with Snowflake

Source: Snowflake (analytics warehouse in AWS)
  Tables: dim_customer, fact_sales, agg_monthly (from their data team)

Setup: Mirror Snowflake → Fabric

Result:
  Power BI Direct Lake on mirrored Snowflake data (fast!)
  Join Snowflake data with Azure SQL mirrored data in one query
  No data movement between clouds (Fabric reads the mirrored replica)
  Snowflake compute costs are not affected

Benefit: Multi-cloud analytics without cross-cloud pipeline complexity

Mirroring vs Pipelines: When to Use Which

Factor Mirroring Pipelines
Setup time 5-15 minutes 2-5 days
Maintenance Near zero Weekly monitoring
Data freshness Seconds to minutes Minutes to hours
Transformations None (raw replica) Yes (clean, transform, enrich)
Read/Write Read-only replica Read/Write to destination
Source impact Near zero (reads transaction log) Queries source tables
Custom logic No (data as-is) Yes (any transformation)
Cost Free compute (storage only) Compute costs (CUs)

Use Mirroring When

  • You need near real-time data without transformation
  • The source data is already clean enough for analytics
  • You want to eliminate pipeline maintenance
  • You need a read-only replica for analytics without impacting production

Use Pipelines When

  • Data needs transformation before analytics (Bronze → Silver → Gold)
  • You need to combine data from multiple sources into one table
  • You need custom business logic applied during loading
  • The source is not supported by mirroring

Use BOTH (Most Common)

Mirroring: Source DB → Mirrored Delta tables (raw, near real-time)
Pipeline:  Mirrored tables → Notebook transforms → Gold layer (enriched)

The mirroring replaces the INGESTION pipeline.
You still need TRANSFORMATION pipelines (Bronze→Silver→Gold).

Cost: Mirroring Is Free (Yes, Really)

Fabric compute used to replicate your data into Fabric OneLake is free. You only pay for:

  • OneLake storage: ~$0.023/GB/month for the mirrored Delta tables
  • Compute for queries: When you query the mirrored data (SQL endpoint, notebooks), that uses your Fabric capacity CUs
  • No replication compute cost: The mirroring process itself is free

Limitations and Considerations

Limitation Detail
Read-only Mirrored tables cannot be written to — they are replicas
No transformations Data arrives as-is from the source — clean it downstream
Stopping replication Stop = complete halt. Restart = full re-sync from scratch (not resume)
Security Source permissions do not carry over — configure RLS/CLS in Fabric
Schema changes Most supported (new columns added automatically). Dropping columns may require restart
Large initial sync First sync of a 1TB database takes hours — plan accordingly
Supported tables only System tables, temporal tables, and some special types may not mirror

Common Mistakes

  1. Expecting mirroring to replace ALL pipelines — mirroring replaces INGESTION but not TRANSFORMATION. You still need notebooks or Dataflow Gen2 to clean and enrich data (Bronze→Silver→Gold).

  2. Not planning for the initial sync — the first sync copies all existing data. A 500GB database takes hours. Schedule the initial setup during off-peak hours.

  3. Forgetting to set up security in Fabric — source database permissions do not carry over. If your SQL Server has RLS, you must re-implement it on the Fabric SQL analytics endpoint.

  4. Stopping replication and expecting to resume — stopping is permanent. Restarting performs a FULL re-sync from scratch, not an incremental resume. Only stop if you truly want to end replication.

  5. Using mirroring for small, static lookup tables — a 100-row country lookup table does not need real-time mirroring. Just copy it once with a pipeline. Mirroring shines for large, frequently changing transactional tables.

  6. Not combining mirroring with transformation pipelines — mirrored data is raw. For production analytics, mirror into Fabric (Bronze), then transform with notebooks (Silver/Gold). Do not serve raw mirrored data directly to Power BI.

Interview Questions

Q: What is database mirroring in Microsoft Fabric? A: Mirroring creates a near real-time, read-only replica of an operational database inside Fabric OneLake as Delta tables. It continuously captures changes (inserts, updates, deletes) from the source database’s transaction log and applies them to the replica. No pipelines, no scheduling, no maintenance required. Supported sources include Azure SQL, SQL Server, Cosmos DB, Snowflake, and PostgreSQL.

Q: How does mirroring differ from building a Copy pipeline? A: Mirroring is automatic, continuous, near real-time, and free for compute — but the data arrives as-is (no transformations). Pipelines require manual setup, scheduling, and monitoring but allow transformations during loading. Most production implementations use both: mirroring for ingestion (replaces the Copy pipeline) and pipelines for transformation (Bronze→Silver→Gold).

Q: What happens when you stop mirroring? A: Stopping mirroring completely halts replication but does not delete data already in OneLake. Restarting does NOT resume from where it stopped — it performs a full re-sync from scratch. There is no pause/resume capability.

Q: Is mirroring free? A: The replication compute is free. You pay only for OneLake storage (Delta tables) and Fabric capacity CUs when you query the mirrored data. This makes mirroring extremely cost-effective compared to running Copy pipelines on a schedule.

Wrapping Up

Mirrored Databases are Fabric’s most transformative feature for data engineers. They eliminate the entire ingestion pipeline layer — no more Copy Activities, no more schedules, no more monitoring failed runs at 3 AM. Source data flows into OneLake automatically, continuously, in near real-time, as Delta tables.

The pattern is clear: Mirror for ingestion (free, automatic, real-time). Transform with notebooks (Bronze→Silver→Gold). Serve with Warehouse + Power BI. The pipeline is not dead — but the ingestion pipeline just became optional.

Related posts:How Real Companies Receive DataFabric Lakehouse Practical GuideOneLake ShortcutsFabric Data Factory & PipelinesAzure Connections & Authentication


Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
Share via
Copy link