Mirrored Databases in Microsoft Fabric: Real-Time Replication from SQL Server, Cosmos DB, Snowflake, and PostgreSQL Without Building a Single Pipeline
You have been building pipelines throughout this entire blog — ADF Copy Activities to move data from SQL to ADLS, Databricks notebooks to transform Delta tables, Fabric Dataflow Gen2 to clean data. Every pipeline requires scheduling, monitoring, error handling, and maintenance.
What if you could skip ALL of that for the initial replication? What if your SQL Server database automatically appeared in Fabric as Delta tables — updated in near real-time — without building a single pipeline, writing a single line of code, or scheduling a single trigger?
That is exactly what Mirrored Databases do.
Mirroring creates a near real-time, read-only replica of your operational database inside Fabric OneLake. Changes in the source database (inserts, updates, deletes) are automatically captured and replicated as Delta tables. No CDC pipelines to build. No ADF Copy Activities. No scheduling. No maintenance. The data just… appears.
Think of mirroring like a live TV broadcast of a security camera. The camera (source database) records everything happening in the store. The broadcast (mirroring) sends the feed to your monitoring room (Fabric) in near real-time. You do not need to walk to the store (build a pipeline) to see what is happening. The feed is always on, always current, and you watch it on your own screens (notebooks, SQL endpoint, Power BI).
Table of Contents
- What Is Database Mirroring in Fabric?
- Why Mirroring Changes Everything
- How Mirroring Works Under the Hood
- All Supported Sources
- Setting Up Mirroring: Azure SQL Database
- Setting Up Mirroring: SQL Server (On-Premises)
- Setting Up Mirroring: Cosmos DB
- Setting Up Mirroring: Snowflake
- Setting Up Mirroring: PostgreSQL
- Open Mirroring (Build Your Own)
- What Gets Mirrored (and What Does Not)
- The Mirrored Database Item in Your Workspace
- Querying Mirrored Data
- With the SQL Analytics Endpoint
- With Spark Notebooks
- With Power BI (Direct Lake)
- Mirroring + Shortcuts: The Power Combination
- Cross-Database Joins (Mirrored + Warehouse + Lakehouse)
- Real-World Scenario 1: E-Commerce Analytics
- Real-World Scenario 2: Financial Reporting
- Real-World Scenario 3: IoT with Cosmos DB
- Real-World Scenario 4: Multi-Cloud with Snowflake
- Mirroring vs Pipelines: When to Use Which
- Security and Networking
- Monitoring Mirroring
- Cost: Mirroring Is Free (Yes, Really)
- Limitations and Considerations
- Common Mistakes
- Interview Questions
- Wrapping Up
What Is Database Mirroring in Fabric?
Mirroring continuously replicates data from an operational database into Fabric OneLake as Delta tables. It captures every change (INSERT, UPDATE, DELETE) from the source and applies it to the replica — automatically, continuously, in near real-time.
Source Database Fabric OneLake
┌──────────────┐ ┌──────────────────────┐
│ SQL Server │ ──── Mirroring ──────► │ Delta Tables │
│ │ (automatic, │ (read-only replica) │
│ INSERT INTO │ continuous, │ │
│ UPDATE │ near real-time) │ SQL Analytics Endpoint│
│ DELETE FROM │ │ Spark Notebooks │
│ │ │ Power BI Direct Lake │
└──────────────┘ └──────────────────────┘
Source keeps running No pipelines needed
No performance impact Always current
Why Mirroring Changes Everything
Before Mirroring (The Pipeline Approach)
Step 1: Create ADF/Fabric Pipeline
Step 2: Create Connection to source database
Step 3: Configure Copy Activity (source → destination)
Step 4: Handle incremental logic (watermark, CDC)
Step 5: Schedule trigger (every hour? every 15 minutes?)
Step 6: Build error handling (retry, alerts)
Step 7: Monitor runs daily
Step 8: Fix failures when they occur (schema changes, connectivity)
Time to set up: 2-5 days
Ongoing maintenance: Weekly monitoring
Data freshness: 15 minutes to 24 hours (depends on schedule)
After Mirroring
Step 1: Click "New mirrored database" → select source → authenticate → done
Time to set up: 5-15 minutes
Ongoing maintenance: Near zero
Data freshness: Seconds to minutes (near real-time)
Real-life analogy: Building a pipeline is like hiring a delivery service to bring inventory from a warehouse to your store on a schedule — you manage the schedule, handle delays, and deal with failed deliveries. Mirroring is like installing a conveyor belt between the warehouse and your store — inventory flows automatically, continuously, with zero manual effort.
How Mirroring Works Under the Hood
1. INITIAL SNAPSHOT
→ Fabric reads ALL existing data from the source database
→ Converts to Delta tables in OneLake
→ This can take minutes to hours depending on data size
2. CONTINUOUS CHANGE CAPTURE
→ Source database has a transaction log (every INSERT/UPDATE/DELETE is logged)
→ Mirroring reads the transaction log (similar to CDC)
→ Changes are applied to the Delta tables in OneLake
→ Latency: typically seconds to a few minutes
3. DELTA TABLE FORMAT
→ All mirrored data is stored as Delta Lake tables in OneLake
→ V-Order optimized for fast reads
→ Queryable via SQL analytics endpoint, Spark notebooks, Power BI
The key insight: mirroring reads from the transaction log, not by querying the tables directly. This means near-zero performance impact on the source database.
All Supported Sources
| Source | Status | Connectivity |
|---|---|---|
| Azure SQL Database | GA | Direct (public or private endpoint) |
| Azure SQL Managed Instance | GA | Direct (public or private endpoint) |
| SQL Server 2016-2025 | GA | On-premises data gateway or Azure VM |
| Azure Cosmos DB | GA | Direct (analytical store) |
| Snowflake | GA | Direct or via gateway for private instances |
| Azure Database for PostgreSQL | GA | Direct (flexible server) |
| Azure Databricks | GA | Unity Catalog tables |
| Oracle | GA | Via gateway |
| SAP Datasphere | GA | Via connector |
| SharePoint Lists | Preview | Direct |
| Dremio | Preview | Via connector |
| Open Mirroring | GA | Any source (you build the connector) |
Setting Up Mirroring: Azure SQL Database
Prerequisites
- Azure SQL Database (any tier — Basic, Standard, Premium, Hyperscale)
- Fabric workspace with capacity (F2 or higher)
- SQL admin credentials or Azure AD authentication
Step by Step
- Open your Fabric workspace
- Click + New item → Mirrored Azure SQL Database
- Connection: Enter server name, database name, authentication
- Select tables: Choose which tables to mirror (or select all)
- Click Mirror database
- Wait for initial sync (minutes for small databases, hours for large ones)
What Happens
Azure SQL Database: AdventureWorksLT
├── SalesLT.Customer (20,000 rows)
├── SalesLT.Product (300 rows)
└── SalesLT.SalesOrderHeader (31,465 rows)
After mirroring, in your Fabric workspace:
Mirrored Database: AdventureWorksLT_mirrored
├── SQL analytics endpoint (auto-created)
│ ├── SalesLT.Customer (Delta, 20,000 rows)
│ ├── SalesLT.Product (Delta, 300 rows)
│ └── SalesLT.SalesOrderHeader (Delta, 31,465 rows)
└── Default dataset (for Power BI)
Every INSERT, UPDATE, DELETE in Azure SQL is replicated within seconds.
Setting Up Mirroring: SQL Server (On-Premises)
For SQL Server running on-premises or in a non-Azure VM:
Prerequisites
- SQL Server 2016 or later
- On-premises data gateway installed OR Azure VM with connectivity
- CDC enabled on the source database (for change tracking)
Step by Step
- + New item → Mirrored SQL Server Database
- Connection: Enter server address, database, authentication
- Gateway: Select your on-premises data gateway
- Select tables → Mirror database
Real-world use case: A bank running core banking on SQL Server 2019 in their data center. Mirroring replicates transaction tables to Fabric for analytics — no pipeline, no ETL, near real-time.
Setting Up Mirroring: Cosmos DB
What Makes Cosmos DB Mirroring Special
Cosmos DB is a NoSQL database — documents, not tables. Mirroring converts JSON documents into tabular Delta tables automatically.
Cosmos DB Container: orders
Document: {
"id": "ord-001",
"customer": {"name": "Naveen", "city": "Toronto"},
"items": [{"product": "Laptop", "qty": 1}],
"total": 999.99
}
After mirroring → Delta table with flattened columns:
| id | customer_name | customer_city | total |
|---------|-------------|-------------|--------|
| ord-001 | Naveen | Toronto | 999.99 |
Nested JSON objects are flattened. Arrays may need additional processing in notebooks.
Step by Step
- + New item → Mirrored Azure Cosmos DB
- Connection: Enter Cosmos DB account URL, database name
- Authentication: Account key or Azure AD
- Select containers → Mirror database
Zero impact on Cosmos DB RU consumption — mirroring uses the change feed, not query RUs.
Setting Up Mirroring: Snowflake
Step by Step
- + New item → Mirrored Snowflake
- Connection: Enter Snowflake account URL, warehouse, database
- Authentication: Username/password or key pair
- Select schemas and tables → Mirror database
Supports managed tables AND Apache Iceberg tables — Snowflake’s open table format is mirrored natively.
Real-world use case: Company has analytics in Snowflake (AWS) but reporting in Power BI (Fabric). Mirror Snowflake into Fabric → Power BI Direct Lake on mirrored data → near real-time dashboards without moving off Snowflake.
Setting Up Mirroring: PostgreSQL
- + New item → Mirrored Azure Database for PostgreSQL
- Connection: Enter flexible server URL, database, credentials
- Select tables → Mirror database
Supports schema evolution — when source tables add columns, the mirrored Delta tables update automatically.
Open Mirroring (Build Your Own)
For sources not natively supported, Open Mirroring lets you build a custom connector:
Your custom application writes changes to OneLake using the mirroring API:
1. Call the Fabric REST API to create a mirrored database item
2. Write Delta files directly to the mirrored database's OneLake folder
3. Fabric treats them as a mirrored database (SQL endpoint, Power BI)
Use case: Mirror from MongoDB, MySQL, DynamoDB, or any custom source
What Gets Mirrored (and What Does Not)
| Mirrored ✅ | NOT Mirrored ❌ |
|---|---|
| Table data (rows and columns) | Stored procedures |
| Schema changes (new columns) | Views |
| INSERTs, UPDATEs, DELETEs | Triggers |
| Primary keys / indexes (metadata) | Functions |
| Multiple schemas | Users / permissions |
| Data types (mapped to Delta) | Database-level settings |
Security must be re-configured in Fabric. Source database permissions do not carry over — you set up RLS, CLS, and workspace roles in Fabric separately.
The Mirrored Database Item in Your Workspace
When you create a mirrored database, TWO items appear in your workspace:
Workspace: Analytics_Prod
├── AdventureWorksLT_mirrored ← Mirrored database item
│ (shows replication status, table list, monitor)
│
└── AdventureWorksLT_mirrored ← SQL analytics endpoint
(auto-generated, query with T-SQL)
The mirrored database item shows: – Replication status: Running, Stopped, Initial sync in progress – Table list: All mirrored tables with row counts – Last sync time: When the most recent change was replicated – Controls: Stop replication, restart replication
Querying Mirrored Data
With the SQL Analytics Endpoint
-- Query mirrored tables with T-SQL (read-only)
SELECT TOP 100 * FROM SalesLT.Customer ORDER BY CustomerID;
SELECT p.Name AS product, SUM(sod.OrderQty) AS total_qty
FROM SalesLT.Product p
JOIN SalesLT.SalesOrderDetail sod ON p.ProductID = sod.ProductID
GROUP BY p.Name
ORDER BY total_qty DESC;
With Spark Notebooks
# Read mirrored table in a notebook
df = spark.sql("SELECT * FROM AdventureWorksLT_mirrored.SalesLT.Customer")
df.show(5)
# Transform mirrored data and write to a lakehouse
df_clean = df.select("CustomerID", "FirstName", "LastName", "EmailAddress") .withColumn("FullName", concat(col("FirstName"), lit(" "), col("LastName")))
df_clean.write.format("delta").mode("overwrite") .saveAsTable("silver_lakehouse.customers_clean")
With Power BI (Direct Lake)
- Create a new Semantic Model in the workspace
- Select the mirrored database’s SQL analytics endpoint
- Choose tables → build your model
- Create reports with Direct Lake mode — always fresh, near-instant
Mirroring + Shortcuts: The Power Combination
Scenario: Three source databases, one analytics layer
Mirrored Database 1: SQL Server (on-prem) → mirrored to Fabric
Mirrored Database 2: Cosmos DB → mirrored to Fabric
Mirrored Database 3: Snowflake → mirrored to Fabric
Lakehouse: analytics_lakehouse
Tables/
sql_customers ← SHORTCUT to Mirrored DB 1 → Customer table
cosmos_orders ← SHORTCUT to Mirrored DB 2 → orders container
snow_products ← SHORTCUT to Mirrored DB 3 → products table
One notebook can now join ALL three sources:
SELECT c.name, o.total, p.product_name
FROM sql_customers c
JOIN cosmos_orders o ON c.customer_id = o.customer_id
JOIN snow_products p ON o.product_id = p.product_id
Three databases. Three clouds. One query. Zero pipelines.
Cross-Database Joins (Mirrored + Warehouse + Lakehouse)
-- From the Warehouse, join mirrored data + lakehouse data + warehouse data
SELECT
w.customer_key,
w.name,
m.LatestOrderDate, -- From mirrored SQL Server (near real-time)
l.churn_score, -- From lakehouse ML predictions
w.lifetime_revenue -- From warehouse Gold layer
FROM gold.dim_customer w
JOIN SQLServer_mirrored.dbo.CustomerLatestOrder m ON w.customer_id = m.CustomerID
JOIN ml_lakehouse.dbo.churn_predictions l ON w.customer_id = l.customer_id
WHERE l.churn_score > 0.7;
Real-World Scenario 1: E-Commerce Analytics
Source: Azure SQL Database (production e-commerce app)
Tables: Customers, Orders, Products, Inventory, Reviews
Setup: Mirror entire database → Fabric
Result:
Analysts query live order data with T-SQL (no pipeline delay)
Power BI shows real-time revenue dashboards
Data scientists join mirrored data with ML feature tables
Inventory alerts trigger when stock drops below threshold
Benefit: Orders placed at 2:15 PM appear in the dashboard by 2:16 PM
(vs 2:00 AM next day with a nightly pipeline)
Real-World Scenario 2: Financial Reporting
Source: SQL Server 2019 (on-premises core banking)
Tables: Accounts, Transactions, Balances, Customers
Setup: Mirror via on-premises gateway → Fabric
Result:
Finance team sees near real-time account balances
Regulatory reports use always-current data
Audit queries run on the replica (zero impact on production banking app)
Historical trend analysis using Delta time travel
Benefit: End-of-day reporting is now end-of-MINUTE reporting
Real-World Scenario 3: IoT with Cosmos DB
Source: Cosmos DB (IoT sensor readings — 1M documents/day)
Containers: sensor_readings, device_metadata, alerts
Setup: Mirror Cosmos DB → Fabric
Result:
Sensor data automatically lands as Delta tables
Spark notebooks run anomaly detection on mirrored data
Power BI shows real-time sensor dashboards
No RU consumption impact on the Cosmos DB production workload
Benefit: IoT analytics without a single pipeline or ETL job
Real-World Scenario 4: Multi-Cloud with Snowflake
Source: Snowflake (analytics warehouse in AWS)
Tables: dim_customer, fact_sales, agg_monthly (from their data team)
Setup: Mirror Snowflake → Fabric
Result:
Power BI Direct Lake on mirrored Snowflake data (fast!)
Join Snowflake data with Azure SQL mirrored data in one query
No data movement between clouds (Fabric reads the mirrored replica)
Snowflake compute costs are not affected
Benefit: Multi-cloud analytics without cross-cloud pipeline complexity
Mirroring vs Pipelines: When to Use Which
| Factor | Mirroring | Pipelines |
|---|---|---|
| Setup time | 5-15 minutes | 2-5 days |
| Maintenance | Near zero | Weekly monitoring |
| Data freshness | Seconds to minutes | Minutes to hours |
| Transformations | None (raw replica) | Yes (clean, transform, enrich) |
| Read/Write | Read-only replica | Read/Write to destination |
| Source impact | Near zero (reads transaction log) | Queries source tables |
| Custom logic | No (data as-is) | Yes (any transformation) |
| Cost | Free compute (storage only) | Compute costs (CUs) |
Use Mirroring When
- You need near real-time data without transformation
- The source data is already clean enough for analytics
- You want to eliminate pipeline maintenance
- You need a read-only replica for analytics without impacting production
Use Pipelines When
- Data needs transformation before analytics (Bronze → Silver → Gold)
- You need to combine data from multiple sources into one table
- You need custom business logic applied during loading
- The source is not supported by mirroring
Use BOTH (Most Common)
Mirroring: Source DB → Mirrored Delta tables (raw, near real-time)
Pipeline: Mirrored tables → Notebook transforms → Gold layer (enriched)
The mirroring replaces the INGESTION pipeline.
You still need TRANSFORMATION pipelines (Bronze→Silver→Gold).
Cost: Mirroring Is Free (Yes, Really)
Fabric compute used to replicate your data into Fabric OneLake is free. You only pay for:
- OneLake storage: ~$0.023/GB/month for the mirrored Delta tables
- Compute for queries: When you query the mirrored data (SQL endpoint, notebooks), that uses your Fabric capacity CUs
- No replication compute cost: The mirroring process itself is free
Limitations and Considerations
| Limitation | Detail |
|---|---|
| Read-only | Mirrored tables cannot be written to — they are replicas |
| No transformations | Data arrives as-is from the source — clean it downstream |
| Stopping replication | Stop = complete halt. Restart = full re-sync from scratch (not resume) |
| Security | Source permissions do not carry over — configure RLS/CLS in Fabric |
| Schema changes | Most supported (new columns added automatically). Dropping columns may require restart |
| Large initial sync | First sync of a 1TB database takes hours — plan accordingly |
| Supported tables only | System tables, temporal tables, and some special types may not mirror |
Common Mistakes
-
Expecting mirroring to replace ALL pipelines — mirroring replaces INGESTION but not TRANSFORMATION. You still need notebooks or Dataflow Gen2 to clean and enrich data (Bronze→Silver→Gold).
-
Not planning for the initial sync — the first sync copies all existing data. A 500GB database takes hours. Schedule the initial setup during off-peak hours.
-
Forgetting to set up security in Fabric — source database permissions do not carry over. If your SQL Server has RLS, you must re-implement it on the Fabric SQL analytics endpoint.
-
Stopping replication and expecting to resume — stopping is permanent. Restarting performs a FULL re-sync from scratch, not an incremental resume. Only stop if you truly want to end replication.
-
Using mirroring for small, static lookup tables — a 100-row country lookup table does not need real-time mirroring. Just copy it once with a pipeline. Mirroring shines for large, frequently changing transactional tables.
-
Not combining mirroring with transformation pipelines — mirrored data is raw. For production analytics, mirror into Fabric (Bronze), then transform with notebooks (Silver/Gold). Do not serve raw mirrored data directly to Power BI.
Interview Questions
Q: What is database mirroring in Microsoft Fabric? A: Mirroring creates a near real-time, read-only replica of an operational database inside Fabric OneLake as Delta tables. It continuously captures changes (inserts, updates, deletes) from the source database’s transaction log and applies them to the replica. No pipelines, no scheduling, no maintenance required. Supported sources include Azure SQL, SQL Server, Cosmos DB, Snowflake, and PostgreSQL.
Q: How does mirroring differ from building a Copy pipeline? A: Mirroring is automatic, continuous, near real-time, and free for compute — but the data arrives as-is (no transformations). Pipelines require manual setup, scheduling, and monitoring but allow transformations during loading. Most production implementations use both: mirroring for ingestion (replaces the Copy pipeline) and pipelines for transformation (Bronze→Silver→Gold).
Q: What happens when you stop mirroring? A: Stopping mirroring completely halts replication but does not delete data already in OneLake. Restarting does NOT resume from where it stopped — it performs a full re-sync from scratch. There is no pause/resume capability.
Q: Is mirroring free? A: The replication compute is free. You pay only for OneLake storage (Delta tables) and Fabric capacity CUs when you query the mirrored data. This makes mirroring extremely cost-effective compared to running Copy pipelines on a schedule.
Wrapping Up
Mirrored Databases are Fabric’s most transformative feature for data engineers. They eliminate the entire ingestion pipeline layer — no more Copy Activities, no more schedules, no more monitoring failed runs at 3 AM. Source data flows into OneLake automatically, continuously, in near real-time, as Delta tables.
The pattern is clear: Mirror for ingestion (free, automatic, real-time). Transform with notebooks (Bronze→Silver→Gold). Serve with Warehouse + Power BI. The pipeline is not dead — but the ingestion pipeline just became optional.
Related posts: – How Real Companies Receive Data – Fabric Lakehouse Practical Guide – OneLake Shortcuts – Fabric Data Factory & Pipelines – Azure Connections & Authentication
Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.