Microsoft Fabric for Data Engineers: What It Is, What It Replaces, How It Competes, and Why It Matters
You have spent months learning Azure Data Factory, Synapse Analytics, Databricks, ADLS Gen2, Key Vault, and Power BI. You built pipelines, configured linked services, set up secret scopes, created storage credentials, and connected everything with access keys and JDBC URLs. It works. But it took a LOT of plumbing.
Now Microsoft says: “What if we combined ALL of those tools into ONE platform where everything shares the same storage, the same security, and the same governance — with zero configuration between them?”
That platform is Microsoft Fabric.
Fabric is not a new tool. It is the unification of all the Azure data tools you already know. ADF pipelines, Synapse SQL, Databricks notebooks, ADLS Gen2 storage, Power BI dashboards — all merged into a single SaaS platform with one storage layer, one security model, and one billing system.
Think of it like this: you have been furnishing your apartment by buying individual pieces from different stores — a couch from one store, a dining table from another, a bed from a third. Each piece is great, but nothing matches, delivery dates are different, and assembly instructions are in different languages. Fabric is like buying a complete furnished apartment from one showroom — everything matches, everything works together, and one team handles the entire setup.
Table of Contents
- What Is Microsoft Fabric?
- The History: How We Got Here
- The Seven Workloads of Fabric
- OneLake: The Foundation That Changes Everything
- What Fabric Replaces (The Mapping Table)
- How Our Blog Pipelines Would Look in Fabric
- Fabric vs Databricks: Compete or Coexist?
- Fabric vs Snowflake
- Fabric vs AWS (Glue, Redshift, EMR)
- Fabric vs the Traditional Azure Stack
- The Licensing Model: Capacity Units (CUs)
- Direct Lake Mode: Why Power BI Teams Love Fabric
- Fabric IQ and AI Integration (2026)
- The DP-700 Certification
- Who Should Use Fabric and When
- What Fabric Does NOT Replace
- Migration Path: From Azure Stack to Fabric
- Common Misconceptions
- Interview Questions
- Wrapping Up
What Is Microsoft Fabric?
Microsoft Fabric is a unified SaaS analytics platform that combines data ingestion, engineering, warehousing, science, real-time analytics, and business intelligence into a single product. It was announced at Microsoft Build in May 2023 and became generally available in November 2023.
The key word is unified. Every Fabric workload shares:
- OneLake — one storage layer for everything (no more separate storage accounts)
- One security model — one set of permissions, not separate RBAC for each service
- One billing — one capacity subscription, not separate bills for ADF, Synapse, Databricks, ADLS
- One governance — Purview built in, not a separate service
- Delta Lake everywhere — all tables are Delta format by default
BEFORE (Separate Tools):
ADF → needs linked service → ADLS Gen2 → needs access key → Synapse → needs JDBC → SQL DB
Databricks → needs secret scope → Key Vault → needs storage credential → ADLS Gen2
Power BI → needs import/DirectQuery → Synapse/SQL → needs connection string
Result: 6 tools, 10+ connections, 15+ credentials, 3 billing systems
AFTER (Fabric):
Fabric Pipeline → OneLake → Fabric Notebook → OneLake → Fabric Warehouse → Power BI
Result: 1 platform, 0 connection configs, 1 billing system
The History: How We Got Here
| Year | What Happened |
|---|---|
| 2018 | Azure Data Factory v2 launched (pipeline orchestration) |
| 2019 | Azure Synapse Analytics announced (unified analytics workspace) |
| 2020 | Synapse GA — combined SQL pools + Spark pools + pipelines |
| 2021 | Databricks partnership deepened — Delta Lake becomes standard |
| 2022 | Microsoft realizes Synapse adoption is slower than expected — too complex |
| May 2023 | Microsoft Fabric announced at Build |
| Nov 2023 | Fabric GA — available to all customers |
| Mar 2025 | DP-203 (Azure Data Engineer) retired, replaced by DP-700 (Fabric Data Engineer) |
| 2025-2026 | Fabric surpasses $2B annual revenue, 31,000+ customers, 60% YoY growth |
| 2026 | Fabric IQ workload previewed — AI-powered natural language analytics |
The key insight: Microsoft tried to unify data services with Synapse (2019), but Synapse still required managing separate SQL pools, Spark pools, and storage accounts. Fabric takes it further — truly unified, truly SaaS, zero infrastructure management.
The Seven Workloads of Fabric
Each workload is designed for a specific persona — but they all share OneLake:
1. Data Factory (Data Engineers)
The evolution of Azure Data Factory. Build pipelines with Copy activities, ForEach loops, If Conditions — exactly what we built throughout this blog. Also includes Dataflow Gen2 — visual transformations using Power Query (like ADF Data Flows but more powerful).
Replaces: Azure Data Factory, Synapse Pipelines, ADF Data Flows
2. Data Engineering (Data Engineers)
Spark-based notebooks for PySpark transformations. Create lakehouses (Delta tables organized in Bronze/Silver/Gold). Write the same PySpark code we practiced in Databricks — withColumn, filter, groupBy, window functions, Delta MERGE.
Replaces: Azure Synapse Spark Pools, covers much of Azure Databricks functionality
3. Data Warehouse (Data Analysts, DBAs)
Fully managed, serverless SQL warehouse. Write T-SQL to create tables, views, stored procedures. No provisioning compute — Fabric manages it. Supports the MERGE statement (as of 2025) for SCD patterns.
Replaces: Azure Synapse Dedicated SQL Pool (no more DWU provisioning)
4. Data Science (Data Scientists)
Notebooks with built-in MLflow integration. Train models with scikit-learn, TensorFlow, PyTorch. Register models, track experiments, deploy endpoints.
Competes with: Azure Databricks ML, Azure Machine Learning
5. Real-Time Intelligence (Streaming Engineers)
Process streaming data from Event Hubs, Kafka, and IoT sources. Build real-time dashboards and alerts. Includes KQL (Kusto Query Language) databases for time-series analytics.
Replaces: Azure Stream Analytics, Azure Data Explorer
6. Power BI (Business Analysts)
Fully integrated business intelligence. The game-changer is Direct Lake mode — Power BI reads Delta tables directly from OneLake without importing data. Dashboards are always up-to-date, no scheduled refreshes needed.
Replaces: Power BI Premium capacity (now included in Fabric)
7. OneLake (Everyone)
The single storage layer underneath everything. Built on ADLS Gen2 with Delta Lake as the native format. Every Fabric workspace automatically gets OneLake storage — no setup required.
Replaces: Azure Data Lake Storage Gen2 (as a separately managed service), Azure Blob Storage for analytics
Real-life analogy: OneLake is like OneDrive for data. Just as every Microsoft 365 user gets OneDrive automatically (no storage account to create, no access keys to manage), every Fabric workspace gets OneLake automatically. You just start storing data. The platform handles the rest.
What Fabric Replaces (The Mapping Table)
| Azure Service We Used | What It Does | Fabric Replacement |
|---|---|---|
| Azure Data Factory | Pipeline orchestration, data movement | Fabric Data Factory (Pipelines + Dataflow Gen2) |
| ADF Data Flows | Visual transformations | Dataflow Gen2 (Power Query based) |
| Azure Synapse Analytics | SQL pools + Spark pools + pipelines | Fabric Data Engineering + Data Warehouse |
| Synapse Dedicated SQL Pool | Data warehousing (provisioned DWUs) | Fabric Data Warehouse (serverless, auto-managed) |
| Synapse Spark Pools | PySpark notebooks | Fabric Data Engineering (Spark notebooks) |
| Azure Data Lake Gen2 | Storage for raw/processed data | OneLake (automatic, zero config) |
| Azure Databricks (partial) | Notebooks, Delta Lake, transformations | Fabric Data Engineering (notebooks + lakehouse) |
| Power BI Premium | Dashboards and reporting | Power BI in Fabric (with Direct Lake) |
| Azure Purview | Data governance, lineage, catalog | Purview built into Fabric |
| Azure Stream Analytics | Real-time processing | Fabric Real-Time Intelligence |
| Azure Machine Learning | ML model training | Fabric Data Science |
| Key Vault + Secret Scopes | Credential management | Simplified (OneLake handles auth natively) |
| Linked Services / Connection Strings | Connecting tools together | Not needed (everything shares OneLake) |
How Our Blog Pipelines Would Look in Fabric
Metadata-Driven Pipeline
Azure (what we built):
ADF → Linked Service → Azure SQL → Copy Activity → ADLS Gen2 → Parquet files
6 configuration steps, 2 linked services, parameterized datasets
Fabric equivalent:
Fabric Pipeline → Copy Activity → OneLake Lakehouse → Delta tables
2 configuration steps, no linked services (OneLake is automatic)
SCD Type 2 Pipeline
Azure:
ADF pipeline → ADLS Gen2 Parquet → Synapse Data Flow (Lookup + Conditional Split +
Alter Row + 3 Sinks) → Azure SQL dimension table
Fabric:
Fabric Pipeline → Fabric Notebook (PySpark Delta MERGE) → OneLake Delta table
Same PySpark MERGE code we wrote, just running in Fabric instead of Databricks
Medallion Architecture
Azure:
ADF → ADLS Gen2/bronze/ → Databricks notebooks → ADLS Gen2/silver/ →
Databricks notebooks → ADLS Gen2/gold/ → Power BI (import/DirectQuery)
Fabric:
Fabric Pipeline → OneLake/bronze/ → Fabric Notebook → OneLake/silver/ →
Fabric Notebook → OneLake/gold/ → Power BI (Direct Lake — instant, no import)
The concepts are identical. The code is identical. The plumbing disappears.
Fabric vs Databricks: Compete or Coexist?
This is the most common question. The answer is: they compete in some areas and coexist in others.
| Capability | Fabric | Databricks |
|---|---|---|
| Spark notebooks | Yes (built-in) | Yes (core product) |
| Delta Lake | Native (default format) | Native (they invented it) |
| SQL Warehouse | Built-in (serverless) | Databricks SQL Warehouse |
| ML/AI | Basic (Data Science workload) | Advanced (MLflow, Feature Store, Model Serving) |
| Pipeline orchestration | Built-in (Data Factory) | Databricks Workflows + external (ADF) |
| Streaming | Built-in (Real-Time Intelligence) | Spark Structured Streaming |
| Multi-cloud | Azure only | Azure, AWS, GCP |
| Unity Catalog equivalent | OneLake catalog + Purview | Unity Catalog |
| Power BI integration | Native (Direct Lake) | Via connector (import/DirectQuery) |
| Pricing model | Capacity Units (CU) — pay for platform | DBUs — pay per compute |
| Infrastructure control | None (fully managed SaaS) | Full control (cluster sizes, spot instances, auto-scaling) |
| Open source ecosystem | Microsoft-managed | Deep open source (MLflow, Delta, Spark) |
| Best for | BI-heavy, Microsoft-ecosystem companies | ML-heavy, multi-cloud, advanced engineering |
Where Fabric Wins
- All-in-one simplicity — no infrastructure to manage, no services to connect
- Power BI integration — Direct Lake mode is transformative for BI teams
- Microsoft ecosystem — natural fit for companies already on Office 365, Teams, SharePoint
- Governance — Purview built in, not an add-on
- Cost predictability — one capacity covers everything
Where Databricks Wins
- Multi-cloud — runs identically on Azure, AWS, and GCP (Fabric is Azure-only)
- Advanced ML — MLflow, Feature Store, Model Serving, AutoML are more mature
- Infrastructure control — tune cluster sizes, spot instances, photon engine
- Open source — deeper integration with the Spark/Delta/MLflow ecosystem
- Large-scale engineering — more performant for petabyte-scale complex transformations
- Flexibility — not locked into Microsoft ecosystem
The Reality: Many Companies Use Both
Fabric: Data ingestion → Bronze → Silver → Gold → Power BI dashboards
Databricks: Advanced ML models → Feature engineering → Model serving → Real-time scoring
Fabric handles the “data platform” workload. Databricks handles the “data science / ML” workload. They share data through OneLake shortcuts or ADLS Gen2.
Fabric vs Snowflake
| Capability | Fabric | Snowflake |
|---|---|---|
| Primary strength | Unified platform (ETL + warehouse + BI) | Cloud data warehouse + data sharing |
| SQL | T-SQL in Data Warehouse | SnowSQL (ANSI SQL) |
| Spark | Built-in notebooks | Via Snowpark (newer, less mature) |
| Pipelines | Built-in Data Factory | External (dbt, Airflow, Fivetran) |
| BI | Power BI (Direct Lake) | External (Tableau, Looker, Power BI) |
| Multi-cloud | Azure only | AWS, Azure, GCP |
| Data sharing | OneLake shortcuts | Native data sharing (industry leader) |
| Pricing | Capacity-based (CU) | Compute + storage (credits) |
| Best for | Microsoft-ecosystem end-to-end | SQL-heavy, cross-cloud data sharing |
Fabric vs AWS (Glue, Redshift, EMR)
| Capability | Fabric | AWS Equivalent |
|---|---|---|
| Pipeline orchestration | Fabric Data Factory | AWS Glue + Step Functions |
| Data warehouse | Fabric Data Warehouse | Amazon Redshift |
| Spark processing | Fabric Data Engineering | AWS EMR / Glue Spark |
| Storage | OneLake | Amazon S3 + Lake Formation |
| Streaming | Real-Time Intelligence | Amazon Kinesis + MSK |
| BI | Power BI (built-in) | Amazon QuickSight (separate) |
| Governance | Purview (built-in) | AWS Lake Formation + Glue Catalog |
| ML | Fabric Data Science | Amazon SageMaker |
The key difference: AWS requires assembling 6-8 separate services. Fabric provides everything in one platform.
Real-life analogy: AWS data engineering is like building a custom PC from individual components — motherboard from one vendor, GPU from another, RAM from a third. Maximum flexibility, maximum configuration effort. Fabric is like buying an iMac — everything integrated, everything works together out of the box, less customization but dramatically simpler.
The Licensing Model: Capacity Units (CUs)
Fabric uses a capacity-based pricing model. You buy a Fabric capacity (measured in Capacity Units), and all workloads share that capacity:
| Capacity | CUs | Approximate Monthly Cost (USD) | Best For |
|---|---|---|---|
| F2 | 2 | ~$260 | Individual learning/dev |
| F4 | 4 | ~$520 | Small team |
| F8 | 8 | ~$1,040 | Department |
| F16 | 16 | ~$2,080 | Multiple teams |
| F64 | 64 | ~$8,320 | Enterprise |
| F128+ | 128+ | ~$16,640+ | Large enterprise |
Key advantage: One bill covers pipelines, notebooks, SQL queries, streaming, and Power BI. In the traditional Azure stack, each service has its own billing meter.
Key risk: If one workload (a heavy Spark job) consumes all the capacity, other workloads (Power BI queries) slow down. Capacity management is a real skill in Fabric.
Direct Lake Mode: Why Power BI Teams Love Fabric
This is the feature that makes BI teams push for Fabric adoption:
Traditional Power BI:
Data Lake → Import into Power BI dataset (scheduled refresh every 30 min)
Problem: Data is always 30 minutes old. Import costs memory. Large datasets fail.
Power BI with DirectQuery:
Data Lake → Power BI queries the source directly on every dashboard click
Problem: Slow. Every click = a query to the warehouse.
Power BI with Direct Lake (Fabric):
OneLake Delta table → Power BI reads Parquet files directly from OneLake
Result: Always up-to-date. No import. No slow queries. Best of both worlds.
Direct Lake combines the speed of import mode with the freshness of DirectQuery. It reads Delta/Parquet files directly from OneLake using the in-memory Vertipaq engine. Dashboards are always fresh, and performance is near-instant.
Fabric IQ and AI Integration (2026)
The April 2026 update brings capabilities across the platform including deeper VS Code integration, enhanced notebook resiliency, expanded machine learning and governance features, and new real-time data processing capabilities.
At FabCon 2026, Microsoft made it clear that the future of data platforms is AI-powered decision systems. With over 30,000 customers and rapid adoption, Fabric is becoming a central piece in modern data strategy. The big shift is that Fabric is evolving from a data platform into a business intelligence engine that thinks, learns, and acts.
Fabric IQ is a new AI-powered workload that allows natural language interaction with your data — ask questions in plain English and get answers from your OneLake data. This signals where Microsoft is heading: making data accessible to non-technical users through AI.
The DP-700 Certification
Microsoft retired the DP-203 (Azure Data Engineer Associate) exam on March 31, 2025, and replaced it with DP-700 (Fabric Data Engineer Associate). This is a clear signal that Microsoft considers Fabric the future of data engineering on Azure.
What DP-700 tests: – OneLake architecture and lakehouses – Fabric Data Factory (pipelines and dataflows) – Fabric notebooks (PySpark, Delta Lake) – Data warehouse operations (T-SQL) – Medallion architecture in Fabric – Security and governance in Fabric
The good news: Every concept we learned — ADF pipelines, Spark transformations, Delta Lake, SCD patterns, medallion architecture, CI/CD — is exactly what DP-700 tests. The concepts transfer directly. Only the platform wrapper changes.
Who Should Use Fabric and When
Use Fabric When
- Your company is already in the Microsoft ecosystem (Office 365, Azure AD, Power BI)
- Your primary consumers are Power BI dashboards and Excel reports
- You want a single platform without managing multiple services
- Your data team includes analysts who prefer SQL and visual tools over code
- You need governance (Purview) built in from day one
- Budget predictability is important (one capacity bill)
Stay with Databricks / Traditional Azure When
- You need multi-cloud (AWS + Azure + GCP)
- You have advanced ML workloads (model serving, feature stores)
- You need full control over compute (cluster sizing, spot instances)
- Your team is deeply invested in the open-source Spark ecosystem
- You need petabyte-scale performance tuning
Use Both When
- Fabric for data platform (ingestion, warehousing, BI)
- Databricks for advanced ML and data science
- They share data through OneLake shortcuts
What Fabric Does NOT Replace
| Tool | Why Fabric Cannot Replace It |
|---|---|
| Databricks (fully) | Multi-cloud, advanced ML, infrastructure control, Photon engine |
| Snowflake | Cross-cloud data sharing, SnowSQL ecosystem, multi-cloud |
| Apache Kafka | Dedicated streaming platform with broader ecosystem |
| dbt | SQL-based transformation framework with version control (works WITH Fabric) |
| Airflow | Complex workflow orchestration beyond what Fabric pipelines offer |
| Terraform/Bicep | Infrastructure as Code for Azure resource management |
| Third-party ETL (Fivetran, Airbyte) | 300+ pre-built source connectors |
Migration Path: From Azure Stack to Fabric
If you are currently using the Azure tools we learned in this blog, here is the migration path:
Phase 1: Enable Fabric trial (free for 60 days)
Phase 2: Create a Fabric workspace and OneLake lakehouse
Phase 3: Migrate ADF pipelines → Fabric Pipelines (near-identical UI)
Phase 4: Migrate Synapse notebooks → Fabric Notebooks (same PySpark code)
Phase 5: Migrate Power BI datasets → Direct Lake mode
Phase 6: Retire Synapse workspace and separate ADLS accounts
The migration is NOT a rewrite. ADF pipelines look almost identical in Fabric. PySpark notebooks run the same code. Delta tables are the same format. The hardest part is restructuring storage from separate ADLS accounts to OneLake lakehouses.
Common Misconceptions
-
“Fabric replaces Databricks” — it competes in some areas (notebooks, Delta) but Databricks remains stronger for multi-cloud, advanced ML, and large-scale engineering. Many companies use both.
-
“I need to relearn everything for Fabric” — if you know ADF pipelines, PySpark, Delta Lake, and SQL, you already know 90% of Fabric. The concepts are identical.
-
“Fabric is free with my Microsoft license” — Fabric requires a separate Fabric capacity purchase. Power BI Pro is included in some Microsoft 365 plans, but Fabric capacity is additional.
-
“OneLake replaces ADLS Gen2” — OneLake is BUILT ON ADLS Gen2. It is a governance and simplification layer on top, not a replacement of the underlying technology.
-
“Fabric is only for small companies” — with 31,000+ customers and $2B+ revenue, Fabric is used by enterprises globally. The capacity model scales from F2 (individual) to F128+ (large enterprise).
Interview Questions
Q: What is Microsoft Fabric? A: A unified SaaS analytics platform that combines data ingestion (Data Factory), engineering (Spark notebooks), warehousing (SQL), data science (ML), real-time analytics, and business intelligence (Power BI) into a single product. All workloads share OneLake as a single storage layer, eliminating the need for separate storage accounts, linked services, and credential management between tools.
Q: What is OneLake and why is it important? A: OneLake is Fabric’s unified storage layer, built on ADLS Gen2 with Delta Lake as the native format. Every Fabric workspace automatically gets OneLake storage — no setup required. All workloads read and write to OneLake natively, eliminating connection strings, linked services, and access keys between tools. It is to data what OneDrive is to files.
Q: How does Fabric compare to Databricks? A: Fabric is a unified SaaS platform focused on simplicity and Microsoft ecosystem integration. Databricks provides more infrastructure control, multi-cloud portability (Azure, AWS, GCP), and advanced ML capabilities. Fabric excels at BI-heavy workloads with Direct Lake mode. Databricks excels at large-scale engineering and ML. Many enterprises use both.
Q: What is Direct Lake mode in Fabric? A: A Power BI feature that reads Delta/Parquet files directly from OneLake without importing data. It combines the speed of import mode (in-memory Vertipaq engine) with the freshness of DirectQuery (always current data). Dashboards are always up-to-date with near-instant performance.
Q: What happened to the DP-203 certification? A: Microsoft retired DP-203 (Azure Data Engineer Associate) on March 31, 2025, and replaced it with DP-700 (Fabric Data Engineer Associate). This signals that Microsoft considers Fabric the future of Azure data engineering. The concepts tested are the same — pipelines, Spark, Delta Lake, SCD, medallion architecture — just within the Fabric platform.
Q: Should I learn Fabric if I already know Azure Data Factory and Databricks? A: Absolutely. Everything you know transfers directly — ADF pipelines become Fabric pipelines, PySpark code runs the same in Fabric notebooks, Delta Lake is the default format. The learning curve is minimal because the concepts are identical. The DP-700 certification validates Fabric skills and is now the primary Azure data engineering certification.
Wrapping Up
Microsoft Fabric is not a revolution — it is an evolution. It takes the tools we have been learning throughout this entire blog — ADF, Synapse, Databricks, ADLS Gen2, Delta Lake, Power BI — and unifies them into a single platform that eliminates the plumbing between services.
The concepts you have learned are exactly what Fabric uses: pipeline orchestration, PySpark transformations, Delta Lake operations, SCD patterns, medallion architecture, and data quality. The only difference is that Fabric wraps it all in one platform, removes the integration overhead, and adds OneLake as the universal storage layer.
Learn the fundamentals (which you have). Then learn Fabric (which is those same fundamentals in a unified wrapper). That is the career path Microsoft is building for data engineers.
Related posts: – What is Azure Data Factory? – ADF vs Synapse – Medallion Architecture – Delta Lake Deep Dive – How Real Companies Receive Data
Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.