Fabric Data Factory: Activities, Pipelines, Dataflow Gen2, Notebooks, and Building Production ETL in Microsoft Fabric

Fabric Data Factory: Activities, Pipelines, Dataflow Gen2, Notebooks, and Building Production ETL in Microsoft Fabric

You know Azure Data Factory. You have built metadata-driven pipelines, incremental loads, SCD pipelines, and CI/CD deployments. Now you open Fabric Data Factory and think: “This looks familiar… but different.”

It IS familiar — about 90% of what you know transfers directly. But Fabric Data Factory removes some complexity (no more datasets, no more linked services), adds new capabilities (Teams notifications, semantic model refresh, Dataflow Gen2 as a pipeline activity), and integrates with OneLake so tightly that connecting to storage is no longer a configuration exercise.

This post covers everything: every activity available, how to build and schedule pipelines, what is new compared to ADF/Synapse, and three complete pipeline examples including combining Dataflow Gen2 and notebooks inside pipelines. Think of it as the bridge between your ADF knowledge and Fabric.

Think of Fabric Data Factory like moving from a manual-transmission car (ADF) to an automatic (Fabric). The driving fundamentals are identical — steering, braking, accelerating. But the gear shifting (dataset configuration, linked service management, integration runtime setup) is now automatic. You focus on WHERE to drive (pipeline logic), not HOW the transmission works (infrastructure plumbing).

Table of Contents

  • What Is Fabric Data Factory?
  • What Changed from ADF to Fabric Data Factory
  • No More Datasets (The Biggest Change)
  • No More Linked Services (Connections Instead)
  • Creating Your First Pipeline
  • All Pipeline Activities in Fabric Data Factory
  • Data Movement Activities
  • Transformation Activities
  • Control Flow Activities
  • Notification Activities (NEW in Fabric)
  • Fabric-Specific Activities (NEW)
  • Pipeline Parameters and Variables
  • Expressions and Dynamic Content
  • Scheduling Pipelines
  • Pipeline Example 1: Copy from SQL to Lakehouse
  • Pipeline Example 2: Metadata-Driven Multi-Table Load
  • Pipeline Example 3: Full ETL with Dataflow Gen2 + Notebook
  • Dataflow Gen2: What It Is and When to Use It
  • Dataflow Gen2 vs ADF Mapping Data Flows
  • Using Dataflow Gen2 Inside a Pipeline
  • Using Notebooks Inside a Pipeline
  • Combining Dataflow Gen2 + Notebook in One Pipeline
  • Monitoring Pipelines
  • Error Handling Patterns
  • Fabric Data Factory vs ADF Feature Mapping
  • When to Use Pipeline vs Dataflow vs Notebook
  • Common Mistakes
  • Interview Questions
  • Wrapping Up

What Is Fabric Data Factory?

Fabric Data Factory is the pipeline orchestration and data integration service inside Microsoft Fabric. It handles data movement (copying data from sources to destinations) and data orchestration (running activities in sequence, parallel, or conditionally).

Fabric Data Factory
  │
  ├── Pipelines — Orchestrate activities (Copy, ForEach, If, Notebook, Dataflow)
  │
  ├── Dataflow Gen2 — Visual no-code transformations (Power Query based)
  │
  └── Both write to OneLake natively (no linked service needed)

What Changed from ADF to Fabric Data Factory

Feature ADF / Synapse Fabric Data Factory
Datasets Required (define table schema + connection) Removed — defined inline in Copy activity
Linked Services Required (connection strings, auth) Replaced by Connections (simpler, reusable)
Mapping Data Flows Visual Spark-based transformations Replaced by Dataflow Gen2 (Power Query based)
Integration Runtime Azure IR, SHIR, SSIS IR Simplified — managed by Fabric capacity
Storage connection Manual (access key, SAS, MI on ADLS) Automatic for OneLake (zero config)
Notifications External (Logic Apps, Azure Functions) Built-in (Teams, Outlook activities)
Power BI refresh External (REST API call) Built-in (Semantic Model Refresh activity)
Monitoring ADF Monitor hub (per factory) Fabric Monitoring Hub (cross-workspace)
Billing Per activity run + DIU hours Included in Fabric capacity (CU based)
Git/CI/CD ADF Git integration → ARM templates Fabric deployment pipelines + Git
SSIS support Azure-SSIS IR Not available yet

No More Datasets (The Biggest Change)

In ADF, you needed to create a Dataset for every source and sink — defining the table, schema, format, and linked service. For 20 tables, that meant 40 datasets (20 source + 20 sink).

ADF (old way):
  Step 1: Create Linked Service → Azure SQL Database (connection string)
  Step 2: Create Dataset → DS_SQL_Customer (table=Customer, linked service=above)
  Step 3: Create Linked Service → ADLS Gen2 (access key)
  Step 4: Create Dataset → DS_ADLS_Customer (path=/bronze/customer/, format=parquet)
  Step 5: Create Copy Activity → source=DS_SQL_Customer, sink=DS_ADLS_Customer

  For 20 tables: 2 linked services + 40 datasets + 20 copy activities = 62 objects

Fabric (new way):
  Step 1: Create Connection → Azure SQL (once, reusable)
  Step 2: Create Copy Activity → source=SQL table (inline), sink=Lakehouse table (inline)

  For 20 tables: 1 connection + 20 copy activities = 21 objects
  No datasets at all. Table, schema, format defined INSIDE the Copy activity.

Real-life analogy: In ADF, ordering food required filling out a form for each dish (dataset): “Form #1: Dish=Pizza, Size=Large, Kitchen=Italian, Delivery=Table 5.” In Fabric, you just tell the waiter directly: “Large pizza to table 5.” Same outcome, less paperwork.

No More Linked Services (Connections Instead)

Linked Services are replaced by Connections — simpler, workspace-level, and reusable across all items:

ADF Linked Service (old):
  Name: LS_AzureSqlDatabase_Dev
  Type: Azure SQL Database
  Connection string: Server=tcp:server.database.windows.net,1433;Database=AdventureWorksLT;...
  Authentication: SQL Auth / Managed Identity
  Integration Runtime: AutoResolveIR

Fabric Connection (new):
  Name: SQL_AdventureWorks
  Type: Azure SQL Database
  Server: server.database.windows.net
  Database: AdventureWorksLT
  Auth: Organizational account / Service Principal
  (No integration runtime selection — managed by Fabric)

Connections are managed at the workspace level under Settings > Connections or created inline when you configure a Copy activity.

Creating Your First Pipeline

Step by Step

  1. Open your Fabric workspace
  2. Click + New itemData pipeline
  3. Name it: PL_Copy_Customers
  4. The pipeline canvas opens (looks very similar to ADF)

The Canvas

┌─────────────────────────────────────────────────────────┐
│  Pipeline: PL_Copy_Customers                             │
│                                                          │
│  ┌──────────┐    ┌──────────┐    ┌──────────────────┐   │
│  │ Copy     │───>│ Notebook │───>│ Semantic Model   │   │
│  │ Activity │    │ Activity │    │ Refresh          │   │
│  └──────────┘    └──────────┘    └──────────────────┘   │
│                                                          │
│  Activities panel (left): Copy, ForEach, If, etc.        │
│  Properties panel (bottom): Source, Sink, Mapping        │
└─────────────────────────────────────────────────────────┘

All Pipeline Activities in Fabric Data Factory

Data Movement Activities

Activity What It Does ADF Equivalent
Copy Data Move data from source to destination Copy Activity (identical concept)

The Copy activity is the workhorse — same as ADF. Configure source (SQL, ADLS, REST API, files) and sink (Lakehouse, Warehouse, ADLS, SQL).

Transformation Activities

Activity What It Does ADF Equivalent
Dataflow Gen2 Visual Power Query transformations inside pipeline Mapping Data Flow
Notebook Run a Fabric Spark notebook Databricks Notebook activity
Stored Procedure Execute SQL stored procedure Stored Procedure activity
Script Run inline SQL script Script activity
SQL Job Definition Run a Spark SQL job N/A (new)

Control Flow Activities

Activity What It Does ADF Equivalent
ForEach Loop over a collection ForEach (identical)
If Condition Branch based on true/false expression If Condition (identical)
Switch Branch based on multiple values Switch (identical)
Until Loop until condition is true Until (identical)
Wait Pause for specified duration Wait (identical)
Set Variable Set a pipeline variable value Set Variable (identical)
Append Variable Add value to an array variable Append Variable (identical)
Filter Filter items in an array Filter (identical)
Lookup Query a data source and return results Lookup (identical)
Get Metadata Get file/folder metadata (size, count, exists) Get Metadata (identical)
Fail Intentionally fail the pipeline with a message Fail (identical)
Execute Pipeline Call another pipeline Execute Pipeline (identical)
Web Make HTTP REST API calls Web activity (identical)
Webhook Call a webhook and wait for callback Webhook (identical)

Notification Activities (NEW in Fabric)

Activity What It Does ADF Equivalent
Office 365 Outlook Send email from your Outlook account Not in ADF — required Logic Apps
Teams Post message to a Teams channel Not in ADF — required webhooks

These are game-changers. In ADF, sending a pipeline failure email required a Logic App, a webhook, or a custom Azure Function. In Fabric, it is a drag-and-drop activity.

Fabric-Specific Activities (NEW)

Activity What It Does ADF Equivalent
Semantic Model Refresh Trigger a Power BI semantic model refresh Not in ADF — required REST API
KQL Run a KQL query against an Eventhouse N/A

Pipeline Parameters and Variables

Parameters (Input values — set when pipeline runs)

Pipeline Parameters:
  Name: source_table     Type: String    Default: SalesLT.Customer
  Name: target_folder    Type: String    Default: bronze/customers
  Name: load_type        Type: String    Default: FULL

Access in expressions: @pipeline().parameters.source_table

Variables (Internal values — change during execution)

Pipeline Variables:
  Name: row_count         Type: String    Default: 0
  Name: error_message     Type: String    Default: 
  Name: table_list        Type: Array     Default: []

Set with Set Variable activity: @activity('Lookup_Config').output.count

Expressions and Dynamic Content

Fabric uses the same expression language as ADF:

# Pipeline parameter
@pipeline().parameters.source_table

# Activity output
@activity('Lookup_Config').output.value
@activity('Copy_Data').output.rowsCopied

# Current item in ForEach
@item().TableName

# System variables
@pipeline().RunId
@pipeline().Pipeline
@utcNow()

# String functions
@concat('bronze/', pipeline().parameters.source_table, '/')
@replace(item().TableName, ' ', '_')
@toLower(item().SchemaName)

# Date functions
@formatDateTime(utcNow(), 'yyyy/MM/dd')
@adddays(utcNow(), -7)

# Conditional
@if(equals(item().LoadType, 'FULL'), 'Full Load', 'Incremental')

If you know ADF expressions, you know Fabric expressions — they are identical.

Scheduling Pipelines

Schedule Trigger

  1. Open your pipeline
  2. Click Schedule in the toolbar
  3. Configure:
  4. Start date and time: 2026-05-20 02:00 AM
  5. Repeat: Every 1 day / Every 1 hour / Custom cron
  6. Time zone: Eastern Standard Time
  7. End date: Optional
  8. Click Apply

Event-Based Trigger

Fabric supports file arrival triggers natively:

  1. Pipeline settings → Add trigger
  2. Type: File event
  3. Configure: OneLake path, file pattern, debounce time

When a new file lands in the specified OneLake path, the pipeline runs automatically.

Pipeline Example 1: Copy from SQL to Lakehouse

The simplest pipeline — copy one table from Azure SQL to a Fabric Lakehouse:

Pipeline: PL_Copy_Customers
  │
  Copy Activity: Copy_Customers
    Source: Azure SQL Database → SalesLT.Customer (inline, no dataset)
    Sink: Lakehouse → Tables → customers (Delta format, auto)
    Mapping: Auto-map columns

Step by Step

  1. Drag Copy Data activity onto the canvas
  2. Source tab:
  3. Connection: Select or create Azure SQL connection
  4. Table: SalesLT.Customer (browse or type)
  5. Destination tab:
  6. Data store: Lakehouse (select your lakehouse)
  7. Table: customers
  8. Table action: Overwrite or Append
  9. Mapping tab: Click Import schemas → auto-maps all columns
  10. Click Run to test

That is it. No dataset to create. No linked service to configure. No integration runtime to select. The Copy activity defines everything inline.

Pipeline Example 2: Metadata-Driven Multi-Table Load

Our classic pattern — load multiple tables from a config table:

Pipeline: PL_Metadata_Load
  │
  Lookup: Lookup_Config
    Query: SELECT * FROM CONFIGTABLE_V2
    │
  ForEach: ForEach_Table
    Items: @activity('Lookup_Config').output.value
    │
    ├── Copy Activity: Copy_Table
    │     Source: Azure SQL → @item().SchemaName.@item().TableName
    │     Sink: Lakehouse → Tables → @item().FolderName
    │
    └── Notebook Activity: Log_Activity (optional)
          Notebook: /Notebooks/Log_Pipeline_Run
          Parameters: {"table": "@item().TableName", "rows": "@activity('Copy_Table').output.rowsCopied"}

The Key Difference from ADF

In ADF, you needed parameterized datasets: DS_SourceTable_Dynamic with @dataset().SchemaName and @dataset().TableName parameters. In Fabric, you configure the table dynamically INSIDE the Copy activity using expressions — no datasets needed.

Copy Activity Source Configuration (Dynamic)

Source:
  Connection: SQL_AdventureWorks
  Use query: Table
  Schema: @item().SchemaName          ← Dynamic from ForEach
  Table: @item().TableName            ← Dynamic from ForEach

Copy Activity Sink Configuration

Destination:
  Data store: Lakehouse
  Lakehouse: bronze_lakehouse
  Table: @item().TableName            ← Dynamic table name
  Table action: Overwrite

Pipeline Example 3: Full ETL with Dataflow Gen2 + Notebook

This is the production pattern — a complete Medallion pipeline:

Pipeline: PL_Daily_ETL
  │
  ├── Stage 1: INGEST (Copy Activities)
  │     Copy_Customers: SQL → Lakehouse bronze/customers
  │     Copy_Products: SQL → Lakehouse bronze/products
  │     Copy_Orders: SQL → Lakehouse bronze/orders
  │     (all 3 run in PARALLEL using ForEach with sequential=false)
  │
  ├── Stage 2: TRANSFORM (Dataflow Gen2)
  │     Dataflow_Bronze_to_Silver:
  │       Read bronze/customers → trim, initcap, dedup → write silver/customers
  │       Read bronze/products → filter, cast types → write silver/products
  │       Read bronze/orders → validate, fill nulls → write silver/orders
  │
  ├── Stage 3: ENRICH (Notebook)
  │     Notebook_Build_Gold:
  │       Read silver tables → SCD Type 2 MERGE → gold/dim_customer
  │       Read silver tables → build fact table → gold/fact_orders
  │       Read silver tables → aggregate → gold/agg_daily_revenue
  │
  ├── Stage 4: REFRESH (Semantic Model)
  │     Refresh_PowerBI_Model:
  │       Trigger semantic model refresh → Power BI Direct Lake updates
  │
  └── Stage 5: NOTIFY
        ├── (Success) → Teams: "Daily ETL completed. X rows processed."
        └── (Failure) → Outlook: "ETL FAILED. Check pipeline run ID: @pipeline().RunId"

The DAG

Copy_Customers ──┐
Copy_Products  ──┼──► Dataflow_Bronze_to_Silver ──► Notebook_Build_Gold ──► Refresh_PowerBI
Copy_Orders   ───┘                                                              │
                                                                          ┌─────┴─────┐
                                                                     (Success)    (Failure)
                                                                     Teams msg    Outlook email

Dataflow Gen2: What It Is and When to Use It

Dataflow Gen2 is the no-code visual transformation tool in Fabric, built on Power Query (the same engine used in Power BI and Excel). You connect to data, apply transformations visually (click, not code), and write results to a Fabric destination.

Dataflow Gen2 Canvas:
  ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────────┐
  │  Source   │───>│ Clean    │───>│ Merge    │───>│ Destination  │
  │ (SQL,CSV) │    │ (trim,   │    │ (join    │    │ (Lakehouse,  │
  │           │    │  filter) │    │  tables) │    │  Warehouse)  │
  └──────────┘    └──────────┘    └──────────┘    └──────────────┘

Dataflow Gen2 Supported Destinations

  • Fabric Lakehouse (Delta tables)
  • Fabric Warehouse
  • Azure SQL Database
  • Azure Data Explorer (KQL)
  • Fabric Mirrored Database

When to Use Dataflow Gen2

  • Simple to medium transformations (filter, rename, merge, pivot)
  • Business users or analysts building their own ETL
  • Quick data cleaning without writing PySpark code
  • Power Query-familiar teams

When NOT to Use Dataflow Gen2

  • Complex transformations (window functions, UDFs, MERGE/SCD)
  • Large-scale processing (billions of rows — use Spark notebook)
  • Custom Python libraries needed
  • Advanced Delta Lake operations (OPTIMIZE, VACUUM, schema evolution)

Dataflow Gen2 vs ADF Mapping Data Flows

Feature ADF Mapping Data Flows Fabric Dataflow Gen2
Engine Spark (behind the scenes) Power Query (M language)
UI ADF Studio (visual Spark) Power Query Editor (Excel-like)
Learning curve Moderate (Spark concepts) Low (Excel/Power BI users know it)
Debug Debug cluster required (slow startup) Instant preview (no cluster)
Performance High (Spark) Medium (optimized for medium data)
Destinations ADLS, SQL, Cosmos DB Lakehouse, Warehouse, SQL
CI/CD ARM templates Fabric deployment pipelines + Git
Cost Separate billing (Data Flow hours) Included in Fabric capacity (CU)
Availability ADF only Fabric only

Using Dataflow Gen2 Inside a Pipeline

  1. In your pipeline canvas, drag Dataflow activity
  2. Select an existing Dataflow Gen2 or create a new one
  3. Configure Parameters (optional — pass pipeline values to the dataflow)
  4. Connect with arrows for sequencing (runs after previous activity succeeds)
Pipeline: PL_Daily_ETL
  │
  Copy Activity: Copy_Raw_Data
    │
  Dataflow Activity: DF_Clean_and_Transform
    Dataflow: Clean_Customer_Data (your Dataflow Gen2 item)
    Parameters: {"load_date": "@formatDateTime(utcNow(), 'yyyy-MM-dd')"}
    │
  Notebook Activity: Build_Gold_Tables

Using Notebooks Inside a Pipeline

  1. Drag Notebook activity onto the canvas
  2. Select the notebook from your workspace
  3. Configure Base parameters (passed as dbutils.widgets values):
{
    "source_table": "customers",
    "target_path": "gold/dim_customer",
    "run_date": "@formatDateTime(utcNow(), 'yyyy-MM-dd')"
}

The notebook reads parameters with:

source = dbutils.widgets.get("source_table")       # "customers"
target = dbutils.widgets.get("target_path")         # "gold/dim_customer"
run_date = dbutils.widgets.get("run_date")          # "2026-05-25"

Notebook Output

Notebooks can return values to the pipeline using the notebook exit value:

# At the end of the notebook
import json
result = {"rows_processed": df.count(), "status": "SUCCESS"}
dbutils.notebook.exit(json.dumps(result))

Pipeline reads it with: @activity('Notebook_Activity').output.result.exitValue

Combining Dataflow Gen2 + Notebook in One Pipeline

This is the recommended production pattern — Dataflow for simple cleaning, Notebook for complex logic:

Pipeline: PL_Complete_Medallion

  Stage 1 — INGEST (Copy Activities):
    ForEach → Copy from SQL sources to bronze lakehouse
    (parallel, fast, raw data)

  Stage 2 — CLEAN (Dataflow Gen2):
    DF_Clean_Customers:
      Source: bronze/customers
      Steps: trim names, lowercase emails, remove nulls, cast dates
      Destination: silver/customers (lakehouse table)

    DF_Clean_Products:
      Source: bronze/products
      Steps: standardize categories, fill missing prices, dedup
      Destination: silver/products

  Stage 3 — ENRICH (Notebook):
    NB_Build_Dimensions:
      Read silver/customers → SCD Type 2 MERGE → gold/dim_customer
      Read silver/products → SCD Type 1 MERGE → gold/dim_product
      (Complex Delta MERGE logic that Dataflow cannot do)

    NB_Build_Facts:
      Read silver/orders + dim tables → build fact_orders
      Aggregate → agg_daily_revenue
      (Window functions, complex joins)

  Stage 4 — SERVE (Semantic Model Refresh):
    Refresh Power BI semantic model (Direct Lake)

  Stage 5 — NOTIFY:
    Success → Teams channel message
    Failure → Email via Outlook activity

Why This Pattern Works

Layer Tool Why
Ingest Copy Activity Fastest way to move data. Zero transformation.
Clean Dataflow Gen2 Simple transforms (trim, filter, dedup). Visual. Business users can maintain.
Enrich Notebook Complex logic (SCD MERGE, window functions, custom Python). Full PySpark power.
Serve Semantic Model Refresh One activity triggers Power BI Direct Lake update.
Notify Teams / Outlook Built-in. No Logic Apps needed.

The rule: Use Dataflow Gen2 for what Power Query does well (simple cleaning). Use Notebooks for what PySpark does well (complex transformations). Never force complex logic into Dataflow. Never over-engineer simple cleaning with Spark.

Monitoring Pipelines

Fabric Monitoring Hub

  1. Click Monitor in the left sidebar
  2. See ALL pipeline runs across ALL workspaces (unlike ADF which is per factory)
  3. Filter by: status, date, pipeline name, workspace

Pipeline Run Details

Click a specific run to see: – Each activity’s status (green/red) – Duration per activity – Rows read and written (for Copy activities) – Error messages (for failed activities) – Input and output for each activity

Alerting

Configure alerts in workspace settings or use the Teams/Outlook activities within the pipeline itself for real-time notifications.

Error Handling Patterns

Pattern 1: Activity-Level Retry

Copy Activity Settings:
  Retry: 3
  Retry interval: 30 seconds

Pattern 2: Red Arrow (On Failure)

Copy_Data ──(green)──► Log_Success
    │
    └──(red)──► Log_Failure ──► Send_Alert_Email

Pattern 3: Try-Catch with Set Variable

Copy_Data ──(green)──► Set_Variable: status = "SUCCESS"
    │
    └──(red)──► Set_Variable: status = "FAILED"
                    │
                    └──► Set_Variable: error = @activity('Copy_Data').output.errors[0].message

If Condition: @equals(variables('status'), 'FAILED')
  True → Send failure alert
  False → Continue pipeline

When to Use Pipeline vs Dataflow vs Notebook

Scenario Best Tool Why
Move data from SQL to Lakehouse Pipeline (Copy) Fastest data movement, zero transformation
Simple cleaning (trim, filter, dedup) Dataflow Gen2 Visual, no code, business users can maintain
Complex transforms (SCD, MERGE, windows) Notebook Full PySpark/SQL power
Orchestrate multiple steps Pipeline ForEach, If Condition, sequencing
Schedule everything Pipeline Built-in scheduler
Ad-hoc data exploration Notebook Interactive, cell-by-cell
Power BI refresh after ETL Pipeline Semantic Model Refresh activity
Alert on failure Pipeline Teams/Outlook activity

Common Mistakes

  1. Trying to do complex transforms in Dataflow Gen2 — SCD MERGE, window functions, and Delta operations belong in Notebooks. Dataflow Gen2 is for simple cleaning.

  2. Creating ADF-style datasets in Fabric — datasets do not exist in Fabric. Define source/sink inline in the Copy activity.

  3. Not using the Teams/Outlook activities — in ADF, email notifications required Logic Apps. In Fabric, drag-and-drop. Use them.

  4. Running Dataflow Gen2 on huge datasets — Dataflow Gen2 is optimized for medium data (millions of rows). For billions, use a Spark notebook.

  5. Forgetting to add error handling — every Copy activity should have a red arrow path to a failure handler. Silent failures are production nightmares.

  6. Not parameterizing pipelines — hardcoded table names, paths, and dates make pipelines single-use. Parameterize everything.

Interview Questions

Q: What is the difference between Fabric Data Factory and Azure Data Factory? A: Fabric Data Factory is the SaaS version inside Microsoft Fabric. Key differences: no datasets (inline configuration), connections instead of linked services, Dataflow Gen2 instead of Mapping Data Flows, built-in Teams and Outlook notification activities, semantic model refresh activity, and billing included in Fabric capacity. About 90% of ADF activities are available, with the notable exception of SSIS support.

Q: What is Dataflow Gen2 and when should you use it? A: Dataflow Gen2 is the visual, no-code transformation tool built on Power Query. Use it for simple to medium transformations like trimming, filtering, deduplication, and type casting. Do not use it for complex operations like SCD MERGE, window functions, or Delta Lake operations — use Spark notebooks for those.

Q: How do you combine Dataflow Gen2 and Notebooks in a pipeline? A: Use Dataflow Gen2 for Bronze-to-Silver cleaning (simple transforms), then Notebook for Silver-to-Gold enrichment (complex MERGE, aggregations). The pipeline sequences them: Copy → Dataflow Gen2 → Notebook → Semantic Model Refresh → Teams notification. Each tool handles what it does best.

Q: What notification options are available in Fabric Data Factory? A: Fabric has built-in Teams and Outlook activities — drag and drop them into your pipeline. In ADF, notifications required external services like Logic Apps or Azure Functions. This is one of Fabric’s key improvements over ADF.

Q: Why did Fabric remove Datasets? A: Datasets in ADF were an extra layer of configuration that added complexity without much value. In Fabric, source and sink properties (table name, schema, format) are defined inline within the Copy activity itself. This reduces the number of objects to manage and simplifies pipeline design.

Wrapping Up

Fabric Data Factory is ADF evolved — same concepts, less plumbing, more built-in capabilities. The pipeline canvas looks familiar. The expressions are identical. The control flow activities are the same. What changed is the removal of unnecessary complexity (datasets, linked services) and the addition of Fabric-native capabilities (Teams notifications, semantic model refresh, Dataflow Gen2 as a pipeline activity).

The production pattern is clear: Copy for ingestion, Dataflow Gen2 for simple cleaning, Notebooks for complex transformations, Semantic Model Refresh for Power BI, and Teams/Outlook for notifications. One pipeline, five stages, zero external services.

Related posts:What is Azure Data Factory?Metadata-Driven PipelineFabric Foundations: Capacity, Workspaces, ItemsMicrosoft Fabric OverviewADF Expressions GuideMedallion Architecture


Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
Share via
Copy link