Unified Full Load and Incremental Load Pipeline in Azure Synapse: One Pipeline, One Config Table, Two Load Strategies

In earlier posts, we built two separate pipelines — a full load pipeline that copies entire tables every run, and an incremental pipeline that copies only new or changed rows using a watermark. Both work perfectly for their use case.

But in production, you rarely have ALL tables needing full load or ALL tables needing incremental load. You have a MIX — some tables are small reference data that you dump entirely (full load), and others are large transaction tables where you only need today’s new rows (incremental). Running two separate pipelines means two schedules, two monitoring dashboards, two sets of error handling.

This post builds a single unified pipeline that handles BOTH strategies. A config table tells the pipeline which tables to process and HOW to process each one — full or incremental. Adding a new table means inserting ONE row. Switching a table from full to incremental means updating ONE column. The pipeline never changes.

Think of it like a restaurant that serves both dine-in and takeout from the same kitchen. The order slip (config table) has a column that says “DINE-IN” or “TAKEOUT.” The kitchen manager (ForEach) reads each slip and routes it to the right station — plating for dine-in (full load) or boxing for takeout (incremental). Same kitchen, same chef, same ingredients — different handling based on one flag.

Why One Pipeline Instead of Two
The Architecture
The Unified Config Table (CONFIGTABLE_V2)
The Audit Log Table (PIPELINE_LOG)
The Stored Procedures
Building the Pipeline Step by Step
Datasets
Lookup: Read the Config Table
ForEach: Loop Through All Tables
If Condition: Route by LoadType
TRUE Branch: Full Load Path
FALSE Branch: Incremental Load Path
The Complete Pipeline Diagram
First Run Walkthrough
Second Run (Incremental Proof)
Adding a New Table (Zero Pipeline Changes)
Switching a Table from Full to Incremental
Date-Partitioned Output Folders
Common Errors and Fixes
Interview Questions
Wrapping Up

Why One Pipeline Instead of Two

Two Pipelines	One Unified Pipeline
Two schedules to manage	One schedule
Two monitoring dashboards	One dashboard
Two sets of error handling	One audit log
Adding a table = edit the right pipeline	Adding a table = one INSERT into config
Switching load type = move table between pipelines	Switching = UPDATE one column
10+ activities duplicated across pipelines	All logic in one place

The key insight: The only difference between full and incremental is the SOURCE QUERY. Full load reads everything (SELECT *). Incremental reads only rows above the watermark (WHERE ProductID > 295). Everything else — the ForEach, the sink, the audit logging — is identical.

The Architecture

Azure SQL Database                                    ADLS Gen2
┌──────────────────────┐                              ┌──────────────────────┐
│                      │                              │ synapse-workspace/   │
│ CONFIGTABLE_V2       │─── Lookup_Config ──>         │   sqldb/             │
│  Customer    FULL    │                              │     Customer/        │
│  Address     FULL    │                              │       2026/05/10/... │
│  Product     INCR    │                              │     Address/         │
│  ProdCat     INCR    │                              │     Product/         │
│  CustAddr    INCR    │                              │     ProductCategory/ │
│                      │         ForEach              │     CustomerAddress/ │
│ SalesLT.Customer     │──┐                           │                      │
│ SalesLT.Address      │  │   If Condition            │                      │
│ SalesLT.Product      │  │  FULL? ──> Copy_FullLoad  │                      │
│ SalesLT.ProductCat   │  │  INCR? ──> Lookup_Max     │                      │
│ SalesLT.CustAddr     │  │            Copy_DeltaData │                      │
│                      │  │            Update_Watermark│                      │
│ PIPELINE_LOG         │<─┘                           │                      │
│  (audit trail)       │  Log_Success / Log_Failure   │                      │
└──────────────────────┘                              └──────────────────────┘

The Unified Config Table (CONFIGTABLE_V2)

CREATE TABLE CONFIGTABLE_V2 (
    ID              INT IDENTITY(1,1),
    TableName       VARCHAR(100),
    SchemaName      VARCHAR(50),
    FolderName      VARCHAR(100),
    LoadType        VARCHAR(20),        -- 'FULL' or 'INCREMENTAL'
    LastLoadedValue VARCHAR(100),       -- NULL for full load tables
    DeltaColumnName VARCHAR(100)        -- NULL for full load tables
);

-- 2 Full Load tables (small reference data)
INSERT INTO CONFIGTABLE_V2 VALUES
('Customer',        'SalesLT', 'sqldb/Customer',        'FULL', NULL, NULL),
('Address',         'SalesLT', 'sqldb/Address',          'FULL', NULL, NULL);

-- 3 Incremental Load tables (large/growing data)
INSERT INTO CONFIGTABLE_V2 VALUES
('Product',         'SalesLT', 'sqldb/Product',          'INCREMENTAL', '0', 'ProductID'),
('ProductCategory', 'SalesLT', 'sqldb/ProductCategory',  'INCREMENTAL', '0', 'ProductCategoryID'),
('CustomerAddress', 'SalesLT', 'sqldb/CustomerAddress',  'INCREMENTAL', '0', 'CustomerID');

Why These Columns

Column	Purpose	Full Load	Incremental
`TableName`	Which table to copy	Customer	Product
`SchemaName`	Source schema	SalesLT	SalesLT
`FolderName`	ADLS destination folder	sqldb/Customer	sqldb/Product
`LoadType`	The routing flag	FULL	INCREMENTAL
`LastLoadedValue`	Watermark — last value copied	NULL (not used)	‘0’ → ‘295’ → ‘300’
`DeltaColumnName`	Which column is the watermark	NULL (not used)	‘ProductID’

Real-life analogy: The config table is like a shipping manifest. Each row is a package. The LoadType column is like the shipping method — “STANDARD” (full load, ship everything) vs “EXPRESS” (incremental, ship only what is new since last shipment). The LastLoadedValue is the tracking number of the last EXPRESS shipment.

The Audit Log Table (PIPELINE_LOG)

CREATE TABLE PIPELINE_LOG (
    ID             INT IDENTITY(1,1),
    TableName      VARCHAR(100),
    SchemaName     VARCHAR(50),
    LoadType       VARCHAR(20),
    RowsRead       INT,
    RowsCopied     INT,
    CopyDuration   INT,
    Status         VARCHAR(20),
    ErrorMessage   VARCHAR(500),
    LoadDate       DATETIME
);

Every table copy — success or failure, full or incremental — gets logged here. Morning health check:

SELECT TableName, LoadType, RowsCopied, Status, ErrorMessage, LoadDate
FROM PIPELINE_LOG
WHERE CAST(LoadDate AS DATE) = CAST(GETDATE() AS DATE)
ORDER BY LoadDate;

The Stored Procedures

Logging Procedure

CREATE PROC usp_insert_pipeline_log
    @TableName      VARCHAR(100),
    @SchemaName     VARCHAR(50),
    @LoadType       VARCHAR(20),
    @RowsRead       INT,
    @RowsCopied     INT,
    @CopyDuration   INT,
    @Status         VARCHAR(20),
    @ErrorMessage   VARCHAR(500)
AS
BEGIN
    INSERT INTO PIPELINE_LOG
        (TableName, SchemaName, LoadType, RowsRead, RowsCopied,
         CopyDuration, Status, ErrorMessage, LoadDate)
    VALUES
        (@TableName, @SchemaName, @LoadType, @RowsRead, @RowsCopied,
         @CopyDuration, @Status, @ErrorMessage, GETDATE());
END;

Watermark Update Procedure (Incremental Only)

CREATE PROC usp_update_watermark_v2
    @TableName      VARCHAR(100),
    @LastLoadedValue VARCHAR(100)
AS
BEGIN
    UPDATE CONFIGTABLE_V2
    SET LastLoadedValue = @LastLoadedValue
    WHERE TableName = @TableName;
END;

Why a separate watermark procedure? Full load tables do not have watermarks. This procedure is only called in the incremental branch. Keeping it separate ensures full load tables are never accidentally updated.

Building the Pipeline Step by Step

Datasets

DS_ConfigTable_V2 — Points to dbo.CONFIGTABLE_V2. Non-parameterized.

DS_SourceTable_Dynamic — Parameterized source dataset. – Parameters: SchemaName (String), TableName (String) – Connection: Schema = @dataset().SchemaName, Table = @dataset().TableName

DS_ADLS_Sink_Parquet_V2 — Parameterized sink dataset. – Parameters: FolderName (String) – Connection: Container = synapse-workspace, Directory = @dataset().FolderName

Lookup: Read the Config Table

Drag Lookup activity
Name: Lookup_Config
Settings:
Dataset: DS_ConfigTable_V2
First row only: UNCHECKED
Query: SELECT * FROM CONFIGTABLE_V2

Returns all 5 rows — the pipeline processes each one.

ForEach: Loop Through All Tables

Drag ForEach activity
Name: ForEach_Table
Green arrow from Lookup_Config
Settings:
Sequential: checked (safer for audit logging order)
Items: @activity('Lookup_Config').output.value

If Condition: Route by LoadType

Inside ForEach (click pencil icon):

Drag If Condition activity
Name: Check_LoadType
Expression: @equals(item().LoadType, 'FULL')

This evaluates to TRUE for Customer and Address (full load), and FALSE for Product, ProductCategory, and CustomerAddress (incremental).

Real-life analogy: The If Condition is like the fork in a warehouse conveyor belt. Packages tagged “STANDARD” go left to the bulk shipping dock (full load). Packages tagged “EXPRESS” go right to the priority dispatch (incremental). Same conveyor, same scanner, different destinations.

TRUE Branch: Full Load Path

Click the pencil icon on the True box.

Copy_FullLoad: – Source: DS_SourceTable_Dynamic – SchemaName: @item().SchemaName – TableName: @item().TableName – Sink: DS_ADLS_Sink_Parquet_V2 – FolderName: @concat(item().FolderName, '/', formatDateTime(utcnow(), 'yyyy'), '/', formatDateTime(utcnow(), 'MM'), '/', formatDateTime(utcnow(), 'dd'), '/', formatDateTime(utcnow(), 'HH'), '/', formatDateTime(utcnow(), 'mm'))

Log_Success_Full (green arrow from Copy_FullLoad): – Stored Procedure: usp_insert_pipeline_log – Parameters: TableName=@item().TableName, SchemaName=@item().SchemaName, LoadType=FULL, RowsRead=@activity('Copy_FullLoad').output.rowsRead, RowsCopied=@activity('Copy_FullLoad').output.rowsCopied, CopyDuration=@activity('Copy_FullLoad').output.copyDuration, Status=SUCCESS, ErrorMessage=NA

Log_Failure_Full (red arrow from Copy_FullLoad): – Same stored procedure with RowsRead=0, RowsCopied=0, CopyDuration=0, Status=FAILED, ErrorMessage=@concat('Full load failed for ', item().SchemaName, '.', item().TableName)

FALSE Branch: Incremental Load Path

Click the pencil icon on the False box.

Lookup_MaxValue: – Drag Lookup activity – Name: Lookup_MaxValue – Use query: Query – Query: @concat('SELECT CAST(MAX(', item().DeltaColumnName, ') AS VARCHAR(100)) AS MaxValue FROM ', item().SchemaName, '.', item().TableName) – First row only: checked

This gets the MAX value of the watermark column (e.g., MAX(ProductID) = 295).

Copy_DeltaData (green arrow from Lookup_MaxValue): – Source: DS_SourceTable_Dynamic – SchemaName: @item().SchemaName – TableName: @item().TableName – Use query: Query – Query: @concat('SELECT * FROM ', item().SchemaName, '.', item().TableName, ' WHERE ', item().DeltaColumnName, ' > ', item().LastLoadedValue) – Sink: DS_ADLS_Sink_Parquet_V2 – FolderName: same date-partitioned expression as full load

Update_Watermark (green arrow from Copy_DeltaData): – Stored Procedure: usp_update_watermark_v2 – Parameters: TableName=@item().TableName, LastLoadedValue=@activity('Lookup_MaxValue').output.firstRow.MaxValue

Log_Success_Incremental (green arrow from Update_Watermark): – Same logging SP with LoadType=INCREMENTAL, RowsRead/Copied/Duration from Copy_DeltaData

Log_Failure_Incremental (red arrow from Copy_DeltaData): – Same pattern as Log_Failure_Full with LoadType=INCREMENTAL

The Complete Pipeline Diagram

Lookup_Config (read CONFIGTABLE_V2 — 5 rows)
      |
      v
ForEach_Table (iterate all 5 tables)
      |
      +-- Check_LoadType: @equals(item().LoadType, 'FULL')
              |
              +-- TRUE (Full Load: Customer, Address):
              |    |
              |    Copy_FullLoad (SELECT * FROM schema.table)
              |       |                          |
              |       +-- (Success)              +-- (Failed)
              |       |    Log_Success_Full       |    Log_Failure_Full
              |
              +-- FALSE (Incremental: Product, ProductCategory, CustomerAddress):
                   |
                   Lookup_MaxValue (MAX(DeltaColumn))
                       |
                   Copy_DeltaData (WHERE DeltaCol > LastLoadedValue)
                       |                          |
                       +-- (Success)              +-- (Failed)
                       |    Update_Watermark       |    Log_Failure_Incremental
                       |    Log_Success_Incr

First Run Walkthrough

Before Run

SELECT TableName, LoadType, LastLoadedValue FROM CONFIGTABLE_V2;

TableName	LoadType	LastLoadedValue
Customer	FULL	NULL
Address	FULL	NULL
Product	INCREMENTAL	0
ProductCategory	INCREMENTAL	0
CustomerAddress	INCREMENTAL	0

Run the Pipeline (Debug)

Iteration 1: Customer (FULL) – Check_LoadType: ‘FULL’ == ‘FULL’ → TRUE – Copy_FullLoad: SELECT * FROM SalesLT.Customer → 847 rows – Log: SUCCESS, FULL, 847 rows

Iteration 2: Address (FULL) – TRUE → Copy_FullLoad → 450 rows → Log SUCCESS

Iteration 3: Product (INCREMENTAL) – Check_LoadType: ‘INCREMENTAL’ == ‘FULL’ → FALSE – Lookup_MaxValue: SELECT MAX(ProductID) FROM SalesLT.Product → 680 – Copy_DeltaData: SELECT * FROM SalesLT.Product WHERE ProductID > 0 → 295 rows (first run copies all) – Update_Watermark: LastLoadedValue = ‘680’ – Log: SUCCESS, INCREMENTAL, 295 rows

Iteration 4: ProductCategory (INCREMENTAL) – FALSE → Lookup_MaxValue → Copy_DeltaData → Update_Watermark → Log

Iteration 5: CustomerAddress (INCREMENTAL) – FALSE → Same flow → Log

After Run

-- Config table — watermarks updated
SELECT TableName, LoadType, LastLoadedValue FROM CONFIGTABLE_V2;

TableName	LoadType	LastLoadedValue
Customer	FULL	NULL
Address	FULL	NULL
Product	INCREMENTAL	680
ProductCategory	INCREMENTAL	41
CustomerAddress	INCREMENTAL	29485

-- Audit log — all 5 tables logged
SELECT TableName, LoadType, RowsCopied, Status FROM PIPELINE_LOG ORDER BY LoadDate;

TableName	LoadType	RowsCopied	Status
Customer	FULL	847	SUCCESS
Address	FULL	450	SUCCESS
Product	INCREMENTAL	295	SUCCESS
ProductCategory	INCREMENTAL	41	SUCCESS
CustomerAddress	INCREMENTAL	417	SUCCESS

Second Run (Incremental Proof)

Run the pipeline again WITHOUT changing any source data:

Full load tables (Customer, Address): Copy ALL rows again (847, 450). That is full load — always copies everything.

Incremental tables (Product, ProductCategory, CustomerAddress): Copy ZERO rows. The watermark has not changed. WHERE ProductID > 680 returns no rows. This proves incremental is working — only new data above the watermark gets copied.

Now add new data:

INSERT INTO SalesLT.Product (Name, ProductNumber, Color, ListPrice, ProductCategoryID, SellStartDate)
VALUES ('Test Laptop', 'TP-001', 'Black', 999.99, 5, GETDATE());

Run again: – Product: 1 row copied (only the new laptop). Watermark updated to 681. – All other tables: same as before.

Adding a New Table (Zero Pipeline Changes)

Your team asks: “Can you also load the SalesOrderHeader table?”

INSERT INTO CONFIGTABLE_V2 VALUES
('SalesOrderHeader', 'SalesLT', 'sqldb/SalesOrderHeader', 'INCREMENTAL', '0', 'SalesOrderID');

Done. Run the pipeline. It now processes 6 tables. You did not touch the pipeline, datasets, or activities. One INSERT, zero code changes.

Switching a Table from Full to Incremental

The Address table grew to 10 million rows. Full load is too slow.

UPDATE CONFIGTABLE_V2
SET LoadType = 'INCREMENTAL',
    LastLoadedValue = '0',
    DeltaColumnName = 'AddressID'
WHERE TableName = 'Address';

Done. Next run, Address uses incremental logic. One UPDATE, zero pipeline changes.

Real-life analogy: Switching a table’s load type is like changing a delivery address on an order. The warehouse process does not change. The shipping label changes. The package goes to the new address.

Date-Partitioned Output Folders

Both branches use this expression for the sink FolderName:

@concat(
    item().FolderName, '/',
    formatDateTime(utcnow(), 'yyyy'), '/',
    formatDateTime(utcnow(), 'MM'), '/',
    formatDateTime(utcnow(), 'dd'), '/',
    formatDateTime(utcnow(), 'HH'), '/',
    formatDateTime(utcnow(), 'mm')
)

This produces: sqldb/Customer/2026/05/10/14/30/

Each run creates a new folder. Previous runs are preserved.

Common Errors and Fixes

Error	Cause	Fix
“ORDER BY invalid in subquery”	Using ORDER BY in Lookup_MaxValue query	Do not use ORDER BY. MAX() does not need it
“Column DeltaColumnName not found”	DeltaColumnName is NULL for full load table, but pipeline tries to use it	This only happens if the If Condition is misconfigured — NULL tables should go to TRUE branch
“Cannot CAST NULL to VARCHAR”	MAX() returns NULL when table is empty	Add `ISNULL(MAX(...), '0')` to the Lookup query
Copy succeeds but 0 rows	Watermark already at max value	Expected for incremental when no new data exists
“dataset() in Copy sink”	Using `@dataset().FolderName` in Copy activity instead of `@item()`	Use `@item().FolderName` in the Copy activity. `@dataset()` is for inside the dataset Connection tab only
“Activity not found” error in expressions	Typo in activity name (case-sensitive)	Match exact name: `Copy_FullLoad` not `copy_fullload`
Watermark not updating	Update_Watermark SP using wrong column	Verify SP uses `WHERE TableName = @TableName`

Interview Questions

Q: Why use one pipeline for both full and incremental instead of two separate pipelines? A: One pipeline means one schedule, one monitoring dashboard, one audit log, and one set of error handling. Adding tables requires only a config table INSERT. Switching load types requires only a config UPDATE. Two pipelines double the management overhead without any technical benefit.

Q: How does the If Condition route tables to the right load strategy? A: The expression @equals(item().LoadType, 'FULL') evaluates each config row. TRUE routes to the full load branch (Copy entire table). FALSE routes to the incremental branch (Lookup max watermark, Copy only new rows above watermark, Update watermark). The config table drives the routing — not the pipeline logic.

Q: What happens when an incremental table has no new data? A: The Copy_DeltaData activity runs with a WHERE clause that returns zero rows (watermark unchanged). Zero rows are copied, zero bytes transferred, and the watermark stays the same. The audit log records SUCCESS with RowsCopied=0. This is expected behavior — not an error.

Q: How do you switch a table from full load to incremental? A: One UPDATE statement on the config table: set LoadType to INCREMENTAL, set LastLoadedValue to the starting watermark value (e.g., ‘0’), and set DeltaColumnName to the watermark column name. No pipeline changes needed. Next run, the If Condition routes it to the incremental branch.

Q: What is the watermark pattern and how does it work? A: A watermark is the last value copied from a monotonically increasing column (ID or timestamp). The pipeline stores this in the config table. On the next run, it queries WHERE column > watermark to get only new rows. After copying, it updates the watermark to the new MAX value. This ensures each row is copied exactly once.

Wrapping Up

This unified pipeline is the production pattern you will use in every data platform. One pipeline, one config table, two load strategies, complete audit logging. Adding a table is an INSERT. Switching strategies is an UPDATE. The pipeline is finished — you never touch it again.

The beauty of this design is separation of concerns: the PIPELINE handles orchestration and error handling. The CONFIG TABLE handles business rules (which tables, which strategy). Change the business rules without changing the pipeline. Change the pipeline without changing the business rules.

That is metadata-driven architecture. That is production-grade data engineering.

Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.

Unified Full Load and Incremental Load Pipeline in Azure Synapse: One Pipeline, One Config Table, Two Load Strategies

Unified Full Load and Incremental Load Pipeline in Azure Synapse: One Pipeline, One Config Table, Two Load Strategies

Table of Contents

Why One Pipeline Instead of Two

The Architecture

The Unified Config Table (CONFIGTABLE_V2)

Why These Columns

The Audit Log Table (PIPELINE_LOG)

The Stored Procedures

Logging Procedure

Watermark Update Procedure (Incremental Only)

Building the Pipeline Step by Step

Datasets

Lookup: Read the Config Table

ForEach: Loop Through All Tables

If Condition: Route by LoadType

TRUE Branch: Full Load Path

FALSE Branch: Incremental Load Path

The Complete Pipeline Diagram

First Run Walkthrough

Before Run

Run the Pipeline (Debug)

After Run

Second Run (Incremental Proof)

Adding a New Table (Zero Pipeline Changes)

Switching a Table from Full to Incremental

Date-Partitioned Output Folders

Common Errors and Fixes

Interview Questions

Wrapping Up

Leave a Comment Cancel Reply

Unified Full Load and Incremental Load Pipeline in Azure Synapse: One Pipeline, One Config Table, Two Load Strategies

Table of Contents

Why One Pipeline Instead of Two

The Architecture

The Unified Config Table (CONFIGTABLE_V2)

Why These Columns

The Audit Log Table (PIPELINE_LOG)

The Stored Procedures

Logging Procedure

Watermark Update Procedure (Incremental Only)

Building the Pipeline Step by Step

Datasets

Lookup: Read the Config Table

ForEach: Loop Through All Tables

If Condition: Route by LoadType

TRUE Branch: Full Load Path

FALSE Branch: Incremental Load Path

The Complete Pipeline Diagram

First Run Walkthrough

Before Run

Run the Pipeline (Debug)

After Run

Second Run (Incremental Proof)

Adding a New Table (Zero Pipeline Changes)

Switching a Table from Full to Incremental

Date-Partitioned Output Folders

Common Errors and Fixes

Interview Questions

Wrapping Up

Related Posts

Leave a Comment Cancel Reply