Slowly Changing Dimensions (SCD) in Azure: Types 0, 1, 2, 3, and 6 with Detailed Examples

Imagine you have a customer named Alice who lives in Toronto. She has been your customer since 2022. In January 2026, she moves to Mumbai. Simple question: what city does Alice live in?

If you say “Mumbai,” you are correct TODAY. But what about the orders she placed in 2023 while living in Toronto? Should those orders show Toronto or Mumbai?

THIS is the Slowly Changing Dimension (SCD) problem. And depending on how you handle it, you get completely different answers to business questions. Get it wrong, and your sales reports, customer analytics, and compliance data are all incorrect.

This post covers every SCD type with detailed table examples, SQL implementation in Azure, and the real-world scenarios where each type is the right choice.

What Is a Slowly Changing Dimension?
The Star Schema Context
Our Running Example: Alice Moves to Mumbai
SCD Type 0: Fixed (Never Changes)
SCD Type 1: Overwrite (Lose History)
SCD Type 2: Add New Row (Full History)
SCD Type 3: Add New Column (Limited History)
SCD Type 6: Hybrid (1 + 2 + 3 Combined)
Side-by-Side Comparison
Implementing SCD Type 2 in Azure SQL Database
Implementing SCD Type 2 in Synapse Data Flows
Implementing SCD Type 2 with Delta Lake MERGE
SCD in the Medallion Architecture
Handling Multiple Changes
Edge Cases and Gotchas
Real-World Scenarios
Interview Questions
Wrapping Up

What Is a Slowly Changing Dimension?

A dimension in a data warehouse is a table that describes the WHO, WHAT, WHERE of your business: customers, products, stores, employees. These dimensions change slowly over time — a customer moves cities, a product changes categories, an employee gets promoted.

The question is: how do you handle these changes in your data warehouse?

The answer depends on your SCD type.

Real-life analogy: Think of your phone contacts. When a friend changes their phone number, do you: – Overwrite the old number (SCD Type 1) — you can call them now but lost their old number – Add a new contact entry with the new number and mark the old one as “previous” (SCD Type 2) — full history – Add a “previous phone” field to the same contact (SCD Type 3) — one level of history

Each approach serves different needs.

The Star Schema Context

SCDs live in dimensional models (star schemas):

dim_customer (SCD applies here)
               |
dim_date --- fact_orders --- dim_product
               |
          dim_store

Fact table (fact_orders): records events — order_id, amount, date_key, customer_key. Millions of rows. Append-only.

Dimension table (dim_customer): describes entities — customer_key, name, city, segment. Thousands of rows. Changes over time.

The customer_key in the fact table points to a row in the dimension table. SCD determines what happens to that row when the customer’s data changes.

Our Running Example: Alice Moves to Mumbai

Starting state (dimension table):

customer_key	customer_id	name	city	segment
1001	C100	Alice	Toronto	Premium
1002	C200	Bob	London	Standard
1003	C300	Carol	Mumbai	Premium

The change: On January 15, 2026, Alice moves from Toronto to Mumbai.

Let us see how each SCD type handles this.

SCD Type 0: Fixed (Never Changes)

The Rule

Once a dimension record is created, it NEVER changes. The original value is preserved forever. Any updates from the source are ignored.

After Alice Moves

customer_key	customer_id	name	city	segment
1001	C100	Alice	Toronto	Premium
1002	C200	Bob	London	Standard
1003	C300	Carol	Mumbai	Premium

City stays Toronto. The change is ignored completely.

When to Use Type 0

Date of birth — never changes (even if the source system “corrects” it)
Original signup date — the date the customer first registered
Social Security Number — once assigned, never modified
Regulatory records — where the original value must be preserved for legal reasons

Real-life analogy: Your birth certificate. Even if your name changes legally, the birth certificate still shows your original birth name. It is a permanent historical record.

SCD Type 1: Overwrite (Lose History)

The Rule

The old value is overwritten with the new value. No history is kept. The dimension always reflects the CURRENT state.

After Alice Moves

customer_key	customer_id	name	city	segment
1001	C100	Alice	Mumbai	Premium
1002	C200	Bob	London	Standard
1003	C300	Carol	Mumbai	Premium

Toronto is gone. Replaced with Mumbai. If you query Alice’s orders from 2023, they now show Mumbai (even though she was in Toronto when she placed them).

SQL Implementation

UPDATE dim_customer
SET city = 'Mumbai',
    last_updated = GETDATE()
WHERE customer_id = 'C100';

When to Use Type 1

Corrections — fixing a typo (city was “Tronto” and corrected to “Toronto”)
Non-critical attributes — phone number, email, where history does not matter
Real-time dashboards — you want the current state only
Simple reporting — “how many customers are in each city RIGHT NOW?”

The Problem with Type 1

Fact: Alice ordered $500 on 2023-06-15 (she was in Toronto)
Dim:  Alice's city is now Mumbai

Report: "Revenue by City, 2023"
  Mumbai: $500 (WRONG -- she was in Toronto when she ordered)
  Toronto: $0 (WRONG -- this is where the order actually happened)

History is rewritten. Past reports change retroactively. This is why Type 1 is only appropriate when history does not matter.

Real-life analogy: Type 1 is like updating your GPS address. You change your home address and the GPS forgets you ever lived anywhere else. Good for navigation, bad for proving where you lived in 2023.

SCD Type 2: Add New Row (Full History)

The Rule

When a change occurs, the old record is “closed” (end-dated) and a new record is inserted with the new values. Both versions coexist in the table. Full history is preserved.

After Alice Moves

customer_key	customer_id	name	city	effective_from	effective_to	is_current
1001	C100	Alice	Toronto	2022-03-15	2026-01-14	0
1004	C100	Alice	Mumbai	2026-01-15	9999-12-31	1
1002	C200	Bob	London	2023-01-01	9999-12-31	1
1003	C300	Carol	Mumbai	2024-06-01	9999-12-31	1

Key observations: – Alice now has TWO rows (customer_key 1001 and 1004) – Same customer_id (C100) but different customer_key (surrogate key) – Row 1001 has effective_to = 2026-01-14 and is_current = 0 (historical) – Row 1004 has effective_to = 9999-12-31 and is_current = 1 (current) – 9999-12-31 is a convention meaning “still active”

Why Surrogate Keys Matter

Notice customer_key is different from customer_id:

customer_id = business key (C100). Stays the same across versions. Identifies the PERSON.
customer_key = surrogate key (1001, 1004). Auto-generated. Identifies the VERSION.

Fact table references customer_key, not customer_id. This is how you link an order to the CORRECT version of the customer:

fact_orders:
  order_id=501, customer_key=1001, amount=500, date=2023-06-15
  (Links to Alice in Toronto -- CORRECT for 2023)

  order_id=720, customer_key=1004, amount=300, date=2026-02-10
  (Links to Alice in Mumbai -- CORRECT for 2026)

Querying SCD Type 2

Current state (like Type 1):

SELECT * FROM dim_customer
WHERE is_current = 1;

Historical state (point-in-time):

-- Where was Alice on 2023-06-15?
SELECT * FROM dim_customer
WHERE customer_id = 'C100'
  AND '2023-06-15' >= effective_from
  AND '2023-06-15' < effective_to;

-- Result: Toronto (customer_key=1001)

All versions of a customer:

SELECT * FROM dim_customer
WHERE customer_id = 'C100'
ORDER BY effective_from;

-- Result: Toronto (2022-2026), Mumbai (2026-present)

When to Use Type 2

Regulatory compliance — must track where customers lived when they transacted (GDPR, financial audits)
Accurate historical reporting — revenue by city should reflect the city AT THE TIME of the order
Customer journey analysis — track how customer segments change over time
Any scenario where “what was the state at that point in time?” matters

Real-life analogy: Type 2 is like your passport history. When you renew your passport, the old one is cancelled (end-dated) and a new one is issued. Immigration can see where you traveled with EACH passport. Complete history preserved.

SCD Type 3: Add New Column (Limited History)

The Rule

Add a column for the previous value. You keep ONE level of history but not the full timeline.

After Alice Moves

customer_key	customer_id	name	current_city	previous_city	city_changed_date
1001	C100	Alice	Mumbai	Toronto	2026-01-15
1002	C200	Bob	London	NULL	NULL
1003	C300	Carol	Mumbai	NULL	NULL

SQL Implementation

UPDATE dim_customer
SET previous_city = current_city,     -- Save old value
    current_city = 'Mumbai',          -- Update to new value
    city_changed_date = '2026-01-15'
WHERE customer_id = 'C100';

The Limitation

What if Alice moves AGAIN to Paris in 2027?

customer_key	customer_id	current_city	previous_city
1001	C100	Paris	Mumbai

Toronto is gone. Type 3 only remembers one previous value. You lose the Toronto history.

When to Use Type 3

You need to compare current vs previous values (e.g., “did the customer’s segment change?”)
You only care about one level of change (current and immediately prior)
The attribute rarely changes (maybe once in the customer’s lifetime)

Real-life analogy: Type 3 is like your phone’s “Recent Calls” showing only the last missed call. You know who called most recently, but not who called before that.

SCD Type 6: Hybrid (1 + 2 + 3 Combined)

The Rule

Type 6 combines Types 1, 2, AND 3 in one table. Each row has: – The current value (Type 1 — always up to date) – The historical value for that version (Type 2 — full history) – The previous value (Type 3 — one-level comparison)

It is called Type 6 because 1 + 2 + 3 = 6.

After Alice Moves

customer_key	customer_id	name	historical_city	current_city	previous_city	effective_from	effective_to	is_current
1001	C100	Alice	Toronto	Mumbai	NULL	2022-03-15	2026-01-14	0
1004	C100	Alice	Mumbai	Mumbai	Toronto	2026-01-15	9999-12-31	1

Notice: – historical_city = the city for THAT specific version (does not change) – current_city = Alice’s current city (updated across ALL rows — Type 1) – previous_city = her previous city (Type 3) – Multiple rows with dates = full history (Type 2)

When to Use Type 6

Enterprise data warehouses that need maximum flexibility
When different reports need different perspectives:
“Revenue by city at time of order” → use historical_city
“Revenue by CURRENT city” → use current_city
“Customers who recently changed cities” → compare current_city vs previous_city

Real-life analogy: Type 6 is like a comprehensive employee record that shows: your current title (Type 1), all past titles with dates (Type 2), and your immediately previous title (Type 3). HR can answer any question about your career history.

Side-by-Side Comparison

Feature	Type 0	Type 1	Type 2	Type 3	Type 6
History	None (original preserved)	None (overwritten)	Full	One level	Full + current + previous
Row count	Fixed	Fixed	Grows	Fixed	Grows
Complexity	Simplest	Simple	Medium	Simple	Complex
Storage	Least	Least	More	Slightly more	Most
Query performance	Fast	Fast	Slower (more rows)	Fast	Slower
Point-in-time query	No	No	Yes	Limited	Yes
Current state query	N/A	Yes	Yes (WHERE is_current=1)	Yes	Yes
Use case	Static attributes	Non-critical updates	Regulatory, analytics	Before/after comparison	Enterprise DW

Implementing SCD Type 2 in Azure SQL Database

This is the most common SCD type. Here is a complete implementation:

Create the Dimension Table

CREATE TABLE dim_customer (
    customer_key    INT IDENTITY(1,1) PRIMARY KEY,
    customer_id     VARCHAR(20) NOT NULL,      -- Business key
    name            VARCHAR(100),
    city            VARCHAR(50),
    segment         VARCHAR(50),
    effective_from  DATE NOT NULL,
    effective_to    DATE NOT NULL DEFAULT '9999-12-31',
    is_current      BIT NOT NULL DEFAULT 1,
    created_date    DATETIME2 DEFAULT GETDATE()
);

CREATE INDEX IX_dim_customer_bk ON dim_customer (customer_id, is_current);
CREATE INDEX IX_dim_customer_dates ON dim_customer (customer_id, effective_from, effective_to);

The MERGE Statement (Upsert)

-- Staging table has the latest source data
-- dim_customer is the target SCD Type 2 table

-- Step 1: Close existing current records where source data changed
UPDATE d
SET d.effective_to = DATEADD(DAY, -1, CAST(GETDATE() AS DATE)),
    d.is_current = 0
FROM dim_customer d
INNER JOIN staging_customer s ON d.customer_id = s.customer_id
WHERE d.is_current = 1
  AND (d.city != s.city OR d.segment != s.segment);

-- Step 2: Insert new version for changed records
INSERT INTO dim_customer (customer_id, name, city, segment, effective_from, effective_to, is_current)
SELECT s.customer_id, s.name, s.city, s.segment,
       CAST(GETDATE() AS DATE), '9999-12-31', 1
FROM staging_customer s
INNER JOIN dim_customer d ON s.customer_id = d.customer_id
WHERE d.is_current = 0
  AND d.effective_to = DATEADD(DAY, -1, CAST(GETDATE() AS DATE));

-- Step 3: Insert brand new customers
INSERT INTO dim_customer (customer_id, name, city, segment, effective_from, effective_to, is_current)
SELECT s.customer_id, s.name, s.city, s.segment,
       CAST(GETDATE() AS DATE), '9999-12-31', 1
FROM staging_customer s
WHERE NOT EXISTS (
    SELECT 1 FROM dim_customer d WHERE d.customer_id = s.customer_id
);

Why Three Steps?

Close old records — the existing current row for changed customers gets an end date
Insert new versions — a new row with the updated values and is_current=1
Insert new customers — first-time customers get their first row

Real-life analogy: It is like a hotel ledger. When a guest checks out of one room (close old record) and checks into another room (insert new record), both entries stay in the ledger. New guests who have never stayed before get their first entry.

Implementing SCD Type 2 in Synapse Data Flows

Visual Approach

Source: staging_customer
Source: dim_customer (WHERE is_current = 1)
  |
  JOIN: staging LEFT OUTER JOIN dim ON customer_id
  |
  Conditional Split:
    New Customer:    isNull(dim@customer_key) --> Insert new row
    Changed:         staging@city != dim@city OR staging@segment != dim@segment
                     --> Two paths:
                        1. Update old row (set is_current=0, effective_to=today)
                        2. Insert new row (is_current=1, effective_from=today)
    Unchanged:       Default --> discard (no action needed)

The Alter Row Transformation

Use Alter Row to mark rows for INSERT or UPDATE:

Insert if:  isNull(dim@customer_key)                    -- New customer
Upsert if:  staging@city != dim@city                    -- Changed customer (close + insert)

Implementing SCD Type 2 with Delta Lake MERGE

In Databricks or Synapse Spark, Delta Lake makes SCD Type 2 elegant:

from delta.tables import DeltaTable

dim_customer = DeltaTable.forPath(spark, "/delta/dim_customer")

# MERGE: handle new, changed, and unchanged in one statement
(dim_customer.alias("target")
    .merge(
        staging_df.alias("source"),
        "target.customer_id = source.customer_id AND target.is_current = true"
    )
    # When matched AND values changed: close the old record
    .whenMatchedUpdate(
        condition="source.city != target.city OR source.segment != target.segment",
        set={
            "is_current": "false",
            "effective_to": "current_date()"
        }
    )
    # When not matched: insert new customer
    .whenNotMatchedInsert(values={
        "customer_id": "source.customer_id",
        "name": "source.name",
        "city": "source.city",
        "segment": "source.segment",
        "effective_from": "current_date()",
        "effective_to": "lit('9999-12-31')",
        "is_current": "true"
    })
    .execute()
)

# Second pass: insert new version rows for changed records
changed_records = staging_df.join(
    dim_customer.toDF().filter("is_current = false AND effective_to = current_date()"),
    "customer_id"
)
# Insert new current rows for changed customers

SCD in the Medallion Architecture

Bronze: Raw data from source (staging_customer)
  |
  v
Silver: Apply SCD Type 2 logic
  |-- dim_customer with full history
  |-- effective_from, effective_to, is_current
  |
  v
Gold: Business views
  |-- v_current_customers (WHERE is_current = 1)
  |-- v_customer_history (all versions, for audit)

Handling Multiple Changes

What if Alice moves to Mumbai in January AND her segment changes to “Enterprise” in March?

Two Separate Changes = Two New Rows

customer_key	customer_id	city	segment	effective_from	effective_to	is_current
1001	C100	Toronto	Premium	2022-03-15	2026-01-14	0
1004	C100	Mumbai	Premium	2026-01-15	2026-03-09	0
1005	C100	Mumbai	Enterprise	2026-03-10	9999-12-31	1

Three versions of Alice. The fact table links to the correct version based on when each order was placed.

Edge Cases and Gotchas

1. Same-Day Changes

Alice moves to Mumbai AND changes segment on the same day. Do you create one row or two?

Best practice: One row with both changes. If they happen on the same effective date, combine them into a single version.

2. Backdated Changes

Source system sends a correction: “Alice actually moved on December 15, 2025, not January 15, 2026.”

Challenge: Orders between Dec 15 and Jan 15 were linked to the wrong dimension version.

Fix: Reprocess the affected date range. This is complex and is why most teams process SCD changes nightly (not retroactively).

3. Deleted Customers

Source system deletes a customer. Do you delete from the dimension?

Best practice: NEVER delete from dimensions. Set is_current = 0 and effective_to = today. This preserves history for past fact records.

4. Reactivated Customers

Alice was marked inactive in 2025, then reactivated in 2026.

Solution: Insert a new row with is_current = 1. The inactive period is captured in the previous version’s date range.

Real-World Scenarios

Scenario 1: Financial Services (Regulatory)

Requirement: Auditors need to know which branch a customer belonged to when each transaction occurred.

SCD Type: Type 2 (mandatory for financial compliance).

Why: If Alice was in the Toronto branch when she made a $10,000 deposit, the audit must show Toronto — even if she later transferred to Mumbai.

Scenario 2: E-Commerce Product Catalog

Requirement: Product prices change weekly. Reports show revenue by product.

SCD Type: Type 2 for price, Type 1 for product name corrections.

Why: Revenue should reflect the price AT THE TIME OF SALE. But if a product name had a typo, fix it everywhere (Type 1).

Scenario 3: HR Employee Tracking

Requirement: Track employee promotions and department transfers.

SCD Type: Type 2 for department and title. Type 0 for hire date.

Why: Headcount by department for Q1 2025 should use the department employees were in DURING Q1, not their current department.

Scenario 4: Marketing Customer Segmentation

Requirement: Marketing wants to see if customers who changed segments had higher spend.

SCD Type: Type 6 (Type 2 + Type 3 combined).

Why: Type 2 gives full history. The previous_segment column (Type 3) makes it easy to query “customers who upgraded from Standard to Premium.”

Interview Questions

Q: What is a Slowly Changing Dimension? A: A dimension table where attribute values change over time — customer address changes, product price changes, employee gets promoted. SCD refers to the strategy for handling these changes in the data warehouse.

Q: What is the difference between SCD Type 1 and Type 2? A: Type 1 overwrites the old value with the new one, losing all history. Type 2 keeps the old row (end-dated) and inserts a new row with the updated value, preserving full history. Type 1 is simpler but cannot answer “what was the state at a past point in time?”

Q: Why do SCD Type 2 tables use surrogate keys? A: Because one business entity (customer_id = C100) can have multiple rows representing different time periods. The surrogate key (customer_key) uniquely identifies each VERSION. Fact tables reference the surrogate key to link to the correct historical version.

Q: How do you query the current state in an SCD Type 2 table? A: SELECT * FROM dim_customer WHERE is_current = 1. This filters to only the latest version of each record, giving you the same result as a Type 1 table.

Q: How do you implement SCD Type 2 in Azure? A: Three approaches: (1) SQL MERGE/UPDATE+INSERT in Azure SQL Database or Synapse Dedicated SQL Pool, (2) Data Flow with Conditional Split, Alter Row, and two sinks (update old + insert new), (3) Delta Lake MERGE in Databricks or Synapse Spark.

Q: When would you NOT use SCD Type 2? A: When history does not matter (use Type 1 for corrections). When storage and query performance are critical and the dimension has millions of rows (Type 2 grows the table). When the attribute changes very frequently (hourly changes would create enormous table growth).

Wrapping Up

SCD is one of those concepts that separates junior data engineers from senior ones. Everyone can build a pipeline that copies data. But handling dimension changes correctly — knowing when to overwrite, when to version, and when to preserve history — requires understanding the business requirements behind the data.

The most important question to ask stakeholders: “Do you need to know what the value WAS at a past point in time, or do you only care about what it is NOW?” The answer determines your SCD type.

← Previous: Audit Logging Concepts Azure (22/37) Next: SCD Type 1 Full Load →

Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.

Table of Contents

What Is a Slowly Changing Dimension?

The Star Schema Context

Our Running Example: Alice Moves to Mumbai

SCD Type 0: Fixed (Never Changes)

The Rule

After Alice Moves

When to Use Type 0

SCD Type 1: Overwrite (Lose History)

The Rule

After Alice Moves

SQL Implementation

When to Use Type 1

The Problem with Type 1

SCD Type 2: Add New Row (Full History)

The Rule

After Alice Moves

Why Surrogate Keys Matter

Querying SCD Type 2

When to Use Type 2

SCD Type 3: Add New Column (Limited History)

The Rule

After Alice Moves

SQL Implementation

The Limitation

When to Use Type 3

SCD Type 6: Hybrid (1 + 2 + 3 Combined)

The Rule

After Alice Moves

When to Use Type 6

Side-by-Side Comparison

Implementing SCD Type 2 in Azure SQL Database

Create the Dimension Table

The MERGE Statement (Upsert)

Why Three Steps?

Implementing SCD Type 2 in Synapse Data Flows

Visual Approach

The Alter Row Transformation

Implementing SCD Type 2 with Delta Lake MERGE

SCD in the Medallion Architecture

Handling Multiple Changes

Two Separate Changes = Two New Rows

Edge Cases and Gotchas

1. Same-Day Changes

2. Backdated Changes

3. Deleted Customers

4. Reactivated Customers

Real-World Scenarios

Scenario 1: Financial Services (Regulatory)

Scenario 2: E-Commerce Product Catalog

Scenario 3: HR Employee Tracking

Scenario 4: Marketing Customer Segmentation

Interview Questions

Wrapping Up

Related Posts

Leave a Comment Cancel Reply