SQL Transactions: BEGIN, COMMIT, ROLLBACK, ACID Properties, Isolation Levels, and Real-World Scenarios Every Data Engineer Must Understand

SQL Transactions: BEGIN, COMMIT, ROLLBACK, ACID Properties, Isolation Levels, and Real-World Scenarios Every Data Engineer Must Understand

You build a pipeline that inserts 10,000 rows into a staging table, then updates a dimension table, then deletes processed records from the source. The INSERT succeeds. The UPDATE succeeds. The DELETE fails halfway. Now you have a staging table that is half-deleted, a dimension table that is fully updated, and no way to tell which source records were already processed.

This is exactly why transactions exist.

A transaction groups multiple SQL statements into ONE atomic unit — either ALL of them succeed, or NONE of them take effect. If the DELETE fails, the INSERT and UPDATE are automatically undone. Your database goes back to exactly where it was before the transaction started. No half-done state. No data corruption.

Think of a transaction like a bank transfer. When you transfer $500 from savings to checking, the bank does not debit savings first and then hope the credit to checking works. It wraps BOTH operations in a transaction: debit savings AND credit checking happen as ONE unit. If the credit fails, the debit is reversed. Your money does not disappear into thin air. That guarantee — all or nothing — is what transactions provide.

Table of Contents

  • What Is a Transaction?
  • The ACID Properties (The Four Guarantees)
  • BEGIN TRANSACTION, COMMIT, ROLLBACK
  • Implicit vs Explicit Transactions
  • SAVEPOINT — Partial Rollback
  • Real-World Scenario 1: Bank Transfer
  • Real-World Scenario 2: E-Commerce Order Placement
  • Real-World Scenario 3: SCD Type 2 Dimension Load
  • Real-World Scenario 4: ETL Pipeline with Staging
  • Real-World Scenario 5: Inventory Reservation
  • Real-World Scenario 6: Payroll Processing
  • Error Handling with TRY/CATCH + Transactions
  • Transaction Scope and Nesting
  • Isolation Levels — Controlling What Other Users See
  • READ UNCOMMITTED (Dirty Reads)
  • READ COMMITTED (Default)
  • REPEATABLE READ
  • SERIALIZABLE
  • SNAPSHOT Isolation
  • Isolation Level Comparison
  • Deadlocks — When Two Transactions Block Each Other
  • Transactions in Stored Procedures
  • Transactions in Data Engineering Pipelines
  • Long-Running Transactions and Locks
  • Common Mistakes
  • Interview Questions
  • Wrapping Up

What Is a Transaction?

A transaction is a logical unit of work — a group of SQL statements that either ALL succeed (COMMIT) or ALL fail (ROLLBACK). There is no in-between.

-- Without transaction (DANGEROUS):
UPDATE accounts SET balance = balance - 500 WHERE account_id = 1001;  -- Debit ✅
-- Power failure here! Debit happened, credit did NOT
UPDATE accounts SET balance = balance + 500 WHERE account_id = 1002;  -- Credit ❌ NEVER RUNS
-- Result: $500 disappeared from the system

-- With transaction (SAFE):
BEGIN TRANSACTION;
    UPDATE accounts SET balance = balance - 500 WHERE account_id = 1001;  -- Debit
    UPDATE accounts SET balance = balance + 500 WHERE account_id = 1002;  -- Credit
COMMIT;
-- Either BOTH happen or NEITHER happens. Money cannot disappear.

The ACID Properties (The Four Guarantees)

Every transaction provides four guarantees, known as ACID:

Atomicity — All or Nothing

"Either ALL statements in the transaction succeed, or NONE of them take effect."

BEGIN TRANSACTION;
    INSERT INTO orders VALUES (5001, 1001, 999.99);     -- ✅ Succeeds
    INSERT INTO order_items VALUES (5001, 'Laptop', 1);  -- ✅ Succeeds
    UPDATE inventory SET stock = stock - 1 WHERE product = 'Laptop';  -- ❌ FAILS (stock = 0)
ROLLBACK;
-- ALL three statements are undone. The order and order_item never existed.

Real-life analogy: Atomicity is like an ATM withdrawal. The machine does not dispense cash unless it can ALSO debit your account. If the debit fails, no cash comes out. If the cash jams, the debit is reversed. Both actions are one atomic unit.

Consistency — From One Valid State to Another

"The database moves from one valid state to another. Constraints are never violated."

BEGIN TRANSACTION;
    UPDATE accounts SET balance = balance - 500 WHERE account_id = 1001;
    -- Constraint: CHECK (balance >= 0)
    -- If balance would go negative, the constraint violation causes ROLLBACK
COMMIT;
-- Database was consistent before (all balances >= 0)
-- Database is consistent after (all balances >= 0)
-- The transaction cannot leave the database in an invalid state

Real-life analogy: Consistency is like a chess game. Every move must follow the rules. You cannot move a rook diagonally. After every move, the board is in a valid state. A transaction is a “move” — it cannot violate the rules (constraints).

Isolation — Transactions Do Not Interfere

"Concurrent transactions behave as if they are running one at a time."

Transaction A (User 1):                Transaction B (User 2):
BEGIN TRANSACTION;
UPDATE accounts SET balance = 1000      
WHERE account_id = 1001;                BEGIN TRANSACTION;
                                        SELECT balance FROM accounts
                                        WHERE account_id = 1001;
                                        -- What does User 2 see?
                                        -- Depends on the ISOLATION LEVEL
COMMIT;

Real-life analogy: Isolation is like individual fitting rooms in a clothing store. Multiple customers try on clothes simultaneously, but each room is private. Customer A’s half-dressed state is not visible to Customer B. Each customer sees a complete, consistent state.

Durability — Committed Data Survives Crashes

"Once a transaction is committed, the data is permanently saved —
even if the server crashes one millisecond after COMMIT."

BEGIN TRANSACTION;
    INSERT INTO critical_data VALUES ('important record');
COMMIT;   -- ← At this moment, data is written to disk/transaction log
-- Server crashes 1ms later
-- When server restarts: 'important record' is still there. Guaranteed.

Real-life analogy: Durability is like a notarized contract. Once signed (COMMIT), it is legally binding even if the office burns down — because copies exist in the registry (transaction log). The record survives any disaster.

BEGIN TRANSACTION, COMMIT, ROLLBACK

The Three Commands

-- BEGIN: Start a transaction
BEGIN TRANSACTION;       -- SQL Server
BEGIN;                   -- PostgreSQL, MySQL
START TRANSACTION;       -- MySQL alternative

-- COMMIT: Save all changes permanently
COMMIT;
COMMIT TRANSACTION;      -- SQL Server alternative

-- ROLLBACK: Undo all changes since BEGIN
ROLLBACK;
ROLLBACK TRANSACTION;    -- SQL Server alternative

Basic Flow

-- SUCCESS path:
BEGIN TRANSACTION;
    INSERT INTO employees VALUES (9999, 'Test', 'Engineering', 90000);
    UPDATE department_stats SET headcount = headcount + 1 WHERE dept = 'Engineering';
COMMIT;
-- Both INSERT and UPDATE are saved permanently

-- FAILURE path:
BEGIN TRANSACTION;
    INSERT INTO employees VALUES (9999, 'Test', 'Engineering', 90000);
    UPDATE department_stats SET headcount = headcount + 1 WHERE dept = 'Engineering';
ROLLBACK;
-- Both INSERT and UPDATE are undone. Database unchanged.

Implicit vs Explicit Transactions

Implicit (Auto-Commit)

By default, every SQL statement is its own transaction:

-- Each statement auto-commits immediately
INSERT INTO employees VALUES (1, 'Naveen', 'Eng', 100000);  -- Auto-committed
UPDATE employees SET salary = 110000 WHERE emp_id = 1;        -- Auto-committed
-- Each statement is independent. Cannot undo the INSERT after the UPDATE.

Explicit (Manual Control)

You explicitly define the transaction boundaries:

BEGIN TRANSACTION;
    INSERT INTO employees VALUES (1, 'Naveen', 'Eng', 100000);
    UPDATE employees SET salary = 110000 WHERE emp_id = 1;
COMMIT;  -- Or ROLLBACK to undo both

Rule: Always use explicit transactions when multiple related statements must succeed or fail together.

SAVEPOINT — Partial Rollback

A savepoint lets you ROLLBACK to a specific point within a transaction without undoing everything:

BEGIN TRANSACTION;
    INSERT INTO orders VALUES (5001, 1001, 'Laptop', 999.99);
    SAVE TRANSACTION sp_after_order;     -- Mark this point

    INSERT INTO order_items VALUES (5001, 'Warranty', 99.99);
    -- Oops, customer does not want the warranty

    ROLLBACK TRANSACTION sp_after_order; -- Undo only the warranty insert
    -- The order (5001) still exists!

COMMIT;
-- Result: Order 5001 is saved. Warranty item is NOT saved.

Real-life analogy: Savepoint is like a bookmark in a book. If you read ahead and decide you don’t like the chapter, you go back to the bookmark — not to the beginning of the book. You keep everything before the bookmark.

Real-World Scenario 1: Bank Transfer

CREATE PROCEDURE usp_transfer_funds
    @from_account INT,
    @to_account INT,
    @amount DECIMAL(10,2)
AS
BEGIN
    BEGIN TRY
        BEGIN TRANSACTION;

            -- Step 1: Verify source account has sufficient funds
            DECLARE @current_balance DECIMAL(10,2);
            SELECT @current_balance = balance FROM accounts WHERE account_id = @from_account;

            IF @current_balance < @amount
            BEGIN
                RAISERROR('Insufficient funds. Balance: %s, Requested: %s', 16, 1,
                    @current_balance, @amount);
                RETURN;
            END;

            -- Step 2: Debit source account
            UPDATE accounts
            SET balance = balance - @amount, last_modified = GETDATE()
            WHERE account_id = @from_account;

            -- Step 3: Credit destination account
            UPDATE accounts
            SET balance = balance + @amount, last_modified = GETDATE()
            WHERE account_id = @to_account;

            -- Step 4: Log the transaction
            INSERT INTO transaction_log (from_account, to_account, amount, transaction_date, status)
            VALUES (@from_account, @to_account, @amount, GETDATE(), 'SUCCESS');

        COMMIT;
        PRINT 'Transfer successful: $' + CAST(@amount AS VARCHAR);

    END TRY
    BEGIN CATCH
        ROLLBACK;
        INSERT INTO transaction_log (from_account, to_account, amount, transaction_date, status, error_message)
        VALUES (@from_account, @to_account, @amount, GETDATE(), 'FAILED', ERROR_MESSAGE());
        THROW;
    END CATCH;
END;

What happens without a transaction: Debit succeeds, credit fails → money disappears from source, never arrives at destination. Customer loses $500.

What happens with a transaction: Debit and credit are atomic. If credit fails, debit is rolled back. Money stays in the source account. Nothing is lost.

Real-World Scenario 2: E-Commerce Order Placement

CREATE PROCEDURE usp_place_order
    @customer_id INT,
    @product_id INT,
    @quantity INT
AS
BEGIN
    BEGIN TRY
        BEGIN TRANSACTION;

            -- Step 1: Check inventory
            DECLARE @current_stock INT, @price DECIMAL(10,2);
            SELECT @current_stock = stock, @price = price
            FROM products WHERE product_id = @product_id;

            IF @current_stock < @quantity
            BEGIN
                RAISERROR('Insufficient stock', 16, 1);
                RETURN;
            END;

            -- Step 2: Create the order
            DECLARE @order_id INT;
            INSERT INTO orders (customer_id, order_date, status, total_amount)
            VALUES (@customer_id, GETDATE(), 'Confirmed', @price * @quantity);
            SET @order_id = SCOPE_IDENTITY();

            -- Step 3: Create order items
            INSERT INTO order_items (order_id, product_id, quantity, unit_price)
            VALUES (@order_id, @product_id, @quantity, @price);

            -- Step 4: Reduce inventory
            UPDATE products
            SET stock = stock - @quantity
            WHERE product_id = @product_id;

            -- Step 5: Create payment record
            INSERT INTO payments (order_id, amount, payment_date, status)
            VALUES (@order_id, @price * @quantity, GETDATE(), 'Pending');

        COMMIT;
        PRINT 'Order ' + CAST(@order_id AS VARCHAR) + ' placed successfully';

    END TRY
    BEGIN CATCH
        ROLLBACK;
        -- Order, order_items, inventory update, payment — ALL undone
        PRINT 'Order failed: ' + ERROR_MESSAGE();
        THROW;
    END CATCH;
END;

Without transaction: Order created → items added → inventory reduced → payment fails → customer has an order with no payment, and inventory is already reduced. Manual cleanup nightmare.

With transaction: If ANY step fails, everything rolls back. No orphan orders, no phantom inventory reductions.

Real-World Scenario 3: SCD Type 2 Dimension Load

CREATE PROCEDURE usp_load_dim_customer
AS
BEGIN
    BEGIN TRY
        BEGIN TRANSACTION;

            -- Step 1: Expire changed records (set end_date and is_active=0)
            UPDATE dim_customer
            SET end_date = CAST(GETDATE() AS DATE),
                is_active = 0
            WHERE customer_id IN (
                SELECT s.customer_id FROM staging_customers s
                JOIN dim_customer d ON s.customer_id = d.customer_id AND d.is_active = 1
                WHERE s.city != d.city OR s.email != d.email
            );

            -- Step 2: Insert new versions of changed records
            INSERT INTO dim_customer (customer_id, name, email, city, start_date, end_date, is_active)
            SELECT s.customer_id, s.name, s.email, s.city,
                   CAST(GETDATE() AS DATE), '9999-12-31', 1
            FROM staging_customers s
            WHERE s.customer_id IN (
                SELECT customer_id FROM dim_customer
                WHERE is_active = 0 AND end_date = CAST(GETDATE() AS DATE)
            );

            -- Step 3: Insert brand new customers
            INSERT INTO dim_customer (customer_id, name, email, city, start_date, end_date, is_active)
            SELECT s.customer_id, s.name, s.email, s.city,
                   CAST(GETDATE() AS DATE), '9999-12-31', 1
            FROM staging_customers s
            WHERE s.customer_id NOT IN (SELECT customer_id FROM dim_customer);

            -- Step 4: Log the load
            INSERT INTO etl_log (procedure_name, rows_expired, rows_inserted, load_date, status)
            VALUES ('usp_load_dim_customer', @@ROWCOUNT, @@ROWCOUNT, GETDATE(), 'SUCCESS');

        COMMIT;

    END TRY
    BEGIN CATCH
        ROLLBACK;
        INSERT INTO etl_log (procedure_name, load_date, status, error_message)
        VALUES ('usp_load_dim_customer', GETDATE(), 'FAILED', ERROR_MESSAGE());
        THROW;
    END CATCH;
END;

Without transaction: Old records expired (Step 1) → new versions INSERT fails (Step 2) → customers now have NO active record. SCD is broken. Fact table joins return nothing.

With transaction: If Step 2 fails, Step 1’s expiration is rolled back. Old records remain active. SCD stays consistent.

Real-World Scenario 4: ETL Pipeline with Staging

CREATE PROCEDURE usp_etl_daily_load
AS
BEGIN
    BEGIN TRY
        BEGIN TRANSACTION;

            -- Step 1: Truncate staging
            TRUNCATE TABLE stg_orders;

            -- Step 2: Load new data into staging
            INSERT INTO stg_orders
            SELECT * FROM source_db.dbo.orders
            WHERE order_date >= DATEADD(DAY, -1, CAST(GETDATE() AS DATE));

            -- Step 3: Merge into production
            MERGE INTO fact_orders AS target
            USING stg_orders AS source
            ON target.order_id = source.order_id
            WHEN MATCHED THEN
                UPDATE SET target.amount = source.amount, target.status = source.status
            WHEN NOT MATCHED THEN
                INSERT (order_id, customer_id, amount, order_date, status)
                VALUES (source.order_id, source.customer_id, source.amount,
                        source.order_date, source.status);

            -- Step 4: Update watermark
            UPDATE etl_config
            SET last_loaded_date = CAST(GETDATE() AS DATE)
            WHERE pipeline_name = 'daily_orders';

        COMMIT;

    END TRY
    BEGIN CATCH
        ROLLBACK;
        THROW;
    END CATCH;
END;

Real-World Scenario 5: Inventory Reservation

CREATE PROCEDURE usp_reserve_inventory
    @product_id INT,
    @quantity INT,
    @customer_id INT
AS
BEGIN
    BEGIN TRY
        BEGIN TRANSACTION;

            -- Lock the row to prevent concurrent reservations
            DECLARE @available INT;
            SELECT @available = stock - reserved
            FROM products WITH (UPDLOCK, ROWLOCK)
            WHERE product_id = @product_id;

            IF @available < @quantity
            BEGIN
                RAISERROR('Only %d units available', 16, 1, @available);
                RETURN;
            END;

            -- Reserve the quantity
            UPDATE products
            SET reserved = reserved + @quantity
            WHERE product_id = @product_id;

            -- Create reservation record
            INSERT INTO reservations (product_id, customer_id, quantity, reserved_at, expires_at)
            VALUES (@product_id, @customer_id, @quantity, GETDATE(), DATEADD(MINUTE, 15, GETDATE()));

        COMMIT;

    END TRY
    BEGIN CATCH
        ROLLBACK;
        THROW;
    END CATCH;
END;

Why the lock hint (UPDLOCK, ROWLOCK)? Without it, two customers could both read “10 available,” both reserve 8, and the system oversells (16 reserved from 10 available). The lock prevents this race condition.

Real-World Scenario 6: Payroll Processing

CREATE PROCEDURE usp_process_payroll
    @pay_period_end DATE
AS
BEGIN
    BEGIN TRY
        BEGIN TRANSACTION;

            -- Step 1: Calculate pay for each employee
            INSERT INTO payroll (emp_id, pay_period, gross_pay, tax, net_pay, status)
            SELECT emp_id, @pay_period_end,
                   salary / 24 AS gross_pay,
                   salary / 24 * 0.25 AS tax,
                   salary / 24 * 0.75 AS net_pay,
                   'Processed'
            FROM employees
            WHERE is_active = 1;

            -- Step 2: Create bank transfer records
            INSERT INTO bank_transfers (emp_id, bank_account, amount, transfer_date, status)
            SELECT e.emp_id, e.bank_account, p.net_pay, GETDATE(), 'Pending'
            FROM payroll p
            JOIN employees e ON p.emp_id = e.emp_id
            WHERE p.pay_period = @pay_period_end;

            -- Step 3: Mark period as processed
            UPDATE payroll_periods
            SET status = 'Completed', processed_date = GETDATE()
            WHERE period_end = @pay_period_end;

        COMMIT;
        PRINT 'Payroll processed for ' + CAST(@@ROWCOUNT AS VARCHAR) + ' employees';

    END TRY
    BEGIN CATCH
        ROLLBACK;
        -- NO employee gets partial pay. Either all get paid or none.
        PRINT 'Payroll FAILED — all changes rolled back: ' + ERROR_MESSAGE();
        THROW;
    END CATCH;
END;

Without transaction: 500 employees get payroll records, 300 get bank transfers, then it fails. 200 employees have payroll but no transfer. Accounting nightmare.

With transaction: All 500 succeed together, or nothing happens. Run again after fixing the issue.

Error Handling with TRY/CATCH + Transactions

-- The standard pattern for production transactions:
BEGIN TRY
    BEGIN TRANSACTION;

        -- Your SQL statements here
        INSERT INTO ...;
        UPDATE ...;
        DELETE ...;

    COMMIT;
END TRY
BEGIN CATCH
    -- Only rollback if a transaction is active
    IF @@TRANCOUNT > 0
        ROLLBACK;

    -- Log the error
    DECLARE @error_message NVARCHAR(4000) = ERROR_MESSAGE();
    DECLARE @error_severity INT = ERROR_SEVERITY();
    DECLARE @error_line INT = ERROR_LINE();

    INSERT INTO error_log (error_message, error_severity, error_line, error_date)
    VALUES (@error_message, @error_severity, @error_line, GETDATE());

    -- Re-throw the error to the caller
    THROW;
END CATCH;

Why check @@TRANCOUNT? In nested procedures, the transaction might already be rolled back by an inner CATCH. Calling ROLLBACK on a non-existent transaction causes another error.

Isolation Levels — Controlling What Other Users See

When multiple users query the same data simultaneously, isolation levels control what each user sees:

READ UNCOMMITTED (Dirty Reads Allowed)

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
-- Can read data that another transaction has modified but NOT yet committed
-- FASTEST but LEAST safe — might read data that gets rolled back

-- Shorthand in SQL Server:
SELECT * FROM employees WITH (NOLOCK);

Real-life analogy: Reading someone’s essay while they are still writing it. They might delete the last paragraph. You read “data” that never actually existed in the final version.

READ COMMITTED (Default)

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
-- Can only read data that has been COMMITTED
-- DEFAULT in SQL Server and PostgreSQL
-- Cannot see uncommitted changes from other transactions

REPEATABLE READ

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
-- Once you read a row, no other transaction can MODIFY it until you commit
-- Running the same SELECT twice in your transaction returns the SAME results
-- Prevents "non-repeatable reads" but allows "phantom rows" (new inserts)

SERIALIZABLE (Strictest)

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- Transactions behave as if running ONE AT A TIME (serial)
-- No dirty reads, no non-repeatable reads, no phantom rows
-- SLOWEST — maximum locking

SNAPSHOT Isolation

-- SQL Server: Database-level setting
ALTER DATABASE MyDB SET ALLOW_SNAPSHOT_ISOLATION ON;

SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
-- Each transaction sees a SNAPSHOT of the database as of when it started
-- No locking — readers do not block writers, writers do not block readers
-- Uses row versioning in tempdb

Isolation Level Comparison

Isolation Level Dirty Read Non-Repeatable Read Phantom Rows Locking Performance
READ UNCOMMITTED ✅ Possible ✅ Possible ✅ Possible None Fastest
READ COMMITTED ❌ Prevented ✅ Possible ✅ Possible Shared locks (released after read) Fast (default)
REPEATABLE READ ❌ Prevented ❌ Prevented ✅ Possible Shared locks (held until commit) Medium
SERIALIZABLE ❌ Prevented ❌ Prevented ❌ Prevented Range locks Slowest
SNAPSHOT ❌ Prevented ❌ Prevented ❌ Prevented None (row versioning) Fast (uses tempdb)

For data engineering pipelines: READ COMMITTED (default) is almost always sufficient. Use SNAPSHOT for heavy read workloads that should not block writes.

Deadlocks — When Two Transactions Block Each Other

Transaction A:                          Transaction B:
BEGIN TRAN;                             BEGIN TRAN;
UPDATE accounts SET ... WHERE id=1;     UPDATE accounts SET ... WHERE id=2;
-- Holds lock on account 1              -- Holds lock on account 2

UPDATE accounts SET ... WHERE id=2;     UPDATE accounts SET ... WHERE id=1;
-- WAITING for B to release account 2   -- WAITING for A to release account 1

-- DEADLOCK! Both waiting forever.
-- SQL Server detects this and KILLS one transaction (the "victim")
-- The victim gets error 1205 and must retry

How to Prevent Deadlocks

  1. Access tables in the same order — if all transactions update account 1 before account 2, no circular wait
  2. Keep transactions short — less time holding locks = less chance of conflict
  3. Use the lowest isolation level needed — fewer locks = fewer deadlocks
  4. Add retry logic — deadlock victims should retry the transaction

Transactions in Data Engineering Pipelines

In ADF/Synapse Stored Procedure Activity

-- This stored procedure is called by ADF as a pipeline activity
-- The transaction ensures the entire load is atomic
CREATE PROCEDURE usp_load_silver_customers
AS
BEGIN
    BEGIN TRY
        BEGIN TRANSACTION;
            TRUNCATE TABLE silver.customers;
            INSERT INTO silver.customers SELECT ... FROM bronze.customers WHERE ...;
            UPDATE etl_log SET last_run = GETDATE() WHERE table_name = 'customers';
        COMMIT;
    END TRY
    BEGIN CATCH
        ROLLBACK;
        THROW;  -- ADF catches this as a failed activity
    END CATCH;
END;

In Fabric Warehouse

-- Fabric Warehouse supports full T-SQL transactions
BEGIN TRANSACTION;
    MERGE INTO dim_customer AS target
    USING staging_customers AS source ON target.customer_id = source.customer_id
    WHEN MATCHED THEN UPDATE SET ...
    WHEN NOT MATCHED THEN INSERT ...;
COMMIT;

In Databricks (Delta Lake)

# Delta Lake provides ACID transactions natively
# Each write operation is automatically atomic
df.write.format("delta").mode("overwrite").saveAsTable("silver.customers")
# This is a single atomic transaction — all rows or nothing

# Delta MERGE is also atomic
deltaTable.alias("t").merge(source.alias("s"), "t.id = s.id")     .whenMatchedUpdate(...)     .whenNotMatchedInsert(...)     .execute()
# The entire MERGE is one transaction

Common Mistakes

  1. No transaction around related statements — INSERT + UPDATE + DELETE that depend on each other MUST be in a transaction. Without it, partial failures leave data inconsistent.

  2. Long-running transactions — a transaction that runs for 30 minutes holds locks for 30 minutes, blocking all other users. Keep transactions as short as possible.

  3. Forgetting ROLLBACK in CATCH — if TRY fails and CATCH does not ROLLBACK, the transaction stays open, holding locks indefinitely.

  4. Not checking @@TRANCOUNT before ROLLBACK — in nested procedures, the transaction might already be rolled back. IF @@TRANCOUNT > 0 ROLLBACK prevents errors.

  5. Using READ UNCOMMITTED to avoid lockingWITH (NOLOCK) reads uncommitted data that might be rolled back. For reports, use SNAPSHOT isolation instead — consistent reads without locking.

  6. Not handling deadlocks — deadlocks WILL happen in concurrent systems. Build retry logic into your applications and stored procedures.

  7. TRUNCATE inside a transaction without understanding — TRUNCATE is minimally logged and CAN be rolled back in SQL Server (contrary to popular belief), but it acquires a schema lock on the table.

Interview Questions

Q: What are the ACID properties? A: Atomicity (all or nothing — entire transaction succeeds or is rolled back), Consistency (database moves from one valid state to another, constraints are never violated), Isolation (concurrent transactions do not interfere with each other), Durability (committed data survives crashes — permanently written to disk).

Q: What is the difference between COMMIT and ROLLBACK? A: COMMIT saves all changes made within the transaction permanently to the database. ROLLBACK undoes all changes, returning the database to its state before BEGIN TRANSACTION. Once committed, changes cannot be rolled back.

Q: What is a deadlock and how do you prevent it? A: A deadlock occurs when two transactions each hold a lock the other needs, creating a circular wait. SQL Server detects deadlocks and kills one transaction (the “victim”). Prevention: access tables in consistent order across all transactions, keep transactions short, use appropriate isolation levels, and implement retry logic for deadlock victims.

Q: What is the difference between READ COMMITTED and SNAPSHOT isolation? A: READ COMMITTED (default) allows reading only committed data but uses shared locks that can block writers. SNAPSHOT isolation uses row versioning to give each transaction a consistent point-in-time view without any locking — readers never block writers and vice versa. SNAPSHOT uses more tempdb space for row versions.

Q: Why are transactions important in ETL pipelines? A: ETL pipelines often have multiple dependent steps: truncate staging, load new data, merge into production, update watermark. Without a transaction, a failure in step 3 leaves staging truncated (step 1) and production partially updated. With a transaction, the entire pipeline is atomic — partial failures roll back everything, and the pipeline can be safely rerun.

Wrapping Up

Transactions are the safety net of your database. They guarantee that related operations succeed or fail together, that concurrent users do not corrupt each other’s work, and that committed data survives any failure.

The pattern is simple: BEGIN TRANSACTION, do your work, COMMIT if everything succeeds, ROLLBACK if anything fails. Wrap it in TRY/CATCH for production code. Choose the right isolation level for your workload. And keep transactions as short as possible to minimize locking.

Every real-world scenario — bank transfers, order placement, SCD loads, ETL pipelines, inventory reservations, payroll processing — follows this same pattern. Master transactions, and your data is always consistent.

Related posts:SQL DDL, DML & ConstraintsStored Procedures & TriggersSCD TypesDelta Lake Deep Dive (ACID in Delta)


Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
Share via
Copy link