Common Azure Data Factory and Synapse Pipeline Errors: A Troubleshooting Guide

Common Azure Data Factory and Synapse Pipeline Errors: A Troubleshooting Guide

If you’ve spent any time building pipelines in Azure Data Factory or Synapse Analytics, you’ve run into errors that make you want to close your laptop. The error messages are often vague, the documentation is scattered, and the same issue can have five different causes depending on your configuration.

This post is a collection of every real error I’ve encountered while building metadata-driven pipelines, parameterized datasets, and incremental load patterns. For each error, I’ll explain what causes it, how to fix it, and how to prevent it from happening again.

Bookmark this page — you’ll come back to it.

Table of Contents

  • Error 1: BadRequest with Null Message
  • Error 2: “The template function ‘dataset’ is not defined or not valid”
  • Error 3: “First row only” Causing ForEach to Fail
  • Error 4: Case-Sensitivity in @item() Expressions
  • Error 5: ADLS Container Does Not Exist
  • Error 6: “Enter manually” vs “Edit” — Where’s the Checkbox?
  • Error 7: Single Quote Escaping in Dynamic SQL
  • Error 8: Stored Procedure Parameter Type Mismatch
  • Error 9: Nested @ in Expressions
  • Error 10: Permission Errors on ADLS Gen2
  • Error 11: Pipeline Validation Passes but Runtime Fails
  • Error 12: Copy Activity Returns 0 Rows (But Table Has Data)
  • Error 13: Parquet Sink Generates Empty Files
  • Error 14: ForEach Stops on First Failure
  • Error 15: Watermark Not Updating After Copy
  • The Universal Debugging Strategy
  • Prevention Checklist
  • Wrapping Up

Error 1: BadRequest with Null Message

The error:

{
    "code": "BadRequest",
    "message": null,
    "target": "pipeline//runid/03227a01-dd3f-4945-8ff8-410dcc9b3769",
    "details": null,
    "error": null
}

Why it’s frustrating: The message is literally null. ADF gives you zero information about what went wrong.

The top 5 causes (in order of likelihood):

1. Stale dataset parameter default values

You had a default value in a dataset parameter that referenced a pipeline variable you later deleted. Even though you’re not using the default, ADF still tries to resolve it at runtime.

Fix: Open every dataset → Parameters tab → clear ALL default values.
     Leave them completely empty.

2. Trailing whitespace on activity or dataset names

An activity named "Copy_TableData " (with a trailing space) passes validation but fails at runtime. This is incredibly hard to spot visually.

Fix: Click on each activity name → press End key → check if the
     cursor is past the visible text. Retype the name if in doubt.

3. Incorrect expression references

An expression like @activity('LookupMetadata').output.value when the activity is actually named Lookup_Metadata (with underscore). Even a single character difference causes this error.

Fix: Copy-paste activity names into expressions.
     Never type them manually.

4. Key Vault linked service UI bug

Editing a Key Vault-based linked service through the ADF UI can inject stray characters or curly braces around parameter values.

Fix: Click the {} code button on the linked service and edit
     the JSON directly instead of using the UI.

5. Nested ForEach loops

ADF does not support ForEach inside another ForEach. The UI doesn’t always catch this during validation.

Fix: Use an Execute Pipeline activity to call a child pipeline
     for the inner loop.

Debugging strategy:

Use the binary chop method since the error message gives you nothing:

  1. Add a breakpoint after the Lookup activity → Debug → does it succeed?
  2. If yes, enable ForEach but replace the Copy with a Set Variable → does iteration work?
  3. If yes, add the Copy back → the problem is in the dataset parameter mapping
  4. Check each dataset’s JSON (click {}) for stale references

Error 2: “The template function ‘dataset’ is not defined or not valid”

The error:

{
    "errorCode": "InvalidTemplate",
    "message": "The template function 'dataset' is not defined or not valid",
    "failureType": "UserError",
    "target": "Copy_DeltaData"
}

Cause: You used @dataset() in the pipeline’s Copy activity settings instead of in the dataset’s Connection tab.

The rule:

Where Correct Function Example
Dataset Connection tab @dataset() @dataset().FolderName
Pipeline activity settings @item() @item().FolderName
Pipeline referencing another activity @activity() @activity('Lookup').output.value

What happened: You put @concat(dataset().FolderName, '/', formatDateTime(...)) in the Copy activity’s Sink tab (which is in the pipeline). The pipeline doesn’t know what dataset() means — only datasets do.

Fix: Move the @dataset() expression to the dataset’s Connection tab. In the pipeline’s Copy activity, use @item() or @concat() with item() to build the value and pass it to the dataset parameter.

Dataset Connection tab → Directory: @dataset().FolderName
Pipeline Copy Sink tab → FolderName parameter: @concat(item().FolderName, '/', ...)

The dataset defines the template. The pipeline fills in the values.

Error 3: “First row only” Causing ForEach to Fail

The symptom: ForEach runs only once (or fails) even though your metadata table has 5 rows.

Cause: The Lookup activity has “First row only” checked. When checked, the output structure is:

{
    "firstRow": {
        "TableName": "Customer",
        "SchemaName": "SalesLT"
    }
}

When unchecked, the output is:

{
    "count": 5,
    "value": [
        {"TableName": "Customer", "SchemaName": "SalesLT"},
        {"TableName": "Address", "SchemaName": "SalesLT"},
        ...
    ]
}

The problem: ForEach expects an array (output.value). When “First row only” is checked, there is no .value array — only .firstRow, which is a single object.

Fix: Click on your Lookup activity → Settings tab → UNCHECK “First row only”. Then make sure your ForEach Items expression is:

@activity('Lookup_Metadata').output.value

Exception: For the Lookup_MaxValue activity inside an incremental load pipeline, you DO want “First row only” checked, because you only need one value (the MAX).

Error 4: Case-Sensitivity in @item() Expressions

The error: Runtime failure or empty values when copying data.

Cause: Your SQL column is named TableName but your expression uses @item().tablename (lowercase). ADF expressions are case-sensitive for property names.

Examples:

WRONG:  @item().tablename      → returns null/error
WRONG:  @item().TABLENAME      → returns null/error
RIGHT:  @item().TableName      → works (matches SQL column)

Fix: Check your SQL table column names (run SELECT * FROM metadata) and make sure your @item() expressions match the exact case.

Prevention: After creating your metadata table, copy-paste the column names directly into your ADF expressions. Never type them from memory.

Error 5: ADLS Container Does Not Exist

The error:

The specified container does not exist.

Cause: ADF can auto-create folders inside a container, but it cannot create the container itself. If your metadata has ContainerName = 'database' but no container named database exists in your storage account, the Copy activity fails.

Fix:

  1. Go to Azure Portal → your Storage AccountContainers
  2. Click + Container → name it to match your metadata (e.g., database)
  3. Re-run the pipeline

Prevention: Before running any pipeline for the first time, verify that all containers referenced in your metadata table exist in the ADLS Gen2 storage account.

Error 6: “Enter manually” vs “Edit” — Where’s the Checkbox?

The symptom: You’re trying to parameterize a dataset’s table name, but you can’t find the “Edit” checkbox to enable dynamic content.

Cause: ADF and Synapse have different UI labels for the same feature:

Platform What to look for Location
Azure Data Factory “Edit” checkbox Dataset → Connection tab → next to the Table dropdown
Azure Synapse Studio “Enter manually” checkbox Dataset → Connection tab → below the Table dropdown

Fix: Look for “Enter manually” if you’re in Synapse. Once checked, two text boxes appear (Schema and Table) where you can add dynamic content like @dataset().SchemaName and @dataset().TableName.

Error 7: Single Quote Escaping in Dynamic SQL

The error: SQL syntax error when the Copy activity tries to execute a dynamic query.

Cause: SQL strings need single quotes around values, but ADF’s expression language also uses single quotes for strings. You need to escape them.

The escaping rule: In ADF expressions, a single quote is escaped by doubling it: ''

Example — building a WHERE clause:

WRONG (missing quotes around the value):
@concat('SELECT * FROM T1.EMPLOYEE WHERE EMPID > ', item().LastLoadedValue)
Generates: SELECT * FROM T1.EMPLOYEE WHERE EMPID > 103
This works for numbers but fails for dates.

RIGHT (with proper quoting):
@concat('SELECT * FROM T1.EMPLOYEE WHERE EMPID > ''', item().LastLoadedValue, '''')
Generates: SELECT * FROM T1.EMPLOYEE WHERE EMPID > '103'
Works for both numbers and dates.

Breaking down the quotes:

'''    = end string + literal single quote + start new string
item().LastLoadedValue    = the value (e.g., '103')
''''   = end string + literal single quote + end of concat argument

Tip: If the quoting is confusing you, build the SQL string in a text editor first, then figure out the escaping:

-- Desired SQL:
SELECT * FROM T1.EMPLOYEE WHERE EMPID > '103' AND EMPID <= '105'

-- In ADF concat:
@concat('SELECT * FROM T1.EMPLOYEE WHERE EMPID > ''', item().LastLoadedValue, ''' AND EMPID <= ''', activity('Lookup_MaxValue').output.firstRow.MaxValue, '''')

Error 8: Stored Procedure Parameter Type Mismatch

The error:

Error converting data type varchar to int.

Cause: Your stored procedure expects an INT parameter, but you’re passing a string value from a dynamic expression. Or vice versa.

Common scenario: The audit stored procedure expects @rows_read INT, but @activity('Copy_TableData').output.rowsRead returns a value that ADF treats as a string.

Fix: Make sure the parameter types in the Stored Procedure activity match the stored procedure definition:

SP Parameter SP Type ADF Parameter Type Value
@rows_read INT Int32 @activity('Copy').output.rowsRead
@rows_copied INT Int32 @activity('Copy').output.rowsCopied
@copy_duration INT Int32 @activity('Copy').output.copyDuration
@table_name VARCHAR String @item().TableName
@error_message VARCHAR String NA

Important: Use Int32 (not Int64 or String) for integer parameters. ADF’s Copy output properties return values that map to Int32.

Error 9: Nested @ in Expressions

The error:

Position 71 Unrecognized expression: @activity('Copy_TableData').output.errors[0].Message

Cause: You’re using @ inside another @ expression.

WRONG:
@concat('Copy failed for ', @item().SchemaName, '.', @item().TableName)

RIGHT:
@concat('Copy failed for ', item().SchemaName, '.', item().TableName)

The rule: The @ symbol is only used once at the very beginning of the expression. Everything inside is already in expression context. Nested @ symbols cause parsing errors.

Another common mistake:

WRONG:
@concat(@activity('Copy').output.rowsRead, ' rows copied')

RIGHT:
@concat(activity('Copy').output.rowsRead, ' rows copied')

Error 10: Permission Errors on ADLS Gen2

The error:

This request is not authorized to perform this operation using this permission.

Cause: The identity running the pipeline (ADF’s managed identity or Synapse’s managed identity) doesn’t have write permissions on the ADLS Gen2 storage account.

Fix:

  1. Go to your Storage AccountAccess Control (IAM)
  2. Click + Add role assignment
  3. Role: Storage Blob Data Contributor
  4. Members → + Select members → search for your ADF or Synapse workspace name
  5. Select it → Review + assign

Common confusion: The Storage Account Contributor role is NOT enough. That role lets you manage the storage account settings, but NOT read/write blob data. You need Storage Blob Data Contributor specifically.

For Synapse: The managed identity has the same name as your Synapse workspace (e.g., naveen-synapse-ws).

Error 11: Pipeline Validation Passes but Runtime Fails

The symptom: You click Validate, see green checkmarks, click Debug, and get errors.

Why this happens: Validation checks structural correctness — are all required fields filled? Do activity names exist? Are expressions syntactically valid? It does NOT check:

  • Whether the SQL server is reachable
  • Whether table names are correct
  • Whether ADLS containers exist
  • Whether credentials are valid
  • Whether column names in @item() match the actual data
  • Whether dataset parameter defaults have stale references

This is by design. Validation would need to connect to every data source and execute queries to verify everything, which would be slow and potentially dangerous (imagine validating a DELETE pipeline).

Prevention: Always do a Debug run before scheduling. Treat validation as a syntax check, not a full test.

Error 12: Copy Activity Returns 0 Rows (But Table Has Data)

Possible causes:

1. Parameterized query is filtering everything out

Your WHERE clause might be too restrictive. For incremental loads, check if LastLoadedValue in the config table is ahead of the MAX value in the source.

-- Debug: check if there's actually data to copy
SELECT MAX(EMPID) FROM T1.EMPLOYEE;        -- e.g., returns 103
SELECT LastLoadedValue FROM configtable     -- if this is '103' or higher,
WHERE TableName = 'EMPLOYEE';              -- there's nothing to copy

2. Schema/table name mismatch

@item().SchemaName returns SalesLT but the actual schema is saleslt (different case). SQL Server is case-insensitive for table names, but the dataset parameter resolution might cause issues.

3. “First row only” checked on the wrong Lookup

If the metadata Lookup returns only one row, ForEach iterates once, copying only one table.

Fix: Check the pipeline’s Monitor tab → click on the Copy activity → view input/output to see the actual query that was executed.

Error 13: Parquet Sink Generates Empty Files

The symptom: Parquet files exist in ADLS but have 0 bytes or contain no data.

Possible causes:

1. Source query returned 0 rows — the Copy activity creates the file structure but writes no data. Check if the source query actually returns rows.

2. File name collision — if you run the pipeline multiple times without date partitioning, new files might not overwrite old ones cleanly. Use date-partitioned folders:

@concat(item().FolderName, '/', formatDateTime(utcnow(), 'yyyy'),
        '/', formatDateTime(utcnow(), 'MM'),
        '/', formatDateTime(utcnow(), 'dd'))

3. Compression issue — rarely, Snappy compression can produce very small files that appear empty. Download and open the file with a Parquet viewer to confirm.

Error 14: ForEach Stops on First Failure

The symptom: Table 2 out of 5 fails, and tables 3-5 are skipped.

Cause: By default, ForEach stops all iterations when any single iteration fails.

Fix: Go to ForEach activity → Settings tab → enable “Continue on error” (or in the JSON, set "isSequential": false with appropriate error handling).

With this enabled, if the Customer table copy fails, Address, Product, and other tables still get copied. Combine this with audit logging to track which tables succeeded and which failed.

Error 15: Watermark Not Updating After Copy

The symptom: The pipeline succeeds, data is copied, but LastLoadedValue in the config table hasn’t changed.

Possible causes:

1. Update_Watermark is connected with a blue arrow (Completed) instead of green (Succeeded)

A blue “Completed” arrow runs the stored procedure regardless of whether the Copy succeeded or failed. But that’s usually not the issue here — the issue is the opposite.

2. The dependency arrow is missing entirely

If there’s no arrow from Copy to Update_Watermark, the stored procedure runs before the copy finishes (or doesn’t run at all).

Fix: Make sure there’s a green arrow (Succeeded dependency) from Copy_DeltaDataUpdate_Watermark.

3. Wrong parameter value in the stored procedure

Verify the parameters:

@tablename      → @item().TableName (must match case)
@LASTLOADVALUE  → @activity('Lookup_MaxValue').output.firstRow.MaxValue

If MaxValue is null (empty table), the watermark gets set to null, which looks like it didn’t update.

The Universal Debugging Strategy

When any pipeline fails and you’re stuck, follow this workflow:

Step 1: Check the Monitor tab

Go to MonitorPipeline runs → click on the failed run. Look at each activity’s status and click the error icon (or glasses icon) on the failed activity.

Step 2: Check the Input/Output

On the failed activity, click Input to see what values were passed. This shows you the actual resolved expressions — if a parameter is empty or wrong, you’ll see it here.

Step 3: Check the Error tab

Click Error on the failed activity. The error message here is usually more detailed than what the pipeline-level error shows.

Step 4: Test the query manually

Copy the resolved SQL query from the Input tab and run it directly in the SQL Query Editor. Does it return results? Does it have a syntax error?

Step 5: Simplify

Replace dynamic expressions with hardcoded values. If @item().TableName might be wrong, temporarily hardcode "Customer" and run again. If it works, the problem is in the expression resolution.

Prevention Checklist

Before running any new or modified pipeline, check these:

  • All dataset parameter default values are empty (not null, not placeholder text — empty)
  • Activity names in expressions match exactly (copy-paste, don’t type)
  • Lookup activities have “First row only” unchecked (for metadata lookups) or checked (for single-value lookups)
  • @item() property names match the SQL column names exactly (case-sensitive)
  • ADLS containers referenced in metadata exist in the storage account
  • The pipeline identity has Storage Blob Data Contributor role on the ADLS account
  • All changes are Published before clicking Debug
  • @dataset() is used only inside datasets; @item() is used only inside pipelines

Wrapping Up

Azure Data Factory and Synapse pipelines are powerful, but their error messages leave a lot to be desired. The good news is that 90% of pipeline errors fall into the same handful of categories — and once you’ve seen them once, you can fix them in minutes.

Keep this guide bookmarked. The next time you see a cryptic error, search this page before spending an hour on Stack Overflow.

Related posts:Building a Metadata-Driven Pipeline in Azure Data FactorySynapse Pipeline with Audit LoggingIncremental Data Loading with Delta CopyBuilding a REST API with FastAPI on AWS Lambda

If this guide saved you from a debugging rabbit hole, share it with your team. Have an error that’s not listed here? Drop a comment and I’ll add it.


Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
Share via
Copy link