Understanding Parameterized Datasets in Azure Data Factory: A Deep Dive
Parameterized datasets are what separate a beginner ADF developer from a production-ready one. Without them, you create one dataset per table — 50 tables means 50 datasets. With them, you create one dataset that handles all 50 tables dynamically.
Why Parameterize? The Multiplication Problem
Without parameterization:
100 datasets (50 source + 50 sink) — one per table
Adding a new table = create 2 datasets + update pipeline
With parameterization:
2 datasets total — one parameterized source, one parameterized sink
Adding a new table = insert 1 row in metadata table
The Three Expression Scopes
1. Dataset Scope: @dataset()
Used inside a dataset’s Connection tab. References the dataset’s own parameters.
Where: Dataset Connection tab
Expression: @dataset().SchemaName
2. Pipeline/ForEach Scope: @item()
Used inside a pipeline activity within a ForEach loop.
Where: Pipeline Copy Activity Source/Sink tab
Expression: @item().SchemaName
3. Activity Scope: @activity()
Used inside a pipeline activity to reference another activity’s output.
Where: Pipeline — any activity referencing another
Expression: @activity('Lookup_Metadata').output.value
The Critical Rule
@dataset() is ONLY inside datasets. @item() and @activity() are ONLY inside pipelines. Cross them and you get errors.
Think of it as a function call:
# Dataset = function definition
def read_from_sql(SchemaName, TableName): # @dataset() parameters
return query(f"SELECT * FROM {SchemaName}.{TableName}")
# Pipeline = function call
for row in metadata_rows: # ForEach iteration
data = read_from_sql(
SchemaName=row.SchemaName, # @item().SchemaName
TableName=row.TableName # @item().TableName
)
Creating a Parameterized Source Dataset
Step 1: Create the Dataset
- Author tab > + > Dataset > Azure SQL Database
- Name:
DS_SqlDB_SourceTable - Linked service:
LS_AzureSqlDB - Leave table blank
Step 2: Add Parameters
- Parameters tab > add:
SchemaName(String, empty default)TableName(String, empty default)
CRITICAL: Leave default values empty. This prevents BadRequest null errors.
Step 3: Configure Connection
- Connection tab > check “Edit” (ADF) or “Enter manually” (Synapse)
- Schema > dynamic content >
@dataset().SchemaName - Table > dynamic content >
@dataset().TableName
JSON View
{
"name": "DS_SqlDB_SourceTable",
"properties": {
"parameters": {
"SchemaName": {"type": "string"},
"TableName": {"type": "string"}
},
"type": "AzureSqlTable",
"typeProperties": {
"schema": {
"value": "@dataset().SchemaName",
"type": "Expression"
},
"table": {
"value": "@dataset().TableName",
"type": "Expression"
}
}
}
}
Creating a Parameterized Sink Dataset
- + > Dataset > ADLS Gen2 > Parquet
- Name:
DS_ADLS_Sink - Parameters:
ContainerName(String, empty),FolderName(String, empty) - Connection tab:
- Container >
@dataset().ContainerName - Directory >
@dataset().FolderName - File name > leave blank
How Values Flow at Runtime
Step 1: Lookup reads metadata table
Output: [{SchemaName: "SalesLT", TableName: "Customer", ...}]
Step 2: ForEach iterates — @item() = current row
Step 3: Copy Source passes to dataset:
DS_SqlDB_SourceTable.SchemaName <- @item().SchemaName = "SalesLT"
DS_SqlDB_SourceTable.TableName <- @item().TableName = "Customer"
Step 4: Dataset resolves:
@dataset().SchemaName -> "SalesLT"
@dataset().TableName -> "Customer"
Result: SELECT * FROM SalesLT.Customer
Step 5: Copy Sink passes:
DS_ADLS_Sink.ContainerName <- @item().ContainerName = "database"
DS_ADLS_Sink.FolderName <- @item().FolderName = "sqldb/Customer"
Step 6: Write to database/sqldb/Customer/part-00000.parquet
The Golden Rules
- Never put expressions in dataset parameter defaults. Leave empty. Always.
- Never use @dataset() in the pipeline. Use @item() or @pipeline().parameters.
- Never use @item() in datasets. Use @dataset() in Connection tabs.
- Parameter names are case-sensitive.
TableName!=tablename. - No trailing whitespace on parameter names.
- Always publish before debugging.
Pipeline Parameters vs Dataset Parameters
| Type | Defined On | Passed By | Expression |
|---|---|---|---|
| Dataset parameter | Dataset Parameters tab | Copy activity Source/Sink | @dataset().ParamName |
| Pipeline parameter | Pipeline Parameters tab | Trigger or manual run | @pipeline().parameters.ParamName |
You can chain them:
Trigger > Pipeline parameter > Copy activity > Dataset parameter
ADF vs Synapse UI Differences
| Action | ADF Studio | Synapse Studio |
|---|---|---|
| Enable dynamic table | Check “Edit” | Check “Enter manually” |
| Dataset location | Author > Datasets | Data > Integration datasets |
| Dynamic content | Click field > “Add dynamic content” | Same, or click fx icon |
Advanced: Dynamic File Names and Date Partitions
Dynamic File Names
@concat(item().TableName, '_', formatDateTime(utcnow(), 'yyyyMMdd'), '.parquet')
Result: Customer_20260405.parquet
Date-Partitioned Folders
@concat(item().FolderName, '/',
formatDateTime(utcnow(), 'yyyy'), '/',
formatDateTime(utcnow(), 'MM'), '/',
formatDateTime(utcnow(), 'dd'))
Result: sqldb/Customer/2026/04/05/
Important: Put this in the pipeline Copy Sink tab, NOT in the dataset. Dataset Directory stays as @dataset().FolderName.
Common Errors
| Error | Cause | Fix |
|---|---|---|
BadRequest null |
Stale defaults in dataset parameters | Clear all defaults |
dataset function not defined |
Used @dataset() in pipeline | Change to @item() |
item function not defined |
Used @item() in dataset | Change to @dataset() |
| Wrong table copied | Case mismatch | Match exact column name case |
| Copy succeeds, 0 rows | Empty parameter value | Verify Lookup returns data |
Interview Questions
Q: What is a parameterized dataset? A: A dataset where properties like table name or file path come from parameters instead of being hardcoded. One dataset handles multiple tables dynamically.
Q: What’s the difference between @dataset() and @item()? A: @dataset() references dataset parameters inside the dataset. @item() references the current ForEach element inside the pipeline. Pipeline passes values to dataset.
Q: How many datasets for a 50-table metadata pipeline? A: Just 3 — one metadata lookup, one parameterized source, one parameterized sink.
Wrapping Up
Parameterized datasets turn a 100-dataset nightmare into a 2-dataset solution. Master @dataset() vs @item(), follow the golden rules, and metadata-driven pipelines become second nature.
Related posts: – Metadata-Driven Pipeline in ADF – Common ADF/Synapse Errors – ADF vs Synapse Comparison – Top 15 ADF Interview Questions
Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.