Azure RBAC Roles Demystified: Every Role, Every Identity, and When to Assign What to Whom

Azure RBAC Roles Demystified: Every Role, Every Identity, and When to Assign What to Whom

You just created an Azure SQL Database, a Synapse workspace, and an ADLS Gen2 storage account. Everything is set up. You run the pipeline and get: “403 Forbidden.” You check permissions, see 200+ roles in the dropdown, and freeze.

Storage Blob Data Reader? Storage Blob Data Contributor? Storage Account Contributor? Reader? Contributor? Owner? What is the difference? Which one does your pipeline need? Which one does your colleague need? Which one would be dangerous to assign?

This confusion is universal. Even experienced engineers Google “which Azure role for…” multiple times a week. The problem is not that the roles are complicated — it is that there are TOO MANY and they sound similar.

This post organizes every role you will encounter as a data engineer, explains what each one actually allows, and gives you a simple decision framework so you never have to guess again.

Think of Azure RBAC like a hotel key card system. The hotel has hundreds of doors — rooms, gym, pool, restaurant, parking, staff-only areas, maintenance closets. Each key card (role) opens specific doors. The front desk manager (Owner) can open every door. A guest (Reader) can only open their room. A housekeeper (Contributor) can open rooms and supply closets but not the vault. The challenge is knowing which card to give to which person.

Table of Contents

  • What Is RBAC (Role-Based Access Control)?
  • The Three Pillars: WHO + WHAT + WHERE
  • The Three Fundamental Roles (Owner, Contributor, Reader)
  • Why Built-In Roles Are Not Enough
  • Identity Types: Who Can Receive Roles
  • Storage Roles (The Most Confusing Ones)
  • SQL and Database Roles
  • Synapse Analytics Roles
  • Databricks-Related Roles
  • Data Factory Roles
  • Key Vault Roles
  • Networking Roles
  • Compute Roles (VMs, AKS)
  • Monitoring and Logging Roles
  • Microsoft Fabric Roles
  • The Decision Framework
  • Real-World Scenarios: Which Role for Which Task
  • Scenario 1: Data Engineer Building Pipelines
  • Scenario 2: Data Analyst Reading Reports
  • Scenario 3: Synapse Managed Identity
  • Scenario 4: Databricks with Unity Catalog
  • Scenario 5: DevOps Engineer Deploying Pipelines
  • The Principle of Least Privilege
  • Custom Roles
  • How to Assign Roles (Step by Step)
  • How to Check What Roles Someone Has
  • Common Permission Errors and Fixes
  • Interview Questions
  • Wrapping Up

What Is RBAC (Role-Based Access Control)?

RBAC is Azure’s authorization system. It answers the question: “Can this person/service do this action on this resource?”

Every RBAC assignment has three parts:

WHO (Security Principal)  +  WHAT (Role)  +  WHERE (Scope)
     |                          |                 |
     v                          v                 v
  "Naveen"              "Storage Blob        "Storage account
   (user)               Data Contributor"     naveensynapsedl"

Translation: “Naveen can read, write, and delete blobs in the storage account naveensynapsedl.”

Real-life analogy: RBAC is like a hospital access system. WHO = Dr. Smith. WHAT = her key card allows access to the surgical ward. WHERE = Toronto General Hospital, 3rd floor. She cannot access the pharmacy (different role needed) or Vancouver General (different scope).

The Three Pillars: WHO + WHAT + WHERE

WHO (Security Principal — The Identity)

Type What It Is Example
User A person with an Azure AD account naveen@company.com
Group A collection of users “Data Engineering Team” group
Service Principal An application identity (like a robot account) databricks-storage-sp
Managed Identity Azure-managed identity for a service (no password) naveen-synapse-ws (Synapse workspace)

WHAT (Role Definition — The Permissions)

The role defines what actions are allowed: read, write, delete, manage, etc.

WHERE (Scope — The Boundary)

Scope Level What It Covers Example
Management Group Multiple subscriptions Company-wide
Subscription All resources in a subscription Pay-As-You-Go subscription
Resource Group All resources in a group rg-dataplatform-dev
Resource Single resource Storage account naveensynapsedl

Roles assigned at higher scopes inherit downward. A Contributor on the subscription is Contributor on EVERY resource group and resource within it.

Inheritance example:

Management Group: Contoso Corp
  └── Subscription: Pay-As-You-Go
        │   Naveen = Contributor here
        │   (inherits to EVERY resource group and resource below)
        │
        ├── Resource Group: rg-dataplatform-dev
        │     │   Naveen is Contributor (inherited from subscription)
        │     │
        │     ├── Storage Account: devdatalake
        │     │     Naveen is Contributor (inherited) — can manage account
        │     │     BUT still cannot read blobs! (need data plane role)
        │     │
        │     ├── Azure SQL: dev-sql-server
        │     │     Naveen is Contributor (inherited) — can manage server
        │     │     BUT still cannot query tables! (need db_datareader inside SQL)
        │     │
        │     └── Key Vault: dev-keyvault
        │           Naveen is Contributor (inherited) — can manage vault
        │           BUT still cannot read secrets! (need Key Vault Secrets User)
        │
        └── Resource Group: rg-dataplatform-prod
              Naveen is Contributor (inherited) — ⚠️ can modify prod too!
              This is why subscription-level Contributor is DANGEROUS

Better approach:
  Assign Contributor on rg-dataplatform-dev ONLY (not subscription)
  Then Naveen CANNOT touch anything in rg-dataplatform-prod

Key rules of inheritance:

  • Roles assigned at a higher scope automatically apply to all lower scopes
  • You CANNOT block or override an inherited role at a lower scope — if someone is Owner on the subscription, they are Owner on every resource in it
  • You CAN add additional roles at lower scopes — Contributor on resource group PLUS Storage Blob Data Reader on a specific storage account
  • Deny assignments (preview) are the only way to restrict inherited permissions

The Three Fundamental Roles (Owner, Contributor, Reader)

These exist on EVERY Azure resource:

Role What It Can Do Cannot Do When to Use
Owner Everything — read, write, delete, AND assign roles to others Nothing restricted Subscription admins, resource group owners only
Contributor Read, write, create, delete resources Cannot assign roles to others DevOps, senior engineers who deploy infrastructure
Reader View resources and their properties Cannot modify anything Auditors, analysts, junior team members

The critical difference: Owner can change WHO has access. Contributor cannot. This is why you NEVER give Owner to someone who just needs to deploy resources — Contributor is sufficient and safer.

Real-life analogy: Owner = building landlord (can change locks, give keys to anyone). Contributor = building manager (can maintain and modify, but cannot give keys). Reader = visitor (can look around but cannot touch anything).

Why Built-In Roles Are Not Enough

The three fundamental roles (Owner, Contributor, Reader) operate at the management plane — they control whether you can create, modify, or delete the RESOURCE ITSELF.

But most data engineering work happens at the data plane — reading and writing DATA inside the resource. This is where the specialized roles come in.

Management Plane (Owner/Contributor/Reader):
  "Can Naveen create or delete the storage account?"

Data Plane (Storage Blob Data Contributor):
  "Can Naveen read and write blobs INSIDE the storage account?"

A common mistake: Assigning Contributor on a storage account and expecting to read blobs. Contributor lets you manage the account (change settings, view keys, delete the account) but does NOT let you read or write blobs inside it. You need Storage Blob Data Contributor for that.

MANAGEMENT PLANE                              DATA PLANE
"Control the resource itself"                 "Access the data inside"
─────────────────────────────                 ─────────────────────────────
Storage Account Contributor                   Storage Blob Data Contributor
  → Create/delete storage account               → Read/write/delete blobs
  → Configure firewall rules                    → List containers
  → Regenerate access keys                      → Upload Parquet files
  → Change replication settings                 → Download CSV files
  → ❌ CANNOT read a single blob                → ❌ CANNOT change account settings

SQL Server Contributor                        db_datareader / db_datawriter
  → Create/delete SQL server                    → SELECT from tables
  → Configure firewall rules                    → INSERT/UPDATE/DELETE rows
  → Manage auditing settings                    → Execute stored procedures
  → ❌ CANNOT query a single table              → ❌ CANNOT change server settings

Key Vault Contributor                         Key Vault Secrets User
  → Create/delete Key Vault                     → Read secret values
  → Configure access policies                   → List secrets
  → ❌ CANNOT read a single secret              → ❌ CANNOT delete the vault

The pattern: every Azure service has BOTH planes.
Management plane roles never give data access. Data plane roles never give management access.

Real-life analogy: Having the key to the office building (Contributor) does not mean you have the key to the filing cabinet inside (Data Contributor). They are separate access levels.

Identity Types: Who Can Receive Roles

Users (People)

naveen@company.com → Storage Blob Data Contributor on storage account

Used for: direct user access during development, Azure Portal browsing.

Groups (Collections of People)

"Data Engineering Team" → Storage Blob Data Contributor on storage account

Used for: managing permissions at scale. Add 10 engineers to the group instead of assigning 10 individual role assignments.

Service Principals (Application Identities)

databricks-storage-sp → Storage Blob Data Contributor on storage account

Used for: applications that need to authenticate (Databricks Service Principal, CI/CD pipelines, external tools). Created in Azure AD App Registrations. Has a client ID + client secret.

Managed Identities (Azure-Managed)

naveen-synapse-ws → Storage Blob Data Contributor on storage account

Used for: Azure services authenticating to other Azure services. NO passwords to manage. Azure handles the credentials automatically.

Two types:System-assigned: Tied to a specific resource. Deleted when the resource is deleted. (Synapse workspace → its managed identity) – User-assigned: Independent resource. Can be shared across multiple services. (One identity used by 5 Azure Functions)

Rule: Always prefer Managed Identity over Service Principal. Managed Identity = no secrets to rotate or leak.

How Managed Identity Works (Under the Hood)

Without Managed Identity (Service Principal):
  1. You create an App Registration in Azure AD
  2. You generate a client secret (password)
  3. You store the secret somewhere safe (Key Vault, env variable)
  4. Your code reads the secret at runtime
  5. Your code exchanges the secret for an OAuth token
  6. The token is used to access ADLS/SQL/Key Vault
  7. Every 1-2 years: secret expires → you must rotate it → update everywhere
  ⚠️ If the secret leaks, anyone can impersonate your service

With Managed Identity:
  1. Azure automatically creates an identity for your service (ADF, Synapse, Databricks)
  2. Azure manages the credentials internally (you never see them)
  3. Your service requests a token from Azure AD automatically
  4. The token is used to access ADLS/SQL/Key Vault
  5. No rotation needed — Azure handles it
  ✅ No secrets to leak, no expiration to track, no Key Vault needed for this credential

Common managed identities you will encounter:
  naveen-synapse-ws           ← Synapse workspace
  naveen-adf                  ← Azure Data Factory
  databricks-access-connector ← Databricks Access Connector for Unity Catalog
  naveen-function-app         ← Azure Function App

Real-life analogy: A Service Principal is like giving an employee a physical key to the office. You must track who has the key, replace it if lost, and change the lock periodically. A Managed Identity is like a biometric scanner — the employee’s fingerprint IS the key. No physical key to lose, no lock to change, no expiration date.

Storage Roles (The Most Confusing Ones)

These are the roles that trip up EVERYONE. The names sound similar but the permissions are very different:

Management Plane Roles (Manage the Storage Account)

Role What It Does Does It Read/Write Data?
Storage Account Contributor Create, delete, manage storage accounts. Configure settings, regenerate keys, manage network rules. NO — cannot read or write blobs
Reader and Data Access View storage account properties AND read access keys Indirectly — can use access keys to read data
Reader View storage account in Azure Portal NO

Data Plane Roles (Read/Write Data INSIDE the Account)

Role What It Does Read Write Delete Manage
Storage Blob Data Reader Read blobs and list containers
Storage Blob Data Contributor Read, write, delete blobs
Storage Blob Data Owner Full access + set POSIX ACLs
Storage Blob Delegator Generate user delegation SAS tokens SAS only

The Decision

Need to READ data from ADLS/Blob?
  → Storage Blob Data Reader

Need to READ + WRITE data (pipeline sinks, Databricks writes)?
  → Storage Blob Data Contributor

Need to MANAGE ACLs (set directory-level permissions)?
  → Storage Blob Data Owner

Need to MANAGE the storage account itself (create, delete, configure)?
  → Storage Account Contributor (but this does NOT give data access!)

The Trap Everyone Falls Into

Scenario: Synapse pipeline writes Parquet to ADLS Gen2
Mistake:  Assign "Contributor" to the Synapse managed identity on the storage account
Result:   403 Forbidden on write

Why:      "Contributor" manages the ACCOUNT, not the DATA
Fix:      Assign "Storage Blob Data Contributor" instead

Real-life analogy: “Storage Account Contributor” is like having the keys to the warehouse building — you can lock/unlock doors, turn on lights, set the alarm. But you do NOT have access to the inventory inside. “Storage Blob Data Contributor” gives you access to the actual inventory — you can add boxes, remove boxes, and read labels.

SQL and Database Roles

Azure SQL (Management Plane)

Role What It Does
SQL Server Contributor Manage SQL servers and databases (create, delete, configure) but NOT access data
SQL DB Contributor Manage SQL databases (but NOT the server and NOT data access)
SQL Security Manager Manage security policies, auditing, threat detection

Azure SQL (Data Plane — Inside the Database)

These are SQL-level roles, NOT Azure RBAC. Managed inside the database:

-- Database-level roles (run inside Azure SQL)
ALTER ROLE db_datareader ADD MEMBER [naveen-synapse-ws];   -- Read all tables
ALTER ROLE db_datawriter ADD MEMBER [naveen-synapse-ws];   -- Write all tables
ALTER ROLE db_owner ADD MEMBER [admin_user];               -- Full control
SQL Role What It Does
db_datareader SELECT on all tables and views
db_datawriter INSERT, UPDATE, DELETE on all tables
db_owner Full control including schema changes
db_ddladmin CREATE, ALTER, DROP tables and schemas

The two-step process for Managed Identity access to SQL:

-- Step 1: Create a user for the managed identity
CREATE USER [naveen-synapse-ws] FROM EXTERNAL PROVIDER;

-- Step 2: Assign SQL database roles
ALTER ROLE db_datareader ADD MEMBER [naveen-synapse-ws];
ALTER ROLE db_datawriter ADD MEMBER [naveen-synapse-ws];

Synapse Analytics Roles

Synapse has its OWN role system in addition to Azure RBAC:

Azure RBAC Roles (on the Synapse Resource)

Role What It Does
Contributor Manage the Synapse workspace (create, configure, delete)
Reader View the workspace in Azure Portal

Synapse-Specific Roles (Inside Synapse Studio)

Role Who Gets It What They Can Do
Synapse Administrator Platform admins Full control of everything
Synapse SQL Administrator DBA Manage SQL pools and run queries
Synapse Spark Administrator Spark leads Manage Spark pools and notebooks
Synapse Contributor Data engineers Create and edit pipelines, notebooks, scripts
Synapse Artifact User Analysts Run pipelines and read data (no edit)
Synapse Credential User Pipeline users Use credentials in linked services

The combo needed for a data engineer: – Azure RBAC: Contributor on the Synapse resource (or resource group) – Synapse Role: Synapse Contributor (create pipelines, notebooks) – Storage Role: Storage Blob Data Contributor on ADLS Gen2 (read/write data)

Azure RBAC (on the Databricks Workspace)

Role What It Does
Contributor Manage the Databricks workspace
Reader View workspace in Azure Portal

Key Vault Access for Databricks

Role Who Gets It Why
Key Vault Secrets User AzureDatabricks app (ID: 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d) So Databricks can read secrets via Secret Scopes

Storage for Databricks (Unity Catalog)

Role Who Gets It Why
Storage Blob Data Contributor Access Connector managed identity So Unity Catalog can read/write to ADLS Gen2

Storage for Databricks (Direct Access)

Role Who Gets It Why
Storage Blob Data Contributor Service Principal or user So notebooks can read/write to ADLS Gen2 via spark.conf.set

Data Factory Roles

Azure RBAC (on the ADF Resource)

Role What It Does
Data Factory Contributor Create, edit, deploy, and manage ADF resources (pipelines, datasets, etc.)
Contributor Same as above plus manage the ADF resource itself
Reader View pipelines and runs (no editing)

Storage Access for ADF Managed Identity

Role Where to Assign Why
Storage Blob Data Contributor On ADLS Gen2 storage account So ADF pipelines can write Parquet/Delta to the data lake

SQL Access for ADF Managed Identity

Role Where to Assign Why
db_datareader + db_datawriter Inside Azure SQL Database So ADF pipelines can read source tables and write audit logs

Key Vault Roles

Role What It Does Who Gets It
Key Vault Administrator Full control of Key Vault and all secrets Platform admins only
Key Vault Secrets Officer Create, read, update, delete secrets DevOps engineers who manage secrets
Key Vault Secrets User Read secrets only (cannot create or modify) Applications, managed identities, Databricks
Key Vault Reader View Key Vault metadata (NOT secret values) Auditors

Most common assignment:

AzureDatabricks (2ff814a6...) → Key Vault Secrets User → on Key Vault
Synapse Managed Identity → Key Vault Secrets User → on Key Vault
ADF Managed Identity → Key Vault Secrets User → on Key Vault

Networking Roles

Role What It Does
Network Contributor Manage VNets, subnets, NSGs, route tables, VPN gateways
DNS Zone Contributor Manage DNS zones and records
Private DNS Zone Contributor Manage private DNS zones (needed for private endpoints)

Compute Roles (VMs, AKS)

Role What It Does
Virtual Machine Contributor Create, manage, delete VMs (NOT access to the OS or data inside)
Virtual Machine Administrator Login Login to VMs as administrator (RDP/SSH)
Virtual Machine User Login Login to VMs as standard user

Monitoring and Logging Roles

Role What It Does
Monitoring Contributor Read and write monitoring data (metrics, alerts)
Monitoring Reader Read monitoring data (dashboards, alerts)
Log Analytics Contributor Manage Log Analytics workspaces and queries
Log Analytics Reader Read logs and run queries

Microsoft Fabric Roles

Microsoft Fabric uses workspace-level roles instead of Azure RBAC for most operations. The permission model is simpler than Synapse but different from the rest of Azure:

Fabric Workspace Roles

Role What They Can Do Who Gets It
Admin Everything — manage settings, add/remove members, delete workspace Workspace owner, team lead
Member Create, edit, delete all items + share reports + publish apps Data engineers, senior analysts
Contributor Create, edit, delete items (cannot share reports or manage access) Junior engineers, data scientists
Viewer View reports and dashboards only (cannot edit or create anything) Business analysts, managers, executives

Fabric vs Synapse vs Databricks Permission Models

Aspect Azure RBAC (ADF, Storage) Synapse Databricks (Unity Catalog) Fabric
Where roles are assigned Azure Portal → IAM Azure Portal + Synapse Studio Databricks workspace + Unity Catalog GRANT/REVOKE Fabric workspace settings
Identity source Azure AD (Entra ID) Azure AD Azure AD (SCIM sync) Azure AD (Microsoft 365)
Data-level security RBAC + POSIX ACLs on ADLS SQL roles + Synapse roles GRANT/REVOKE on catalog.schema.table Workspace roles + SQL endpoint RLS
Row-level security Not at storage level SQL RLS on Dedicated Pool Row filters on Unity Catalog tables RLS via SQL analytics endpoint or Warehouse
Column masking Not at storage level Dynamic Data Masking on SQL Column masks on Unity Catalog tables Dynamic Data Masking on Warehouse

The key difference: In the traditional Azure stack (ADF + ADLS + SQL), you manage permissions across multiple services using Azure RBAC. In Fabric, most permissions are consolidated at the workspace level — one place to manage who can do what. In Databricks with Unity Catalog, data-level permissions are managed through SQL GRANT/REVOKE statements.

The Decision Framework

When you need to assign a role, ask three questions:

QUESTION 1: Does the person/service need to MANAGE the resource
            or ACCESS THE DATA inside it?

  Manage the resource → Owner / Contributor / Reader
  Access the data    → Data-specific role (Storage Blob Data..., db_datareader, etc.)

QUESTION 2: What level of data access is needed?

  Read only        → ...Reader (Storage Blob Data Reader, db_datareader)
  Read + Write     → ...Contributor (Storage Blob Data Contributor, db_datawriter)
  Full control     → ...Owner (Storage Blob Data Owner, db_owner)

QUESTION 3: What is the minimum scope needed?

  One resource     → Assign on that specific resource
  All resources    → Assign on the resource group
  Everything       → Assign on the subscription (be very careful!)

Rule of thumb:

Pipeline writes to ADLS?     → Storage Blob Data Contributor (on storage account)
Pipeline reads from SQL?      → db_datareader (inside SQL database)
Databricks reads Key Vault?   → Key Vault Secrets User (on Key Vault)
User browses Azure Portal?    → Reader (on resource group)
Engineer deploys pipelines?   → Contributor (on resource group) + Synapse Contributor

Real-World Scenarios: Which Role for Which Task

Scenario 1: Data Engineer Building Pipelines

Resource Role Why
Resource Group Contributor Deploy/manage resources
Synapse Workspace Synapse Contributor Create pipelines, notebooks
ADLS Gen2 (source + sink) Storage Blob Data Contributor Read/write Parquet files
Azure SQL Database db_datareader + db_datawriter Read source, write audit logs
Key Vault Key Vault Secrets User Read connection strings

Scenario 2: Data Analyst Reading Reports

Resource Role Why
Resource Group Reader View resources in portal
Synapse Workspace Synapse Artifact User Run queries, view data
ADLS Gen2 Storage Blob Data Reader Read data lake files
Azure SQL db_datareader Read tables

Scenario 3: Synapse Managed Identity

Resource Role Identity
ADLS Gen2 Storage Blob Data Contributor naveen-synapse-ws (managed identity)
Azure SQL db_datareader + db_datawriter naveen-synapse-ws
Key Vault Key Vault Secrets User naveen-synapse-ws

Scenario 4: Databricks with Unity Catalog

Resource Role Identity
ADLS Gen2 Storage Blob Data Contributor Access Connector managed identity
Key Vault Key Vault Secrets User AzureDatabricks (2ff814a6…)

Scenario 5: DevOps Engineer Deploying Pipelines

Resource Role Why
Resource Group Contributor Create/modify all resources
Key Vault Key Vault Secrets Officer Create and manage secrets
Subscription (if deploying new RGs) Contributor Create resource groups

The Principle of Least Privilege

Always assign the MINIMUM role needed. If someone only needs to read data, give them Reader, not Contributor. If a pipeline only writes to one storage account, assign the role on THAT storage account, not the entire resource group.

❌ Bad:  Owner on the subscription (can do anything, including delete everything)
❌ Bad:  Contributor on the subscription (can deploy anything anywhere)
✅ Good: Storage Blob Data Contributor on the specific storage account
✅ Good: db_datareader on the specific database

Real-life analogy: You do not give the pizza delivery driver the master key to your building. You give them access to the lobby intercom. Minimum access for the specific task.

Custom Roles

When built-in roles do not fit, create a custom role:

{
    "Name": "Data Lake Writer",
    "Description": "Can write blobs but not delete or manage",
    "Actions": [
        "Microsoft.Storage/storageAccounts/blobServices/containers/blobs/write",
        "Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read"
    ],
    "NotActions": [
        "Microsoft.Storage/storageAccounts/blobServices/containers/blobs/delete"
    ],
    "AssignableScopes": ["/subscriptions/<sub-id>"]
}

When to use custom roles: Only when no built-in role fits AND the principle of least privilege demands more granularity. Most teams never need custom roles.

How to Assign Roles (Step by Step)

  1. Go to the resource (storage account, SQL server, Key Vault, etc.)
  2. Click Access Control (IAM) in the left menu
  3. Click + AddAdd role assignment
  4. Role tab: Search and select the role (e.g., “Storage Blob Data Contributor”)
  5. Members tab: Select the identity type:
  6. User, group, or service principal — for people, groups, or app registrations
  7. Managed identity — for Synapse, ADF, Databricks managed identities
  8. Click + Select members → search by name
  9. Click Review + assign

Propagation time: Role assignments can take up to 10 minutes to take effect. If you get 403 immediately after assigning, wait and try again.

How to Check What Roles Someone Has

Resource → Access Control (IAM) → Role assignments tab
→ Search by user/identity name
→ See all roles assigned at this scope and inherited from above

Or check at the subscription level to see ALL assignments across all resources.

Common Permission Errors and Fixes

Error Likely Cause Fix
“403 Forbidden” writing to ADLS Missing Storage Blob Data Contributor Assign on the storage account to the correct identity
“403 Forbidden” reading Key Vault Missing Key Vault Secrets User Assign to AzureDatabricks (2ff814a6…) or the managed identity
“Cannot connect to SQL” Managed Identity not added as SQL user Run CREATE USER [identity-name] FROM EXTERNAL PROVIDER inside SQL
“Cannot create pipeline” in Synapse Missing Synapse Contributor role Assign in Synapse Studio → Manage → Access Control
“Contributor” assigned but cannot read blobs Contributor = management plane, not data plane Add Storage Blob Data Contributor separately
Role assigned but still getting 403 Propagation delay (up to 10 minutes) Wait 10 minutes, restart cluster/session, try again
“Authorization failed” on resource group Role assigned on specific resource, not resource group Assign on the correct scope (resource vs resource group)
Cannot assign roles to others Only Owner can assign roles Ask an Owner to assign or elevate temporarily

Interview Questions

Q: What is the difference between Contributor and Storage Blob Data Contributor? A: Contributor operates at the management plane — it lets you create, modify, and delete the storage ACCOUNT itself. Storage Blob Data Contributor operates at the data plane — it lets you read, write, and delete BLOBS inside the account. Having Contributor does NOT give you data access. You need both for full access.

Q: What is the principle of least privilege and how do you apply it in Azure? A: Assign the minimum role needed at the narrowest scope possible. If a pipeline needs to write to one storage account, assign Storage Blob Data Contributor on THAT specific account — not Contributor on the resource group. This limits the blast radius if credentials are compromised.

Q: What identity should a Synapse pipeline use to access ADLS Gen2? A: The Synapse workspace’s system-assigned managed identity. Assign Storage Blob Data Contributor to the managed identity on the storage account. No passwords to manage, automatic credential rotation, and full audit trail.

Q: What is the difference between a Service Principal and a Managed Identity? A: Both are application identities. A Service Principal requires you to create and manage client secrets (passwords). A Managed Identity is managed by Azure — no secrets to create, rotate, or store. Always prefer Managed Identity for Azure-to-Azure communication.

Q: What role does the AzureDatabricks App ID need on Key Vault? A: Key Vault Secrets User. The App ID 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d is Databricks’ global service principal. It needs this role to read secrets from Key Vault via Secret Scopes.

Q: How does role inheritance work in Azure? A: Roles assigned at a higher scope (subscription, resource group) automatically inherit to all resources below. A Contributor on a resource group is Contributor on every resource in that group. You cannot block inherited roles at a lower scope. This is why you should assign roles at the narrowest scope possible — assigning Contributor at the subscription level gives access to every resource in the subscription.

Q: What roles does a Synapse managed identity need to run a pipeline that reads from SQL and writes to ADLS? A: Three separate assignments: Storage Blob Data Contributor on the ADLS Gen2 storage account (data plane — write Parquet), db_datareader inside the Azure SQL database (SQL role — read source tables), and Key Vault Secrets User on the Key Vault if the pipeline reads connection strings from secrets. The first is Azure RBAC. The second is a SQL database role. The third is Azure RBAC again.

Q: Why would you use a group for role assignments instead of individual users? A: Scale and management. If 10 data engineers need Storage Blob Data Contributor on a storage account, you can make 10 individual assignments or create one “Data Engineering Team” group and make one assignment. When a new engineer joins, add them to the group — they automatically get all the group’s roles. When someone leaves, remove them from the group. No need to audit and remove individual assignments across dozens of resources.

Q: What happens if a role assignment conflicts — one rule allows and another denies? A: Azure RBAC is additive. If a user has Reader from one assignment and Contributor from another (via group membership or different scope), they effectively have Contributor (the union of both). The only way to restrict is with Deny assignments (currently limited to Azure Blueprints and Privileged Identity Management). This is why the principle of least privilege is critical — you cannot easily take away permissions once granted through multiple paths.

Wrapping Up

Azure RBAC is not about memorizing 200 roles. It is about understanding the pattern: management plane vs data plane, four identity types, and the principle of least privilege.

The roles you will use 90% of the time as a data engineer:

Role When
Storage Blob Data Contributor Pipelines writing to ADLS
Storage Blob Data Reader Analysts reading from ADLS
Key Vault Secrets User Databricks/Synapse/ADF reading secrets
Contributor Engineers deploying infrastructure
Synapse Contributor Engineers building pipelines in Synapse
db_datareader / db_datawriter Accessing data inside Azure SQL

Learn these six. The rest are variations of the same pattern.

Related posts:Azure Fundamentals (IAM, Subscriptions)Azure Networking (NSGs, Private Endpoints)Databricks Secret ScopesConnecting Databricks to StorageCloud Computing Concepts


Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
Share via
Copy link