OneLake Deep Dive: Architecture, ADLS Gen2 Compatibility, OneLake File Explorer, Multi-Cloud Shortcuts, Storage Billing, and the Foundation of Microsoft Fabric

Every Fabric item — Lakehouse, Warehouse, KQL Database, Semantic Model — stores its data in ONE place: OneLake. It is the unified storage layer underneath all of Fabric. Understanding OneLake is like understanding the foundation of a building — everything above it depends on it being solid, organized, and accessible.

This post goes beyond the basics. We cover OneLake’s architecture, its ADLS Gen2 API compatibility (meaning you can access OneLake with existing ADLS tools), the OneLake File Explorer (browse from Windows), multi-cloud shortcuts (access S3 and GCS without copying), storage billing, and the patterns that make OneLake the most important Fabric component.

Think of OneLake as a single massive filing cabinet shared across the entire company. Every department (workspace) gets drawers (lakehouses, warehouses). Every drawer has folders (tables, files). The filing cabinet has ONE address — not 50 different storage accounts scattered across Azure. Anyone with the right key (permissions) can access any drawer from anywhere (including non-Microsoft tools via ADLS Gen2 APIs).

What Is OneLake?
OneLake vs ADLS Gen2 vs S3
The OneLake Hierarchy
OneLake Architecture
One Tenant = One OneLake
Namespaces and Paths
ADLS Gen2 API Compatibility
Accessing OneLake from External Tools
Azure Storage Explorer
AzCopy
ADLS Gen2 SDK (Python)
Databricks (External)
OneLake File Explorer (Windows Desktop)
Installing and Using
Sync Local Files with OneLake
Multi-Cloud Shortcuts
How Shortcuts Access S3 and GCS
Shortcut Caching (Reduce Egress)
OneLake Data Hub
Discovering Data Across Workspaces
Storage Billing and Optimization
What Counts as Storage
BCDR (Disaster Recovery) Replication
Soft Delete and Recovery
Storage Optimization Tips
OneLake Security
Workspace Roles + Data Access Roles
Firewall and Private Endpoints
Real-World OneLake Patterns
Pattern 1: Centralized Data Lake
Pattern 2: Hub-and-Spoke
Pattern 3: Multi-Cloud Unified Lake
Common Mistakes
Interview Questions
Wrapping Up

What Is OneLake?

OneLake is Fabric’s built-in, unified storage layer — a single data lake for your entire organization. Every Fabric workspace, every lakehouse, every warehouse writes its data to OneLake. There is nothing to provision, no storage accounts to create, no access keys to manage.

Traditional approach:
  Team A: ADLS Gen2 account → storageA.dfs.core.windows.net
  Team B: ADLS Gen2 account → storageB.dfs.core.windows.net
  Team C: ADLS Gen2 account → storageC.dfs.core.windows.net
  → 3 storage accounts, 3 sets of credentials, 3 different access controls

OneLake approach:
  Team A: OneLake → workspace_A/lakehouse_A/Tables/...
  Team B: OneLake → workspace_B/lakehouse_B/Tables/...
  Team C: OneLake → workspace_C/lakehouse_C/Tables/...
  → 1 storage layer, 1 set of credentials (Azure AD), 1 governance model

OneLake vs ADLS Gen2 vs S3

Feature	OneLake	ADLS Gen2	Amazon S3
Provisioning	Automatic (built into Fabric)	Manual (create storage account)	Manual (create bucket)
Authentication	Azure AD (automatic in Fabric)	Access key, SAS, MI, SP	IAM, access key
Organization	Tenant → Workspace → Item → Tables/Files	Account → Container → Folders	Bucket → Prefix
Format	Delta Lake (default for tables)	Any	Any
Multi-workload	All Fabric workloads read/write natively	Needs connectors	Needs connectors
Governance	Purview integrated, sensitivity labels	Purview integration available	AWS Macie
Shortcuts	Internal + external (ADLS, S3, GCS)	N/A	N/A
Billing	~$0.023/GB/month	~$0.020-0.046/GB/month	~$0.023/GB/month

The OneLake Hierarchy

OneLake (one per tenant)
  └── Workspace: DataEng_Prod
        ├── Lakehouse: bronze_lakehouse
        │     ├── Tables/
        │     │     ├── raw_customers/ (Delta files)
        │     │     └── raw_orders/ (Delta files)
        │     └── Files/
        │           └── uploads/ (raw CSV, JSON)
        │
        ├── Lakehouse: silver_lakehouse
        │     └── Tables/
        │           ├── customers_clean/
        │           └── orders_validated/
        │
        ├── Warehouse: gold_warehouse
        │     └── gold/
        │           ├── dim_customer/ (Parquet, managed)
        │           └── fact_sales/ (Parquet, managed)
        │
        └── KQL Database: iot_analytics
              └── sensor_readings/ (columnar store)

Every item writes to OneLake. The physical path:

onelake.dfs.fabric.microsoft.com
  /{workspace_id}/{item_id}/Tables/{table_name}/
  /{workspace_id}/{item_id}/Files/{folder_name}/

OneLake Architecture

OneLake is built on top of Azure Data Lake Storage Gen2 technology, but abstracted into a managed service. You never see or manage the underlying storage accounts — Fabric handles provisioning, scaling, replication, and security automatically.

One Tenant = One OneLake

Unlike ADLS Gen2 where you can create unlimited storage accounts, OneLake is one per tenant. Your entire organization shares a single OneLake instance. This is intentional — it prevents data silos. Instead of 50 storage accounts scattered across teams, all data lives under one roof with unified governance.

Namespaces and Paths

Every item in OneLake has a predictable path structure:

https://onelake.dfs.fabric.microsoft.com/{workspace_name}/{item_name}.{item_type}/{section}/{path}

Examples:
  /DataEng_Prod/bronze_lakehouse.Lakehouse/Tables/customers/     → Delta table
  /DataEng_Prod/bronze_lakehouse.Lakehouse/Files/raw_csv/data.csv → Raw file
  /DataEng_Prod/gold_warehouse.Warehouse/dbo/fact_sales/          → Warehouse table

This structure matches ADLS Gen2 conventions, which is why existing ADLS tools work with OneLake by simply changing the endpoint URL.

ADLS Gen2 API Compatibility

OneLake implements the ADLS Gen2 REST API. This means ANY tool that works with ADLS Gen2 also works with OneLake — zero code changes:

ADLS Gen2 endpoint: https://storageaccount.dfs.core.windows.net/container/path
OneLake endpoint:   https://onelake.dfs.fabric.microsoft.com/workspace/item/path

Same API, different endpoint. Switch the URL and everything works.

Accessing OneLake from External Tools

Azure Storage Explorer

Open Azure Storage Explorer
Click Connect → ADLS Gen2 or OneLake
URL: https://onelake.dfs.fabric.microsoft.com/
Sign in with Azure AD
Browse workspaces → items → tables/files

AzCopy

# Copy a file TO OneLake
azcopy copy "local_file.csv"   "https://onelake.dfs.fabric.microsoft.com/workspace_name/lakehouse_name.Lakehouse/Files/uploads/local_file.csv"

# Copy FROM OneLake
azcopy copy   "https://onelake.dfs.fabric.microsoft.com/workspace_name/lakehouse_name.Lakehouse/Files/data.csv"   "local_copy.csv"

ADLS Gen2 SDK (Python)

from azure.identity import DefaultAzureCredential
from azure.storage.filedatalake import DataLakeServiceClient

# Connect to OneLake using the SAME ADLS Gen2 SDK
credential = DefaultAzureCredential()
service_client = DataLakeServiceClient(
    account_url="https://onelake.dfs.fabric.microsoft.com",
    credential=credential
)

# List files in a lakehouse
file_system_client = service_client.get_file_system_client("workspace_name")
paths = file_system_client.get_paths(path="lakehouse_name.Lakehouse/Files/")
for path in paths:
    print(path.name)

Databricks (External)

# Access OneLake from a Databricks notebook (outside Fabric)
df = spark.read.format("delta").load(
    "abfss://workspace_name@onelake.dfs.fabric.microsoft.com/lakehouse_name.Lakehouse/Tables/customers"
)
# Uses ADLS Gen2 protocol (abfss://) — Databricks treats OneLake like any ADLS account

OneLake File Explorer (Windows Desktop)

A Windows app that syncs OneLake data to your local file system:

Installing and Using

Download from Microsoft Store → search “OneLake File Explorer”
Install → sign in with Azure AD
OneLake appears as a drive in File Explorer (like OneDrive)
Browse: OneLake → workspace → lakehouse → Files/Tables
Copy files to/from OneLake by dragging and dropping

File Explorer:
  OneLake - Contoso
    └── DataEng_Prod
          ├── bronze_lakehouse
          │     ├── Files
          │     │     └── uploads (drag CSV here to upload!)
          │     └── Tables
          │           ├── raw_customers
          │           └── raw_orders
          └── silver_lakehouse
                └── Tables
                      └── customers_clean

Use for: Quick file uploads, browsing table structures, downloading small files for local analysis.

Sync Local Files with OneLake

OneLake File Explorer works like OneDrive — files sync between your local machine and OneLake. Drag a CSV into the Files folder on your desktop, and it appears in your Lakehouse within seconds. Download a Delta table’s Parquet files locally for offline analysis. The sync is bidirectional for the Files section. Tables (Delta) are read-only in File Explorer — you can browse the Parquet files but should not modify them locally.

Multi-Cloud Shortcuts

OneLake shortcuts make external data (S3, GCS) appear as if it is local:

OneLake Lakehouse:
  Tables/
    local_customers/ (actual Delta files in OneLake)
    aws_events/      ← SHORTCUT to s3://company-events/processed/
    gcp_analytics/   ← SHORTCUT to gs://analytics-bucket/reports/

One notebook query:
  SELECT * FROM local_customers c
  JOIN aws_events e ON c.id = e.customer_id
  JOIN gcp_analytics g ON c.id = g.customer_id

Three clouds. One query. Zero data movement.

Shortcut Caching

For cross-cloud shortcuts, enable caching to avoid repeated egress fees:

First read:  OneLake → S3 (egress fee) → data + cached locally
Second read: OneLake → local cache (no egress fee) → instant

Enable: Workspace settings → OneLake → Cache setting → On

How Shortcuts Access S3 and GCS

When you create a shortcut to Amazon S3 or Google Cloud Storage, Fabric reads the data at query time via the cloud provider’s API (S3 API for AWS, GCS API for Google). Authentication uses the credentials you provided when creating the shortcut (Access Key for S3, Service Account Key for GCS). The data is NOT copied — every read goes directly to the source. This means you pay egress fees on the source cloud when Fabric reads the data.

Shortcut Caching (Reduce Egress)

To reduce cross-cloud egress costs, enable shortcut caching. When enabled, Fabric caches the data locally in OneLake after the first read. Subsequent reads use the cached copy instead of fetching from S3/GCS again. The cache refreshes at a configurable interval. This trades OneLake storage cost (~$0.023/GB/month) for reduced egress fees — usually a significant net savings for frequently queried data.

OneLake Data Hub

The Data Hub is a searchable catalog of all data items across all workspaces:

Click OneLake data hub in the left sidebar
Browse or search for items: “customers,” “sales,” “dim_product”
See: item name, workspace, type, owner, endorsement status
Click to explore or create a shortcut to the item

Use for: Discovering data created by other teams without asking “where is the customer table?”

Discovering Data Across Workspaces

The Data Hub shows items from ALL workspaces you have access to — not just your current workspace. Search for “customers” and see every lakehouse, warehouse, and semantic model that contains customer data across the entire organization. Click any item to view its details, create a shortcut to it, or open it directly. This eliminates the “who has the customer data?” question that plagues organizations with scattered storage accounts.

Storage Billing and Optimization

What Counts as Storage

Stored Item	Billed?	Location
Lakehouse Tables (Delta)	✅ Yes	OneLake
Lakehouse Files (CSV, JSON)	✅ Yes	OneLake
Warehouse tables	✅ Yes	OneLake
KQL Database data	✅ Yes	OneLake
Shortcut target data	❌ No (billed at source)	S3, GCS, ADLS
Shortcut cache	✅ Yes	OneLake
Delta log files	✅ Yes	OneLake
Old Delta versions (before VACUUM)	✅ Yes	OneLake
Mirrored database replicas	✅ Yes	OneLake

Soft Delete and Recovery

OneLake retains deleted data for a recovery period:

Delete a table → data soft-deleted → recoverable for retention period
After retention → permanently deleted → storage freed

Configure retention: Workspace settings → OneLake → Soft delete retention

BCDR (Disaster Recovery) Replication

OneLake data is replicated within the Azure region for high availability (locally redundant by default). For cross-region disaster recovery, Fabric supports BCDR replication to a paired Azure region. If your primary region (e.g., Canada Central) has an outage, your data is available in the paired region (e.g., Canada East). BCDR is configured at the capacity level through Azure portal settings. Note that BCDR replication adds storage cost for the secondary copy.

Storage Optimization Tips

Run VACUUM on Delta tables — old versions consume storage. VACUUM table RETAIN 168 HOURS removes versions older than 7 days.
Use shortcuts instead of copies — if data exists in ADLS, create a shortcut instead of copying to OneLake.
Delete staging data after processing — staging tables (bronze) that have been transformed to silver do not need to persist forever.
Compress before uploading — Parquet/Delta are already compressed. CSVs are not — convert to Delta after landing.
Monitor storage — Fabric Admin Portal shows storage usage per workspace.

Real-World OneLake Patterns

Pattern 1: Centralized Data Lake

OneLake
  └── Workspace: Central_DataLake
        ├── bronze_lakehouse (all raw data lands here)
        ├── silver_lakehouse (all cleaned data)
        ├── gold_warehouse (star schema for all teams)
        └── All teams access via Viewer role or shortcuts

Pattern 2: Hub-and-Spoke

OneLake
  ├── Workspace: DataEng_Hub (shared data)
  │     ├── silver_lakehouse (master clean data)
  │     └── gold_warehouse (shared star schema)
  │
  ├── Workspace: Sales_Spoke (sales team)
  │     └── sales_lakehouse
  │           Tables/customers ← SHORTCUT to Hub silver
  │
  ├── Workspace: Marketing_Spoke (marketing team)
  │     └── marketing_lakehouse
  │           Tables/customers ← SHORTCUT to Hub silver
  │
  └── Each spoke reads from Hub via shortcuts — no data duplication

Pattern 3: Multi-Cloud Unified Lake

OneLake
  └── Workspace: Unified_Analytics
        ├── lakehouse
        │     Tables/
        │       azure_crm   ← SHORTCUT to ADLS Gen2
        │       aws_events  ← SHORTCUT to Amazon S3
        │       gcp_logs    ← SHORTCUT to Google Cloud Storage
        │       local_dims  ← Local Delta tables
        │
        └── All queryable together in one notebook or SQL endpoint

OneLake Security

Workspace Roles + Data Access Roles

OneLake security is layered. Workspace roles (Admin, Member, Contributor, Viewer) control who can enter the workspace. OneLake data access roles control which specific tables and folders within a Lakehouse a user can read. A Viewer in the workspace might have access to the customers table but not the salaries table. Configure through the Lakehouse’s “Manage OneLake data access” settings — create custom roles, assign tables, add members. See our Security & Governance post for the complete 7-layer security model.

Firewall and Private Endpoints

OneLake supports restricting access via tenant-level firewall rules. You can limit OneLake access to specific IP ranges or Azure Virtual Networks. For private connectivity, configure private endpoints — OneLake traffic stays on Microsoft’s backbone network without traversing the public internet. This is essential for regulated industries (banking, healthcare) where data must not be exposed to public networks. Configure through the Fabric Admin Portal under tenant settings.

Common Mistakes

Creating separate storage accounts alongside OneLake — OneLake IS your storage. Do not create ADLS Gen2 accounts for Fabric data. Use OneLake directly.
Not using shortcuts for shared data — copying data between workspaces wastes storage and creates staleness. Use internal shortcuts.
Not running VACUUM — old Delta versions accumulate. A table with 1 GB of current data can have 10 GB of old versions. VACUUM regularly.
Not exploring the Data Hub — teams duplicate work because they do not know other teams’ data exists. The Data Hub makes all data discoverable.
Ignoring ADLS Gen2 compatibility — existing ADLS tools (Storage Explorer, AzCopy, Python SDK) work with OneLake. Do not build custom connectors when standard tools work.

Interview Questions

Q: What is OneLake and how does it differ from ADLS Gen2? A: OneLake is Fabric’s built-in, unified storage layer — one per tenant, automatic provisioning, Azure AD authentication, integrated with all Fabric workloads. ADLS Gen2 is a standalone Azure service requiring manual provisioning and configuration. OneLake is built on ADLS Gen2 technology and implements the same REST API, so existing ADLS tools work with OneLake.

Q: How can you access OneLake from outside Fabric? A: OneLake implements the ADLS Gen2 API at onelake.dfs.fabric.microsoft.com. Access via Azure Storage Explorer, AzCopy, ADLS Gen2 Python SDK, Databricks (abfss:// protocol), or OneLake File Explorer (Windows desktop app). Any tool that supports ADLS Gen2 works with OneLake by changing the endpoint URL.

Q: What are OneLake shortcuts and why are they important? A: Shortcuts are pointers to data in other locations (other workspaces, ADLS Gen2, S3, GCS) that appear as local tables in your Lakehouse. They eliminate data duplication — multiple teams access the same data through shortcuts without copying. Cross-cloud shortcuts enable multi-cloud analytics from a single query.

Wrapping Up

OneLake is the invisible foundation that makes Fabric work. Every Lakehouse, Warehouse, KQL Database, and Semantic Model writes to OneLake. ADLS Gen2 API compatibility means existing tools work seamlessly. Shortcuts unify data across workspaces and clouds. And the Data Hub makes everything discoverable.

Related posts: – OneLake Shortcuts – Fabric Foundations – Fabric Lakehouse Guide – ADLS Gen2 Guide

← Previous: Capacity, Workspaces, Items Fabric (3/38) Next: OneLake Shortcuts →

Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.

OneLake Deep Dive: Architecture, ADLS Gen2 Compatibility, OneLake File Explorer, Multi-Cloud Shortcuts, Storage Billing, and the Foundation of Microsoft Fabric

Table of Contents

What Is OneLake?

OneLake vs ADLS Gen2 vs S3

The OneLake Hierarchy

OneLake Architecture

One Tenant = One OneLake

Namespaces and Paths

ADLS Gen2 API Compatibility

Accessing OneLake from External Tools

Azure Storage Explorer

AzCopy

ADLS Gen2 SDK (Python)

Databricks (External)

OneLake File Explorer (Windows Desktop)

Installing and Using

Sync Local Files with OneLake

Multi-Cloud Shortcuts

Shortcut Caching

How Shortcuts Access S3 and GCS

Shortcut Caching (Reduce Egress)

OneLake Data Hub

Discovering Data Across Workspaces

Storage Billing and Optimization

What Counts as Storage

Soft Delete and Recovery

BCDR (Disaster Recovery) Replication

Storage Optimization Tips

Real-World OneLake Patterns

Pattern 1: Centralized Data Lake

Pattern 2: Hub-and-Spoke

Pattern 3: Multi-Cloud Unified Lake

OneLake Security

Workspace Roles + Data Access Roles

Firewall and Private Endpoints

Common Mistakes

Interview Questions

Wrapping Up

Leave a Comment Cancel Reply

Table of Contents

What Is OneLake?

OneLake vs ADLS Gen2 vs S3

The OneLake Hierarchy

OneLake Architecture

One Tenant = One OneLake

Namespaces and Paths

ADLS Gen2 API Compatibility

Accessing OneLake from External Tools

Azure Storage Explorer

AzCopy

ADLS Gen2 SDK (Python)

Databricks (External)

OneLake File Explorer (Windows Desktop)

Installing and Using

Sync Local Files with OneLake

Multi-Cloud Shortcuts

Shortcut Caching

How Shortcuts Access S3 and GCS

Shortcut Caching (Reduce Egress)

OneLake Data Hub

Discovering Data Across Workspaces

Storage Billing and Optimization

What Counts as Storage

Soft Delete and Recovery

BCDR (Disaster Recovery) Replication

Storage Optimization Tips

Real-World OneLake Patterns

Pattern 1: Centralized Data Lake

Pattern 2: Hub-and-Spoke

Pattern 3: Multi-Cloud Unified Lake

OneLake Security

Workspace Roles + Data Access Roles

Firewall and Private Endpoints

Common Mistakes

Interview Questions

Wrapping Up

Related Posts

Leave a Comment Cancel Reply