Real-Time Analytics Deep Dive: Window Types, Accelerated Shortcuts, KQL Functions, Materialized Views, and Eventhouse Optimization
Our Real-Time Intelligence post covered the architecture — Eventstream, Eventhouse, KQL basics, and dashboards. This post goes deeper: the five window types for time-based analysis, accelerated vs standard shortcuts in KQL databases, advanced KQL functions for dates, strings, and aggregations, materialized views for pre-computed aggregations, and Eventhouse optimization.
Table of Contents
- The Five Window Types
- Tumbling Windows
- Hopping (Sliding) Windows
- Sliding Windows
- Session Windows
- Snapshot Windows
- Accelerated vs Standard Shortcuts in KQL DB
- Creating Accelerated Shortcuts
- Advanced KQL Functions
- Date and Time Functions
- String Functions
- Aggregation Functions
- Dynamic and JSON Parsing
- Materialized Views in KQL
- Creating and Managing Materialized Views
- Eventhouse Optimization
- Retention Policies
- Caching Policies
- Partitioning
- Error Resolution in RTI
- Eventstream Errors
- Eventhouse Errors
- Common Mistakes
- Interview Questions
- Wrapping Up
The Five Window Types
Tumbling Windows
Non-overlapping, fixed-size windows. Each event belongs to exactly ONE window.
// Tumbling window: 5-minute buckets
sensor_readings
| summarize avg_temp = avg(temperature), count = count()
by bin(timestamp, 5m), device_id
|--5min--|--5min--|--5min--|
No overlap. Each event in one window.
Hopping (Sliding) Windows
Fixed-size windows that OVERLAP by sliding at regular intervals.
// 10-minute window, hopping every 5 minutes
sensor_readings
| summarize avg_temp = avg(temperature)
by bin(timestamp, 5m), device_id
| extend window_end = timestamp + 10m
// An event at 10:07 appears in both the 10:00-10:10 and 10:05-10:15 windows
Sliding Windows
Windows triggered by events — window starts when an event arrives, has a fixed duration.
// Sliding: events within 5 minutes of each other
sensor_readings
| summarize count() by device_id, bin(timestamp, 1m)
| where count_ > 10 // More than 10 events per minute = burst
Session Windows
Dynamic windows based on activity gaps — window opens when activity starts, closes after inactivity.
// Session: group events with < 5min gap between them
sensor_readings
| sort by device_id, timestamp asc
| extend gap = datetime_diff('minute', timestamp, prev(timestamp))
| extend new_session = iif(gap > 5 or isnull(gap), 1, 0)
| extend session_id = row_cumsum(new_session)
| summarize session_start = min(timestamp), session_end = max(timestamp),
event_count = count() by device_id, session_id
Snapshot Windows
Captures the state at a specific point in time — not based on event time but on system time.
// Snapshot: state of all devices right now
sensor_readings
| summarize arg_max(timestamp, *) by device_id
// Returns the LATEST reading per device (current snapshot)
Accelerated vs Standard Shortcuts in KQL DB
| Feature | Standard Shortcut | Accelerated Shortcut |
|---|---|---|
| Data location | Reads from OneLake at query time | Copies data into Eventhouse cache |
| Query speed | Slower (reads from OneLake) | Faster (data pre-loaded) |
| Freshness | Always current | Near real-time (minutes) |
| Use case | Infrequent queries on large historical data | Frequent queries needing fast response |
| Storage | No duplication | Duplicates data in Eventhouse |
Creating Accelerated Shortcuts
- Open your KQL Database in the Eventhouse
- Click New → OneLake shortcut
- Select the source (Lakehouse table, another KQL DB)
- Toggle Acceleration to On
- Configure refresh interval (how often data syncs)
// Query accelerated shortcut (same syntax as regular table)
accelerated_sales_data
| where order_date > ago(7d)
| summarize revenue = sum(amount) by product
| top 10 by revenue
// Fast because data is pre-loaded into Eventhouse cache
DP-700 exam tip: Know when to choose accelerated (frequent queries, speed matters) vs standard (infrequent, storage matters).
Advanced KQL Functions
Date and Time Functions
// Current time
print now()
// Extract parts
sensor_readings
| extend year = datetime_part("year", timestamp),
month = datetime_part("month", timestamp),
hour = datetime_part("hour", timestamp),
day_of_week = dayofweek(timestamp)
// Date arithmetic
| extend yesterday = timestamp - 1d,
next_week = timestamp + 7d,
age_hours = datetime_diff("hour", now(), timestamp)
// Start/end of periods
| extend start_of_month = startofmonth(timestamp),
end_of_month = endofmonth(timestamp),
start_of_day = startofday(timestamp)
// Format
| extend formatted = format_datetime(timestamp, "yyyy-MM-dd HH:mm")
// Between
| where timestamp between (datetime(2026-01-01) .. datetime(2026-06-01))
String Functions
// Common string operations
events
| extend lower_name = tolower(name),
upper_name = toupper(name),
trimmed = trim(" ", raw_name),
contains_error = name contains "error",
starts_with = name startswith "ERR",
extracted = extract("device-(\d+)", 1, device_name), // Regex
split_parts = split(csv_field, ","),
replaced = replace_string(phone, "-", ""),
length = strlen(name),
substring = substring(name, 0, 5)
Aggregation Functions
// Beyond basic count/sum/avg
sensor_readings
| summarize
count = count(),
distinct_devices = dcount(device_id), // Distinct count
percentile_95 = percentile(temperature, 95), // 95th percentile
median_temp = percentile(temperature, 50),
variance = variance(temperature),
stdev = stdev(temperature),
min_temp = min(temperature),
max_temp = max(temperature),
make_list = make_list(device_id), // Array of all values
make_set = make_set(device_id) // Unique array
by bin(timestamp, 1h)
Dynamic and JSON Parsing
// Parse JSON from string column
events
| extend parsed = parse_json(json_payload)
| extend device = tostring(parsed.device_id),
temp = todouble(parsed.readings.temperature),
tags = parsed.tags // Array access
| mv-expand tag = tags // Expand array to rows
Materialized Views in KQL
Pre-computed aggregations that auto-refresh when source data changes:
// Create materialized view: hourly averages (auto-updated!)
.create materialized-view HourlyAvgTemp on table sensor_readings
{
sensor_readings
| summarize avg_temp = avg(temperature), max_temp = max(temperature), count = count()
by bin(timestamp, 1h), device_id
}
// Query the materialized view (instant — pre-computed!)
HourlyAvgTemp
| where timestamp > ago(7d)
| order by timestamp desc
// Check materialized view health
.show materialized-view HourlyAvgTemp extents
Materialized views vs regular queries: A regular aggregation query scans ALL raw data every time. A materialized view pre-computes and maintains the result incrementally — 100x faster for repeated queries.
Eventhouse Optimization
Retention Policies
// Set data retention: auto-delete after 90 days
.alter table sensor_readings policy retention "{'SoftDeletePeriod': '90.00:00:00'}"
// Hot cache: keep last 30 days in fast SSD cache
.alter table sensor_readings policy caching hot = 30d
Partitioning
// Partition by date for faster time-range queries
.alter table sensor_readings policy partitioning
@'{"PartitionKeys": [{"ColumnName": "timestamp", "Kind": "UniformRange", "Properties": {"Reference": "2020-01-01", "RangeSize": "1.00:00:00"}}]}'
Error Resolution in RTI
| Error | Cause | Fix |
|---|---|---|
| Eventstream ingestion lag | Source producing faster than destination consumes | Scale up Eventhouse, add partitions to Event Hub |
| KQL query timeout | Scanning too much data | Add time filters, use materialized views |
| Eventhouse disk full | No retention policy | Set retention (90d), purge old data |
| Shortcut stale data | Accelerated shortcut refresh failed | Check source connectivity, restart acceleration |
| Schema mismatch | Source schema changed | Update table schema, restart ingestion |
Common Mistakes
- Not setting retention policies — Eventhouse grows forever without them. Set 30-90 day retention.
- Using standard shortcuts when accelerated is needed — frequent queries on standard shortcuts are slow. Accelerate them.
- Not using materialized views — recomputing hourly aggregations from billions of rows is wasteful. Materialized views pre-compute once.
- Wrong window type — tumbling for non-overlapping counts, hopping for moving averages, session for user activity tracking. Match the window to the business question.
Interview Questions
Q: What are the five window types in Fabric Real-Time Analytics? A: Tumbling (non-overlapping fixed-size), Hopping/Sliding (overlapping fixed-size), Sliding (event-triggered fixed-duration), Session (dynamic based on activity gaps), Snapshot (point-in-time state). Each serves different analytical needs — tumbling for counts, hopping for moving averages, session for user activity.
Q: What is the difference between accelerated and standard shortcuts in KQL? A: Standard shortcuts read from OneLake at query time (slower, always current). Accelerated shortcuts copy data into Eventhouse cache (faster, near real-time refresh). Use accelerated for frequently queried data, standard for infrequent historical queries.
Wrapping Up
This post completes your Real-Time Intelligence knowledge. Combined with the RTI overview post, you now cover all DP-700 streaming objectives.
Related posts: – Real-Time Intelligence Overview – Spark Structured Streaming
Naveen Vuppula is a Senior Data Engineering Consultant based in Ontario, Canada. He writes about data engineering at DriveDataScience.com.