Clustering Algorithms: K-Means, DBSCAN, and Hierarchical Clustering — Unsupervised Learning, Segmentation, and When to Use Each

Every ML model we have covered so far — linear regression, decision trees, XGBoost — learned from labeled data. You gave the model inputs AND the correct answers (supervised learning). But what if you do not have labels? What if you have 10 million customers and the business asks: “find natural groups in our customer base”? Nobody has pre-labeled customers as “high value” or “at risk.” You need the algorithm to discover the groups on its own.

That is unsupervised learning, and clustering is its most widely used technique. Clustering finds natural groupings in data without being told what the groups are.

Think of it like sorting a bag of mixed coins from different countries. Nobody tells you which country each coin belongs to. But you naturally group them by size, color, weight, and markings — and you end up with neat piles of US quarters, Canadian loonies, British pounds, and Indian rupees. The algorithm does the same thing with data points: groups them by similarity without being told the categories in advance.

This post covers the three most important clustering algorithms — K-Means (fast, simple, spherical clusters), DBSCAN (finds irregular shapes, handles noise), and Hierarchical Clustering (no need to pre-specify K, produces a visual dendrogram) — with Python code, comparison tables, real-world use cases, and when to use each.

Supervised vs Unsupervised Learning
What Is Clustering?
K-Means Clustering
How K-Means Works (Step by Step)
K-Means in Python
Choosing K: The Elbow Method
K-Means Limitations
DBSCAN (Density-Based Clustering)
How DBSCAN Works
DBSCAN in Python
Choosing eps and min_samples
DBSCAN Strengths and Limitations
Hierarchical Clustering
How Hierarchical Clustering Works
Hierarchical Clustering in Python
Reading a Dendrogram
Comparison: K-Means vs DBSCAN vs Hierarchical
Evaluation Metrics for Clustering
Silhouette Score
Inertia and Davies-Bouldin Index
Data Preprocessing for Clustering
Real-World Use Cases
Common Mistakes
Interview Questions
Wrapping Up

Supervised vs Unsupervised Learning

Aspect	Supervised Learning	Unsupervised Learning
Data has labels?	Yes — you provide correct answers	No — algorithm discovers patterns
Goal	Predict an outcome (y = f(X))	Find structure in data (groups, patterns)
Examples	Predict house price, classify spam	Customer segmentation, anomaly detection
Algorithms	Linear Regression, Decision Trees, XGBoost	K-Means, DBSCAN, Hierarchical, PCA
Evaluation	Accuracy, RMSE, F1 (against known labels)	Silhouette score, inertia (no “correct” answer)

Real-life analogy: Supervised learning is like a teacher grading exams with an answer key. Unsupervised learning is like an archaeologist sorting artifacts from a new dig site — no labels, no answer key, just patterns in the objects themselves.

What Is Clustering?

Clustering divides data points into groups (clusters) where points in the same cluster are more similar to each other than to points in other clusters.

Before clustering:                After clustering:
  •   •                             Cluster A: • • •
    •     •  •                      Cluster B: • • • •
  •  •  •                           Cluster C: • •
       •  •   •
    •        •
  •    •   •

No structure visible.              Three natural groups discovered.

The algorithm does not know what the groups mean — it just finds them. YOU interpret them: “Cluster A is high-spending young customers,” “Cluster B is budget-conscious families,” etc.

K-Means Clustering

K-Means is the most popular clustering algorithm. It is fast, simple, and works well for spherical (round) clusters of similar size.

Real-life analogy: Imagine you are a pizza delivery manager with 3 delivery drivers. You need to assign each delivery address to the closest driver. K-Means does exactly this: it places K “centroids” (drivers) and assigns each data point (delivery address) to the nearest centroid. Then it moves the centroids to the center of their assigned points and repeats until nobody switches groups.

How K-Means Works (Step by Step)

Step 1: Choose K (number of clusters) — you decide this upfront
Step 2: Randomly place K centroids in the data space
Step 3: ASSIGN — each data point goes to its nearest centroid
Step 4: UPDATE — move each centroid to the mean of its assigned points
Step 5: REPEAT Steps 3-4 until centroids stop moving (convergence)

Example with K=3 and 9 data points:

  Iteration 1:                    Iteration 2 (centroids moved):
  C1*                             Points reassigned based on new
   • •                            centroid positions.
     •  C2*                       
        • •                       C1 and C2 shift to cluster centers.
   •        C3*                   
         • •                      After 5-10 iterations: stable.

  * = centroid position           Final: 3 tight clusters.

K-Means in Python

import numpy as np
import pandas as pd
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt

# Sample customer data
data = pd.DataFrame({
    'annual_income': [15, 16, 17, 18, 19, 39, 40, 41, 42, 43, 70, 71, 72, 73, 74],
    'spending_score': [39, 81, 6, 77, 40, 76, 94, 3, 72, 14, 99, 15, 73, 10, 88]
})

# Step 1: Scale the data (CRITICAL for clustering!)
scaler = StandardScaler()
scaled_data = scaler.fit_transform(data)

# Step 2: Fit K-Means with K=3
kmeans = KMeans(n_clusters=3, random_state=42, n_init=10)
data['cluster'] = kmeans.fit_predict(scaled_data)

# Step 3: Examine results
print(data.groupby('cluster').mean())
# cluster  annual_income  spending_score
# 0        17.0           48.6           ← Low income group
# 1        41.0           51.8           ← Mid income group
# 2        72.0           57.0           ← High income group

# Step 4: Visualize
plt.scatter(data['annual_income'], data['spending_score'], 
            c=data['cluster'], cmap='viridis', s=100)
plt.xlabel('Annual Income (K$)')
plt.ylabel('Spending Score')
plt.title('Customer Segments (K-Means, K=3)')
plt.colorbar(label='Cluster')
plt.savefig('kmeans_clusters.png')
plt.show()

# Access centroids (cluster centers)
print("Centroids (scaled):", kmeans.cluster_centers_)
print("Inertia (lower = tighter clusters):", kmeans.inertia_)

Choosing K: The Elbow Method

The hardest part of K-Means is choosing K — how many clusters? The Elbow Method runs K-Means for K=1 through K=10 and plots the inertia (sum of squared distances to centroids). The “elbow” in the plot is where adding more clusters stops significantly improving the fit.

inertias = []
K_range = range(1, 11)

for k in K_range:
    km = KMeans(n_clusters=k, random_state=42, n_init=10)
    km.fit(scaled_data)
    inertias.append(km.inertia_)

# Plot the elbow
plt.figure(figsize=(8, 4))
plt.plot(K_range, inertias, 'bo-')
plt.xlabel('Number of Clusters (K)')
plt.ylabel('Inertia')
plt.title('Elbow Method — Choosing K')
plt.xticks(K_range)
plt.grid(True)
plt.savefig('elbow_method.png')
plt.show()

# The "elbow" is typically at K=3 or K=4
# After the elbow, each additional cluster adds minimal improvement

Real-life analogy: The Elbow Method is like adding staff to a restaurant. Going from 1 to 2 chefs doubles output. 2 to 3 adds significant capacity. 3 to 4 helps a bit. 4 to 5 barely matters — the kitchen is already efficient. The “elbow” is where adding more stops making a meaningful difference.

K-Means Limitations

Must specify K in advance — if you guess wrong, clusters are meaningless
Assumes spherical clusters — struggles with elongated, crescent, or irregular shapes
Sensitive to outliers — one extreme point pulls the centroid away from the true center
Sensitive to initialization — random starting centroids can produce different results (use n_init=10 to mitigate)
Equal-size assumption — tends to create clusters of similar size even if natural groups differ

DBSCAN (Density-Based Clustering)

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) finds clusters based on density — regions where points are packed closely together. Unlike K-Means, it does not require you to specify K, it can find clusters of any shape, and it identifies outliers as noise.

Real-life analogy: Imagine you are at a party in a large hall. DBSCAN is like looking at the room from above. You see groups of people standing close together (clusters) and a few people standing alone near the walls (noise/outliers). The groups can be any shape — a circle around the buffet, a line at the bar, an irregular blob on the dance floor. DBSCAN finds them all without you telling it how many groups to look for.

How DBSCAN Works

DBSCAN uses two parameters:

eps (epsilon) — the radius of a neighborhood around each point
min_samples — minimum number of points required within eps to form a dense region

DBSCAN classifies every point into one of three types:

1. CORE POINT: Has >= min_samples points within eps radius
   (In the thick of the crowd)

2. BORDER POINT: Within eps of a core point but has < min_samples neighbors
   (Standing at the edge of a group)

3. NOISE POINT: Not within eps of any core point
   (Standing alone — outlier)

Algorithm:
  Step 1: Pick any unvisited point
  Step 2: Find all points within eps radius
  Step 3: If count >= min_samples → start a cluster
          Expand: check each neighbor's neighbors (recursive)
  Step 4: If count < min_samples → mark as noise (for now)
  Step 5: Repeat until all points visited

  Noise points near a later-discovered cluster become border points.

DBSCAN in Python

from sklearn.cluster import DBSCAN
from sklearn.preprocessing import StandardScaler
import numpy as np

# Generate sample data with two dense clusters and noise
from sklearn.datasets import make_moons
X, _ = make_moons(n_samples=300, noise=0.1, random_state=42)

# Scale the data
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Fit DBSCAN
dbscan = DBSCAN(eps=0.3, min_samples=5)
labels = dbscan.fit_predict(X_scaled)

# Results
n_clusters = len(set(labels)) - (1 if -1 in labels else 0)
n_noise = list(labels).count(-1)
print(f"Clusters found: {n_clusters}")      # 2
print(f"Noise points: {n_noise}")            # ~5 outliers
print(f"Core samples: {len(dbscan.core_sample_indices_)}")

# Visualize
import matplotlib.pyplot as plt
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis', s=30)
plt.title(f'DBSCAN: {n_clusters} clusters, {n_noise} noise points')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.colorbar(label='Cluster (-1 = noise)')
plt.savefig('dbscan_clusters.png')
plt.show()

Notice: DBSCAN found the two crescent-shaped clusters that K-Means would split incorrectly. And it automatically identified the scattered noise points — no manual outlier removal needed.

Choosing eps and min_samples

The k-distance plot helps you choose eps. For each point, compute the distance to its k-th nearest neighbor (where k = min_samples). Sort these distances and plot them. The “elbow” in the curve is a good eps value.

from sklearn.neighbors import NearestNeighbors

# k-distance plot to find optimal eps
k = 5  # same as min_samples
nn = NearestNeighbors(n_neighbors=k)
nn.fit(X_scaled)
distances, _ = nn.kneighbors(X_scaled)

# Sort the k-th neighbor distances
k_distances = np.sort(distances[:, k-1])

plt.figure(figsize=(8, 4))
plt.plot(k_distances)
plt.xlabel('Points (sorted)')
plt.ylabel(f'{k}-th Nearest Neighbor Distance')
plt.title('k-Distance Plot — Choose eps at the Elbow')
plt.grid(True)
plt.savefig('k_distance_plot.png')
plt.show()
# Look for the "knee" — the sharp increase indicates the boundary
# between dense regions (low distance) and sparse regions (high distance)

Rules of thumb:

min_samples: Start with 2 × number_of_features. For 2D data, min_samples=5 is common. Higher values = stricter (fewer clusters, more noise)
eps: Use the k-distance plot elbow. Too small = everything is noise. Too large = everything is one cluster

DBSCAN Strengths and Limitations

Strengths	Limitations
Does not require K in advance	Sensitive to eps and min_samples choice
Finds clusters of any shape	Struggles with clusters of varying density
Automatically identifies outliers as noise	Does not work well in high-dimensional data (curse of dimensionality)
Robust to outliers	Non-deterministic for border points
No equal-size assumption	Slower than K-Means on very large datasets

Hierarchical Clustering

Hierarchical clustering builds a tree of clusters (dendrogram) by either merging small clusters into larger ones (agglomerative — bottom-up) or splitting large clusters into smaller ones (divisive — top-down). You choose the number of clusters AFTER seeing the tree, not before.

Real-life analogy: Think of a company organizational chart built from the bottom up. Start with individual employees (each is their own cluster). The two most similar employees merge into a team. Then the two most similar teams merge into a department. Departments merge into divisions. Eventually everyone is in one company. The dendrogram shows this entire merging history, and you can “cut” the tree at any level to get the number of groups you want.

How Hierarchical Clustering Works

Agglomerative (bottom-up) — the most common approach:

Step 1: Start with N clusters (each point is its own cluster)
Step 2: Find the two closest clusters
Step 3: Merge them into one cluster
Step 4: Repeat Steps 2-3 until only one cluster remains
Step 5: Cut the dendrogram at the desired level

Example with 5 points: A, B, C, D, E

  Start: {A} {B} {C} {D} {E}         — 5 clusters
  Merge: {A,B} {C} {D} {E}           — 4 clusters (A and B were closest)
  Merge: {A,B} {C} {D,E}             — 3 clusters (D and E merged)
  Merge: {A,B} {C,D,E}               — 2 clusters
  Merge: {A,B,C,D,E}                 — 1 cluster

  Dendrogram captures every merge with the distance at which it happened.

Linkage methods determine how “distance between clusters” is measured:

Linkage	Distance Between Clusters	Best For
Single	Minimum distance between any two points	Elongated/chain-like clusters
Complete	Maximum distance between any two points	Compact, spherical clusters
Average	Average distance between all point pairs	Balanced approach
Ward	Minimizes total within-cluster variance	Equal-sized, compact clusters (most common)

Hierarchical Clustering in Python

from sklearn.cluster import AgglomerativeClustering
from scipy.cluster.hierarchy import dendrogram, linkage
from sklearn.preprocessing import StandardScaler

# Sample data
data = pd.DataFrame({
    'income': [15, 16, 45, 46, 75, 76, 77],
    'spending': [39, 41, 55, 58, 90, 88, 92]
})
scaled = StandardScaler().fit_transform(data)

# Method 1: Fit with a specific number of clusters
agg = AgglomerativeClustering(n_clusters=3, linkage='ward')
data['cluster'] = agg.fit_predict(scaled)
print(data.groupby('cluster').mean())

# Method 2: Build dendrogram FIRST, then decide number of clusters
linked = linkage(scaled, method='ward')

plt.figure(figsize=(10, 5))
dendrogram(linked, labels=data.index.tolist(), leaf_rotation=0)
plt.title('Dendrogram — Ward Linkage')
plt.xlabel('Data Point Index')
plt.ylabel('Distance')
plt.axhline(y=4, color='r', linestyle='--', label='Cut at 3 clusters')
plt.legend()
plt.savefig('dendrogram.png')
plt.show()
# The red line cuts the dendrogram into 3 groups

Reading a Dendrogram

              ┌───────┐
         ┌────┤       ├────┐         ← Cut here = 2 clusters
    ┌────┤    └───────┘    ├────┐
    │    │                 │    │    ← Cut here = 3 clusters
  ┌─┤  ┌─┤              ┌─┤  ┌─┤
  A B  C D              E F  G H   ← Cut here = 8 clusters (every point)

Reading the dendrogram:
  - Height of each merge = distance at which clusters joined
  - Tall vertical lines = clusters are far apart (natural separation)
  - Short vertical lines = clusters are close (maybe should not be separated)
  - Cut horizontally at a height = choose number of clusters
  - The longest vertical line without a horizontal crossing = best number of clusters

Comparison: K-Means vs DBSCAN vs Hierarchical

Feature	K-Means	DBSCAN	Hierarchical
Specify K?	Yes (required)	No (finds automatically)	Optional (cut dendrogram)
Cluster shapes	Spherical only	Any shape	Depends on linkage
Handles noise?	No (assigns all points)	Yes (labels noise as -1)	No (assigns all points)
Speed	Fast (O(n·K·i))	Medium (O(n²) or O(n log n))	Slow (O(n³) or O(n² log n))
Large datasets?	Excellent (millions of rows)	Good (up to ~500K rows)	Poor (struggles above ~10K rows)
Outlier handling	Poor (outliers pull centroids)	Excellent (outliers = noise)	Moderate
Interpretability	Easy (centroids = cluster profiles)	Moderate (density-based)	Visual (dendrogram)
Parameters	K	eps, min_samples	linkage, n_clusters or distance
Best for	Customer segmentation, quantization	Anomaly detection, geographic data	Gene expression, small datasets

Decision guide:

Do you know how many clusters?
  ├── YES → K-Means (fast, simple)
  └── NO
       ├── Do you have noise/outliers?
       │    ├── YES → DBSCAN (finds clusters + marks noise)
       │    └── NO → Hierarchical (build dendrogram, decide K visually)
       └── Is the dataset large (>50K rows)?
            ├── YES → K-Means or DBSCAN (Hierarchical is too slow)
            └── NO → Hierarchical (dendrogram gives the best visual insight)

Evaluation Metrics for Clustering

Unlike supervised learning (where you compare predictions to known labels), clustering evaluation is harder because there are no “correct” answers. These metrics measure cluster quality internally.

Silhouette Score

The most widely used metric. For each point, it measures how similar the point is to its own cluster versus the nearest other cluster.

from sklearn.metrics import silhouette_score, silhouette_samples

# Compute silhouette score for different K values
for k in range(2, 8):
    km = KMeans(n_clusters=k, random_state=42, n_init=10)
    labels = km.fit_predict(scaled_data)
    score = silhouette_score(scaled_data, labels)
    print(f"K={k}: Silhouette Score = {score:.3f}")

# Output:
# K=2: Silhouette Score = 0.581
# K=3: Silhouette Score = 0.672  ← Best (highest)
# K=4: Silhouette Score = 0.534
# K=5: Silhouette Score = 0.489

# Interpretation:
#  1.0 = perfect separation (each point far from other clusters)
#  0.0 = on the boundary between clusters
# -1.0 = wrong cluster (closer to a different cluster)
# > 0.5 = good clustering
# > 0.7 = strong clustering

Inertia and Davies-Bouldin Index

# Inertia (K-Means only): sum of squared distances to centroids
# Lower = tighter clusters, but always decreases with more K
print(f"Inertia: {kmeans.inertia_:.2f}")

# Davies-Bouldin Index: lower = better cluster separation
from sklearn.metrics import davies_bouldin_score
db_score = davies_bouldin_score(scaled_data, labels)
print(f"Davies-Bouldin: {db_score:.3f}")  # < 1.0 is good

Data Preprocessing for Clustering

Clustering is distance-based. If one feature is measured in thousands (income: 50,000) and another in single digits (age: 35), income dominates the distance calculation and age is ignored. Scaling is mandatory.

from sklearn.preprocessing import StandardScaler, MinMaxScaler

# StandardScaler: zero mean, unit variance (most common)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# MinMaxScaler: scales to [0, 1] range
scaler = MinMaxScaler()
X_scaled = scaler.fit_transform(X)

# Additional preprocessing:
# 1. Handle missing values BEFORE clustering (impute or drop)
# 2. Remove or cap extreme outliers (especially for K-Means)
# 3. Consider dimensionality reduction (PCA) if >10 features
# 4. One-hot encode categorical features
# 5. Use log transform for heavily skewed features

Real-life analogy: Scaling is like converting all measurements to the same unit before comparing. You would not compare a building’s height in meters with its width in millimeters — the height would seem enormously more important. Scaling puts every feature on equal footing.

Real-World Use Cases

Use Case	Algorithm	What It Does
Customer segmentation	K-Means	Group customers by spending behavior for targeted marketing
Fraud detection	DBSCAN	Normal transactions form dense clusters; fraud = noise points
Geographic analysis	DBSCAN	Find hotspots (crime clusters, earthquake zones, delivery areas)
Gene expression analysis	Hierarchical	Group similar genes; dendrogram shows evolutionary relationships
Image compression	K-Means	Reduce millions of colors to K representative colors
Document grouping	K-Means/DBSCAN	Cluster similar news articles or support tickets by topic
Anomaly detection in logs	DBSCAN	Normal log patterns = clusters; unusual patterns = noise
Market basket analysis	Hierarchical	Group products frequently purchased together

Common Mistakes

Not scaling features before clustering — the most common mistake. Features with larger ranges dominate distance calculations. Always use StandardScaler or MinMaxScaler before fitting any clustering algorithm.
Choosing K randomly for K-Means — use the Elbow Method or Silhouette Score to find the optimal K. “The business wants 5 segments” is fine, but validate that 5 produces meaningful clusters.
Using K-Means on non-spherical data — K-Means assumes round clusters. For crescent shapes, rings, or irregular blobs, use DBSCAN instead.
Ignoring DBSCAN noise points — noise points (label = -1) are not errors. They are outliers that do not belong to any cluster. Investigate them — they might be fraud, data quality issues, or genuinely unique records.
Using Hierarchical Clustering on large datasets — it has O(n³) time complexity. Above 10K-20K rows, it becomes impractically slow. Use K-Means or DBSCAN instead and reserve Hierarchical for small datasets where the dendrogram adds value.
Forgetting to interpret clusters — clustering finds groups, but YOU must interpret what they mean. After clustering, profile each group: “Cluster 0 has high income + low spending = potential upsell target.” Without interpretation, clusters are just numbers.

Interview Questions

Q: What is the difference between K-Means and DBSCAN? A: K-Means requires K (number of clusters) upfront, assumes spherical clusters, and assigns every point to a cluster. DBSCAN finds clusters automatically based on density, handles any cluster shape, and identifies outliers as noise (label = -1). Use K-Means for well-separated round clusters and DBSCAN for irregular shapes with noise.

Q: How do you choose the number of clusters in K-Means? A: Use the Elbow Method (plot inertia vs K, look for the elbow) and the Silhouette Score (higher is better, try K=2 to K=10). Combine both: the elbow suggests a range, the silhouette score confirms the best K within that range. Also consider domain knowledge — does the number of clusters make business sense?

Q: Why is feature scaling important for clustering? A: Clustering algorithms use distance (Euclidean) to measure similarity. If one feature ranges 0-100,000 (income) and another 0-100 (age), income dominates the distance calculation and age is effectively ignored. Scaling (StandardScaler or MinMaxScaler) puts all features on equal footing so each contributes equally to the distance.

Q: What is the Silhouette Score and what values indicate good clustering? A: The Silhouette Score measures how similar each point is to its own cluster versus the nearest other cluster. Ranges from -1 to 1. Above 0.5 = good clustering. Above 0.7 = strong clustering. Near 0 = points are on cluster boundaries. Negative = points are likely in the wrong cluster.

Q: When would you use Hierarchical Clustering over K-Means? A: When you do not know K and want to visualize the cluster hierarchy via a dendrogram. When the dataset is small (under 10K rows). When you need to explore different numbers of clusters without re-running the algorithm — just cut the dendrogram at different heights. K-Means is better for large datasets where speed matters.

Q: How does DBSCAN handle outliers? A: DBSCAN labels points that are not within eps distance of any core point as noise (cluster label = -1). These are points that do not belong to any dense region. This makes DBSCAN inherently robust to outliers — they are identified automatically rather than distorting cluster boundaries like in K-Means.

Wrapping Up

Clustering is your go-to tool when the data has no labels. K-Means is the workhorse — fast, simple, and effective for round clusters. DBSCAN excels at finding irregular shapes and identifying noise. Hierarchical Clustering gives you the best visual tool (dendrogram) for understanding cluster structure.

The key takeaway: there is no single best clustering algorithm. The right choice depends on your data shape, dataset size, whether you know K, and whether outliers matter. Start with K-Means (fastest to try), validate with silhouette scores, and switch to DBSCAN or Hierarchical when K-Means assumptions do not hold.

And never forget: clustering finds the groups, but YOU give them meaning. The algorithm says “these 500 customers are similar.” You say “they are our high-value, low-engagement segment — let us send them a re-engagement campaign.”

← Previous: Fine-Tuning LLMs

AI/ML (8/9)

Next: Hyperparameter Tuning →

Naveen Vuppula is a Senior Data Engineering Consultant and app developer based in Ontario, Canada. He writes about Python, SQL, AWS, Azure, and everything data engineering at DriveDataScience.com.

Clustering Algorithms: K-Means, DBSCAN, and Hierarchical Clustering — Unsupervised Learning, Segmentation, and When to Use Each

Clustering Algorithms: K-Means, DBSCAN, and Hierarchical Clustering — Unsupervised Learning, Segmentation, and When to Use Each

Table of Contents

Supervised vs Unsupervised Learning

What Is Clustering?

K-Means Clustering

How K-Means Works (Step by Step)

K-Means in Python

Choosing K: The Elbow Method

K-Means Limitations

DBSCAN (Density-Based Clustering)

How DBSCAN Works

DBSCAN in Python

Choosing eps and min_samples

DBSCAN Strengths and Limitations

Hierarchical Clustering

How Hierarchical Clustering Works

Hierarchical Clustering in Python

Reading a Dendrogram

Comparison: K-Means vs DBSCAN vs Hierarchical

Evaluation Metrics for Clustering

Silhouette Score

Inertia and Davies-Bouldin Index

Data Preprocessing for Clustering

Real-World Use Cases

Common Mistakes

Interview Questions

Wrapping Up

Leave a Comment Cancel Reply

Clustering Algorithms: K-Means, DBSCAN, and Hierarchical Clustering — Unsupervised Learning, Segmentation, and When to Use Each

Table of Contents

Supervised vs Unsupervised Learning

What Is Clustering?

K-Means Clustering

How K-Means Works (Step by Step)

K-Means in Python

Choosing K: The Elbow Method

K-Means Limitations

DBSCAN (Density-Based Clustering)

How DBSCAN Works

DBSCAN in Python

Choosing eps and min_samples

DBSCAN Strengths and Limitations

Hierarchical Clustering

How Hierarchical Clustering Works

Hierarchical Clustering in Python

Reading a Dendrogram

Comparison: K-Means vs DBSCAN vs Hierarchical

Evaluation Metrics for Clustering

Silhouette Score

Inertia and Davies-Bouldin Index

Data Preprocessing for Clustering

Real-World Use Cases

Common Mistakes

Interview Questions

Wrapping Up

Related Posts

Leave a Comment Cancel Reply