Leiden Algorithm

The Leiden algorithm, developed at Leiden University, is an advanced community detection method that addresses key limitations of the Louvain Algorithm. Its primary innovation is guaranteeing that all detected communities are well-connected, eliminating the disconnected community problem that can occur with Louvain.¹

Motivation: Louvain’s Limitations

The Louvain algorithm, while fast and effective, can produce poorly connected or even disconnected communities. This occurs because:

Nodes may be “trapped” in communities after their connecting nodes have moved
The greedy local moves don’t consider global community structure
No mechanism ensures internal community connectivity

Example of disconnected community in Louvain:

Before:                      After Louvain:
    A───B───C───D           ┌─────────────────┐
    │       │   │           │ Community 1     │
    E───F───G───H           │ A B   C D       │  ← A,B disconnected
        │                   │   │   │ │       │    from C,D!
        I                   │   E F G H I     │
                            └─────────────────┘

Algorithm Overview

Leiden introduces a three-phase iterative process:

Phase 1: Local Moving

Similar to Louvain’s first phase:

Each node starts in its own community
Nodes move to neighboring communities that maximize modularity gain
Continue until no improvement is possible

After initial assignment, refine community assignments:

For each community, create a refined partition
Nodes can only move to communities reachable via edges
Merge only well-connected node subsets
Ensures all resulting communities are internally connected

Phase 3: Aggregation

Create the aggregate network:

Each refined community becomes a super-node
Aggregate edges between communities
Maintain the refined partition structure

Iteration

Repeat all three phases on the coarsened network until convergence.

Visual Comparison

LOUVAIN (2 phases):
┌──────────┐    ┌──────────┐
│  Local   │───▶│Aggregate │───▶ Repeat
│  Moving  │    │          │
└──────────┘    └──────────┘

LEIDEN (3 phases):
┌──────────┐    ┌──────────┐    ┌──────────┐
│  Local   │───▶│Refinement│───▶│Aggregate │───▶ Repeat
│  Moving  │    │          │    │          │
└──────────┘    └──────────┘    └──────────┘
                     ▲
              Guarantees connectivity

The refinement phase is what makes Leiden superior. It works as follows:

Consider each community from Phase 1 as a subgraph
Initialize each node in its own sub-community within the community
Merge nodes only if they are:
- Directly connected by an edge
- The merge improves the partition quality
Build connected components within each original community

function REFINE(G, partition):
    refined_partition = {}
    
    for community in partition.communities:
        subgraph = G.subgraph(community.nodes)
        
        // Initialize each node in its own sub-community
        sub_partition = {node: {node} for node in community.nodes}
        
        // Only merge connected nodes
        for node in random_order(community.nodes):
            for neighbor in subgraph.neighbors(node):
                if merge_improves_quality(node, neighbor, sub_partition):
                    merge(node, neighbor, sub_partition)
        
        // Add refined sub-communities to result
        refined_partition.add(sub_partition)
    
    return refined_partition

Quality Function: CPM vs Modularity

While Leiden can optimize modularity, it also supports the Constant Potts Model (CPM), which avoids the resolution limit:

Modularity

$$ Q = \sum_c \left[ \frac{e_c}{m} - \gamma \left( \frac{n_c}{2m} \right)^2 \right] $$

CPM (Constant Potts Model)

$$ H = \sum_c \left[ e_c - \gamma \binom{n_c}{2} \right] $$

Where:

$e_c$ = number of edges inside community $c$
$n_c$ = number of nodes in community $c$
$\gamma$ = resolution parameter

CPM advantage: The resolution parameter has a direct interpretation as the expected edge density within communities.

Guaranteed Properties

The Leiden algorithm guarantees:

Property	Description
$\gamma$-connected	All communities are locally optimal w.r.t. resolution $\gamma$
Well-connected	No community can be disconnected by removing a single node
Subset optimality	Communities cannot be improved by moving subsets of nodes
Convergence	Algorithm converges to a stable partition

Complexity Analysis

Aspect	Complexity
Time (per iteration)	$O(m)$
Total time (typical)	$O(m \cdot \text{iterations})$
Space	$O(n + m)$
Iterations	Typically 5-10

Leiden is slightly slower per iteration than Louvain due to the refinement phase, but often converges faster and produces better results.

Comparison: Louvain vs Leiden

Aspect	Louvain	Leiden
Phases	2	3
Connected communities	Not guaranteed	Guaranteed
Partition quality	Good	Better
Speed	Faster per iteration	Faster convergence
Determinism	Non-deterministic	Non-deterministic
Implementation complexity	Simpler	More complex
Research applications	Established	Recommended

Implementation

Python with leidenalg

import igraph as ig
import leidenalg

# Create graph
G = ig.Graph.Famous("Zachary")

# Run Leiden with modularity optimization
partition = leidenalg.find_partition(
    G, 
    leidenalg.ModularityVertexPartition
)

print(f"Number of communities: {len(partition)}")
print(f"Modularity: {partition.modularity:.4f}")

# With CPM and custom resolution
partition_cpm = leidenalg.find_partition(
    G,
    leidenalg.CPMVertexPartition,
    resolution_parameter=0.1
)

Python with cdlib

from cdlib import algorithms
import networkx as nx

# Create graph
G = nx.karate_club_graph()

# Run Leiden
communities = algorithms.leiden(G)

# Get community membership
for i, com in enumerate(communities.communities):
    print(f"Community {i}: {com}")

Resolution Parameter Exploration

import leidenalg
import igraph as ig
import matplotlib.pyplot as plt

G = ig.Graph.Famous("Zachary")
resolutions = [0.01, 0.05, 0.1, 0.5, 1.0, 2.0]

for res in resolutions:
    partition = leidenalg.find_partition(
        G,
        leidenalg.CPMVertexPartition,
        resolution_parameter=res
    )
    print(f"Resolution {res}: {len(partition)} communities")

Practical Guidelines

When to Use Leiden Over Louvain

Quality is critical: Research publications, final analyses
Connectivity matters: Applications where disconnected communities are problematic
Large networks: Leiden often finds better partitions faster
Resolution exploration: CPM provides interpretable resolution parameter

Parameter Tuning

Parameter	Recommended Values	Notes
Resolution (Modularity)	0.5 - 2.0	Higher = smaller communities
Resolution (CPM)	0.001 - 0.1	Depends on network density
Number of iterations	Until convergence	Usually 5-10
Number of runs	10-100	For stability analysis

Validation

Check connectivity: Verify all communities are connected
Size distribution: Look for reasonable community sizes
Modularity: Compare with Louvain baseline
Stability: Multiple runs should give similar results

Applications

Domain	Application
Single-cell analysis	Cell type clustering in scRNA-seq
Social networks	Community structure analysis
Citation networks	Research field identification
Protein networks	Functional module detection
Financial networks	Market segment analysis

Community Detection - Overview of community detection methods
Louvain Algorithm - Predecessor algorithm
Modularity - Quality function
Graph Neural Networks - Deep learning on graphs

References

Traag, V. A., Waltman, L., & van Eck, N. J. (2019). From Louvain to Leiden: guaranteeing well-connected communities. Scientific Reports, 9(1), 5233. https://www.nature.com/articles/s41598-019-41695-z ↩︎

Leiden Algorithm

Leiden Algorithm

Motivation: Louvain’s Limitations

Algorithm Overview

Phase 1: Local Moving

Phase 2: Refinement (Leiden’s Innovation)

Phase 3: Aggregation

Iteration

Visual Comparison

The Refinement Phase in Detail

Refinement Pseudocode

Quality Function: CPM vs Modularity

Modularity

CPM (Constant Potts Model)

Guaranteed Properties

Complexity Analysis

Comparison: Louvain vs Leiden

Implementation

Python with leidenalg

Python with cdlib

Resolution Parameter Exploration

Practical Guidelines

When to Use Leiden Over Louvain

Parameter Tuning

Validation

Applications

References

🔗 Referenced By

Leiden Algorithm

Motivation: Louvain’s Limitations

Algorithm Overview

Phase 1: Local Moving

Phase 2: Refinement (Leiden’s Innovation)

Phase 3: Aggregation

Iteration

Visual Comparison

The Refinement Phase in Detail

Refinement Pseudocode

Quality Function: CPM vs Modularity

Modularity

CPM (Constant Potts Model)

Guaranteed Properties

Complexity Analysis

Comparison: Louvain vs Leiden

Implementation

Python with leidenalg

Python with cdlib

Resolution Parameter Exploration

Practical Guidelines

When to Use Leiden Over Louvain

Parameter Tuning

Validation

Applications

Related Topics

References

🔗 Referenced By