Louvain Algorithm
The Louvain algorithm, developed at the University of Louvain in Belgium, is one of the most widely used methods for community detection in large networks. It employs a greedy optimization approach to maximize modularity, discovering hierarchical community structure efficiently.1
Algorithm Overview
The Louvain method is an agglomerative algorithm that operates in two alternating phases, repeated iteratively until no further improvement in modularity is possible.
Phase 1: Local Optimization (Modularity Optimization)
- Start with each node in its own community
- For each node $i$, calculate the modularity gain from moving it to each neighbor’s community
- Move $i$ to the community yielding the maximum positive gain
- Repeat until no move improves modularity
Phase 2: Network Aggregation
- Create a new network where each community becomes a single node
- Edge weights between new nodes = sum of edges between communities
- Self-loops = sum of internal edges within the original community
Iteration
Repeat Phase 1 and Phase 2 on the coarsened network until modularity converges.
Visual Representation
Initial State: After Phase 1: After Phase 2:
Each node = community Nodes grouped Communities = nodes
○ ○ ○ ○ ○ ○ ┌───┐ ┌───┐ ●───────●
\│╱ \│╱ │○○○│─────│○○○│ \ /
○───○ │ ○ │ │ ○ │ \ /
╱│╲ ╱│╲ └───┘ └───┘ ●
○ ○ ○ ○ ○ ○ (coarsened)
Modularity Gain Calculation
When moving node $i$ from community $C$ to community $D$, the modularity change is:
$$ \Delta Q = \frac{k_{i,D} - k_{i,C}}{m} - \gamma \frac{k_i(\Sigma_D - \Sigma_C + k_i)}{2m^2} $$
Where:
- $k_{i,D}$ = sum of edge weights from $i$ to nodes in community $D$
- $k_{i,C}$ = sum of edge weights from $i$ to nodes in community $C$ (excluding $i$)
- $\Sigma_D$ = sum of degrees of nodes in $D$
- $\Sigma_C$ = sum of degrees of nodes in $C$ (excluding $i$)
- $k_i$ = degree of node $i$
- $m$ = total edge weight
- $\gamma$ = resolution parameter (default = 1)
Simplified Formula
For practical implementation, the gain from moving $i$ into community $C$ can be computed as:
$$ \Delta Q = \left[ \frac{\Sigma_{in} + 2k_{i,in}}{2m} - \left(\frac{\Sigma_{tot} + k_i}{2m}\right)^2 \right] - \left[ \frac{\Sigma_{in}}{2m} - \left(\frac{\Sigma_{tot}}{2m}\right)^2 - \left(\frac{k_i}{2m}\right)^2 \right] $$
Where:
- $\Sigma_{in}$ = sum of weights of edges inside $C$
- $\Sigma_{tot}$ = sum of degrees of nodes in $C$
- $k_{i,in}$ = sum of weights of edges from $i$ to nodes in $C$
Algorithm Pseudocode
function LOUVAIN(G):
// Initialize: each node in its own community
communities = {v: v for v in G.nodes}
improvement = True
while improvement:
improvement = False
// PHASE 1: Local modularity optimization
for node in random_order(G.nodes):
best_community = communities[node]
best_gain = 0
for neighbor in G.neighbors(node):
gain = calculate_modularity_gain(node, communities[neighbor])
if gain > best_gain:
best_gain = gain
best_community = communities[neighbor]
if best_community != communities[node]:
communities[node] = best_community
improvement = True
// PHASE 2: Network aggregation
if improvement:
G = aggregate_communities(G, communities)
communities = {v: v for v in G.nodes}
return communities
Resolution Parameter
The resolution parameter $\gamma$ controls the granularity of detected communities:
| $\gamma$ Value | Effect |
|---|---|
| $\gamma < 1$ | Fewer, larger communities |
| $\gamma = 1$ | Standard modularity |
| $\gamma > 1$ | More, smaller communities |
This helps address the resolution limit problem inherent in modularity optimization.
Complexity Analysis
| Aspect | Complexity |
|---|---|
| Time (sparse networks) | $O(n \log n)$ |
| Time (worst case) | $O(n^2)$ |
| Space | $O(n + m)$ |
| Typical iterations | 2-10 passes |
The algorithm is highly efficient due to:
- Local updates (only considering neighbors)
- Greedy optimization (no backtracking)
- Hierarchical coarsening (network shrinks each iteration)
Advantages
- Speed: Near-linear time complexity for sparse networks
- Scalability: Handles networks with millions of nodes
- Quality: Produces high-modularity partitions
- Hierarchy: Naturally discovers multi-scale structure
- No prior knowledge: Does not require specifying number of communities
- Weighted networks: Works with edge weights
Limitations
- Non-deterministic: Results vary based on node processing order
- Resolution limit: May merge small communities in large networks
- Disconnected communities: Can produce poorly connected communities2
- Local optima: Greedy approach may miss global optimum
- Sensitive to initialization: Different runs may yield different results
Comparison with Leiden
| Aspect | Louvain | Leiden |
|---|---|---|
| Community quality | Good | Better (guaranteed connected) |
| Speed | Fast | Slightly slower |
| Refinement step | No | Yes |
| Guarantees | None | Well-connected communities |
| Implementation complexity | Simpler | More complex |
The Leiden Algorithm was developed specifically to address Louvain’s disconnected communities issue.
Implementation
Python with NetworkX
import networkx as nx
from networkx.algorithms.community import louvain_communities
# Create or load a graph
G = nx.karate_club_graph()
# Run Louvain algorithm
communities = louvain_communities(G, seed=42, resolution=1.0)
# Convert to list for easier handling
communities_list = [list(c) for c in communities]
# Print results
for i, community in enumerate(communities_list):
print(f"Community {i}: {community}")
# Calculate modularity
from networkx.algorithms.community import modularity
Q = modularity(G, communities)
print(f"Modularity: {Q:.4f}")
Python with python-louvain
import community as community_louvain
import networkx as nx
G = nx.karate_club_graph()
# Get partition dictionary
partition = community_louvain.best_partition(G, resolution=1.0)
# Calculate modularity
modularity = community_louvain.modularity(partition, G)
print(f"Modularity: {modularity:.4f}")
# Count communities
n_communities = len(set(partition.values()))
print(f"Number of communities: {n_communities}")
Practical Tips
- Run multiple times: Due to non-determinism, run 10-50 times and select best result
- Try different resolutions: Explore $\gamma$ from 0.5 to 2.0
- Set random seed: For reproducibility in testing
- Validate results: Check community connectivity and size distribution
- Consider Leiden: For applications requiring guaranteed quality
Applications
| Domain | Use Case |
|---|---|
| Social Networks | Finding friend groups, influencer communities |
| Biology | Protein complex detection, metabolic pathways |
| Citation Networks | Research topic clustering |
| E-commerce | Customer segmentation |
| Telecommunications | Call pattern analysis |
Related Topics
- Community Detection - Overview of community detection methods
- Leiden Algorithm - Improved version of Louvain
- Modularity - Quality function being optimized
- Graph Neural Networks - Deep learning approaches
References
Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008. https://arxiv.org/abs/0803.0476 ↩︎
Traag, V. A., Waltman, L., & van Eck, N. J. (2019). From Louvain to Leiden: guaranteeing well-connected communities. Scientific Reports, 9(1), 5233. https://www.nature.com/articles/s41598-019-41695-z ↩︎