Louvain Algorithm

The Louvain algorithm, developed at the University of Louvain in Belgium, is one of the most widely used methods for community detection in large networks. It employs a greedy optimization approach to maximize modularity, discovering hierarchical community structure efficiently.1


Algorithm Overview

The Louvain method is an agglomerative algorithm that operates in two alternating phases, repeated iteratively until no further improvement in modularity is possible.

Phase 1: Local Optimization (Modularity Optimization)

  1. Start with each node in its own community
  2. For each node $i$, calculate the modularity gain from moving it to each neighbor’s community
  3. Move $i$ to the community yielding the maximum positive gain
  4. Repeat until no move improves modularity

Phase 2: Network Aggregation

  1. Create a new network where each community becomes a single node
  2. Edge weights between new nodes = sum of edges between communities
  3. Self-loops = sum of internal edges within the original community

Iteration

Repeat Phase 1 and Phase 2 on the coarsened network until modularity converges.


Visual Representation

Initial State:          After Phase 1:           After Phase 2:
Each node = community   Nodes grouped            Communities = nodes

○ ○ ○ ○ ○ ○            ┌───┐     ┌───┐          ●───────●
 \│╱ \│╱               │○○○│─────│○○○│            \   /
  ○───○                │ ○ │     │ ○ │             \ /
 ╱│╲ ╱│╲               └───┘     └───┘              ●
○ ○ ○ ○ ○ ○                                    (coarsened)

Modularity Gain Calculation

When moving node $i$ from community $C$ to community $D$, the modularity change is:

$$ \Delta Q = \frac{k_{i,D} - k_{i,C}}{m} - \gamma \frac{k_i(\Sigma_D - \Sigma_C + k_i)}{2m^2} $$

Where:

  • $k_{i,D}$ = sum of edge weights from $i$ to nodes in community $D$
  • $k_{i,C}$ = sum of edge weights from $i$ to nodes in community $C$ (excluding $i$)
  • $\Sigma_D$ = sum of degrees of nodes in $D$
  • $\Sigma_C$ = sum of degrees of nodes in $C$ (excluding $i$)
  • $k_i$ = degree of node $i$
  • $m$ = total edge weight
  • $\gamma$ = resolution parameter (default = 1)

Simplified Formula

For practical implementation, the gain from moving $i$ into community $C$ can be computed as:

$$ \Delta Q = \left[ \frac{\Sigma_{in} + 2k_{i,in}}{2m} - \left(\frac{\Sigma_{tot} + k_i}{2m}\right)^2 \right] - \left[ \frac{\Sigma_{in}}{2m} - \left(\frac{\Sigma_{tot}}{2m}\right)^2 - \left(\frac{k_i}{2m}\right)^2 \right] $$

Where:

  • $\Sigma_{in}$ = sum of weights of edges inside $C$
  • $\Sigma_{tot}$ = sum of degrees of nodes in $C$
  • $k_{i,in}$ = sum of weights of edges from $i$ to nodes in $C$

Algorithm Pseudocode

function LOUVAIN(G):
    // Initialize: each node in its own community
    communities = {v: v for v in G.nodes}
    
    improvement = True
    while improvement:
        improvement = False
        
        // PHASE 1: Local modularity optimization
        for node in random_order(G.nodes):
            best_community = communities[node]
            best_gain = 0
            
            for neighbor in G.neighbors(node):
                gain = calculate_modularity_gain(node, communities[neighbor])
                if gain > best_gain:
                    best_gain = gain
                    best_community = communities[neighbor]
            
            if best_community != communities[node]:
                communities[node] = best_community
                improvement = True
        
        // PHASE 2: Network aggregation
        if improvement:
            G = aggregate_communities(G, communities)
            communities = {v: v for v in G.nodes}
    
    return communities

Resolution Parameter

The resolution parameter $\gamma$ controls the granularity of detected communities:

$\gamma$ ValueEffect
$\gamma < 1$Fewer, larger communities
$\gamma = 1$Standard modularity
$\gamma > 1$More, smaller communities

This helps address the resolution limit problem inherent in modularity optimization.


Complexity Analysis

AspectComplexity
Time (sparse networks)$O(n \log n)$
Time (worst case)$O(n^2)$
Space$O(n + m)$
Typical iterations2-10 passes

The algorithm is highly efficient due to:

  • Local updates (only considering neighbors)
  • Greedy optimization (no backtracking)
  • Hierarchical coarsening (network shrinks each iteration)

Advantages

  1. Speed: Near-linear time complexity for sparse networks
  2. Scalability: Handles networks with millions of nodes
  3. Quality: Produces high-modularity partitions
  4. Hierarchy: Naturally discovers multi-scale structure
  5. No prior knowledge: Does not require specifying number of communities
  6. Weighted networks: Works with edge weights

Limitations

  1. Non-deterministic: Results vary based on node processing order
  2. Resolution limit: May merge small communities in large networks
  3. Disconnected communities: Can produce poorly connected communities2
  4. Local optima: Greedy approach may miss global optimum
  5. Sensitive to initialization: Different runs may yield different results

Comparison with Leiden

AspectLouvainLeiden
Community qualityGoodBetter (guaranteed connected)
SpeedFastSlightly slower
Refinement stepNoYes
GuaranteesNoneWell-connected communities
Implementation complexitySimplerMore complex

The Leiden Algorithm was developed specifically to address Louvain’s disconnected communities issue.


Implementation

Python with NetworkX

import networkx as nx
from networkx.algorithms.community import louvain_communities

# Create or load a graph
G = nx.karate_club_graph()

# Run Louvain algorithm
communities = louvain_communities(G, seed=42, resolution=1.0)

# Convert to list for easier handling
communities_list = [list(c) for c in communities]

# Print results
for i, community in enumerate(communities_list):
    print(f"Community {i}: {community}")

# Calculate modularity
from networkx.algorithms.community import modularity
Q = modularity(G, communities)
print(f"Modularity: {Q:.4f}")

Python with python-louvain

import community as community_louvain
import networkx as nx

G = nx.karate_club_graph()

# Get partition dictionary
partition = community_louvain.best_partition(G, resolution=1.0)

# Calculate modularity
modularity = community_louvain.modularity(partition, G)
print(f"Modularity: {modularity:.4f}")

# Count communities
n_communities = len(set(partition.values()))
print(f"Number of communities: {n_communities}")

Practical Tips

  1. Run multiple times: Due to non-determinism, run 10-50 times and select best result
  2. Try different resolutions: Explore $\gamma$ from 0.5 to 2.0
  3. Set random seed: For reproducibility in testing
  4. Validate results: Check community connectivity and size distribution
  5. Consider Leiden: For applications requiring guaranteed quality

Applications

DomainUse Case
Social NetworksFinding friend groups, influencer communities
BiologyProtein complex detection, metabolic pathways
Citation NetworksResearch topic clustering
E-commerceCustomer segmentation
TelecommunicationsCall pattern analysis

  • Community Detection - Overview of community detection methods
  • Leiden Algorithm - Improved version of Louvain
  • Modularity - Quality function being optimized
  • Graph Neural Networks - Deep learning approaches

References


  1. Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), P10008. https://arxiv.org/abs/0803.0476 ↩︎

  2. Traag, V. A., Waltman, L., & van Eck, N. J. (2019). From Louvain to Leiden: guaranteeing well-connected communities. Scientific Reports, 9(1), 5233. https://www.nature.com/articles/s41598-019-41695-z ↩︎