Luke a Pro

Luke Sun

Developer & Marketer

šŸ‡ŗšŸ‡¦

Union-Find (Disjoint Set)

| , 4 minutes reading.

Union-Find: The Digital Bureaucrat

The Story: The Party Planner’s Dilemma

Imagine you are hosting a massive party with 1,000 guests. You want to know if two random people, Alice and Bob, know each other.

  • Alice knows Carol.
  • Carol knows Dave.
  • Dave knows Bob.

Tracing this chain every time is exhausting (O(N)O(N)).

In 1964, Bernard Galler and Michael Fisher proposed a solution that mirrors a Political Party: Everyone belongs to a group, and every group has a ā€œRepresentativeā€ (The Boss).

If you ask Alice ā€œWho is your boss?ā€, she says ā€œCarolā€. You ask Carol, she says ā€œDaveā€. You ask Dave, he says ā€œMeā€. If you ask Bob ā€œWho is your boss?ā€, he also says ā€œDaveā€. Since they have the same boss, they are in the same group. Simple.

But they added a twist called Path Compression: The next time you ask Alice, she skips the middleman and points directly to Dave. This simple trick flattens the hierarchy, making the algorithm nearly instantaneous.

Why do we need it?

Union-Find is designed for one thing: Dynamic Connectivity.

  • Social Networks: ā€œYou and X have 5 mutual friends.ā€
  • Image Processing: ā€œMagic Wandā€ tool in Photoshop uses this to find all connected pixels of the same color.
  • Kruskal’s Algorithm: Building a Minimum Spanning Tree (connecting cities with the cheapest cables) requires constantly checking ā€œAre these two cities already connected?ā€

How the Algorithm ā€œThinksā€

The algorithm is a bureaucrat with two simple jobs:

  1. Find (The Inquiry): It asks a node: ā€œWho is your leader?ā€ If the node points to someone else, it follows the finger up the chain until it finds the person pointing to themselves. Crucial Optimization: As it walks back down, it whispers to everyone in the chain: ā€œBy the way, the big boss is Dave. Next time, just point directly to him.ā€ (Path Compression)

  2. Union (The Merger): It connects two groups. It takes the boss of Group A and tells them: ā€œFrom now on, you report to the boss of Group B.ā€ Optimization: It always makes the smaller group report to the larger group to keep the tree flat. (Union by Rank)

The ā€œInverse Ackermannā€ Miracle

Union-Find is famous for its speed. Its time complexity is O(α(N))O(\alpha(N)), where α\alpha is the Inverse Ackermann function. What does that mean? For all practical purposes (even if NN is the number of atoms in the universe), α(N)≤4\alpha(N) \le 4. It is, effectively, Constant Time O(1)O(1).

Implementation (Python)

class UnionFind:
    def __init__(self, size):
        # Initially, everyone is their own boss
        self.parent = list(range(size))
        # Rank helps us merge smaller trees into larger ones
        self.rank = [1] * size

    def find(self, p):
        # 1. Path Compression: The Bureaucratic Shortcut
        if self.parent[p] != p:
            # Recursively find the boss and update my parent directly to the boss
            self.parent[p] = self.find(self.parent[p])
        return self.parent[p]

    def union(self, p, q):
        # 2. Union by Rank: The Merger
        rootP = self.find(p)
        rootQ = self.find(q)

        if rootP != rootQ:
            # Attach the shorter tree to the taller tree
            if self.rank[rootP] > self.rank[rootQ]:
                self.parent[rootQ] = rootP
            elif self.rank[rootP] < self.rank[rootQ]:
                self.parent[rootP] = rootQ
            else:
                self.parent[rootQ] = rootP
                self.rank[rootP] += 1
            return True # Successfully merged
        return False # Already connected

# Example Usage
# uf = UnionFind(10)
# uf.union(1, 2)
# uf.union(2, 5)
# print(uf.find(1) == uf.find(5)) # True

Summary

Union-Find teaches us that the best way to manage a complex organization is to flatten the hierarchy. By empowering every member to know the ā€œBig Bossā€ directly, it turns a tangled web of relationships into a lightning-fast lookup system.