Bron to Clique
- 8 minutes read - 1522 wordsDiscovery of Bron-Kerbosch in AoC24-23
For part 2 of this challenge, I am actually ashamed of showing here what I initially tried to program without knowing what a clique was, nor that an algorithm existed to find the maximal cliques in a graph… Maybe one day when I add a Premium Pass to this blog, a few privileged users could see the pΓ©pite.

Part 1 - the piece of π°
As The Historians wander around a secure area at Easter Bunny HQ, you come across posters for a LAN party scheduled for today!
Maybe you can find it; you connect to a nearby datalink port and download a map of the local network (your puzzle input).
The network map provides a list of every connection between two computers. For example:
kh-tc
qp-kh
de-cg
ka-co
Each line of text in the network map represents a single connection; the line kh-tc represents a connection between the
computer named kh and the computer named tc. Connections aren't directional; tc-kh would mean exactly the same thing.
LAN parties typically involve multiplayer games, so maybe you can locate it by finding groups of connected computers.
Start by looking for sets of three computers where each computer in the set is connected to the other two computers.
If the Chief Historian is here, and he's at the LAN party, it would be best to know that right away. You're pretty
sure his computer's name starts with t, so consider only sets of three computers where at least one computer's name
starts with t. That narrows the list down to 7 sets of three inter-connected computers:
co,de,ta
co,ka,ta
de,ka,ta
qp,td,wh
tb,vc,wq
tc,td,wh
td,wh,yn
Find all the sets of three inter-connected computers. How many contain at least one computer with a name that starts
with t?
Initial Thoughts
Setting the “starts with t” requirement aside, the list of computers given is a list of edges connecting two computers (nodes).
def is_name_valid(a: str, b: str, c: str) -> bool:
return a.startswith('t') or b.startswith('t') or c.startswith('t')
What we are looking for is pretty straightforward: 3 computers connected, each connected to the other two. For A, B and C, we are really looking for the edges A-B, A-C and B-C in any orientation given the graph is undirected.
I will learn
later this was called a clique. A clique is a set of nodes in which every
pair of nodes is connected by an edge.
First thing to do is to build the adjacency list for the graph.
For each pair of nodes (origin, related_computer_i), the common neighbours are extracted. There are as many trios as there are common
neighbours for the defined pair.
There is no need to look for the origin node within the adjacency list of the related_computer. Because the graph is
undirected, this relationship is guaranteed to be there given how the adjacency list is built (both sided).
Using this method, there is a risk of performing redundant trio search given edges are both sided - namely A-B and B-A.
A simple if condition related_computer > origin divides the iterations by two, by arbitrarily choosing
one side to process.
def find_trios(pairs: dict[str, set]) -> set:
relationships = defaultdict(set)
for pair in pairs:
relationships[pair[0]].add(pair[1])
relationships[pair[1]].add(pair[0])
trios = set()
for origin in relationships.keys():
for related_computer in relationships[origin]:
if related_computer > origin: # removes redundant checks A-B / B-A
common = relationships[related_computer] & relationships[origin]
for computer in common:
if is_name_valid(origin, related_computer, computer):
trio = frozenset([origin, related_computer, computer])
trios.add(trio)
return trios
Part 2 - the tough πͺ
There are still way too many results to go through them all. You'll have to find the LAN party another way and go there yourself.
Since it doesn't seem like any employees are around, you figure they must all be at the LAN party. If that's true, the
LAN party will be the largest set of computers that are all connected to each other. That is, for each computer at the LAN party,
that computer will have a connection to every other computer at the LAN party.
In the above example, the largest set of computers that are all connected to each other is made up of co, de, ka, and ta.
Each computer in this set has a connection to every other computer in the set:
ka-co
ta-co
de-co
ta-ka
de-ta
ka-de
The LAN party posters say that the password to get into the LAN party is the name of every computer at the LAN party,
sorted alphabetically, then joined together with commas. (The people running the LAN party are clearly a bunch of nerds.)
In this example, the password would be co,de,ka,ta.
What is the password to get into the LAN party?
What we’re looking for here is the maximum clique in this graph. Once this clique is obtained, it needs to be sorted and joined to serve as a password. Putting the second requirement aside using the below.
def get_lan_password(names: list[str]) -> str:
return ",".join(sorted(names))
This is when the Bron-Kerbosch algorithm comes to the rescue.
The algorithm is based on 3 sets: the clique we are trying to expand R, the nodes we are yet to consider to extend
the current clique P and the nodes we have already encountered X, to avoid duplicate work.
In its simplest form the algorithm is as below:
BronKerbosch(R, P, X):
if P is empty and X is empty:
report R as a maximal clique
return
for each node in P:
BronKerbosch(R βͺ {node}, P β© N(node), X β© N(node))
P = P \ {node}
X = X βͺ {node}
Let’s walk through an example. Given the edges aa-bb, bb-cc, cc-aa, bb-dd, the algorithm would find 2 maximal cliques: [aa-bb-cc] and [bb-dd].
aa
/ \
bb---cc
|
dd
By adding nodes to the set X, we prevent the algorithm from ‘marking’ maximal cliques which have already been found.
In a similar way, by intersecting P with the neighbours of the current node, we ensure the nodes explored in the recursion level
are all neighbours of nodes already in R - R being unioned with the current node at each search level.
It also ensures all nodes added to R extending the current clique, meet the requirement to belong to that clique.
When backtracking, removing the current node from P and adding it to X makes the algorithm stop iterating early when
said node has already been explored.
bron_kerbosch(R={}, P={aa, bb, cc, dd}, X={})
βββ node = aa
β βββ bron_kerbosch(R={aa}, P={bb, cc}, X={})
β βββ node = bb
β β βββ bron_kerbosch(R={aa, bb}, P={cc}, X={})
β β βββ node = cc
β β βββ bron_kerbosch(R={aa, bb, cc}, P={}, X={}) β
Maximal clique!
β βββ node = cc
β βββ bron_kerbosch(R={aa, cc}, P={}, X={bb}) β Not maximal
βββ node = bb
β βββ bron_kerbosch(R={bb}, P={cc, dd}, X={aa})
β βββ node = cc
β β βββ bron_kerbosch(R={bb, cc}, P={}, X={aa}) β Not maximal
β βββ node = dd
β βββ bron_kerbosch(R={bb, dd}, P={}, X={}) β
Maximal clique!
βββ node = cc
β βββ bron_kerbosch(R={cc}, P={}, X={aa, bb}) β Not maximal
βββ node = dd
βββ bron_kerbosch(R={dd}, P={}, X={bb}) β Not maximal
The below implementation solved the challenge.
def bron_kerbosch(r: set[str], p: set[str], x: set[str], graph: dict[str, set], cliques: list[set]):
if not p and not x:
cliques.append(r)
return
for vertex in list(p):
bron_kerbosch( r | {vertex}, p & graph[vertex], x & graph[vertex], graph, cliques)
p.remove(vertex)
x.add(vertex)
def find_largest_clique(graph: dict[str, set]) -> str:
cliques_found = []
bron_kerbosch(set(), set(graph.keys()), set(), graph, cliques_found)
largest_clique = max(cliques_found, key=len)
return get_lan_password(list(largest_clique))
Digging Depper…
Reading further on Bron-Kerbosch unveiled an optimised version using pivoting. The idea behind it is to prune the search space by removing redundant checks.
A pivot node is selected from the union set of P and X. Recall P and X are
all nodes not belonging to the current clique, either eliminated or pending exploration.
All maximum clique in the union of R, P and X include a non-neighbor of the pivot (either the pivot itself,
or another node that prevents the pivot from being added).
If the pivot isn’t part of the maximal clique then it must include some nodes not connected to it.
Based on this reasoning, the pivot’s neighbors
are removed from the search iteration to minimize the number of recursive calls.
To reduce the search space even further, we chose the pivot having the maximum number of neighbours.
def bron_kerbosch(r: set[str], p: set[str], x: set[str], graph: dict[str, set], cliques: list[set]):
if not p and not x:
cliques.append(r)
return
# pivot
pivot_candidates = p | x
# choose the pivot from the candidates with the highest number of connexions
u = max(pivot_candidates, key=lambda vertex: len(graph[vertex] & p))
candidate_neighbours = graph[u]
# reduce search space by avoiding direct neighbours of the pivot
for vertex in p - (candidate_neighbours):
bron_kerbosch( r | {vertex}, p & graph[vertex], x & graph[vertex], graph, cliques)
p.remove(vertex)
x.add(vertex)
Using the base example given previously, the pivoting method cuts the number of iterations by 50%: from 10 to only 5. However, using the data of the AoC challenge, the iterations reduce from 248068 to 1337; representing a 99,5% drop!