[HL-9] Minimal set cover problem
This is a minimal set cover problem. I met such a problem during one of my coding interviews.
Problem: Given a group of sets S, find a minimal set covering that includes all elements in U.
Input:
S = {s1, s2, s3, s4}
s1 = {1,2,3}
s2 = {2,4}
s3 = {3,4}
s4 = {4,5}
U = {1,2,3,4,5}
Output:
ans = {s1, s4}
Solution: Greedy Algorithm (polynomial time approximation)
Idea: at each stage, choose the set that contains the largest number of uncovered elements.
"""
Minimal set cover problem.
"""
# At each stage, find a set that includes largest number of uncovered
# elements.
def set_cover(S, U):
ans = []
while len(U) != 0:
# find a set that maximumly covers
intersects = [U.intersection(x) for x in S]
lengths = map(len, intersects)
max_index = lengths.index(max(lengths))
# remove it from the entire set and uncovered set
ans.append(S.pop(max_index))
U.difference_update(intersects[max_index])
return ans
s1 = set([1,2,3])
s2 = set([2,4])
s3 = set([3,4])
s4 = set([4,5])
S = [s1,s2,s3,s4]
U = set([1,2,3,4,5])
print set_cover(S, U)
Here, I give a simple implementation. You can find more on Wikipedia. A more difficult problem is that every set has a weight, so it requires to cover all items in U with a minimal total weight (refer to [3]).
Reference
[0] Set cover problem on Wikipedia
[1] Polynomials on MATH is FUN
[2] What is P, NP, and NPC problems (Chinese), great explanation of those problems.