Question #3

Each person is a vertex and there exists an edge between two vertices if there is a conflict between those two people. The weight of each edge is 1. We want to split all the vertices into two clusters. The cost function is the number of edges that go from one cluster to another.

The local search algorithm is to pick a random solution and then create two values for each node. A random solution splits the nodes into two different clusters. The first value is the number of edges that go to the other node in its own cluster. The second value is the number of edges that go a node in the other cluster. If the first value \(>\) the second value, that means switching this node to the other table would increase the number of edges that cross the clusters and that is what we want to do.

So basically with the local search, we look for a vertex that would increase the number of edges between the partitions if we swapped them.

The algorithm runs in polynomial time because when we check if we can move a vertex, we basically have to look at every single edge. For each vertex, we need to look at the first and second value which are derived by the number of edges. So for each iteration we have to look at all the edges and if this is a dense graph that will be \(n^2\) operations where \(n\) is the number of vertices.

Now we know that all the edge weights are 1, and since we only iterate through this program if we are going to increase the cost, that means every iteration must increase the cost by 1.

So that means if there is a total of \(n^2\) edges, the total cost this problem could have is \(n^2\). So in the worst case our random split starts with 0 cost and every single iteration only increases the cost by 1. Then this means the maximum number of iterations we can have is \(n^2\).

So total runtime is \((n^2*n^2)=O(n^4)\) which is polynomial.

To show that it provides a 2-approximation solution:

First we know that the sum of all the edges weights >= optimal value. Since the cost is derived from the number of edges, it can’t exceed the sum of the edge weights. Also we know that if we reached a locally optimal solution, that means if you swap two vertices, the total cost must decrease.

Let’s call our two sets be call \(A\) and \(B\). If we are a locally optimum solution and we move the node u from A to B, we get:

\(\sum\nolimits_{u \in A, (u,v) \in E}w_{u,v} >= \sum\nolimits_{u \in B, (u,v) \in E}w_{u,v}\)

\(\sum\nolimits_{u \in A, (u,v) \in E}w_{u,v} >= \sum\nolimits_{u \in B, (u,v) \in E}w_{u,v}+\sum\nolimits_{u \in A, (u,v) \in E}w_{u,v}\)

\(\sum\nolimits_{u \in A, (u,v) \in E}w_{u,v} >= \frac{1}{2}\sum\nolimits_{u:(u,v) \in E}w_{u,v}\)

Similarly, for any \(v' \in A\):

\(\sum\nolimits_{u \in B, (u,v') \in E}w_{u,v'} >= \frac{1}{2}\sum\nolimits_{u:(u,v') \in E}w_{u,v}\)

Summing over all vertices, we obtain the desired result:

\(2c(B) = 2\sum\nolimits_{u \in B, v \in A, (u,v) \in E}w_{u,v} >=\sum\nolimits_{e \in E}w_e >= OPT \)

For the greedy algorithm, start off with both sets A and B empty. Then pick a random vertex and place it in set A. Then pick the next vertex and put it in B so that it will maximize the number of edges between A and B. After this, choose and vertex that will maximize the number of edges between the two sets and place it in the set that maximizes the number of edges. Continue this operation until you run out of vertices.

This is a polynomial time algorithm. There are a total number of \(n\) iterations, one for each time you pick a vertex. Then to pick a vertex you must look at every single edge and worst case there will be \(n^2\) number of edges. So that means in total there should be \(n^3\) operations, which is polynomial.

This is a 2-approximation algorithm. Let the total number of edges in the graph be \(E\). Let \(n\) be the number of nodes. For every single edge in the graph, it has two nodes. Let’s label the first node as the \(Start\) node and the other node at the \(End\) node.

When the \(End\) node gets added to the graph problem, the edge is created. Now let \(e_i\) be the number of edges that the node \(i\), is considered the \(End\) node. Since every single edge has one \(End\) node, that means \(\sum\limits_{i=1}^n e_i = E\), which is the total number of edges.

Now since we are looking at a greedy algorithm whenever node \(i\) is added to the sets, the number of edges that are added to this cost will be \(\frac{e_i}{2}\). This happens because this is a greedy choice. The greedy choice will always choose the set that will create the highest cost. In the worst case, both sets have split \(e_i\) evenly, so no matter which set you choose you only increase the cost by \(\frac{e_i}{2}\). But that is the smallest because it will be impossible for both sets to contain less than the average number of nodes that correspond to \(e_i\) per set.

So the number of edges that cross A and B = sum of the number of edges added by the \(End\) nodes...

Which is \(\sum\limits_{i=1}^n \frac{e_i}{2} = E\) (since this is how much the greedy algorithm adds each time)...

Which is \(\frac{E}{2}\) since the sum of the \(e_i\)’s equal the total number of edges.

So the number of edges that cross A and B from the greedy algo is \(\frac{E}{2}\). Since the optimal value can only be \(E\) at best that means this greedy algorithm is at most 2 times worse than the optimal solution.

Randomized algorithm: Just randomly place vertices in the two sets. This is obvious polynomial because there are no computations necessary.

This is a 2-time approximation algorithm. There are a total of E edges which is the max value the optimal solution can take. We need to show that this algo will on average give us \(\frac{E}{2}\) cost. So let us number the edges from 1 to E. Let \(e_i=1\) if that edge crosses from A to B and let it equal \(0\) if it doesn’t.

Now, \(P(e_i=1)=\) the probability that the two nodes get placed into two different sets.

We take two random nodes and enumerate all their possibilities.

Node1 - A, Node2 - A

Node1 - A, Node2 - B

Node1 - B, Node2 - A

Node1 - B, Node2 - B

This shows us that for half the time, the edge will exist and for the other half it wont.

So the expected value of \(e_i= P(e_i=1)*1=\frac{1}{2}\)

So the expected value of all the \(e_i\)’s = \(\frac{E}{2}\) \(*E\)

So using this algorithm, the expected number of edges that should cross from A to B is \(\frac{E}{2}\).

So using this algo the expected number of edges that should cross from A to B is \(\frac{E}{2}\).

So this is a 2-time approximation algorithm.