$29.99
points. Unless otherwise specified, complete and reasoned arguments will be expected for all answers.
Question Points Score
Trade-offs in sampling 12
Satisfying ordering constraints 10
Writing constraints 6
LP Basics 10
Identifying Corners 6
Checking feasibility vs optimization 6
Total: 50
1
Suppose we have a population of size 10 million, and suppose 52% of them vote for A and 48% of them vote for B.
(a) [4] Randomly pick samples of size (a) 20, (b) 100, (c) 400, and evaluate the probability that A is majority even in your sample (by running the experiment say 100 times and taking the count of times A is the majority in your sample). Write down the values you observe for these probabilities in the cases (a-c).
Code link: https://gist.github.com/mozartfish/b5e8a2dba17ec1ba9c8fb19a51b4fe5c a) sample size 20 - Percent A: 43%
b) sample size 100 - Percent A: 63%
c) sample size 400 - Percent A: 81%
(b) [4] What is the size of the sample you need for the probability above to become 0.9?
(Your answer can be within 20% of the ‘best’ value.)
Sample size needed to achieve a probability of 0.9 with 100 trials: 900 elements
(c) [4] Suppose the population is more biased —55% of them vote for A and 45% of them vote for B— and re-solve part (b).
Sample size needed to achieve a probability of 0.9 with 100 trials: 175 elements
Note: This code was written with python and uses python’s random number generator with randint(). According to the docs https://docs.python.org/3/library/random.html Python uses the Mersenne twister to generate random numbers so if a particular experiment is run enough times a specified target probability can be achieved.
Question 2: Satisfying ordering constraints .................................................. [10] This problem is meant to illustrate a rather cool paradigm for solving problems, which is picking a random solution, and understanding how well it does. Indeed, there are many problems (”3SAT”, a version of boolean satisfiability, being the chief of them) where doing better than a random solution is actually NP-hard!
Suppose we have n elements, labeled 1,2,...,n. Our goal is to find an ordering of these elements that satisfies as many “constraints” as possible, of the kind described below. We are given m constraints, where each constraint is given by a triple (a,b,c), where a,b,c ∈ {1,2,...,n}. The constraint is said to be satisfied if in the ordering we output, a does not lie “between” b and c (but the order between b and c doesn’t matter). For example, if n = 4 and we consider the ordering 2431, then the constraint (1,4,3) is satisfied, but (3,1,2) is not.
Given the constraints, the goal is to find an ordering that satisfies as many constraints as possible (for simplicity, assume in what follows that m is a multiple of 3). For large m,n, this problem becomes very difficult.
(a) [6] As a baseline, let us consider a uniformly random ordering. What is the expected number of constraints that are satisfied by this ordering? [Hint: define appropriate random variables whose sum is the quantity of interest, and apply the linearity of expectation.]
Let us consider two values for elements n where n = 3 and n = 4. These represent the set of elements to be chosen for a triplet. For a triplet, we want a unique ordering of the digits so to calculate this we consider the permutation nP3. This yields the total number of constraints m which for n = 3 is 6 and for n = 4 is 24. Through trial and error for writing out the pattern for of the constraints are valid and are invalid. For the ordering of (a,b,c) there are 6 different ways of arranging (a,b,c). Since we are only selecting 3 values for a triplet, we will always get of the constraints are valid regardless of what value of n is chosen.
(b) [4] Let X be the random variable which is the number of constraints satisfied by a random ordering, and let E denote its expectation (which we computed in part (a)). Now, Markov’s inequality tells us, for example, that Pr[X ≥ 2E] ≤ 1/2. But it does not directly imply that Pr[X ≥ E] is “large enough” (which we need if we want to say that generating a few random orderings and picking the best one leads to many constraints being satisfied with high probability). Use the definition of X above to conclude that Pr[X ≥ E] ≥ 1/m. [Hint: X is a non-negative integer!]
Let m represent the number of total constraints
Question 3: Writing constraints...............................................................[6] We will see how writing down constraints can be tricky when formulating problems as optimization.
Recall the traveling salesman problem (TSP): we are given a directed graph G = (V,E) with edge lengths w(e). The goal is to find a directed cycle that (a) visits every vertex exactly once, and (b) minimizes the total length (sum of lengths of the edges of the cycle).
Consider the following optimization formulation. We have variables xuv ∈ {0,1}, one for each edge (u,v) (remember that the graph is directed). The objective is to minimize P(u,v)∈E w(uv)xuv. The constraints are as follows: let N+(u) and N−(u) denote the out-neighbors and in-neighbors of u; then we consider the constraints:
∀ u, X xuv = 1,
v∈N+(u)
∀ u, X xwu = 1.
w∈N−(u)
(Intuitively, the constraints say that exactly one incoming edge and one outgoing edge must be chosen — as in a directed cycle.) Does an optimal solution to this optimization problem always yield an optimal solution to TSP? Provide a proof or a counterexample with a full explanation. [Hint: consider a feasible solution to the optimization problem and focus on the edges with xuv = 1. Do they have to form a single cycle?]
The constraints given in the problem allow for multiple cycles that can be disconnected because there has to be an incoming edge and outgoing edge for a node in the graph. Thus this Linear Program does not yield an optimal solution to the Traveling Salesman Problem because the TSP requires a single cycle.
Question 4: LP Basics.......................................................................[10] Consider the following two-variable linear program. The variables are x,y ∈ R. And the constraints are:
2x + y ≤ 10, x ≥ 0, y ≥ −1, x + 3y ≤ 8.
(a) [4] Plot the feasible region for the linear program. (Diagram can be drawn approximately to scale + scanned).
See attached photo with assignment
(b) [2] Suppose the objective function is to maximize x + y. Find the maximum value and the point at which this is attained.
The maximum point that satisfies the constraint is (4.4, 1.2)
(c) [4] Say the maximum value found in part (b) is C. Then could you have concluded that x+y ≤ C by just “looking at” the equations? [Hint: does adding equations, possibly with constants, yield a bound?]
We can define 4 equations as the following:
1) 2x + y ≤ 10
2) x ≥ 0
3) y ≥ −1
4) x + 3y ≤ 8
We can ignore equations 2) 3) because they will not be affecting the bound.
Operation 1: Multiply Equation 1 by 2
1) → 4x + 2y ≤ 20 2)x + 3y ≤ 8
Operation 2: Add Equations 1 and equation 2
In the above operations, we have shown a bound according to the format in the problem x + y ≤ C. Thus to achieve the maximum from part b, we can use the following equation
Question 5: Identifying Corners...............................................................[6] Consider the following linear program, with variables x1,x2,...,xn ∈ R. Suppose that the constraints are:
0 ≤ xi ≤ 1 for all i, x1 + x2 + ··· + xn ≤ k,
where k is some integer between 1 and n. Consider any point (a1,a2,...,an) where ai ∈ {0,1}, and exactly k of the {ai} are 1. Prove that any such point is (a) a feasible point for the LP, and (b) a corner point of the polytope defining the feasible set.
Part A
In order for a be a feasible point, both constraints 1) and 2) have to be satisfied. According to the definition in the problem, there are only k components of a that are 1 and the rest of the components of a are 0. Let the first k components of a be 1 and the rest of the components ai be 0. Then . This satisfies constraint 2) from the problem statement. By proving constraint 2) we have satisfied constraint 1) in the process, thus satisfying both constraints 1) and 2), showing that any point a is a feasible point for the LP.
Part B
Consider a point a where each ai is within the domain of ∈ 0,1. According to the definition of a, ai can take on either a value of 0 or 1. Consider a non-zero vector z where each element in the vector zi can take on three different values: negative, zero or positive.
Now consider the following cases where we attempt to add values of ai and zi according to the rules of vector addition and subtraction.
Case 1: ai = 0 zi = −1
If we add two vectors a and z, a+z where some element ai and zi are 0 and −1 respectively the resultant vector violates the first constraint thus showing that the resultant vector is a corner point because the result −1 is outside the domain of ∈ 0,1.
Case 2: ai = 0 zi = 1
If we subtract vectors a and z, a−z where some element ai and zi are 0 and 1 respectively the resultant vector violates the first constraint thus showing that the resultant vector is a corner point because the result −1 is outside the domain of ∈ 0,1.
Case 3: ai = 1 zi = −1
If we subtract vectors a and z, z−a where some element ai and zi are 1 and −1 respectively the resultant vector violates the first constraint thus showing that the resultant vector is a corner point because the result −2 is outside the domain of ∈ 0,1.
Case 4: ai = 1 zi = 1
If we subtract vectors a and z, a+z where some element ai and zi are 1 and 1 respectively the resultant vector violates the first constraint thus showing that the resultant vector is a corner point because the result 2 is outside the domain of ∈ 0,1.
Case 5: ai = 0 zi = −1
If we subtract vectors a and z, a − z where some element ai and zi are 0 and −1 respectively, then the resultant vector violates the second constraint because we have added a new 1 component to the vector result which in turn produces a sum > k where there are exactly k ai elements that are 1.
Case 6: ai = 0 zi = 1
If we subtract vectors a and z, a+z where some element ai and zi are 0 and 1 respectively, then the resultant vector violates the second constraint because we have added a new 1 component to the vector result which in turn produces a sum > k where there are exactly k ai elements that are 1.
Thus we have shown in the above cases that for any nonzero vector z a is a corner point.
[Hint: To prove that a point is a corner point, we use the definition we mentioned in class: a feasible point x is a corner point if and only if for all nonzero vectors z, either x + z or x − z is infeasible.]
[Remark: It turns out that these are the only corner points of the polytope, and so it is an example of a polytope defined by 2n + 1 constraints and having corner points.]
Question 6: Checking feasibility vs optimization .............................................. [6] Some of the algorithms for linear programming (e.g. simplex) start off with one of the corner points of the feasible set. This turns out to be tricky in general. In this problem, we will see that in general, finding one feasible point is as difficult as actually performing the optimization!
Consider the following linear program (in n variables x1,...,xn, represented by the vector x):
minimize cT x subject to aT1 x ≥ b1 aT2 x ≥ b2
···
aTmx ≥ bm.
Suppose you know that the optimum value (i.e. the minimum of cT x over the feasible set) lies in the interval [−M,M] for some real number M (this is typically possible in practice). Suppose also that you have an oracle that can take any linear program and say whether it is feasible or not. Prove that using O(log(M/ϵ)) calls to the oracle, one can determine the optimum value of the LP above up to an error of ±ϵ, for any given accuracy ϵ > 0. [Hint: can you write a new LP that is feasible only if the LP above has optimum value ≤ z, for some z?] Since ±ϵ is the error tolerance, we can add the total ϵ for the bound to be 2ϵ. Then we can use a variation of binary search to find a linear program that satisfies the requirements of the constraints.
Proof of Correctness
This algorithm is a variation of the binary search algorithm. We define the lower interval bound as −M and the upper interval bound as M. The stopping condition for this algorithm is when the difference between hi and lo is greater than 2ϵ since ϵ represents the tolerance range between for a z value beween the hi and the lo bounds. We use the oracle as stated in the Algorithm 1 Linear Programming Binary Search
Input: Constraints C, optimum value cT x,ϵ, interval [−M,M]
Output: Some minmum optimum value that Z that satisfies the constraints C
1: lo = -M, hi = M
2: while hi − lo > 2ϵ do
3: z = (hi - lo) / 2
4: Use oracle to check if z is feasible. If all constraints satisfied - LP solution, otherwise not an LP solution
5: if z is a LP Solution then hi = z
6: else
7: lo = z
8: end if 9: end while
10: return z = (lo + hi) / 2 - z is calculated using the midpoint formula for the upper and lower bounds
problem to determine whether some linear program z satisfies the constraints C defined in the problem. When we call the oracle over the sample space we are looking over a boundary of small slices of ϵ which yields in the overall runtime when we have to check over the interval range with binary search. Runtime
The runtime of the algorithm as given in the problem is