Monday, February 23, 2015

Partitioning with Binary Variables

Armed with the contents of my last two posts ("The Geometry of a Linear Program", "Branching Partitions the Feasible Region"), I think I'm ready to get to the question that motivated all this. Let me quickly enumerate a few key take-aways from those posts:
  • The branch and bound method for solving an integer linear program or mixed integer linear program uses a search tree.
  • At each node of the search tree, a linear program (LP) that relaxes some or all of the integrality restrictions (while adding some new restrictions based on earlier branching) is solved. (As an aside, this hold for mixed integer quadratic programs, with the change that the node relaxation will likely be a quadratic program rather than a linear program.)
  • LPs have closed and convex feasible regions.
  • The act of branching (separating a node into two child nodes) partitions the feasible region for the problem at the parent node into two smaller feasible regions.
Let's move on now. A common use for binary decision variables -- I suspect the most common use -- is to partition the feasible region of a problem into two or more disjoint subregions, corresponding to mutually exclusive scenarios. Consider the example in Figure 1, a mixed integer linear program (MILP) in which $x$ is integer-valued and $y$ is continuous. The green polytope is the LP relaxation; the vertical bars are the actual feasible region of the MILP.

Figure 1: Feasible region
Now suppose that there is something going on in the problem that depends on whether $x$ is 3 or less, 4-6, or 7 or more. For instance, if $x$ represents the number of workers deployed on some project, we might need a supervisor for every three workers. We need to partition into those three cases.

A general approach for partitioning a feasible region is to introduce one binary variable for each subregion/scenario, and constrain those variables to add up to 1 (so that exactly one scenario is selected). In our example, we add $z_i \in \{0,1\}$ for $i \in \{1,2,3\}$ along with the constraints\begin{eqnarray*} z_{1}+z_{2}+z_{3} & = & 1\\ x & \le & 3z_{1}+9(1-z_{1})\\ x & \le & 6z_{2}+9(1-z_{2})\\ x & \ge & 4z_{2}\\ x & \ge & 7z_{3}. \end{eqnarray*}Left to the reader as an exercise: confirm that if $z_1=1$ then $0\le x \le 3$, if $z_2=1$ then $4\le x\le 6$, and if $z_3=1$ then $7\le x\le 9$.

As a side note, before proceeding further, suppose that what we really want is to ensure that either $x\le 3$ or $x\ge 7$. (Don't ask why; just play along.) We do that the same way I just described, but then also constrain $z_2 = 0$. So the technique for partitioning a feasible region also works for omitting a chunk of a feasible region.

One limitation of this approach is that the subregions must be definable using linear constraints, assuming that you want your problem to remain a MILP. Another (and this is really what motivated me to do this post) is that the partitioning really takes place during the branching process. The algebra splits our feasible region into three disjoint chunks so long as $z_1,z_2,z_3$ are integer, but when we are solving node relaxations some or all of the $z$ variables will be relaxed to continuous.

Let's stick with our three-way split, while also fixing $z_2=0$, so that we force either $x\le 3$ or $x\ge 7$. The conceptual feasible region looks like Figure 2. Again, the actual feasible region for the MILP consists of two clusters of vertical line segments; the shaded polytopes are the relaxations of each scenario.

Figure 2: Two scenarios


At some node in the search tree, $z_1$ will be fixed at 1 (and $z_2$ and $z_3$ at 0), and the feasible region of the node relaxation will be the polytope on the left side of Figure 2. At some other node, $z_3$ will be fixed at 1 ($z_1$ and $z_2$ at 0), and the feasible region of the node relaxation will be the polytope on the right side of Figure 2.

What about before any of the $z$ variables are fixed (for instance, at the root node)? Since $z_1$ and $z_3$ can take fractional values, the relaxation at the root node (shown in Figure 3) is the same as what it was before we introduced the $z$ variables.

Figure 3: Root node relaxation
To see this, consider what happens if we let $(z_1, z_2, z_3) = (0.5, 0, 0.5)$. The constraints on $x$ become\begin{eqnarray*} x & \le & 3(0.5)+9(1-0.5) = 6\\ x & \le & 6(0)+9(1-0) = 9\\ x & \ge & 4(0) = 0\\ x & \ge & 7(0.5) = 3.5. \end{eqnarray*}Together with the original constraints that defined the sides of the polytope in Figure 1, this gives us the portion of the excluded region where $3.5 \le x \le 6$ (which is most of the pink region). For $x$ between 3 and 3.5 or between 6 and 7, we just need to adjust the $z_1, z_3$ split.

The key takeaway here is that the split we see in Figure 2 actually takes effect during branching, not at the root node (and not at nodes below the root but above the point where the solver branches on one of the $z$ variables). The approach of using binary variables to partition feasible regions is valid because the final solution will obey the integrality restrictions (and thus will belong to one of the disjoint scenarios). Along the road to the final solution, though, we cannot count on the partitioning holding true (and, in the case of excluded regions, we cannot count on the unwanted portion actually be excluded yet).