Friday, September 1, 2017

Minimizing a Median

\( \def\xorder#1{x_{\left(#1\right)}} \def\xset{\mathbb{X}} \def\xvec{\mathbf{x}} \)A somewhat odd (to me) question was asked on a forum recently. Assume that you have continuous variables $x_{1},\dots,x_{N}$ that are subject to some constraints. For simplicity, I'll just write $\xvec=(x_{1},\dots,x_{N})\in\xset$. I'm going to assume that $\xset$ is compact, and so in particular the $x_{i}$ are bounded. The questioner wanted to know how to model the problem of minimizing the median of the values $x_{1},\dots,x_{N}$. I don't know why that was the goal, but the formulation is mildly interesting and fairly straightforward, with one wrinkle.

The wrinkle has to do with whether $N$ is odd or even. Suppose that we sort the components of some solution $\xvec$, resulting in what is sometimes called the "order statistic": $\xorder 1\le\xorder 2\le\dots\xorder N$. For odd $N$, the median is $$\xorder{\frac{N+1}{2}}.$$For even $N$, it is usually defined as $$\left(\xorder{\frac{N}{2}}+\xorder{\frac{N}{2}+1}\right)/2.$$

The odd case is easier, so we'll start there. Introduce $N$ new binary variables $z_{1},\dots,z_{N}$ and a new continuous variable $y$, which represents the median $x$ value. The objective will be to minimize $y$. In addition to the constraint $\xvec\in\xset$, we use "big-M" constraints to force $y$ to be at least as large as half the sample (rounding "half" up). Those constraints are: \begin{align*} y & \ge x_{i}-M_{i}z_{i},\ i=1,\dots,N\\ \sum_{i=1}^{N}z_{i} & =\frac{N-1}{2} \end{align*} with the $M_{i}$ sufficiently large positive constants. The last constraint forces $z_{i}=0$ for exactly $\frac{N+1}{2}$ of the indices $i$, which in turn forces $y\ge x_{i}$ for $\frac{N+1}{2}$ of the $x_{i}$. Since the objective minimizes $y$, $z_{i}$ will be 0 for the $\frac{N+1}{2}$ smallest of the $x_{i}$ and $y$ will be no larger than the smallest of them. In other words, we are guaranteed that in the optimal solution $y=\xorder{\frac{N+1}{2}}$, i.e., it is the median of the optimal $\xvec$.

If $N$ is even and if we are going to use the standard definition of median, we need twice as many added variables (or at least that's the formulation that comes to mind). In addition to the $z_{i}$, let $w_{1},\dots,w_{N}$ also be binary variables, and replace $y$ with a pair of continuous variables $y_{1}$, $y_{2}$. The objective becomes minimization of their average $(y_{1}+y_{2})/2$ subject to the constraints \begin{align*} y_{1} & \ge x_{i}-M_{i}z_{i}\ \forall i\\ y_{2} & \ge x_{i}-M_{i}w_{i}\ \forall i\\ \sum_{i=1}^{N}z_{i} & =\frac{N}{2}-1\\ \sum_{i=1}^{N}w_{i} & =\frac{N}{2} \end{align*} where the $M_{i}$ are as before. The constraints force $y_{1}$ to be at least as large as $\frac{N}{2}+1$ of the $x_{i}$ and $y_{2}$ to be at least as large as $\frac{N}{2}$ of them. The minimum objective value will occur when $y_{1}=\xorder{\frac{N}{2}+1}$ and $y_{2}=\xorder{\frac{N}{2}}$.

2 comments:

  1. In the big leagues, median is not unique with even number of data points. If the problem is to minimize "a" median, i.e., find the smallest median, not "the" (what you call standard definition of) median, I think for N even you could minimize y_2, and forget about y_1 ... or something to that effect.

    ReplyDelete
    Replies
    1. I agree that, with a "liberal" (nonpolitical sense) interpretation of median you could just minimize $y_2$, and with a "conservative" interpretation you might want to minimize $y_1$. I'm pretty sure that statisticians view the definition of the sample median of an even number of observations to be the mean of the two middle ones, and I suspect they consider themselves the "big leagues" when it comes to stats. ("Data scientists" might disagree. Statisticians would view their disagreement as an outlier.)

      Delete

Due to intermittent spamming, comments are being moderated. If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on Operations Research Stack Exchange.