next up previous
Next: Dynamic Load Balancing and Up: Reliable High Performance Computing Previous: Branch-and-Bound

Interval Analysis

Of particular interest here are BP and BB schemes based on interval analysis. A real interval $Z$ is defined as the set of real numbers lying between (and including) given upper and lower bounds; i.e., $Z=[z^{\mathrm{L}},z^{\mathrm{U}}]=\{z \in \Re
\mid z^{\mathrm{L}} \le z \le z^{\mathrm{U}} \}$. A real interval vector $\mathbf{Z}=(Z_1,Z_2,\ldots,Z_n)^{\mathrm{T}}$ has $n$ real interval components and can be interpreted geometrically as an $n$-dimensional rectangle (box). Note that in this section lower case quantities are real numbers and upper case quantities are intervals. Several good introductions to interval analysis are available (e.g., [57,58,59]). In this section, interval analysis is described in the context of solving nonlinear parameter estimation problems, since that is the primary example used in the tests discussed later. However, it should be emphasized that the interval methods discussed here are general-purpose and can be used in connection with other objective functions in a global optimization problem and other equation systems in an equation solving problem.

BP and BB techniques can be constructed using the interval-Newton technique. Given a nonlinear equation system with a finite number of real roots in some initial interval, this technique provides the capability to find (or, more precisely, narrowly enclose) all the roots of the system within the given initial interval. For the unconstrained minimization of an objective function (or estimator) $\phi(\mbox{\boldmath${\theta}$})$ in parameter estimation, a common approach is to use the gradient of $\phi(\mbox{\boldmath${\theta}$})$ and seek a solution of $\mathbf{g}(\mbox{\boldmath${\theta}$}) \equiv \nabla \phi (\mbox{\boldmath${\theta}$}) = \mathbf{0} $ in order to determine the optimal parameter values $\mbox{\boldmath${\theta}$}$. The global minimum will be a root of this nonlinear equation system, but there may be many other roots as well, representing local minima and maxima and saddle points. Thus, for this approach to be reliable, the capability to find all the roots of $\mathbf{g}(\mbox{\boldmath${\theta}$}) = \mathbf{0}$ is needed, and this is provided by the interval-Newton technique. In practice, by using an objective range test, as discussed below, the interval-Newton procedure can also be implemented as a BB technique, so that roots of $\mathbf{g}(\mbox{\boldmath${\theta}$}) = \mathbf{0}$ that cannot be the global minimum need not be found. The solution algorithm is applied to a sequence of intervals, beginning with some initial interval $\mathbf{\Theta}^{(0)}$ specified by the user. This initial interval can be chosen to be sufficiently large to enclose all physically feasible values. It is assumed here that the global optimum will occur at an interior stationary minimum of $\phi(\mbox{\boldmath${\theta}$})$ and not at the boundaries of $\mathbf{\Theta}^{(0)}$. Since the estimator $\phi(\mbox{\boldmath${\theta}$})$ is derived based on a product of Gaussian distribution functions corresponding to each data point, only a stationary global minimum is reasonable for statistical regression problems such as considered here.

For an interval $\mathbf{\Theta}^{(k)}$ in the sequence, the first step in the solution algorithm is the function range test. Here an interval extension $\mathbf{G}(\mathbf{\Theta}^{(k)})$ of the function $\mathbf{g}(\mbox{\boldmath${\theta}$})$ is calculated. An interval extension provides upper and lower bounds on the range of values that a function may have in a given interval. It is often computed by substituting the given interval into the function and then evaluating the function using interval arithmetic. The interval extension so determined is often wider than the actual range of function values, but it always includes the actual range. If there is any component of the interval extension $\mathbf{G}(\mathbf{\Theta}^{(k)})$ that does not contain zero, then we may discard (prune) the current interval (node) $\mathbf{\Theta}^{(k)}$, since the range of the function does not include zero anywhere in this interval, and thus no solution of $\mathbf{g}(\mbox{\boldmath${\theta}$}) = \mathbf{0}$ exists in this interval. We may then proceed to consider the next interval in the sequence, since the current interval cannot contain a stationary point of $\phi(\mbox{\boldmath${\theta}$})$. Otherwise, if $\mathbf{0} \in \mathbf{G}(\mathbf{\Theta}^{(k)})$, then testing of $\mathbf{\Theta}^{(k)}$ continues.

The next step is the objective range test. The interval extension $\Phi(\mathbf{\Theta}^{(k)})$, which contains the range of $\phi(\mbox{\boldmath${\theta}$})$ over $\mathbf{\Theta}^{(k)}$, is computed. If the lower bound of $\Phi(\mathbf{\Theta}^{(k)})$ is greater than a known upper bound on the global minimum of $\phi(\mbox{\boldmath${\theta}$})$, then $\mathbf{\Theta}^{(k)}$ cannot contain the global minimum and need not be further tested. Otherwise, testing of $\mathbf{\Theta}^{(k)}$ continues. The upper bound on the objective function used for comparison in this step can be determined and updated in a number of different ways. Here we use point evaluations of $\phi(\mbox{\boldmath${\theta}$})$ done at the midpoint of previously tested $\mathbf{\Theta}$ intervals that may contain stationary points. Using the objective range test yields a BB procedure for the global minimization of $\phi(\mbox{\boldmath${\theta}$})$, while if this step is skipped, we will have a BP technique for finding all solutions of $\mathbf{g}(\mbox{\boldmath${\theta}$}) = \mathbf{0}$, i.e., all stationary points of $\phi(\mbox{\boldmath${\theta}$})$.

The next step is the interval-Newton test. Here the linear interval equation system

\begin{displaymath}
G'(\mathbf{\Theta}^{(k)})(\mathbf{N}^{(k)}-\mbox{\boldmath${\theta}$}^{(k)})=-\mathbf{g}(\mbox{\boldmath${\theta}$}^{(k)})
\end{displaymath}

is set up and solved for a new interval $\mathbf{N}^{(k)}$, where $ G'(\mathbf{\Theta}^{(k)}) $ is an interval extension of the Jacobian of $\mathbf{g}(\mbox{\boldmath${\theta}$})$, i.e., the Hessian of $\phi(\mbox{\boldmath${\theta}$})$, over the current interval $\mathbf{\Theta}^{(k)}$, and $\mbox{\boldmath${\theta}$}^{(k)}$ is a point in the interior of $\mathbf{\Theta}^{(k)}$, usually taken to be the midpoint. It has been shown (e.g., [57,58,59]) that any root $\mbox{\boldmath${\theta}$}^* \in \mathbf{\Theta}^{(k)}$ of $\mathbf{g}(\mbox{\boldmath${\theta}$}) = \mathbf{0}$ is also contained in the image $\mathbf{N}^{(k)}$, implying that if there is no intersection between $\mathbf{\Theta}^{(k)}$ and $\mathbf{N}^{(k)}$ then no root exists in $\mathbf{\Theta}^{(k)}$, and suggesting the iteration scheme $\mathbf{\Theta}^{(k+1)} = \mathbf{\Theta}^{(k)} \cap \mathbf{N}^{(k)}$. In addition to this iteration step, which can be used to tightly enclose a solution, it has been proven (e.g., [57,58,59]) that if $\mathbf{N}^{(k)}$ is contained completely within $\mathbf{\Theta}^{(k)}$, then there is one and only one root contained within the current interval $\mathbf{\Theta}^{(k)}$. This property is quite powerful, as it provides a mathematical guarantee of the existence and uniqueness of a root within an interval when it is satisfied.

There are thus three possible outcomes to the interval-Newton test, as shown schematically for a two variable problem in Figs. 1-3. The first possible outcome  (Fig. 1) is that $\mathbf{N}^{(k)} \subset \mathbf{\Theta}^{(k)}$. This represents mathematical proof that there exists a unique solution to $\mathbf{g}(\mbox{\boldmath${\theta}$}) = \mathbf{0}$ within the current interval $\mathbf{\Theta}^{(k)}$, and that that solution also lies within the image $\mathbf{N}^{(k)}$. This solution can be rigorously enclosed, with quadratic convergence, by applying the interval-Newton step to the image and repeating a small number of times. Alternatively, convergence to a point approximation of the solution can be guaranteed using a routine point-Newton method starting from anywhere inside of the current interval. Since a unique solution has been identified for this subproblem, it can be pruned, and the next interval in the sequence can now be tested, beginning with the function range test.

Figure 1: The computed image $\mathbf{N}^{(k)}$ is a subset of the current interval $\mathbf{\Theta}^{(k)}$. This is mathematical proof that there is a unique solution of the equation system in the current interval, and furthermore that this unique solution is also in the image.
\begin{figure}
\centerline{\psfig{figure=figure1.eps,height=6.2cm}}
\end{figure}

The second possible outcome  (Fig. 2) is that $\mathbf{N}^{(k)} \cap \mathbf{\Theta}^{(k)} = \emptyset$. This provides mathematical proof that no solutions of $\mathbf{g}(\mbox{\boldmath${\theta}$}) = \mathbf{0}$ exist within the current interval. Thus, the current interval can be pruned and testing of next interval can begin.

Figure 2: The computed image $\mathbf{N}^{(k)}$ has a null intersection with the current interval $\mathbf{\Theta}^{(k)}$. This is mathematical proof that there is no solution of the equation system in the current interval.
\begin{figure}
\centerline{\psfig{figure=figure2.eps,height=6.2cm}}
\end{figure}

The final possible outcome  (Fig. 3) is that the image $\mathbf{N}^{(k)}$ lies partially within the current interval $\mathbf{\Theta}^{(k)}$. In this case, no conclusions can be made about the number of solutions in the current interval. However, it is known that any solutions that do exist must lie in the intersection $\mathbf{\Theta}^{(k)} \cap\mathbf{N}^{(k)}$. If the intersection is sufficiently smaller than the current interval, one can proceed by reapplying the interval Newton test to the intersection. Otherwise, the intersection is bisected, and the resulting two intervals added to the sequence of intervals to be tested. This approach is referred to as an interval-Newton/generalized-bisection (IN/GB) method, and depending on whether or not the objective range test is employed, can be interpreted as either a BB or BP procedure.

Figure 3: The computed image $\mathbf{N}^{(k)}$ has a nonnull intersection with the current interval $\mathbf{\Theta}^{(k)}$. Any solutions of the equation system must lie in the intersection of the image and the current interval.
\begin{figure}
\centerline{\psfig{figure=figure3.eps,height=6.2cm}}
\end{figure}

It should be emphasized that, when machine computations with interval arithmetic operations are done, the endpoints of an interval are computed with a directed outward rounding. That is, the lower endpoint is rounded down to the next machine-representable number and the upper endpoint is rounded up to the next machine-representable number. In this way, through the use of interval, as opposed to floating point arithmetic, any potential rounding error problems are eliminated, yielding an approach that can provide a computational, not just mathematical, guarantee of reliability. Overall, when properly implemented, the IN/GB method described above provides a procedure that is mathematically and computationally guaranteed to find the global minimum of $\phi(\mbox{\boldmath${\theta}$})$, or, if desired, to enclose all of its stationary points (within, of course, the specified initial parameter interval $\mathbf{\Theta}^{(0)}$).

In implementing an IN/GB algorithm, there are opportunities for the use of parallel computing at multiple levels. On a fine-grained level, the basic interval arithmetic operations can be parallelized (e.g., [60]). On a larger-grained level, the solution of the linear interval equation system for the image can be parallelized (e.g., [61,62,63,64]). Of course, on a coarse-grained level, each independent subproblem generated in the bisection process can be tested in parallel (e.g., [1,6,8,9]). It is only this coarsest level of parallelism that will be considered here.


next up previous
Next: Dynamic Load Balancing and Up: Reliable High Performance Computing Previous: Branch-and-Bound
ChaoYang Gau 2001-03-12