Simple Optimization in 3D

R
Calculus
Author

Peter Amerkhanian

Published

March 9, 2024

I’ve previously blogged about optimization in two dimensional space:

In this post, I’m going to expand into three dimensions, with an overview of the analytical solution to a distance minimization problem.

Code
library(plotly)
library(dplyr)

The Minima Problem

Consider the following question:

Find the point closest to \(P(-3, -5, 0)\) on the surface defined as follows: \[ \begin{align} z^2 &= x^2 + y^2 \\ z = f(x, y) &= \pm \sqrt{x^2 + y^2} \end{align} \tag{1}\](Strang and Herman 2016, chap. 4)

Emphasis is put on closest to denote that this is a keyword suggesting we are looking at an optimization problem. To start approaching this, we will set the equation up in R so that we can inspect a plot and better understand the problem.

f <- function(x, y) sqrt(x^2 + y^2)
n <- 100
x <- seq(-10, 10, length.out = n)
y <- seq(-10, 10, length.out = n)
z <- outer(x, y, f)
Code
scene <- list(title="f(x, y) & P1", 
              camera = list(eye = list(x = -2.5, y = 2, z = .7)),
              xaxis = list(title = "X"),
              yaxis = list(title = "Y"),
              zaxis = list(title = "Z"))
plot_ly(x = x, y = y, z = z, type = "surface") %>%
  style(cmin=-max(z), cmax=max(z)) %>% 
  hide_colorbar() %>% 
  add_trace(x = x, y = y, z = -z, type = "surface", cmin=-max(z), cmax=max(z)) %>% 
  add_trace(x = -3, y = -5, z = 0,
            type = "scatter3d",
            mode = "markers",
            marker = list(size = 5, color = "red"),
            name = '(-3, -5, 0)') %>% 
  layout(scene = scene)

We see that the red dot is our target, and we want to find the point on the surface closest to that point. Given the shape of the surface, it’s clear that there will be two solutions (this is also apparent from the \(\pm\) in Equation 1).

Defining the Objective Function

Returning to the text of the problem, we want to find the point closest to \(P\), on some surface. “Closest to” implies that we will be minimizing a distance. Recall the equation for distance between a point, \(P(x, y, z)\) and some other point \(P_1(x_1, y_1, z_1)\) in \(\mathbb{R}^3\) (Strang and Herman 2016, chap. 2):

\[ \begin{equation} d(P, P_1) = \sqrt{(x-x_1)^2 + (y-y_1)^2 + (z-z_1)^2} \end{equation} \tag{2}\]

We have the point, \(P = (-3, -5, 0)\), which we can plug into Equation 2 to yield the equation for the distance from this specific point to some other location in \(\mathbb{R}^3\):
\[ \begin{align} d((-3, -5, 0), P_1) &= \sqrt{(x+3)^2 + (y+5)^2 + z^2} \\ &= \sqrt{x^2 + 6x + 9 + y^2 + 10y + 25 + z^2} \\ &= \sqrt{x^2 + 6x + y^2 + 10y + z^2 + 34} \end{align} \] Now, given that this function describes the distance between \(P = (-3, -5, 0)\) and any other point in \(\mathbb{R}^3\), it follows that, without constraints, the closest point would just be \(P\)1:

\[ \begin{align} d((-3, -5, 0),(-3, -5, 0)) &= \sqrt{x^2 + 6x + y^2 + 10y + z^2 + 34} \\ &= \sqrt{(-3)^2 + 6(-3) + (-5)^2 + 10(-5) + 0^2 + 34} \\ &= \sqrt{-34 + 34} \\ &= \sqrt{0} = 0 \end{align} \]

However, we were explicitly tasked with finding the point closest to \(P\) on a surface \(z\). We have the equation of our surface, Equation 1, which we can square and then substitute in for \(z^2\). This substitution is in effect the application of a constraint – we optimize for the closest point \(P_1\) possible to our target, \(P\), subject to the constraint that the point must lie on the surface, \(z^2 = x^2 + y^2\).
\[ \begin{align} f^* &= \sqrt{x^2 + 6x + y^2 + 10y + z^2 + 34} \\ &= \sqrt{x^2 + 6x + y^2 + 10y + (x^2 + y^2) + 34} \\ &= \sqrt{2x^2 + 6x + 2y^2 + 10y + 34} \end{align} \] Note that minimizing \(f^*\) is going to be equivalent to minimizing \((f^*)^2\),2 so let’s focus on the more simple equation,

\[ (f^*)^2 = (2x^2 + 6x + 2y^2 + 10y + 34) \]

parameter_space <- function(x, y) {2*x^2 + 6*x + 2*y^2 + 10*y + 34}
z_p <- outer(x, y, parameter_space)
Code
scene <- list(xaxis = list(title = "X"),
              yaxis = list(title = "Y"),
              zaxis = list(title = "(f*)^2,(X, Y)"))
plot_ly(x = x, y = y, z = z_p, type = "surface", colorscale="coolwarm") %>% 
  layout(title="Objective Function, (f*)^2, to Minimize", scene=scene)

Minimizing the Objective Function

To find the point \(x^*, y^*\) that minimizes this parameter surface, we’ll first find the critical points. In the bivariate setting, this implies finding the derivative then setting that equal to 0. In the multivariate setting, we find the gradient vector: \[ \nabla (f^*)^2 = \left< 2 x + 3, 2 y + 5 \right> \] and set that equal to zero: \[ \begin{align*} \nabla (f^*)^2 &= 0 \\ &\begin{cases} 2 x + 3 = 0 \\ 2 y + 5 = 0 \end{cases} \\ &\begin{cases} x = -\frac{3}{2} \\ y = -\frac{5}{2} \end{cases} \end{align*} \]

Thus we have the critical value. We can visually inspect this point and its output to confirm that we have indeed found the minimum of the surface:

Code
scene <- list(title="f(x, y) & P1", 
              camera = list(eye = list(x = -1.25, y = 1.25, z = -1)),
              xaxis = list(title = "X"),
              yaxis = list(title = "Y"),
              zaxis = list(title = "Z"))

plot_ly(x = x, y = y, z = z_p, type = "surface") %>% 
  style(colorscale="coolwarm") %>%
  add_trace(x = -3/2, y = -5/2, z = parameter_space(-3/2, -5/2),
            type = "scatter3d",
            mode = "markers",
            marker = list(size = 5, color = "red"),
            name = '(-3/2, -5/2, 17)') %>% 
  layout(scene = scene)

Answering the Original Minima Question

Now we’ll return to the original surface and substitute this \(x^*,y^*\) into Equation 1 to find the \(z^*\) of this minimizing point:
\[ \begin{align*} z^* = f(x^*, y^*) &= \pm \sqrt{(x^*)^2 + (y^*)^2} \\ &= \pm \sqrt{(-\frac{3}{2})^2 + (-\frac{5}{2})^2} \\ &= \pm \frac{\sqrt{34}}{2} \end{align*} \]

Code
scene <- list(title="f(x, y) & P1", 
              camera = list(eye = list(x = -2, y = -.5, z = .5)),
              xaxis = list(title = "X"),
              yaxis = list(title = "Y"),
              zaxis = list(title = "Z"))
plot_ly(x = x, y = y, z = z, type = "surface") %>%
  style(cmin=-max(z), cmax=max(z)) %>%
  hide_colorbar() %>% 
  add_trace(x = x, y = y, z = -1*z, cmin=-max(z), cmax=max(z)) %>%
  add_trace(x = -3, y = -5, z = 0,
            type = "scatter3d",
            mode = "markers",
            marker = list(size = 5, color = "red"),
            name = '(-3, -5, 0') %>% 
  add_trace(x = -3/2, y = -5/2, z = sqrt(34)/2,
            type = "scatter3d",
            mode = "markers",
            marker = list(size = 5, color = "orange"),
            name = '(-3/2, -5/2, sqrt(34)/2)') %>% 
  add_trace(x = -3/2, y = -5/2, z = -1*sqrt(34)/2,
            type = "scatter3d",
            mode = "markers",
            marker = list(size = 5, color = "orange"),
            name = '(-3/2, -5/2, -sqrt(34)/2)') %>% 
  layout(scene = scene)

Thus on the surface \(z^2 = x^2 + y^2\), the two closest points to \(P=(-2, -5, 0)\) are:
\[ (-\frac{3}{2},-\frac{5}{2}, \frac{\sqrt{34}}{2}) , (-\frac{3}{2},-\frac{5}{2}, -\frac{\sqrt{34}}{2}) \]

References

Strang, Gilbert, and Edwin Herman. 2016. Calculus Volume 3. OpenStax.

Footnotes

  1. 0 is the smallest possible real number output of the square root function.↩︎

  2. This is beyond the scope of this post, but based on the inequality \(0 \leq x \leq y \rightarrow x^2 \leq y^2\).↩︎