Inverse Problem (Part 1)

The process of calculating the causal factors from an observation is called inverse problem. An inverse problem is much harder to solve than the corresponding forward counterpart, which is calculating the observation from the causal factors.

Many problems in science and math are inverse problems. They can be found in optics, radar, acoustics, communication theory, signal processing, medical imaging, computer vision, geophysics, oceanography, astronomy, remote sensing, natural language processing, machine learning, nondestructive testing, and many other fields.
Two concrete examples are image deblurring and computer vision. Computer vision is the inverse problem of computer graphics and image deblurring is the inverse problem of image convolution. I have already covert the topic of image deblurring in previous posts (part 1 and part 2).
Inverse problems are hard, because

  • they are often underdetermined. That means they have no unique solution.
  • There exist no direct inversion of the forward problem,
  • and they are Ill-posed, so small observation errors cause high inversion errors.

Inverse problems are often solved iteratively. For example for a given y, we could try out different x and evaluate f(x). If y is equal to f(x), we know x is a valid solution. This technique might take very long (infinitely long if x is continuous) to find the correct result. A better procedure is to describe this problem as an optimization problem and solve it with gradient descent (or other gradient based algorithms). For that we have to define a loss function:


Minimizing this loss function gives us the right result.


If f(x) is differentiable and the loss function is convex, this problem can be solved with gradient descent, by iteratively calculating the next best solution with:

    \[x_{n+1}=x_n-\alpha \frac{dL}{dx}\]

Lets define a small inverse problem, with the function


We search for f^{-1}(0). This example is not particularly hard, since you could solve it analytically. However it is a good example that shows the optimization procedure.

In [1]:
%matplotlib inline

import numpy as np
from sympy import symbols, lambdify
import matplotlib.pylab as plt

L, f, x, y = symbols('L f x y')

f = 2*x - 4
L = (y - f)**2

print('L=' + str(L))

L_func = lambdify(x, L.subs({y: 0.0}), 'numpy')
gt_x = np.arange(-4, 8, 0.01)
gt_loss = L_func(gt_x)
plt.plot(gt_x, gt_loss)
L=(-2*x + y + 4)**2

Solving this optimization problem with gradient descent requires differentiating the loss function. It would be possible to solve many optimization problems without a gradient, using evolutionary algorithms or estimating the gradient in a different way. If gradients are available you should use them, because they significantly simplify and accelerate the problem.

In [2]:
from sympy import diff
dL = diff(L, x)
print('dL: ' + str(dL))
dL: 8*x - 4*y - 16

Solving our exemplary optimization problem with gradient descent and properly selected \alpha, leads to the correct result 1.97\approx2=f^{-1}(0). For this example we can verify the result analytically:

    \[f^{-1}(y)=\frac{y+4}{2}\\ f^{-1}(0)=2\]

In [3]:
alpha = 0.05
current_x_opt = -3.0
current_loss = L.evalf(subs={y: 0.0, x: current_x_opt})
x_opt = [current_x_opt,]
loss = [current_loss,]

for i in range(10):
    current_x_opt = current_x_opt - alpha * dL.evalf(subs={y: 0.0, x: current_x_opt})
    current_loss = L.evalf(subs={y: 0.0, x: current_x_opt})

<matplotlib.text.Text at 0x1cc39b9cb38>
In [4]:
plt.plot(gt_x, gt_loss)
plt.plot(x_opt, loss, 'o-', color='red')
One Comment