How can I distinguish a genuine solution of polynomial equations from a numerical near miss?
How can I distinguish a genuine solution of polynomial equations from a numerical near miss?
Cross-posted from MSE, where this question was asked over a year ago with no answers.
Suppose I have a large system of polynomial equations in a large number of real-valued variables.
beginalign
f_1(x_1, x_2, &dots, x_m) = 0 \
f_2(x_1, x_2, &dots, x_m) = 0 \
&vdots \
f_n(x_1, x_2, &dots, x_m) = 0 \
endalign
(In my particular case, I have about $n approx 1000$ equations of degree $10$ in about $m approx 200$ variables.) By numerical means, I've found an approximate solution vector $(tildex_1, tildex_2, dots, tildex_m)$ at which the value of each $f_j$ is very small:
$$lvert f_j(tildex_1, tildex_2, dots, tildex_m) rvert < 10^-16 quad forall j = 1, dots, n.$$
This leads me to believe that a genuine solution of my system exists somewhere in a small neighborhood of $(tildex_1, tildex_2, dots, tildex_m)$, and that the small residuals I see are due to round-off error in finite (IEEE double) precision arithmetic. However, it could conceivably be the case that the zero loci of my polynomials $f_j$ come very close to each other (within $10^-16$) but do not mutually intersect. How can I rigorously tell which is the case?
I could, of course, further refine my solution using quadruple- or extended-precision arithmetic to push the residuals even closer to zero, but this would only provide supporting empirical evidence.
If it helps, all of my polynomials have integer coefficients and can be evaluated exactly on integer and rational inputs. However, my approximate solution $(tildex_1, tildex_2, dots, tildex_m)$ is probably irrational.
In principle, there are methods of computational algebraic geometry (Groebner bases, cylindrical decomposition, etc.) that can algorithmically decide the existence of a true mathematical solution to a polynomial system, but my system is completely out of reach of all such algorithms I know. Buchberger's algorithm, for example, has doubly-exponential time complexity in the number of input variables.
Note that interval/ball arithmetic won't help, because even if I can show that each $f_j$ exactly assumes the value $0$ in a small neighborhood of $(tildex_1, tildex_2, dots, tildex_m)$, it could be the case that a different point zeroes out each $f_j$, and no single point simultaneously zeroes out all of them.
Is there an answer for linear systems (i.e., the $f_j$ are affine functions) that doesn't reduce to solving the system?
– RBega2
Sep 2 at 18:23
You could possibly apply numerical methods that some well-chosen 100 of the 1000 equations have a unique solution in some box. Then you could reduce the problem to exactly verifying that 101 equations have a joint solution, and that it lies in the box, 900 times. If the double-exponential in the number of variables is the main problem then this won't help much.
– Will Sawin
Sep 8 at 23:04
There are also all-solution homotopy methods for polynomial equations. Given that you have a very overdetermined system of equations, it might be worth trying an exact method such as homotopy, or Grobner bases, and see if you get lucky.
– arsmath
Sep 8 at 23:46
Does Smale's alpha-theory (see @Kirill's comment above) have anything to do with logicians' $alpha$-theories, or is it just a coincidence?
– Qfwfq
Sep 9 at 11:30
1 Answer
1
Interval/ball arithmetic may help, actually.
It can be used to prove existence of solutions to multivariate systems like this one. The main idea is: reformulate your system as a fixed-point system $x = g(x)$, $g: mathbbR^n to mathbbR^n$. If you manage to find a certain product of intervals (hypercube) $X = [a_1,b_1] times [a_2,b_2] times dots times [a_n, b_n]$ such that $g(X) subseteq X$ (which you can check using interval arithmetic: interval arithmetic computes an enclosure $g(X) subseteq tildeg(X)$, so if you have $tildeg(X) subseteq X$ you have won), then by the Brouwer fixed-point theorem your system has an exact solution inside $X$.
There are tricks to get a 'good' map $g$ such that the inclusion can be verified; for instance, centering on an approximate computed solution and multiplying by an approximated inverse of the Jacobian. Look for the interval Newton method, or the Krawczyk method. Rump's paper on Acta Numerica is a good introduction to these topics.
(EDIT: As the comments below mention, OP's problem is more general than this one though, since the system is overdetermined.)
Interval arithmetic does not mean "replace double x
with interval<double> x
and run your algorithm again"; this is a common misconception. Most of the times it works best as an a posteriori verification method, starting from an approximate solution that has been computed with a completely different (numerical, unverified) method.
double x
interval<double> x
There are tricky and ill-conditioned cases where many digits of precision will be needed in your arithmetic (think $(x-1)^2+varepsilon=0$), but the main idea works well, especially for simple solutions. (Rump's review paper contains methods to find double solutions as well, essentially augmenting the system with its Jacobian.)
With an overdetermined system like the one OP asks about, one could convert multiple equations to a single one with $f(x)=g(x)=0 iff f(x)^2+g(x)^2=0$ (since the question is about real solutions), but this gives a problem with multiple solutions, so it's not simply a matter of applying that fixed-point method.
The problem is, sometimes you can't reformulate the problem this way. If there are 1000 equations with 200 variables and this is it, then interval arithmetic is useless. (As well as anything else, I guess.)
– Alex Gavrilov
Sep 2 at 7:39
That's a really clever trick! But how do I cast a severely overdetermined system of equations as a fixed-point equation? I only know how to do it for exactly determined systems of equations, where you send $f(x) = 0$ to (an appropriately transformed version of) $g(x) := f(x) + x$.
– David Zhang
Sep 2 at 7:43
You raise a good point. My first idea would be to convert two (or more) equations to one with $f(x)=0,g(x)=0 iff f(x)^2+g(x)^2=0$. This should work at least on paper, I guess. But then of course we are close to a double solution and that's exactly where the method may find trouble.
– Federico Poloni
Sep 2 at 7:53
(Note that in Rump's paper that I mentioned in the answer there are methods tailored to find double and multiple solutions as well).
– Federico Poloni
Sep 2 at 8:09
@FedericoPoloni That's not going to work, as a small deformation of the system will have no solutions, so interval arithmetic / topological considerations won't detect the solutions.
– Will Sawin
Sep 8 at 22:34
Thanks for contributing an answer to MathOverflow!
But avoid …
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Do you know about Smale's alpha-theory? (Like this: arxiv.org/abs/1011.1091)
– Kirill
Sep 2 at 10:53