
Hello there,
On Fri, 18 Jan 2002, Roberto Bagnara wrote:
Bob McElrath wrote:
ginsh: > evalf((-1)^(1/3)); 0.5+0.86602540378443864673*I
While this answer is technically correct: exp(I*pi/3) = -1
It's not what I was expecting, and the cause of some headache today. I was trying to write a little routine like RootOf to solve a cubic polynomial, and was getting three complex answers. (at least one of them must be real if you start with a real polynomial)
How should I go about enforcing that (-1)^(1/3) = -1?
I think it would be unreasonable to require that (-1)^(1/3) = -1, since the other solutions are as good as -1. Thus I believe GiNaC does the right thing by leaving the symbolic expression (-1)^(1/3) untouched. On the other hand, a little experiment with ginsh shows that
1^(1/2);
1
so that here GiNaC does choose one root. I believed it does so on the grounds that even roots of non-negative real numbers are usually defined that way. However, still with ginsh, we get also
(-1)^(1/2);
I
so it seems this tradition is pushed a little too far. I think these issues should be clarified with the GiNaC team.
Perhaps you are criticizing the behavior of evalf() when applied to (-1)^(1/3) and I agree with you. Since odd roots of real numbers always have a real solution, why not defining evalf() so that it does "the Right Thing"? The point is perhaps: how much would it cost? GiNaC people?
There are obviously a lot of choices that can be made in this field and they were the cause of much agony. The question is, what should be the guidelines for these choices? Instead of choosing `visual simplicity', which could be interpreted as a vote for (-1)^(1/3) => -1, we chose to accept `branch cut consistency', which is more rigorously definable.
The roots (even or odd) of negative numbers are always the first ones counting counter-clock-wise from the positive real axis.
Why is this? There are several guidelines. The usual convention is to define the n'th root of x==r*exp(I*p), r and p being real, like such:
x^(1/n) = exp(log(x)/n) == exp((log(r) + I*p)/n) == exp(log(r)/n) + exp(I*p/n)
The definition of the principal branch of the log function is such that the branch cut runs along the negative real axis, continuous with quadrant II, i.e. log(-1) == +I*Pi. Since the exp function does not feature a branch cut, this defines the branch cut of all the roots as well.
This definition is arbitrary from a mathematical point of few. However, for practical purposes, we need to fix the branch cuts somehow and we did not come up with our own definitions. For instance, the Common Lisp standard mandates them. I can warmly recommend Guy L. Steele's book "Common Lisp, the Language, 2nd Edition", which we should think of as the de-facto standard for Lisp. Section 12.5 specifies these things and is an excellent treatment. Just search Google for it, there are numerous copies of it on the web. (But skip the parts dealing with signed zeros, because we don't have these.)
Why, could one ask next, should we bother about Lisp? Because they did dare to specify these things and people have been following their recommendation. I checked a lot of critical cases and found that both MapleV and Mathematica follow the same convention with respect to branch cuts. Also, we studied the ISO C++ standard and while this standard unfortunately leaves some room for interpretation there is hope in sight: The revised C standard (C99) specifies complex numbers and the choice of all branch cuts is exactly identical to the one in Common Lisp. (One just has to read carefully and find the amusing section 7.3.3.2 saying "implementations shall map a cut so the function is continuous as the cut is approached coming around the finite endpoint of the cut in a counter clockwise direction.") The above consistency between Lisp, C, Maple and Mathematica (and GiNaC) also holds true for all inverse hyperbolic and inverse trigonometric functions like tanh, sinh, atanh, etc.
Unfortunately, other Computer Algebra Systems (notably MuPAD and Reduce) chose their own, mutually incompatible, definitions for branch cuts.
If somebody finds a discrepancy, then we should fix it -- I am not guaranteeing that we got every case correct (but we've tried quite hard). I am only saying that there are some external constraints and thinking that the choice is entirely ours would be foolish.
Well, I hope this explains your observation with regards to evalf().
Oh, and (-1)^(1/2) gets eval'ed while (-1)^(1/3) does not because the first one has an exact solution while the latter does not. That is all behind it.
Hope this helps -richy.