Python Automatic vs Symbolic differentiation

  • Thread starter m4r35n357
  • Start date
654
148
Summary
Deathmatch
I thought I would give you guys some justification for why I keep pestering you with this odd numerical technique. I've mentioned its advantages over finite differences quite a few times, so now I am putting it up against symbolic differentiation.

Here is a plot of the function ##x^2 / {\ln(\cosh(x) + 1)}##, together with its first six derivatives, using automatic differentiation (1001 x data points in total). The computations use 236 bit floating point precision courtesy of gmpy2 (MPFR) arbitrary point arithmetic.
246559

Here is an illustration of the CPU time involved in generating the data:
Code:
$ time -p ./models.py 0 -8 8 1001 7 1e-12 1e-12 > /tmp/data 2>/dev/null
real 0.30
user 0.29
sys 0.00
Here is an interactive session evaluating the same function at a single point, ##x = 2##, together with its first twelve derivatives:
Code:
$ ipython3
Python 3.7.1 (default, Oct 22 2018, 11:21:55)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from ad import *                                                                                                    
ad module loaded

In [2]: x = Series.get(13, 2).var                                                                                            

In [3]: print(~(x * x / (x.cosh + 1).ln))                                                                                    
+2.562938002e+00 +1.312276410e+00 -3.440924889e-01 +2.366685634e-01 +2.260914151e-01 -1.617592557e+00 +5.097273067e+00 -1.157769456e+01 +1.290979888e+01 +5.759642561e+01 -5.363560778e+02 +2.667773442e+03 -8.844486444e+03
Finally, for comparison, here is something to cut & paste into Wolfram Alpha (symbolic computation):
Code:
d^6/dx^6 x^2 / ln(cosh(x) + 1) where x = 2
I couldn't get it to do the twelfth derivative, and even for the sixth it will not attempt to evaluate the value (not unless I register anyway!).

Enjoy!
 
Last edited:

phyzguy

Science Advisor
4,206
1,191
Is your point that numerical differentiation is faster than symbolic differentiation? This is probably true if you only want the answer at one point or a small number of points. But the symbolic answer gives you the full function at all possible points. It contains information that is not contained in the numerical values.
 
2,011
479
Evidently I am arriving late to the party. What is the algorithm that you use for what you call "automatic" differentiation? (I know nothing about Python, so please describe the algorithm in mathematical terms; thanks).
 
654
148
But the symbolic answer gives you the full function at all possible points. It contains information that is not contained in the numerical values.
So does AD; all the derivatives are derived from the function itself, at any point where the function is defined.
 
654
148
Evidently I am arriving late to the party. What is the algorithm that you use for what you call "automatic" differentiation? (I know nothing about Python, so please describe the algorithm in mathematical terms; thanks).
Well, the Wikipedia page is not a bad place to start . . .
 

FactChecker

Science Advisor
Gold Member
2018 Award
4,979
1,748
A lot of the benefit of symbolic differentiation is in seeing the structure of the derivative. The simplest example I can think of is that the derivative of the exponential is itself. It doesn't look to me like AD would ever tell you that except point-by-point within a certain accuracy. AD is still a numerical technique.

I think that AD would be of more interest as a different numerical approach than as a substitute for symbolic differentiation.
 
654
148
A lot of the benefit of symbolic differentiation is in seeing the structure of the derivative. The simplest example I can think of is that the derivative of the exponential is itself.
Not a major point to me, as in all practical cases where I've used symbolic differentiation (like calculating the einstein tensor from the metric in GR), the symbolic representations are a complete mess, even if the package is any good at simplification, but yes I suppose it is a benefit. Basically, AD can effortlessly evaluate differentials to orders that can easily choke any the major computer algebra systems. Way beyond anyone's ability to see structure ;)
It doesn't look to me like AD would ever tell you that except point-by-point within a certain accuracy. AD is still a numerical technique.
Yes, reverse mode AD the numerical technique behind tensor flow. I am talking about the benefits of forward mode AD, which course it is a numerical technique, just not a very well known one (as I have learned from past responses to my posts).
I think that AD would be of more interest as a different numerical approach than as a substitute for symbolic differentiation.
I am not putting it forward as a replacement for symbolic differentiation, it is an alternative to finite differences (in many circumstances) and even RK4 in most circumstances, but it essentially performs the same (but much simpler combinatorially) calculations as symbolic and is fundamentally of the same accuracy. Hence the comparison.
 
654
148
The simplest example I can think of is that the derivative of the exponential is itself. It doesn't look to me like AD would ever tell you that except point-by-point within a certain accuracy.
Thought I would address this one separately. Here are the first twenty nineteen differentials of ##\exp (2.0)##:
Code:
$ ipython3
Python 3.7.1 (default, Oct 22 2018, 11:21:55)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.2.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from ad import *                                                                                                           
ad module loaded

In [2]: x = Series.get(20, 2).var                                                                                                  

In [3]: print(~(x.exp))                                                                                                            
+7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00
I think I can see the structure in that ;)
 

FactChecker

Science Advisor
Gold Member
2018 Award
4,979
1,748
Thought I would address this one separately. Here are the first twenty nineteen differentials of ##\exp (2.0)##:
Code:
+7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00 +7.389056099e+00
I think I can see the structure in that ;)
Yes. Because it is trivial. It might be more difficult to recognize even slightly less trivial examples like ##x^n+x^{n-1}## or ##x*\sin(x)##.
 
654
148
Yes. Because it is trivial. It might be more difficult to recognize even slightly less trivial examples like ##x^n+x^{n-1}## or ##x*\sin(x)##.
Agreed, but as I mentioned above the "window" of intelligible output is finite.

There is a subset of use cases for a CAS, where the symbolic output is translated to a more "familiar" language for execution. These are the use cases where my comparison is valid. Not because of the extra time generating the derivatives (this is already paid for), but the time merely to evaluate the more complex expressions. Aside from this, it is not trivial to implement complicated expressions such as these without making errors (testing is essential!).

Unless I am much mistaken, the complexity of a CAS doing symbolic differentiation and evaluation is subject to Francesco Faà di Bruno's formula, whereas the Taylor Series Method is based on the Cauchy product. It certainly feels that way in Wolfram Alpha as the order of differentiation increases ;)
 

FactChecker

Science Advisor
Gold Member
2018 Award
4,979
1,748
I have never personally needed a higher order of derivative than third. In everything I have been involved in, I needed velocity and acceleration often, but I only needed jerk once or twice, and never (that I can remember) needed higher derivatives (snap, crackle, pop).

EDIT: I should add that I could never use very sophisticated numerical techniques because there were always messy complications (random components, needing to find global minimums rather than local, etc.) that prevented it.
 
Last edited:
654
148
I have never personally needed a higher order of derivative than third. In everything I have been involved in, I needed velocity and acceleration often, but I only needed jerk once or twice, and never (that I can remember) needed higher derivatives (snap, crackle, pop).
You bring up an interesting point, so make yourself comfortable ;)

I am assuming that you use or have used RK4 for solving ODEs, which is by definition a finite difference approximation to a fourth order Taylor Series solver (because finding higher order derivatives is supposedly "difficult" or "expensive").

In fact, as I have demonstrated here, finding higher order derivatives is not difficult or expensive at all. The Taylor Series Method (TSM) which is built on the iterative AD functions and operators, is trivial at fourth order (and much higher), and is immune from the compromises and inaccuracies of finite difference.

That is a pretty big use case. Essentially, I would contend that the TSM is superior to RK4 except for functions not covered by a given AD arithmetic, or for tabular functions.
 

FactChecker

Science Advisor
Gold Member
2018 Award
4,979
1,748
Good point. ODEs are a big use case that I did not encounter. There may have been people where I worked who did a lot of it that I was not aware of. For instance, I don't know what is involved in the aerodynamic CFD calculations.
 
Last edited:
654
148
Good point. ODEs are a big use case that I did not encounter. There may have been people who did a lot of it that I was not aware of. For instance, I don't know what is involved in the aerodynamic CFD calculations.
Thanks for the feedback!

ODEs are what got me involved with this method in the first place, and is the main reason I reverse-engineered the procedure for my own use. Turns out that the best way to verify the low level functions that I needed was to wrap them in Series objects and do those function/derivative plots like in the OP. Once this was done, the interactivity was almost an obvious thing to tidy up. But neither of those is as important to me as the ODE solver!

I wonder if there is any sensible application of this to PDEs, but I don't have experience solving them. I haven't seen anything in the literature.
 

anorlunda

Mentor
Insights Author
Gold Member
7,104
3,909
That is a pretty big use case. Essentially, I would contend that the TSM is superior to RK4 except for functions not covered by a given AD arithmetic, or for tabular functions.
There are many problems with strong nonlinearities. Just think of a diode for example. All higher order integrations are disadvantaged in cases where the equations and/or the coefficients change dramatically from one time step to the next.
 
654
148
There are many problems with strong nonlinearities. Just think of a diode for example. All higher order integrations are disadvantaged in cases where the equations and/or the coefficients change dramatically from one time step to the next.
I'm sure there are degenerate/edge cases, but here is ##|x + 1|##:
246623

Yes there really are 12 derivatives here! Piecewise functions are fine as long as f() is defined at the jump (and derivatives set to zero - see next sentence!). However, this is nothing whatsoever to do with the order of integration as it begins with the first derivative as I mentioned (Euler's method and RK4 would suffer the same fate).
 
654
148
Really?
246664
 
1,084
188
Thanks, that's clearer.
 

Want to reply to this thread?

"Automatic vs Symbolic differentiation" You must log in or register to reply here.

Related Threads for: Automatic vs Symbolic differentiation

Replies
1
Views
347
Replies
1
Views
2K
Replies
16
Views
1K
  • Posted
Replies
3
Views
3K
  • Posted
Replies
1
Views
4K

Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving
Top