Insights Recursion in Programming and When to Use or Not to Use It

jedishrfu · Mar 22, 2020

I'm sure programmers before him used variations of the stack. He was likely the first to describe it clearly in the literature though.

I remember implementing a data stack in programs I wrote in BASIC for a CS class on data structures. My computer could handle recursive calls in BASIC but not with data so I implemented an index that I incremented and decremented as I traversed the recursion. My data values were stored in arrays referenced by the index.

FORTRAN on mainframes in the 1970s didn't have recursive capabilities. If you tried it you get caught in a loop.
The reason was return address was a single location so sub A could call sub B and sub B could call sub C but if subs B or C called A then the return address would point back to them and not the original caller.

Klystron · Mar 22, 2020

rcgldr said:

[snip...]
"nest" (which later was named "stack").

This "nest" aka "stack" was used to allow recursion [...]

Apropos the terminology one of my old data structure textbooks used both terms but concluded that 'stack' was less ambiguous than 'nest' but retained the latter to describe 'nested FOR loops', 'nested functions', etc. Also, 'stack' was a popular term in electronics at that time. Interesting post.

bhobba · Mar 22, 2020

The programming language I mostly used, called Natural, allowed recursion but few used it, and those that were called upon to maintain the few that did tried to avoid it like the plague; buck pass it whenever possible. Interestingly many programs that were put in the too hard basket were often solved using an array as a stack. One that did the rounds I originally wrote but ran too long. What it did was a heap of database searches to update things. It was possible to store a direct pointer to the database record rather than search for it, and if you do that it retrieved records much faster. So I figured maybe it was retrieving many of the same records a lot, and on a punt kept a copy of the records location in a large array I used as a stack. I searched that first before searching the database. Amazingly it cut run time to about an hour from all night.

Thanks
Bill

rcgldr · Mar 22, 2020

jedishrfu said:

FORTRAN on mainframes in the 1970s didn't have recursive capabilities.

IBM 360 family to this day doesn't have a native stack. Instead there is a caller allocated save area, similar to X64 calling convention, except that the IBM save area is at least large enough to save all registers. At callee entry, R13 points to the save area, R14 is return address, R15 is base == entry address of the function being called . A linkage stack of save areas can be implemented by allocating save areas from the heap, then using those allocated save areas to allow recursion. Assuming the programming convention saves all registers, including R13, and R14 into the caller provided save area, then the current R13 points to the caller's save area, and that save area's stored copy of R13, points to the prior save area, providing a link chain (linkage stack) of save areas. When control is returned to the caller, the caller frees the allocated save area if not making another call. If a function does not make any calls, there's no need for that function to allocate a new save area, it will just use the save area provided by the caller.

jedishrfu · Mar 23, 2020

Yes, they had an area to save the registers prior to the call. Some registers were then loaded with argument pointers and upon return the registers were reloaded with their older values.

.Scott · Jun 25, 2020

If you're looking for a worse case scenario for "no stack support", I submit the HP2100 series (circa 1967).
A call to a routine at location 300 would result in the return address being stored at 300 and execution beginning at 302. And I do remember writing code to support reentrancy on that machine.

I've been programming professionally for about 50 years. I believe I've used recursion a handful of times - but only two of any significance:

1) The ZigZag sort:
I actually got this sort from a ACM newsletter cover article (I recall the name was AACM at the time). It was originally intended to be implemented on a hypothetical parallel processing machine - but I adapted it for single processor use. I haven't used it in a couple of decades because now excellent sorts are part of regular libraries. But at the time, a sort that was sparing on memory, had good characteristics when run in a memory paging environment, and was efficient with processor resources made for a handy tool in any toolbox.

As I said, it is an "in place" sort - meaning that if you have "N" entries, you only need N+C memory locations to store the data where "C" is a small constant. In particular, that C is only needed when two elements are exchanged. So C=1.

The sort works most readily when N is a power of two, and it was originally implemented only for that case (and as originally implemented, it ran on "N" hypothetical processors in a 2^N configuration).

But here it is as pseudo-code:

Code:

   Sort(int direction, array a) {
      n=a.length()
      if(n>3) {
         n2 = truncate(n/2)
         Sort(direction,a[1..n2])
         Sort(-direction,a[n2+1..n])
      }
      Merge(direction,a)
   }

   Merge(int direction,array a) {
      n=a.length()
      if(n>3) {
         n2 = truncate(n/2)
         n3 = truncate((n+1)/2)
         for(n=1 to n2) {
           CompareAndSwap(direction,a[n],a[n+n3])
         }
         if(n3>n2) {
            // When number of elements is odd...
            if(Compare(direction,a[n3],a[1]) && Compare(direction,a[n3],a[n2]) n2 = n2+1
         }
         Merge(direction,a[1..n2])
         Merge(direction,a[n2+1..n])
      } else if(n>1) {
         CompareAndSwap(direction,a[1],a[2])
         if(n==3) {
            CompareAndSwap(direction,a[2],a[3])
            CompareAndSwap(direction,a[1],a[2])
         }
      }
   }

   Compare(int direction, element A, element B) {
      return ((direction>0) and (A>B)) or ((direction<0) and (B>A))
   }

   CompareAndSwap(int direction, element reference A, element reference B) {
      if(Compare(direction,A,B)) {
         element temp
         temp = A
         A = B
         B = temp
      }
   }

2) I ran into the other recursive application in the late 70's. The Defence Mapping Agency Aerospace Center produced charts for military pilots and wanted to put those publication on a computer - the Data General MV/8000.

In many cases, those "charts" were books. The books were printed as "signatures" (large paper sheets) with as many signatures per book as required. Each signature was 16 or 32 pages. Each page was 1 to 3 columns. Often, within each column there would be lists with several columns within the main column. So, as you went from book to signature to column to list column, you had 3 or 4 levels of recursion - depending on exactly how you counted them.

This recursion came into critical use just before publication when they wanted to "balance" the pages. The goal was to avoid leaving blank pages - better to have a little extra space at the bottom of each page that whole blank sheets at the end of the book. Similarly, if there were three columns on a page, it is better to put that content toward the top of the page than to leave a columns blank or unbalanced. And when a multi-column list was inserted, it should be balanced so as not to require more column space than the minimum. So the software would first run through all of the content to generate a table of small paragraph descriptions - how long the paragraph was, what level it was at (page, column, list), and how and whether it could be broken across columns/pages.
Then the processor would run through that table placing the content without attempting to balance it. This would generate the minimum number of signatures that would be required. A binary search would then ensue by adjusting the bottom margin, repopulating the pages, and determining whether the signature count had increased. If it had, then the margin was reduced. If not, it was increased. Ultimately leading to the perfect margin. This was repeated at the page and column levels to produce a completely balanced document. That binary search needed to occur within the recursion - and specifically whenever the content specified a page-break, a change in the page columns, or the end of a list. If that sounds complicated, there was one more wrinkle. The MV/8000 could not hold all of the table data at once - it had to be maintained in a disk file. To do this efficiently, the recursive algorithm needed to be adjusted so that it could make good use of the processor resources when working with an oversize table. The final results were paragraph-placing parameters that were saved in the table and later used to plot the originals for each signature.

.Scott · Jun 26, 2020

Why did I call this a "ZigZag" sort? ... in case anyone is interested.

After reading that original AACM article, I studied the algorithm and realized that it made for a good in-place sort for the reasons I described above. But before I adopted it, I wanted to prove to myself that it would really work under all conditions.

Of pivotal importance in this algorithm is the "Merge" operation. I'm using the term "Merge" loosely. What is actually done (as shown in the pseudo-code above) is that corresponding elements from two lists are compared and if needed swapped - so that the "smaller" (by collating sequence) is left in the first list and the larger is left in the second list.
But the sort will only work if the result of this "Merge" is that each of the elements of the first list is smaller than any in the second.

So I drew a chart for myself that showed the elements to be sorted along the X axis and their relative positions in the collating sequence in the Y axis. Before applying this Merge, the elements need to form a particular pattern in this chart. There will be a maximum and minimum element - and the elements need to be sequenced so that the collating sequence of all the elements between the min and max rise monotonically in both the +X and -X directions - with the chart wrapping around from the first element in the list to the last. If you think of the minimum element as the "Zig" is the chart and the Maximum as the "Zag", it has to be a simple ZigZag, with no extra Zigs or Zags (local minimums or maximums).

If you start with this ZigZag shape, fold it in half, and then perform the "merge", you get all the elements of low collating sequence in the first half, all those of higher collating sequence in the second half, and both of these sequences will have this same "ZigZag" property - allowing you to Merge them as well.

jedishrfu · Jun 26, 2020

Zig-zag bah! I shall name it the Mark of Zorro!

.Scott · Jun 29, 2020

Out of the night, when the full moon is bright,
Comes some code that is known as Zorro.

This bold engineer codes a 'Z' with his gear,
a 'Z' that stands for Zorro!

... I'll stick with ZigZag ... and the next time I will makes the collating plots look less like sword play.

.Scott · Jun 29, 2020

Here are a couple of more examples of recursion - the DFT and iDFT.
C Code for FFT.

I have coded the Discrete FT before. But never used explicit recursion to do it.

Stephen Tashi · Jun 30, 2020

jedishrfu said:

Yes, a good example of this is when you make an expression parser.

That's a very good example because most programming languages that I have seen use recursion in defining what constitutes a valid statement in the language.

rcgldr · Jul 4, 2020

.Scott said:

If you're looking for a worse case scenario for "no stack support", I submit the HP2100 series (circa 1967).
A call to a routine at location 300 would result in the return address being stored at 300 and execution beginning at 302. And I do remember writing code to support reentrancy on that machine.

Predating that, the CDC 3000 series (circa 1963), did the same thing, the first word was used to store the return address. There were other computers that also used this calling convention.

https://en.wikipedia.org/wiki/CDC_3000_series

The calling convention for IBM mainframes going back to the IBM 360 doesn't involve a stack, instead the caller provides a save area, and for a call, R13 will point to save area, R14 will have the return address, R15 the base address of the function called. For recursion, each instance of a caller allocates a save area from the heap, and since R13 is saved in the save area, you end up with a linked list stack of save areas.

.Scott · Oct 27, 2020

@rcgldr:
You're right. I had forgotten about that CDC "RTJ" (Return Jump) instruction. I worked on a CDC 3300 at Lowell Tech - mostly in COMPASS (the macro assembler) and FORTRAN.

The RTJ instruction is documented on page 5-47 of the CDC 3300 Hardware Manual.

vinaykhatri · Oct 18, 2022

Recursion can be a very tricky concept for a beginner because its definition is very simple but the real-world uses are complex. Recursion is generally used when we want to call a function again and again until its hit a base condition.
The factorial of a number is very simple to use in the case of using recursion where we want to perform the same function n*factorial(n-1). until the value of factorial(n) becomes 1.
But using recursion is not that simple, by calling a function again and again n times takes a lot of resources and in this case, loops can be very handy tools to use.
So why and when to use recursion?
Although recursion can act like a loop, it can not be used as an alternative for a loop, the main reason is recursion can make a simple loop program complex and increase time and space complexity.
Recursion really shows when they are used in complex algorithms and data structures like searching, sorting, trees, and graph traverse. In these complex algorithms recursion really shines because these algorithms generally work on divide and conqure rules. Where we keep dividing the structure or data into half or sub-parts, and this is the power of recursion, where as loops can not be that useful in such scenarios.

PeterDonis · Oct 18, 2022

vinaykhatri said:

Although recursion can act like a loop, it can not be used as an alternative for a loop, the main reason is recursion can make a simple loop program complex and increase time and space complexity.

This is not always true. Some languages (for example, Lisp) implement loop constructs using recursion, and have optimizations for common cases of recursion that eliminate the increase in time and space complexity (for example, tail call optimization).

A better general argument would be that loops are much easier for programmers to reason about than recursion, so that, even if your language implements loop constructs using recursion, it still benefits from having the loop construct because it is easier for programmers. In most cases these days, it is much more important to optimize programmer time than to optimize computation time, since the former is far more expensive than the latter.

Mark44 · Dec 9, 2022

I've taught a lot of introductory programming classes over the years, mostly in C or C++, but also in other languages, such as Modula-2 (one time) and MIPS assembly (a half-dozen times). In all of these classes I've had students write programs that feature functions and in most I have assigned problems in which they write recursive functions.

A few years back I got interested in Lisp and wrote several programs in that language. More recently I have developed an interest in functional programming, and have been learning Erlang, a language that was developed by Ericsson AB, a Swedish telecom company.

In my classes, whenever I presented material on recursive functions, I tried to come up with examples other than the ubiquitous factorial function or the Fibonacci function, examples of which seem to appear in every programming textbook that covers recursion.

Two examples that I think are interesting feature recursive functions that 1) convert a decimal integer into its hexadecimal form, and 2) calculate the binomial coefficient ##\binom {n}{k}##, also written as ##~_nC_k##. The binomial coefficient gives the number of combinations of n items taken k at a time.

Here's the first of these, in Erlang. It converts a decimal integer to its hexadecimal form.

For some reason, the forum software wouldn't let me insert the code as text inside code tags, so I've attached the code as images. I've done a fair amount of testing, and this code seems to work well. Most of the work of recursion is done in the second toHexAsst() function, in which the last line calls the function recursively, making it tail-call recursive. In Erlang, a recursive function with tail-call recursion gets converted by the compiler into a loop, so there's not the problem of blowing up the stack if recursion goes too deep.

The other example calculates the binomial coefficient, ##~_nC_k##. This is something I've never come across in my reading. I've omitted the comments I have in the code, but I'll provide a brief explanation here.

If f(n, k) = ##~_nC_k##, a recurrence relation is f(n, k) = n*(n - 1) * f(n - 2, k - 1) / [ (n - k) * k].
So, ##~_nC_k## = n * (n - 1) / [(n - k)*k] * ##~_{n - 2}C_{k - 1}##. The division here is integer division, which is guaranteed to be exact by virtue of the numbers involved.

By symmetry ##~_nC_k## = ##~_nC_{n -k}##. For example, ##~_8C_5 =~_8C_{8 - 5} = ~_8C_3##. This is easy to confirm by writing a few rows of the Pascal Triangle, which contains the binomial coefficients.

This one isn't tail-call recursive, and I can't for the life of me figure out how to make it so. Previous attempts produced incorrect values due to the integer division being performed too early.

jedishrfu · Dec 9, 2022

One function we wrote recursively was Ackermans function. Its a do nothing function but can really drag your machine down while doing a computation.

https://en.wikipedia.org/wiki/Ackermann_function

Here's the rosettacode.org Erlang version:

https://rosettacode.org/wiki/Pascal's_triangle#Erlang

[CODE lang="clike" title="pascal trianlge in erlang"]-import(lists).
-export([pascal/1]).

pascal(1)-> [[1]];
pascal(N) ->
L = pascal(N-1),
[H|_] = L,
[lists:zipwith(fun(X,Y)->X+Y end,[0]++H,H++[0])|L].[/CODE]

It looks like it has a tail recursion design.

pbuk · Dec 9, 2022

Mark44 said:

For some reason, the forum software wouldn't let me insert the code as text inside code tags,

I think there is something broken at the moment, there are other symptoms.

Mark44 said:

This one isn't tail-call recursive, and I can't for the life of me figure out how to make it so. Previous attempts produced incorrect values due to the integer division being performed too early.

Interesting, I would have thought the compiler could have sorted it out (perhaps there is a higher level of optimization you can select) but I think you need

Code:

  true ->
    (N * (N - 1)) div ((N - K) * K) * combi(N - 2, K - 1)

pbuk · Dec 9, 2022

Rosettacode has a different approach to binomial coeffecients using an explicit accumulator to make the tail recursion more obvious: https://rosettacode.org/wiki/Evaluate_binomial_coefficients#Erlang

Mark44 · Dec 9, 2022

pbuk said:
Interesting, I would have thought the compiler could have sorted it out (perhaps there is a higher level of optimization you can select) but I think you need
Code:
  true ->
    (N * (N - 1)) div ((N - K) * K) * combi(N - 2, K - 1)

I'm not familiar enough with the compiler and its settings to know if I can adjust this.

Regarding your suggestion, I actually tried that, but got incorrect results, due to integer division being performed at the wrong time. For example, for ##~_5C_2## the factor to the left of the call to combi() is
$$\frac{5 \cdot 4}{3 \cdot 2}= \frac {20} 6$$
With integer division, this works out to 3. The other factor, from the call to combi(), is ##~_3C_1## or 3, so the result would be 3 * 3 = 9.

If the factors are placed in the proper order, you get ##20 \cdot 3 \text{ div } 6##, or 10, the correct value. I haven't run across any information about operator precedence an associativity, but it's almost certainly the case that multiplication and division are left-associative.

Mark44 · Dec 9, 2022

pbuk said:

Rosettacode has a different approach to binomial coeffecients using an explicit accumulator to make the tail recursion more obvious: https://rosettacode.org/wiki/Evaluate_binomial_coefficients#Erlang

I'll take a look at this. The algorithm I'm using is something I came up with independently.

bhobba · Dec 9, 2022

Nice seeing my little article on recursion with a bit of humour has created some interesting discussion.

Previously I suggested programming in Python and using Juajit with some simple C glue if some parts needed speeding up.

But things move on, and the one I now find interesting is TPtython++

The latest version is even smarter. Instead of making some minor changes to the Python code, so those bits are compiled C++, it goes through your Python code, and the bits it can be easily done to automatically become compiled C++. Of course, you can still convert the code manually you want as C++ to speed things up further. Actually, it is not C++, but a Python-like version of C++ called Pythonic++

Thanks
Bill

Mark44 · Dec 9, 2022

jedishrfu said:

Here's the rosettacode.org Erlang version: URL='https://rosettacode.org/wiki/Pascal%27s_triangle#Erlang']https://rosettacode.org/wiki/Pascal's_triangle#Erlang[/URL]

[CODE lang="clike" title="pascal trianlge in erlang"]-import(lists).
-export([pascal/1]).

pascal(1)-> [[1]];
pascal(N) ->
L = pascal(N-1),
[H|_] = L,
[lists:zipwith(fun(X,Y)->X+Y end,[0]++H,H++[0])|L].[/CODE]

It looks like it has a tail recursion design.

I don't think so. After the recursive call to pascal(), the code still needs to do a couple operations.

jedishrfu · Dec 9, 2022

@bhobba The Lua reference is pretty interesting in that it beat Numpy for both tasks. I've played with Lua in the context of doing a 360 degree Rosette Graphing app on iOS Codea. Lua seems like a nicer version of Python although I only did one app so far.

The Lua article makes me want to play with it again and maybe even get it working on my desktop linux box.

Thanks.

jedishrfu · Dec 9, 2022

You're right, for tail recursion to work the recursive call needs to be last. I forgot that part.

https://www.baeldung.com/cs/tail-vs-non-tail-recursion

bhobba · Dec 10, 2022

jedishrfu said:

The Lua reference is pretty interesting in that it beat Numpy for both tasks.

Yes LUAJIT by itself is very nice.

There is an even higher level language that compiles to LUA
https://moonscript.org/

For simpler programs I just write in Moonscript, knowing the LUAJIT will be nearly as fast a C++.

Thanks
Bill

Mark44 · Dec 10, 2022

Interesting to hear about Lua and Pythonic++. When I get tired of playing with Erlang, I think I'll look into these languages. I'm always interested in speeding things up.

jedishrfu · Dec 10, 2022

Julia too is getting a lot of press. Its functional not OO but with function signature overloading. Its fast but theres a precompile cost thats frustrating. Its similar to matlab but faster for numerical work and it can iact as glue to ntegrate python, fortran and r into a collective application.

.Scott · Dec 12, 2022

vinaykhatri said:

Although recursion can act like a loop, it can not be used as an alternative for a loop, the main reason is recursion can make a simple loop program complex and increase time and space complexity.

I realize that I am changing the context of that statement, but that context is easy to mistake.
So I am not quite willing to let that statement stand without comment.

If your build environment or coding rules prohibit recursion, there will always be an alternative to recursion that does not add artificial complexity.

When recursion is used, you are, in effect, using the stack as an array of structures with the array size not specified (or known) until runtime. In situations where recursion is prohibited by rules or build methods, the central issue is that the demand for stack space is not explicitly limited.

The alternative is to define a structure which has all the data elements needed to describe the state at each recursion reentry - the equivalent of all those local variables that are residing on the stack. Then pick a maximum recursion depth and declare an array of those structures of that size.

Then, when it's time to make that recursive call, increment your pointer into that array, check for array overrun, move the data that that had appeared in the recursive call arguments into that new array structure, and continue on. On return, decrement the array index.

Sure, it's not generally that simple. But with a bit of thought and code shuffling it can become that simple.

Now a few words on "complexity": "Code complexity" can refer to how psychologically tractable a software developer may find the code to follow or to "cyclomatic complexity" which is a very specific static code analysis measurement. The tractability can be addressed by describing the basic strategy and tying the code to that strategy by "common sense" arrangement of the structure, mnemonic names tied to the strategy, and making sure that the design documentation is readily accessible to anyone looking at the code. "Cyclomatic complexity" will generally affect both the tractability of the code and the difficulty in producing unit tests that thoroughly test the code. But it is tied most closely to the testing issues.
A common method of reducing both types of complexity is to revisit working code that has evolved with an eye to more fully thought out strategies - and then reworking the code to implement those strategies.
A common method of reducing cyclomatic complexity is to break a function up into smaller functions to reduce if/then/switch/loop nesting levels.

PeterDonis · Dec 12, 2022

.Scott said:

The alternative is to define a structure which has all the data elements needed to describe the state at each recursion reentry - the equivalent of all those local variables that are residing on the stack. Then pick a maximum recursion depth and declare an array of those structures of that size.

In many cases you don't even need to do that, since all you need is a single data structure, not an array of them, to keep track of the state. This fact is why so many algorithms can be implemented as loops instead of recursively. For example, to compute the nth Fibonacci number, you don't need n data structures; all you need to keep track of is the current loop index and the current partial product.

Insights Recursion in Programming and When to Use or Not to Use It

Similar threads

How to increase phone signal strength by lying about it

A Crisis for Newly Minted CompSci Majors -- entry level jobs gone

How to calculate Tension for a series of connected points?

Learning Assembly and computer architecture for x86

Sequential Analog Computers?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers