Could someone translate this Donald Knuth algorithm to C/C++?

  • #1

Main Question or Discussion Point

[Can't embed image. Keep getting 500 Internal Server Error ...]

https://i.stack.imgur.com/rmDuR.jpg

Image added by mentor
rmDuR.jpg
 
Last edited by a moderator:

Answers and Replies

  • #2
Tom.G
Science Advisor
3,248
1,995
That program is from "THE ART OF COMPUTER PROGRAMMING", by Donald Knuth, Vol. 3 "SEARCHING AND SORTING".

The assembly language used is MIXAL, as described in Vol.1 "FUNDAMENTAL ALGORITHIMS", pg. 120 in the 2nd edition.

Have Fun and Learn Much!
Tom

p.s. That book series is well worth having in your personal library. It is sometimes referred to as The Bible of programming fundamentals.
 
  • Like
Likes jim mcnamara, anorlunda and jedishrfu
  • #3
That program is from "THE ART OF COMPUTER PROGRAMMING", by Donald Knuth, Vol. 3 "SEARCHING AND SORTING".

The assembly language used is MIXAL, as described in Vol.1 "FUNDAMENTAL ALGORITHIMS", pg. 120 in the 2nd edition.

Have Fun and Learn Much!
Tom

p.s. That book series is well worth having in your personal library. It is sometimes referred to as The Bible of programming fundamentals.
I do own the book, but I've mostly been able to get through it without learning MIXAL by just reading his paragraph descriptions or his pseudo-code. In this case I'm not quite getting it and I'm wondering if someone could either translate it to C code or direct me to a link online where it is showed (I don't know what to look for since the algorithm doesn't appear to go by a name).
 
  • #4
11,827
5,450
Try looking for Knuth Quicker Sequential Search algorithm.

I found this article on wikipedia:

https://en.wikipedia.org/wiki/Linear_search

Its hard to tease out an answer here without the other parts of the algorithm. It seems that this was replacing the inner loop of a prior algorithm. I would suggest learning MIXAL to really understand it.

https://esolangs.org/wiki/MIX_(Knuth)

You could imagine you have an arithmetic register called A and an index register called X1
C:
A = K    // LDA K
mem[KEY+N+1] = A    // STA KEY+N+1
X1 = -1 - N     //  ENT1 -1-N

if (A == mem[KEY+N+X1]) { goto four_f } else { goto three_b }

...
The CMP does a compare of the A register to some memory location which is computed from KEY+N and adding in the index register X1 ie like the "i" in a for loop. You'll have to tease out the rest and how it fits into the larger algorithm though.

KEY is likely the name of the table to be searched and N is likely the number of bytes in the table and the index register X1 I mentioned acts like the "i" in a C for loop

As an aside, I found this ACM article on searching that mentions Knuth and job interview questions which may be good to know for long range prospects ie not for this immediate thread.

https://queue.acm.org/detail.cfm?id=2984631
 
  • #5
Filip Larsen
Gold Member
1,257
184
I belive, as jedishrfu also points out, that Knuth here is giving an example of the fairly common optimization technique in low-level programming of duplicating the body of a (small) loop to reduce the relative high cost of conditional testing and jumping when iterating. The general term is (full or partial) loop unrolling.

(Edit: added (relative expensive) conditional testing as reason for unrolling).
 
Last edited:
  • Like
Likes anorlunda and jedishrfu
  • #6
11,827
5,450
I once had a great boss at GE who was a master of GMAP the macro-assembler langauge of the GE-635 mainframe. The language is very similar to Knuth's MIXAL language.

One time, I saw in his code a redundant list of LDA / STA statements that was copying a table from one location to another. There were like 50 repetitions of LDA / STA over and over.

Being a boy genius at the time ie a newly minted programmer from the academic halls of GE's programming school. I suggested an obvious loop solution to this block code which I felt was plain ugly and wasted memory. He kindly showed me the error of my ways by pointing out his code took 100 machine cycles whereas my looping code took roughly 250 cycles to complete.

The wasted memory was in fact not wasted since this code was in an initializing subroutine that ran once at startup and then was no longer needed. He would reuse the memory area for follow-on data storage. In his code, there was no notion of keeping program code separate from data as is commonly done today with our programming systems. He had use of all memory allocated to his program and used it in the most effective and efficient manner. Consequently, his program took up far less space than a higher level program would. On mainframes of the time, that was a key thing to keep in mind.

In summary, I would suggest you learn MIXAL to fully understand and appreciate Knuth's writing. As of now, its like you're trying to learn a subject based on Calculus with only an understanding of algebra.
 
  • #7
anorlunda
Staff Emeritus
Insights Author
8,603
5,495
He kindly showed me the error of my ways by pointing out his code took 100 machine cycles whereas my looping code took roughly 250 cycles to complete.
In those days memory was also severely limited. That made it hard to decide whether to optimize for speed or memory.

A decade or two later, we were still struggling with the limitations of address space. Virtual memory was the magic sauce that finally motivated wider address fields in the instruction set.

But then in the 90s, communications overwhelmed both speed&space and bingo we had the Internet. But IBM mainframe fans could brag that IBM had the fastest input/output throughput all along.
 
  • #8
11,827
5,450
My recollection was that IBM had the best hardware but that GE/Honeywell had the best OS known as GCOS that was highly configurable. Unix was yet to be conceived.

Unix came out of the partnership of MIT, GE, Honeywell and AT&T on Multics the successor to GCOS. AT&T Bell Labs became dissatisfied and broke away later making Unix and using the commands for the Multics/GCOS Access system as their ubiquitous ls, rm, ... shell commands.

and now back to our regularly hosted thread.
 
Last edited:
  • Like
Likes Klystron
  • #9
Klystron
Gold Member
690
939
As an aid to understanding when studying Knuth and other pioneers, consider creating and commenting intermediate variables into your pseudo-code. @jedishrfu's code fragment in post #4 is a good example.

Extract the embedded variables, assign them data types and trace their values through critical sections of the process. The OP's request to translate the algorithms into C makes sense but only after you understand what the original code intends to accomplish.

For instance a memory address might be held in and treated as an integer, even used as an implied index to an array, but the sophisticated student understands they are manipulating computer memory even if the language does not explicitly include pointers. Expanding embeds and declaring accurate data types aids understanding.

Deconstruct the terse optimized algorithm using descriptive intermediate variables that you choose. Once you understand the steps, you can re-embed and streamline the (pseudo)-code into tight C/C++ as you wish.
 
  • #10
11,827
5,450
Many times it’s often unnecessary to actually implement Knuth algorithms unless of course you are writing a new library of functions for a new language with new features then Knuth can help you understand what to optimize.

In nearly all other cases, your system library or third party libraries will have done this work for you making it an unnecessary exercise.

However, it doesn’t hurt to try your skills and implement it.

Nowadays, most good programmers seek off the shelf solutions and support to minimize their code base while maximizing their time to develop their code using the building blocks of the open source world. Minimizing your code reduces possible coding defects and leverages the open source community to fix their defects in their updates to you.
 
  • Like
Likes QuantumQuest
  • #11
33,646
5,315
but I've mostly been able to get through it without learning MIXAL
In summary, I would suggest you learn MIXAL to fully understand and appreciate Knuth's writing.
I second that comment. Here's a link to a short summary of Knuth's MIX language - https://en.wikipedia.org/wiki/MIX. From the page in Knuth that you're looking at, and its comments, plus the wiki article, you have a better chance of understanding what Knuth is doing in his algorithm.

Keep in mind that in 1962, when Knuth started his "Art of Computer Programming" books, Fortran had been available on for about 5 years, and COBOL for only about 3 years. A lot of programming was done in whatever assembly language was available for the mainframe one was working on. Knuth's MIX language seems to me to be his effort to write pseudocode that could be adapted to one of the few high-level languages or to another assembly language.

Regarding the link in post #1. You can't attach a hyper link as a file. What I did was to save the imgur image as a file on my computer, and then attach that image.
 
  • Like
Likes jim mcnamara, QuantumQuest and jedishrfu

Related Threads on Could someone translate this Donald Knuth algorithm to C/C++?

  • Last Post
Replies
3
Views
2K
  • Last Post
Replies
0
Views
1K
Replies
25
Views
1K
Replies
8
Views
868
  • Last Post
Replies
10
Views
4K
Replies
3
Views
2K
Replies
7
Views
828
Replies
2
Views
881
  • Last Post
Replies
16
Views
6K
Replies
4
Views
2K
Top