Could someone translate this Donald Knuth algorithm to C/C++?

SlurrerOfSpeech · May 3, 2020

[Can't embed image. Keep getting 500 Internal Server Error ...]

https://i.stack.imgur.com/rmDuR.jpg

Image added by mentor

Tom.G · May 3, 2020

That program is from "THE ART OF COMPUTER PROGRAMMING", by Donald Knuth, Vol. 3 "SEARCHING AND SORTING".

The assembly language used is MIXAL, as described in Vol.1 "FUNDAMENTAL ALGORITHIMS", pg. 120 in the 2nd edition.

Have Fun and Learn Much!
Tom

p.s. That book series is well worth having in your personal library. It is sometimes referred to as The Bible of programming fundamentals.

SlurrerOfSpeech · May 3, 2020

Tom.G said:

That program is from "THE ART OF COMPUTER PROGRAMMING", by Donald Knuth, Vol. 3 "SEARCHING AND SORTING".

The assembly language used is MIXAL, as described in Vol.1 "FUNDAMENTAL ALGORITHIMS", pg. 120 in the 2nd edition.

Have Fun and Learn Much!
Tom

p.s. That book series is well worth having in your personal library. It is sometimes referred to as The Bible of programming fundamentals.

I do own the book, but I've mostly been able to get through it without learning MIXAL by just reading his paragraph descriptions or his pseudo-code. In this case I'm not quite getting it and I'm wondering if someone could either translate it to C code or direct me to a link online where it is showed (I don't know what to look for since the algorithm doesn't appear to go by a name).

jedishrfu · May 4, 2020

Try looking for Knuth Quicker Sequential Search algorithm.

I found this article on wikipedia:

https://en.wikipedia.org/wiki/Linear_search

Its hard to tease out an answer here without the other parts of the algorithm. It seems that this was replacing the inner loop of a prior algorithm. I would suggest learning MIXAL to really understand it.

https://esolangs.org/wiki/MIX_(Knuth)

You could imagine you have an arithmetic register called A and an index register called X1

C:

A = K    // LDA K
mem[KEY+N+1] = A    // STA KEY+N+1
X1 = -1 - N     //  ENT1 -1-N

if (A == mem[KEY+N+X1]) { goto four_f } else { goto three_b }

...

The CMP does a compare of the A register to some memory location which is computed from KEY+N and adding in the index register X1 ie like the "i" in a for loop. You'll have to tease out the rest and how it fits into the larger algorithm though.

KEY is likely the name of the table to be searched and N is likely the number of bytes in the table and the index register X1 I mentioned acts like the "i" in a C for loop

As an aside, I found this ACM article on searching that mentions Knuth and job interview questions which may be good to know for long range prospects ie not for this immediate thread.

https://queue.acm.org/detail.cfm?id=2984631

Filip Larsen · May 4, 2020

I belive, as jedishrfu also points out, that Knuth here is giving an example of the fairly common optimization technique in low-level programming of duplicating the body of a (small) loop to reduce the relative high cost of conditional testing and jumping when iterating. The general term is (full or partial) loop unrolling.

(Edit: added (relative expensive) conditional testing as reason for unrolling).

jedishrfu · May 4, 2020

I once had a great boss at GE who was a master of GMAP the macro-assembler language of the GE-635 mainframe. The language is very similar to Knuth's MIXAL language.

One time, I saw in his code a redundant list of LDA / STA statements that was copying a table from one location to another. There were like 50 repetitions of LDA / STA over and over.

Being a boy genius at the time ie a newly minted programmer from the academic halls of GE's programming school. I suggested an obvious loop solution to this block code which I felt was plain ugly and wasted memory. He kindly showed me the error of my ways by pointing out his code took 100 machine cycles whereas my looping code took roughly 250 cycles to complete.

The wasted memory was in fact not wasted since this code was in an initializing subroutine that ran once at startup and then was no longer needed. He would reuse the memory area for follow-on data storage. In his code, there was no notion of keeping program code separate from data as is commonly done today with our programming systems. He had use of all memory allocated to his program and used it in the most effective and efficient manner. Consequently, his program took up far less space than a higher level program would. On mainframes of the time, that was a key thing to keep in mind.

In summary, I would suggest you learn MIXAL to fully understand and appreciate Knuth's writing. As of now, its like you're trying to learn a subject based on Calculus with only an understanding of algebra.

anorlunda · May 4, 2020

jedishrfu said:

He kindly showed me the error of my ways by pointing out his code took 100 machine cycles whereas my looping code took roughly 250 cycles to complete.

In those days memory was also severely limited. That made it hard to decide whether to optimize for speed or memory.

A decade or two later, we were still struggling with the limitations of address space. Virtual memory was the magic sauce that finally motivated wider address fields in the instruction set.

But then in the 90s, communications overwhelmed both speed&space and bingo we had the Internet. But IBM mainframe fans could brag that IBM had the fastest input/output throughput all along.

jedishrfu · May 4, 2020

My recollection was that IBM had the best hardware but that GE/Honeywell had the best OS known as GCOS that was highly configurable. Unix was yet to be conceived.

Unix came out of the partnership of MIT, GE, Honeywell and AT&T on Multics the successor to GCOS. AT&T Bell Labs became dissatisfied and broke away later making Unix and using the commands for the Multics/GCOS Access system as their ubiquitous ls, rm, ... shell commands.

and now back to our regularly hosted thread.

Klystron · May 4, 2020

As an aid to understanding when studying Knuth and other pioneers, consider creating and commenting intermediate variables into your pseudo-code. @jedishrfu's code fragment in post #4 is a good example.

Extract the embedded variables, assign them data types and trace their values through critical sections of the process. The OP's request to translate the algorithms into C makes sense but only after you understand what the original code intends to accomplish.

For instance a memory address might be held in and treated as an integer, even used as an implied index to an array, but the sophisticated student understands they are manipulating computer memory even if the language does not explicitly include pointers. Expanding embeds and declaring accurate data types aids understanding.

Deconstruct the terse optimized algorithm using descriptive intermediate variables that you choose. Once you understand the steps, you can re-embed and streamline the (pseudo)-code into tight C/C++ as you wish.

jedishrfu · May 4, 2020

Many times it’s often unnecessary to actually implement Knuth algorithms unless of course you are writing a new library of functions for a new language with new features then Knuth can help you understand what to optimize.

In nearly all other cases, your system library or third party libraries will have done this work for you making it an unnecessary exercise.

However, it doesn’t hurt to try your skills and implement it.

Nowadays, most good programmers seek off the shelf solutions and support to minimize their code base while maximizing their time to develop their code using the building blocks of the open source world. Minimizing your code reduces possible coding defects and leverages the open source community to fix their defects in their updates to you.

Mark44 · May 4, 2020

SlurrerOfSpeech said:

but I've mostly been able to get through it without learning MIXAL

jedishrfu said:

In summary, I would suggest you learn MIXAL to fully understand and appreciate Knuth's writing.

I second that comment. Here's a link to a short summary of Knuth's MIX language - https://en.wikipedia.org/wiki/MIX. From the page in Knuth that you're looking at, and its comments, plus the wiki article, you have a better chance of understanding what Knuth is doing in his algorithm.

Keep in mind that in 1962, when Knuth started his "Art of Computer Programming" books, Fortran had been available on for about 5 years, and COBOL for only about 3 years. A lot of programming was done in whatever assembly language was available for the mainframe one was working on. Knuth's MIX language seems to me to be his effort to write pseudocode that could be adapted to one of the few high-level languages or to another assembly language.

Regarding the link in post #1. You can't attach a hyper link as a file. What I did was to save the imgur image as a file on my computer, and then attach that image.

Could someone translate this Donald Knuth algorithm to C/C++?

Is A.I. more than the sum of its parts?

AI vs. Humans as Processors in an Environment

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Could someone translate this Donald Knuth algorithm to C/C++?

Similar threads