Search for a label than a trailing right brace

rcgldr · Apr 20, 2009

Rather than hijack the help with C thread, discussion can continue here.

Jeff Reid said:

I don't understand why most programmers are "anti-goto". It's far easier to search for a label than a trailing right brace. Regardless of how you write your C code, eventually the assembly equivalent of a "goto", a branch or jump instruction is going to be used (except machines that have conditional execution of instructions such as the Motorola 68000 series, and it only helps with 2 to 4 instruction sequences). If there are multiple entry or exit points in a program's loops, then goto's are OK.

mXSCNT said:

The main problem with goto is that it can do almost anything. When reading code, it's harder to figure out what a goto or label is for than what a loop is for, because the goto could do so many things (jumping control to anywhere in the program). A loop can only apply to the loop body, and two loops can't "overlap" the way gotos can. That means it is easier to reason about loops than gotos, so loops are preferable when possible.

AUMathTutor said:

"I don't understand why most programmers are "anti-goto"."
For high-level programming, they are generally considered too unstructured. They support ad-hoc programming and decrease readability.

"It's far easier to search for a label than a trailing right brace."
I disagree. With proper use of white space, C-style brace notation makes it very easy to see the block-structure.

"Regardless of how you write your C code, eventually the assembly equivalent of a "goto", a branch or jump instruction is going to be used (except machines that have conditional execution of instructions such as the Motorola 68000 series, and it only helps with 2 to 4 instruction sequences)."
What do you mean here? Yes, whenever code is translated into assembly language, loops and conditionals become the equivalent of "goto".
But any and all programs require only a single while loop and the if-then-else structure. This is provably true.

"If there are multiple entry or exit points in a program's loops, then goto's are OK."
I respectfully disagree. If your code starts using several goto's, you probably need to rework your design.

AUMathTutor said:

"I don't understand why most programmers are "anti-goto"."
For high-level programming, they are generally considered too unstructured. They support ad-hoc programming and decrease readability.

Just like anything else, goto's can be used appropriately or inappropriately. Inappropriate use doesn't invalidate using goto's.

Jeff Reid said:
"It's far easier to search for a label than a trailing right brace."
"I disagree. With proper use of white space, C-style brace notation makes it very easy to see the block-structure."

Take this sequence
Code:
    //  step 1
    if(status_from_step_1 == OK){
        // step 2
        if(status_from_step_2 == OK){
            // step 3
            if(status_from_step_3 == OK){
                // step 4
                }
            }
        }
    }
versus:
Code:
    //  step 1
    if(status_from_step_1 != OK)
        goto exit0;
    // step 2
    if(status_from_step_2 != OK)
        goto exit0;
    // step 3
    if(status_from_step_3 != OK)
        goto exit0;
    // step 4

exit0: // common exit point
The first sequence involved uneeded indention, and implication that step3 is a sub-step of step 2 which is a sub-step of step 1, when they are really at the same level. IMO, it's a lot easier to find exit0 than to find that right brace, especially if the indentation is messed up.

But any and all programs require only a single while loop and the if-then-else structure. This is provably true.

I assume you mean one main loop per function, not per program. Take the case of a file copy or backup program that processes all the files in a directory before working on any subdirectories. There are two main loops, one to process all files in the current directory, and one to process all the sub-directories, recursively calling the process files loop, which in turn recursively calls the process sub-directories loop, ...

On a somewhat related issue, I prefer to use pointer to functions as opposed to state variables (switch case) for deferred actions, as it allows a series of small functions to be sequentially located in a source file rather than tied into the switch case statement for some common message handler. If I need to add a step, I don't have to edit the common message handler as it just calls a pointer to function. C++ implements the equivalent in it's classes, for C, structures can have pointers to functions.

AUMathTutor said:

"Just like anything else, goto's can be used appropriately or inappropriately. Inappropriate use doesn't invalidate using goto's."
Fair enough. I like to avoid them, however, because I feel it's all too easy to misuse them. I guess when it comes to the GOTO I prefer to err on the side of not using them.

"The first sequence involved uneeded indention, and implication that step3 is a sub-step of step 2 which is a sub-step of step 1, when they are really at the same level. IMO, it's a lot easier to find exit0 than to find that right brace, especially if the indentation is messed up."
I disagree. Fundamentally, the subsequent if statements do depend on each other. This dependence is explicit. I think the indentation should be preferred because it illustrates this fact. It's only if you think in terms of the GOTO that it seems that those steps are on the same level... they're not.

"I assume you mean one main loop per function, not per program."
False. I suggest you read the proof of Boehm and Jacopini. Any algorithm which can be described using a flowgraph can be written using a single while loop and if-then-else structures. Any recursive algorithm can be written as an iterative algorithm... so recursive functions aren't exempt from Boehm-Jacopini.

"On a somewhat related issue, I prefer to use pointer to functions as opposed to state variables (switch case) for deferred actions, as it allows a series of small functions to be sequentially located in a source file rather than tied into the switch case statement for some common message handler. If I need to add a step, I don't have to edit the common message handler as it just calls a pointer to function. C++ implements the equivalent in it's classes, for C, structures can have pointers to functions."
...

rcgldr · Apr 20, 2009

Jeff Reid said:

The first sequence involved uneeded indention, and implication that step3 is a sub-step of step 2 which is a sub-step of step 1, when they are really at the same level. IMO, it's a lot easier to find exit0 than to find that right brace, especially if the indentation is messed up.

AUMathTutor said:

I disagree. Fundamentally, the subsequent if statements do depend on each other. This dependence is explicit. I think the indentation should be preferred because it illustrates this fact. It's only if you think in terms of the GOTO that it seems that those steps are on the same level... they're not.

Depends on the nature of the failure in the previous steps. I could prefix each step with an if "no hardware failure detected", and that would not imply dependency among the steps. With any sequential process, each step depends on the output of a previous step, and in my coding style I don't indent the steps of a sequential process.

Jeff Reid said:

I assume you mean one main loop per function, not per program.

AUMathTutor said:

False. I suggest you read the proof of Boehm and Jacopini. Any algorithm which can be described using a flowgraph can be written using a single while loop and if-then-else structures. Any recursive algorithm can be written as an iterative algorithm... so recursive functions aren't exempt from Boehm-Jacopini.

Except for truly parallel sequences (multiple parallel flow charts) I understand what you're getting at, but it can get to a point that using state variables or the equivalent to avoid sequential loops is more unreadable than simply coding several sequential loops. Likewise, sometimes recursion is more readable than creating a virtual stack of state variables to accomplish the equivalent.

I try to avoid making the same decision twice. If the action on a decision needs to be deferred, than I use some type of programmed jump or call instead of retesting a state variable. In the case of Fortran, this can be accomplished via a computed "goto" (I haven't written a Fortran program since the 1970's). In the case of C, it can be accomplished by calling pointer to a function. In C++, a class member function can be overridden. There are times when state variables are better, but I prefer using pointers to functions. Using if statements or switch case statements to simply avoid the few cases where a goto is probably the best choice, isn't a good idea.

Jeff Reid said:

On a somewhat related issue, I prefer to use pointer to functions as opposed to state variables (switch case) for deferred actions, as it allows a series of small functions to be sequentially located in a source file rather than tied into the switch case statement for some common message handler. If I need to add a step, I don't have to edit the common message handler as it just calls a pointer to function. C++ implements the equivalent in it's classes, for C, structures can have pointers to functions.

Example of this below. Note that adding a step doesn't require any change to the common handler source:

Code:

//  common handler

CommonHandler(void)
{
    switch (message){
        ...
        case eNextStep:
            *(pfNextStep)();
            break;
        ...
    }
}

Step1(void)
{
    pfNextStep = step2;
    // initiate step1 sequence
}

Step2(void)
{
    pfNextStep = step3;
    // initiate step2 sequence
}

Step3(void)
{
    pfNextStep = SequenceDone;
    // initiate step3 sequence
}

SequenceDone(void)
{
    pfNextStep = NextStepUnexpected;
}

NextStepUnexpected(void)
{
    // handled unexpected next step message
}

AUMathTutor · Apr 21, 2009

"Depends on the nature of the failure in the previous steps. I could prefix each step with an if "no hardware failure detected", and that would not imply dependency among the steps. With any sequential process, each step depends on the output of a previous step, and in my coding style I don't indent the steps of a sequential process."

Again, I don't think it's right to think of the if statements as being on a single level. The results may be independent of previous steps, but the preconditions and postconditions are intimately connected. In words, you would say:

do step 1.
if step 1 worked, do step 2.
if step 2 worked, do step 3.
if step 3 worked, do step 4.

These can be logically expanded to the following:

do step 1.
if step 1 worked, do step 2.
if step 1 and step 2 worked, do step 3.
if step 1 and step 2 and step 3 worked, do step 4.

Therefore the execution of the steps is inherently based on some condition concerning previous steps. This semantic information is hidden by using the GOTO.

"Except for truly parallel sequences (multiple parallel flow charts) I understand what you're getting at, but it can get to a point that using state variables or the equivalent to avoid sequential loops is more unreadable than simply coding several sequential loops. Likewise, sometimes recursion is more readable than creating a virtual stack of state variables to accomplish the equivalent."
True enough, but it remains true that the GOTO is not technically required in any programming language - whether it makes some parallel programming more straightforward or not is another matter entirely.

I think that's a very dangerous game you're playing with pointers. It seems completely unnecessary to me. I would simply do it using state variables:

Code:

void func(void)
{
   switch(message)
   {
      // cases...
      case doNextStep:
         switch(nextStep)
         {
            // cases...
            case firstState:
               stateFunction1();
               break;
            // cases...
         }

         break;
      // cases...
   }
}

void stateFunction1(void)
{
   // do something...
   nextState = secondState;
}

// other state functions...

Another option in C is to use a state table. This is closer to what you are doing, but better IMHO. In C++ I think state tables are a great idea, and naturally lend themselves to a class:

Code:

class StateTableADT
{
public:

   // add an entry to the state table
   addState( stateNumber, stateFunction , nextState );

   // remove an entry from the state table
   removeState( stateNumber );

   // get the current state
   getState();

   // set the current state
   setState();

   // go to the next state and execute
   next();
};

This is at least a little better than what you're doing with raw function pointers, though, granted, it's not possible in straight C.

mXSCNT · Apr 21, 2009

I'm generally opposed to goto, but Jeff, you raise a point with your status_from_step_1, status_from_step_2, status_from_step_3 nested conditional example. The excess indentation for the nested conditional is unsightly. If you had 15 steps, it might be a legibility problem (although if you have 15 steps, perhaps you're going about it the wrong way).

One way to solve the problem in structured programming is to have a function which does steps 1,2,3, and instead of goto exit0, simply return from the function.

rcgldr · Apr 21, 2009

AUMathTutor said:

"Depends on the nature of the failure in the previous steps. I could prefix each step with an if "no hardware failure detected", and that would not imply dependency among the steps. With any sequential process, each step depends on the output of a previous step, and in my coding style I don't indent the steps of a sequential process."

Again, I don't think it's right to think of the if statements as being on a single level. The results may be independent of previous steps, but the preconditions and postconditions are intimately connected. In words, you would say:

do step 1.
if step 1 worked, do step 2.
if step 2 worked, do step 3.
if step 3 worked, do step 4.

These can be logically expanded to the following:

do step 1.
if step 1 worked, do step 2.
if step 1 and step 2 worked, do step 3.
if step 1 and step 2 and step 3 worked, do step 4.

I was thinking more along the lines of a recipe sequence for cooking. Obvious if step 2 requires butter, and there is no butter, then the recipe stops at step 2 due to failure, yet the steps in a recipe sequence aren't indented based on the outcome of previous steps. Normally all the requirements are dealt with at the start, ingredients and utensils; this might require the equivalent of 10 conditional statements, but all are independent requirements and should not be written to imply some inter-dependency. For example, if the requirement are allocated memory, an opened input file, and an opened output file, I wouldn't want the conditionals to be nested or the main line of code to have 3 levels of indention.

Therefore the execution of the steps is inherently based on some condition concerning previous steps. This semantic information is hidden by using the GOTO.

The condition is successful completion of the previous step, this is inherent with any sequential process. A reasonable exception is when a failure simply means to use an alternative method. For example, if there is no ASPI library on a XP system, then the alternative is to use the native API on XP for device operations.

True enough, but it remains true that the GOTO is not technically required in any programming language

Depends on the language, some languages only allow one or more branch labels to be specified as one or more of the results of a conditional.

I think that's a very dangerous game you're playing with pointers. It seems completely unnecessary to me. I would simply do it using state variables:

Code:

void func(void)
{
   switch(message)
   {
      // cases...
      case doNextStep:
         switch(nextStep)
         {
            // cases...
            case firstState:
               stateFunction1();
               break;
            // cases...
         }

         break;
      // cases...
   }
}

void stateFunction1(void)
{
   // do something...
   nextState = secondState;
}

// other state functions...

The problem with this is that adding a step requires updating the message handler as well as updating the step handlers. This creates an uneeded overhead. If there are a large number of steps, or sequences, the switch case statements get huge. In the case of a old driver I rewrote, the interrupt handler had over 200 cases. I converted this to use a small set of pointer to functions, with separate end and error action handlers, and the concept of nested end and error action handlers, so a high level multi-step sequence could initiate a series of mid level multi-step sequences, and each of those could initate a series of low level multi-step sequences, each sequence with it's own end and error action pointer to function.

In the case of windows programming in C++ via the predefined classes, the programmer overrides default message handlers by overriding class functions (the equivalent of function pointers), not by adding cases to a switch statement as it's done in C.

Hurkyl · Apr 21, 2009

do step 1.
if step 1 worked, do step 2.
if step 1 and step 2 worked, do step 3.
if step 1 and step 2 and step 3 worked, do step 4.

Yuck! I would assert that implementing this literally would be a Bad Thing. Upon failure, this algorithm is supposed to cease continuing -- but this approach has the program continuing through the rest of the algorithm, forcing each individual statement to be guarded against accidental execution after failure.

There are at least three serious problems:
1. Each time a new statement is added to the algorithm, you would have to craft a new conditional to guard it against execution after failure.

2. You have to remember to do so. (And you have to make sure the next person to touch your code is aware of this important fact)

3. If a new failure mode is introduced by new statements, every subsequent conditional has to be modified to catch the new failure mode.

The goto version, the nested if-block version, and the function-with-return version all do the Right Thing -- upon failure, flow immediately proceeds directly to where it's supposed to go.

Of course, my statements aren't absolute. I'm sure there are some algorithms that are most naturally expressed in the way you did. But I assert that such algorithms are few and far in-between, and most algorithms are better expressed in the form

If any of the following steps fails, skip forward to handle failure:
* do step 1
* do step 2
* do step 3
* do step 4

with the specific manner of skipping left up to the implementation. (goto, return, nested if's, exceptions, state machines (via various methods))

I would like to assert an axiom:

If it's acceptable in some situation to emulate a goto through other flow control mechanisms, then it's even better to just use a goto.

D H · Apr 21, 2009

Jeff Reid said:

Rather than hijack the help with C thread, discussion can continue here.

From the other thread, to keep in line with Jeff's highjack avoidance:

AUMathTutor said:

False. I suggest you read the proof of Boehm and Jacopini. Any algorithm which can be described using a flowgraph can be written using a single while loop and if-then-else structures.

False. Boehm and Jacopini's paper says any non-structured program can be built up using sequences, while loops, and alternation. The construction starts by wrapping a while loop around the whole program. The construction then adds sequences, loops, and alternations. There is nothing to stop the construction from nesting a loop within a loop; it is trivial to prove that such nested loops are needed to emulate the non-structured code.

Besides, the theorem has been falsified. Another construct beside sequence, loop, and alternation is needed. Loops with multi-level breaks work, for example.

Kozen, Dexter and Tseng, Wei-Lung Dustin, "The Böhm–Jacopini Theorem Is False, Propositionally", Lecture Notes in Computer Science, 5133 177-192, Springer-Verlag, 2008
http://ecommons.cornell.edu/handle/1813/9478

Jeff Reid said:

Just like anything else, goto's can be used appropriately or inappropriately. Inappropriate use doesn't invalidate using goto's.

Goto-less programming can make for some extremely ugly code at times, particularly in projects where some structured programming fanatic project manager has ruled out break statements and has mandated the "single point of entry / single point of return" rule.

OTOH, when asked to review some butt ugly code, whether goto-less or not, my first question to the author is always "why did you make this code so ugly?"

Hurkyl said:

I would like to assert an axiom:

If it's acceptable in some situation to emulate a goto through other flow control mechanisms, then it's even better to just use a goto.

Surely you jest! While strict adherence to structured programming can be considered harmful, this axiom advocates complete abandonment of structured programming guidelines. They are excellent guidelines. The problems arise then these guidelines are made into rigid, no-exception-granted rules.

Hurkyl · Apr 21, 2009

D H said:

Surely you jest! While strict adherence to structured programming can be considered harmful, this axiom advocates complete abandonment of structured programming guidelines. They are excellent guidelines. The problems arise then these guidelines are made into rigid, no-exception-granted rules.

I'm not trying to advocate using goto when flow is naturally described by other mechanisms -- what I'm trying to advocate is to reject using the other mechanisms to implement flow that is most naturally expressed via goto.

The flagship example is the break out of double loop:

Code:

  for(int i = 0; i < 100; ++i) {
    for(int j = 0; j < 100; ++i) {
      if(table[i][j] == 0) { goto end_of_loop; }
    }
  }
end_of_loop:

(aside: one of the few things I give java high praise for is having multi-level break statements to deal with this)

Yes, you could have turned this loop into a function that exits via return. If it should be a function, then make it a function. But I reject the idea that "eliminating goto" is just cause for making it a function.

Worse is this style of "fix":

Code:

  int flag = 0;
  for(int i = 0; i < 100; ++i) {
    for(int j = 0; j < 100; ++i) {
      if(table[i][j] == 0) { flag = 1; break; }
    }
    if(flag) { break; }
  }

which breaks the jump into smaller pieces, which is considerably more complicated and error-prone, becoming very awkward if the loops are doing more complex work and if there are other skips to be made.

Standard python style, I believe, implements this as

Code:

class end_of_loop_exception:
  pass
try:
  for i in xrange(100):
    for j in xrange(100):
      if table[i][j] == 0:
        raise end_of_loop_exception
except end_of_loop_exception:
  pass
# more code here

I assert this code snippet would be better implemented as goto statement. (Of course, python doesn't have them, so it uses exceptions to emulate them) I still prefer this to the flag variable approach.

(Since being introduced to python's "for...else" construct, I find myself using that flow control in C++ from time to time -- implemented with goto, though. It has the benefits of requiring neither flag variables nor making duplicate copy of the loop condition, both of which I find undesirable)

rcgldr · Apr 21, 2009

Hurkyl said:

Yes, you could have turned this loop into a function that exits via "return".

I've never understood why many feel that returns scattered throughout a function are acceptable, but not goto's to a common exit point, or as in your case to break out of a nested loop. One disadvantage of the return is the loss of the current context (local variables).

Worse is this style of "fix":

Code:

  int flag = 0;
  for(int i = 0; i < 100; ++i) {
    for(int j = 0; j < 100; ++i) {
      if(table[i][j] == 0) { flag = 1; break; }
    }
    if(flag) { break; }
  }

I agree; it violates one of my rules about making the same decision twice. It creates a second uneeded conditional, and the more conditionals in a program the more paths to check and the more that can go wrong.

Grateful to find that I'm not the only remaining programmer in the world that is not "anti-goto".

D H · Apr 21, 2009

Hurkyl said:

I'm not trying to advocate using goto when flow is naturally described by other mechanisms -- what I'm trying to advocate is to reject using the other mechanisms to implement flow that is most naturally expressed via goto.

I think we are in violent agreement then. While goto can be considered harmful, so can strict adherence to goto-less programming. I've always viewed handling of exceptional conditions as an acceptable exception to the "no goto" rule. Bottom line: If the use of goto makes sense and the workarounds around it are just plain ugly (e.g., getting rid of the gotos bumps code's extended cyclomatic complexity from 8 to 18), use the goto.

AUMathTutor · Apr 21, 2009

"Yuck! I would assert that implementing this literally would be a Bad Thing. Upon failure, this algorithm is supposed to cease continuing -- but this approach has the program continuing through the rest of the algorithm, forcing each individual statement to be guarded against accidental execution after failure."
I wasn't suggesting you actually code it like that. I was just saying that those are the semantics of nested if-then-else statements. Even though you don't write it all out, it can be implied that those are the preconditions required for execution.

"If it's acceptable in some situation to emulate a goto through other flow control mechanisms, then it's even better to just use a goto."
I'm not sure how much sense that makes since, in reality, all control flow constructs emulate the goto.

"False. Boehm and Jacopini's paper says any non-structured program can be built up using sequences, while loops, and alternation. The construction starts by wrapping a while loop around the whole program. The construction then adds sequences, loops, and alternations. There is nothing to stop the construction from nesting a loop within a loop; it is trivial to prove that such nested loops are needed to emulate the non-structured code. Besides, the theorem has been falsified. Another construct beside sequence, loop, and alternation is needed. Loops with multi-level breaks work, for example."
False. Their proof showed that all deterministic programs described by flowgraphs can be represented as a single while loop and if-then-else blocks. The only reason to introduce multi-level breaks and nested loops is to eliminate the need for auxiliary variables... or at least that was my understanding. Perhaps you can furnish a counterexample to their proof, and we'll see if I can't use their method to get a program with one outer while loop and if-then-else blocks.

And Hurkyl, the way I solve that problem of yours is to put the semantic information where it belongs, namely, in the test condition:

Code:

bool running = true;
for(int i = 0; i < 100 && running; i++)
{
   for(int j = 0; j < 100 && running; j++)
   {
      if (table[i][j] == 0)
      {
         running = false;
      }
      else
      {
         // rest of code...
      }
   }    
}

The benefit of this is that's it's blindingly obvious whether or not the nested loops actually covered all 100 x 100 cases. In practice, I would probably reverse the order of the if block... do the case where it's not 0 first, and then handle the = 0 with the else. I did it the way I did above for illustrative purposes. Generally, it's better practice to put your normal use scenarios before your exception ones.

AUMathTutor · Apr 21, 2009

"I've never understood why many feel that returns scattered throughout a function are acceptable,"
I know, right? That's even worse than using gotos. For my money, I don't think that returns should be allowed except as the last line of a function.

"Grateful to find that I'm not the only remaining programmer in the world that is not "anti-goto"."
I wouldn't say I'm "anti-goto"... I certainly think the only thing to do is to use other constructs when it is reasonable to do so. I can't think of a time when the GOTO is the only reasonable construct, but that doesn't mean there aren't any.

"I think we are in violent agreement then. While goto can be considered harmful, so can strict adherence to goto-less programming. I've always viewed handling of exceptional conditions as an acceptable exception to the "no goto" rule. Bottom line: If the use of goto makes sense and the workarounds around it are just plain ugly (e.g., getting rid of the gotos bumps code's extended cyclomatic complexity from 8 to 18), use the goto."
I think that's a well-reasoned opinion. The only difference between that and my opinion is that mine would be reversed, that is, ...

"I think we are in violent agreement then. While goto can be considered harmful, so can strict adherence to goto-less programming. I've always viewed handling of exceptional conditions as an acceptable exception to the "no goto" rule. Bottom line: If the use of structured programming makes sense and gotos are just plain ugly (e.g., using gotos hides semantic information and substantially decreases readability), use structured progranming."
We're really saying the same thing, I think, just with different spin. Everybody likes their own brand, you know.

rcgldr · Apr 21, 2009

While we're on the subject of avoiding goto's this is one of my pet peaves:

Code:

    step1;
    if(step1_status == OK)
    {
        step2;
        if(step2_status == OK)
        {
            step3;
            ...
        }
        else
        {
            handle step 2 failure
        }
    }
    else
    {
        handle step 1 failure
    }

This is because the error handling code for step 1 ends up the furthest away from step 1, with the error handling for step 2 the nest furthest away from the step2, ... I prefer this:

Code:

    step1;
    if(step1_status != OK)
    {
        handle step1 error;
        goto exit0;
    }
    step2;
    if(step2_status != OK)
    {
        handle step2 error;
        goto exit0;
    }
    step3;
exit0:

or this equivalent to throw+catch, since at least I have a label I can search for rather than trying to find the 2nd level of indention trailing brace and else statement for step2_status == OK. Also this separates the error handling code from the main sequence of code where errors are not expected.

Code:

    step1;
    if(step1_status != OK)
        goto step1error:
    step2;
    if(step2_status != OK)
        goto step2error;
    step3;
    goto exit0:

step1error:
    handle step1 error;
    goto exit0;

step2error:
    handle step2 error;
    goto exit0;

exit0:
    return(...)

D H · Apr 21, 2009

AUMathTutorial,

Please, please, please learn how to use the quote feature.
You appear to be in the camp of rabid structured programming fanatics. We are not really saying the same thing.

Hurkyl · Apr 21, 2009

AUMathTutor said:
And Hurkyl, the way I solve that problem of yours is to put the semantic information where it belongs, namely, in the test condition:
Code:
bool running = true;
for(int i = 0; i < 100 && running; i++)
{
   for(int j = 0; j < 100 && running; j++)
   {
      if (table[i][j] == 0)
      {
         running = false;
      }
      else
      {
         // rest of code...
      }
   }    
}
The benefit of this is that's it's blindingly obvious whether or not the nested loops actually covered all 100 x 100 cases. ... I would probably reverse the order of the if block... do the case where it's not 0 first, and then handle the = 0 with the else

One drawback is that the implementation of one idea (break out of the loop if we find a zero) has now been spread over five lines of code (not counting the else and the extra {}), which could be widely separated depending upon what else is in the loop. (especially if you reverse the order of the if block)

A more minor drawback that's only relevant for code that needs to be fast is that you've made the loop condition more complicated, which may make it more difficult for the compiler to optimize. (e.g. I expect the compilers on a cray vector machine to have a significantly easier time dealing with the early exit than with the flag variable / more complicated loop condition)

The more serious drawback is it's an error-prone flag variable solution. The obvious trap you've laid is:

Code:

bool running = true;
for(int i = 0; i < 100 && running; i++)
{
   for(int j = 0; j < 100 && running; j++)
   {
      if (table[i][j] == 0)
      {
         running = false;
      }
      else
      {
         // rest of code...
      }
   }    
   cout << "Didn't find a zero in row " << i << endl;
}

D H · Apr 21, 2009

AUMathTutor said:

For my money, I don't think that returns should be allowed except as the last line of a function.

All I can say to this rule is YUCK. All caps yuck, and I probably should have used bold and put it in a humongous font to boot. The "single point of entry / single point of return" rule is responsible for some incredibly hideous code.

Jeff Reid said:

While we're on the subject of avoiding goto's this is one of my pet peaves: ... This is because the error handling code for step 1 ends up the furthest away from step 1, with the error handling for step 2 the nest furthest away from the step2, ...

I can see arguments for both ways of implementing things (handle exceptions first versus last), and I use both schemes. If the exception handling is exceptionally short (e.g., print an error message and call exit()), I tend to deal with the exception first to get it out of the way. If the major point of the function is to ensure that exceptional conditions are address, I try to deal with exceptional cases as soon as they arise. If, on the other hand, the exception handling is peripheral and gets in the way of understanding the main purpose of the code, the exception handling comes last.

I tend to be a defensive programmer and add tests for conditions I know will never occur. Amazingly, some other programmer or user inevitably finds a way to invoke those "this-can't-happen-but-just-in-case" tests (at which point I get a phone call or e-mail because I wrote an insipid error message along the lines of "This can't happen. FIXME").

rcgldr · Apr 21, 2009

Regarding the pointer to function usage, an example snippet from a I/O based program written in C (this could be implemented by overiding member functions in C++):

Code:

    status = loadlibrary(ASPI);
    if(status == OK){
        pfRead = ReadAspi;
        pfWrite = WriteAspi;
    }
    else
    {
        pfRead = ReadNative;
        pfWrite = WriteNative;
    }

    ...
    *(pfRead)(...);
    ...
    *(pfWrite)(...);

This usage of pointer to function eliminates having conditionals on every I/O statement in the program. Plus if a third set of I/O function became available, only the initialization code has to be changed.

rcgldr · Apr 21, 2009

D H said:

I can see arguments for both ways of implementing things (handle exceptions first versus last)

True, but for handling exceptions last, I prefer using gotos and human readable labels, instead of searching through 4 levels of nested if else statements (which I've seen).

I tend to be a defensive programmer and add tests for conditions I know will never occur. Amazingly, some other programmer or user inevitably finds a way to invoke those "this-can't-happen-but-just-in-case" tests (at which point I get a phone call or e-mail because I wrote an insipid error message along the lines of "This can't happen. FIXME").

In my apparently heavy usage pointer to functions, I always initialize them to a "this shouldn't happen" function, and they've been triggered a few times.

D H · Apr 21, 2009

Jeff Reid said:

True, but for handling exceptions last, I prefer using gotos and human readable labels, instead of searching through 4 levels of nested if else statements (which I've seen).

What's all this stuff about searching for those close braces? What kind of primitive editor are you using?

rcgldr · Apr 21, 2009

D H said:

What's all this stuff about searching for those closed braces? What kind of primitive editor are you using?

Some of my jobs involve some relatively antiquated editors and toolsets, especially embedded firmware on some arcane CPU. Plus I've seen some really bad examples of nested if else statements that spanned several screens of code. Split screen editors that let you look at two or more places in the same file via multiple windows help out here. In the case of limited capablity debuggers, sometimes it's better to write the code closer to how it will be implemented in assembly code, so when working with the debugger, the correspondence between source and binary code is close.

jim mcnamara · Apr 21, 2009

Jeff Reid said:

Some of my jobs involve some relatively antiquated editors and toolsets, <snip>

One of my tasks was maintaining several very oversized (> 17000 lines of C) with cyclomatic complexity > 50. In other words it was not maintainable.

A large part of this was due to mandates from a manager:
1. no goto's
2. one return
3. leave code alone no matter how "bad" it may appear.

After the manager left us, I refactored one module got the CC down to less than 15, LOC < 10000.

We can now make changes to it. It does have goto's, and it does have multiple returns from functions - which really increase CC - if you use McCabe's algorithm as it came from the box.

AUMathTutor · Apr 21, 2009

D H said:

AUMathTutorial,

Please, please, please learn how to use the quote feature.

You appear to be in the camp of rabid structured programming fanatics. We are not really saying the same thing.

A little touchy, eh? Well, I suppose it's only natural, what with you going against the almost universally accepted notion that structured programming is almost universally better than the GOTO and my reminding you of that nagging little fact. Oh well. I guess if I can learn to use the quote feature like civilized people, my opinion is worth listening to.

Obviously, you don't need structured programming, but people make a big deal out of using it because it promotes good programming practice. That doesn't mean it's always the better to use it than a GOTO, but using a GOTO is like being in love. You just know when you need to use a GOTO. If you're using a GOTO every other line, you don't know what you want.

AUMathTutor · Apr 21, 2009

Hurkyl said:
I'm not trying to advocate using goto when flow is naturally described by other mechanisms -- what I'm trying to advocate is to reject using the other mechanisms to implement flow that is most naturally expressed via goto.

The flagship example is the break out of double loop:
Code:
  for(int i = 0; i < 100; ++i) {
    for(int j = 0; j < 100; ++i) {
      if(table[i][j] == 0) { goto end_of_loop; }
    }
  }
end_of_loop:
(aside: one of the few things I give java high praise for is having multi-level break statements to deal with this)

Yes, you could have turned this loop into a function that exits via return. If it should be a function, then make it a function. But I reject the idea that "eliminating goto" is just cause for making it a function.

Worse is this style of "fix":
Code:
  int flag = 0;
  for(int i = 0; i < 100; ++i) {
    for(int j = 0; j < 100; ++i) {
      if(table[i][j] == 0) { flag = 1; break; }
    }
    if(flag) { break; }
  }
which breaks the jump into smaller pieces, which is considerably more complicated and error-prone, becoming very awkward if the loops are doing more complex work and if there are other skips to be made.

Standard python style, I believe, implements this as
Code:
class end_of_loop_exception:
  pass
try:
  for i in xrange(100):
    for j in xrange(100):
      if table[i][j] == 0:
        raise end_of_loop_exception
except end_of_loop_exception:
  pass
# more code here
I assert this code snippet would be better implemented as goto statement. (Of course, python doesn't have them, so it uses exceptions to emulate them) I still prefer this to the flag variable approach.

(Since being introduced to python's "for...else" construct, I find myself using that flow control in C++ from time to time -- implemented with goto, though. It has the benefits of requiring neither flag variables nor making duplicate copy of the loop condition, both of which I find undesirable)

There are tradeoffs involved, to be sure. My method involves spreading the checking out over multiple loops. This is less than ideal, to be sure. However, it's really not so difficult to add such a structure to a loop you know will be broken out of. It becomes almost mechanical. Also, it is usually very easy to read. People expect to see the standard loop header, and when they see the extra bits, they wonder about it. It becomes very explicit. And the issue you mentioned about the value at the end of the loop, after exiting, is an easy one to remedy, and surely you know it. It's really sort of a trivial (I avoid saying petty) argument. The other is valid enough, though.

The problem with your method is that with anything but toy functions, it becomes unfeasible to determine the control flow. Say the software is being maintained by somebody else who's never seen your code before. He sees a pair of nested loops, and he can only assume that they both run to completion. How is he to know that there's a couple of gotos buried in 100 lines of code? He could check the whole body of code, sure. But the semantics of a loop are really such that it should only allow exiting at the top or bottom (except for the do-while, it should be the case that a loop exit condition is checked at the top of the loop).

And the efficiency issue... well, I almost feel like that's a smokescreen of sorts. Yes, doing an extra boolean check 10,100 times will take a little time... but compared to the rest of the loop, this time is likely neglible. Consider the following code:

Code:

bool running = true;
for(int i = 0; i < 100 && running; i++)
{
   for(int j = 0; j < 100 && running; j++)
   {
      if (table[i][j] == 0)
      {
         running = false; // say this takes 1 time unit.
      }
      else
      {
         // the code...
      }
   }
}

If we say that assigning variables, reading variables, basic comparisons, and increments take 1 time unit... then just the given framework code takes around 81,000 units of time to finish. Your code - with the goto - takes around 60,000 units of time. So my method takes 35% longer than your method in the limit of node code actually being run other than the check. Now say that a paltry 10 units of time are taken by "// the code...". Then my way takes 181,000 units of time and yours takes 160,000 units of time. Now my code takes only ~13% longer than yours. Now say that you have a relatively large function - where my method would give the greatest relative increase in readability. Let's say the code takes 100 units of time (say you are doing some sort of basic manipulation involving arithmetic or something). Now you have 1,081,000 versus 1,060,000 units of time... you get the idea.

Plus, most of the time and money spent on software development is spent on maintenance, not on development, per se. Increasing readability pays off more in the end than making a few quick and dirty optimizations here and there. Memory reads and boolean comparisons are really not the bottleneck of modern computing systems.

D H · Apr 21, 2009

AUMathTutor said:

A little touchy, eh? Well, I suppose it's only natural, what with you going against the almost universally accepted notion that structured programming is almost universally better than the GOTO and my reminding you of that nagging little fact. Oh well. I guess if I can learn to use the quote feature like civilized people, my opinion is worth listening to.

First things first: Thanks for taking some time to learn how to use the quote feature.

Now for the rest of your post. Two words: Oh, please.

A few more words: Strict adherence to programming rules, no exceptions granted, inevitably lead to horror stories like this:

jim mcnamara said:

One of my tasks was maintaining several very oversized (> 17000 lines of C) with cyclomatic complexity > 50. In other words it was not maintainable.

A large part of this was due to mandates from a manager:
1. no goto's
2. one return
3. leave code alone no matter how "bad" it may appear.

After the manager left us, I refactored one module got the CC down to less than 15, LOC < 10000.

I manage multiple software development and software verification & validation efforts for highly critical systems that must have very high reliability. Thanks to some negative training by managers of the sort Jim faced, I mandate very few strict, no exceptions granted, rules on my teams. Rigid rules lead to low quality code. I do have tools that look for quality issues, gotos among them. People who use them had better have a dang good reason for doing so. "Because I'm handling a lot of hairy error conditions and my code would be truly ugly otherwise" is a very good start at getting a waiver to the "no gotos" guideline.

Given a choice between (a) a function written using strict structured programming concepts and an extended cyclomatic complexity of 50 or more versus (b) an equivalent function that uses an oft-forbidden feature such as gotos, breaks, and in-line returns but is understandable, maintainable, and has an extended cyclomatic complexity of 15 or less, I'll take choice (b) any day.

That is often the choice: Make your rules hard and steadfast, and you will get lousy code that is in the end unmaintainable, incomprehensible, and unverifiable. But hey! It has no gotos! Make your rules soft and you can get high quality code that is maintainable, easily comprehensible, and verifiable. Give the programmers an out (allow breaks, continues, in-line returns) and you will have those ilities -- and you will have very few, if any, goto statements to boot.

rcgldr · Apr 21, 2009

AUMathTutor said:

He sees a pair of nested loops, and he can only assume that they both run to completion. How is he to know that there's a couple of gotos buried in 100 lines of code?

True, but now he sees that && running, he has to scan for every instance of running, plus the same issue about goto's would also apply to any breaks, continues, or returns in those loops, yet most programmers don't have a problem with those.

The fact that branch labels existed in a program would prompt me to go search for references to the branches that access those labels.

Donald Knuth accepted the principle that programs must be written with provability in mind, but he disagreed (and still disagrees) with abolishing the GOTO statement. In his 1974 paper, "Structured Programming with Goto Statements", he gave examples where he believed that a direct jump leads to clearer and more efficient code without sacrificing provability. Knuth proposed a looser structural constraint: It should be possible to draw a program's flow chart with all forward branches on the left, all backward branches on the right, and no branches crossing each other. Many of those knowledgeable in compilers and graph theory have advocated allowing only reducible flow graphs. :

http://en.wikipedia.org/wiki/Structured_programming

AUMathTutor · Apr 21, 2009

Well, DH, I don't think any of us ever said that the GOTO should never be used. All I've ever said is that it isn't necessary (then again, neither are loops or if-then-else if you have the GOTO) and that it is generally accepted that structured programming is to be preferred over non-structured jump instructions.

I have said that I, personally, avoid the GOTO when it is not too inconvenient to do so. I'm sure there are instances where it is the case that GOTOs are much better than the alternative. I'm not sure I've seen an overwhelmingly convincing case for that here... all of the provided examples seem iffy to me.

And I agree, it seems very inconsistent to whine about the GOTO and then use mid-function returns, breaks, continues, etc. Like I said, for my money, all of these should be thrown out. They probably won't be thrown out, but that doesn't mean that it's a good idea to keep them (or that it would be a good idea to throw them out, either). I just believe that removing them would greatly simplify the semantics of programs, and I think the benefits of this outweigh the issues of execution efficiency and coding convenience.

I also think that there have been several issues raised that could use additional constructs to facilitate programming. All I'm saying is that the GOTO might not be the best way to deal with them. The GOTO seems like a step back in programming... if, for no other reason, that it is among the lowest-level instructions available to the programmer.

There's a lot of room for subjectivity in this, and I don't think anyone would claim there is a single right or wrong answer. I wonder, does anybody know of any good studies involving the GOTO?

AUMathTutor · Apr 21, 2009

Jeff Reid said:

True, but now he sees that && running, he has to scan for every instance of running, plus the same issue about goto's would also apply to any breaks, continues, or returns in those loops, yet most programmers don't have a problem with those.

The fact that branch labels existed in a program would prompt me to go search for references to the branches that access those labels.

http://en.wikipedia.org/wiki/Structured_programming

I think I would feel better about the GOTO if it was supported by language mechanisms that added structure to it. For instance, a set of rules such as...

1. the GOTO can only be used in a block of code if the code is tagged as containing a GOTO. This would at least let readers know that GOTOs are involved... so they can look for them.

2. the GOTO can only jump to labels appearing below the GOTO. All of the examples you guys have given obey this rule; it seems like any rule breaking this is essentially duplicating the work of a regular loop.

3. The GOTO used instead of if-then-else for breaking a sequence should be replaced with a brand-new construct, if such a construct is desired. For instance:

Code:

condexec
{
   statement 1;
   statement 2;
   ...
   statement n;
}

The statements would continue executing until one evaluates to false, in which case the control breaks.

So basically you guys see an opportunity to use GOTOs, and I look for new structured constructs. I guess it's just a matter of taste.

D H · Apr 21, 2009

AUMathTutor said:

I have said that I, personally, avoid the GOTO when it is not too inconvenient to do so. I'm sure there are instances where it is the case that GOTOs are much better than the alternative. I'm not sure I've seen an overwhelmingly convincing case for that here... all of the provided examples seem iffy to me.

Of course the examples presented here are iffy. A real example that shows where gotos are preferable would take up far too much space. Open a file, parse the header, check for consistency. There are many reasons open might fail, there are many reasons the parse might fail, and there many reasons the consistency check might fail. Encapsulate those in a single function and you will get something with a very nasty McCabe complexity at best. Add the constraints of no gotos, no in-line returns, no breaks, and you will get something with an astronomical complexity (20 or more is astronomical to me). Simply allowing the use those verboten features can half the complexity.

I'm a big fan of the complexity metric. Sort function/methods by complexity and bingo! you have just found the vast majority of your project's maintainability/verifiability trouble spots. I'm also a big fan of extended cyclomatic complexity. There is little if any difference between if (a) {if (b) {do_something();}} and if (a && b) {do_something();}. The first has a cyclomatic complexity of 3 while for the second, 2. Both have an extended cyclomatic complexity of 3. Strict adherence to structured programming, plus the natural extension to "no break statements" and the very nasty "single point of entry / single point of return" rule, tends to increase the cyclomatic complexity by quite a bit. They increase the extended cyclomatic complexity by even more.

And I agree, it seems very inconsistent to whine about the GOTO and then use mid-function returns, breaks, continues, etc. Like I said, for my money, all of these should be thrown out.

Why? The ultimate objective software that is usable, reliable, comprehensible, verifiable, maintainable, and a bunch of other -ilities. Structured programming is a means to that end. It should not be an end in and of itself. Making it so can be considered harmful. See post #21 for an excellent example.

rcgldr · Apr 21, 2009

AUMathTutor said:

I think I would feel better about the GOTO if it was supported by language mechanisms that added structure to it.

Fortran had computed goto's. C could have implemented the equivalent with pointers to code labels, but it only supports pointers to functions.

The GOTO seems like a step back in programming... if, for no other reason, that it is among the lowest-level instructions available to the programmer.

Aren't most operations low level, such as basic math, and if statements? I consider C to be a mid-level language. C++ bumps it up a notch. Cobol's "move corresponding" is an example of a high level operation. APL's ability to work with dynamic multi-dimension variables is high level. Each language has it's purpose.

Continuing on, although it may not be the best I typically use goto's to handle resource allocation failures (allocate memory, open files, ...) in this manner, so that the main line code doesn't have to be indented.

Code:

{
    ...
    if(allocate(&resource1) != OK)
        goto error1;
    if(allocate(&resource2) != OK)
        goto error2;
    if(allocate(&resource3) != OK)
        goto error3;
    ...
    MainProcess();
    ...
exit0:
    free(resource3);
error3:
    free(resource2);
error2:
    free(resource1);
error1:
    return;
}

or like this if it's more generalized or might used by other programmers.

Code:

{
    ...
    resource1 = invalid;
    resource2 = invalid;
    resource3 = invalid;
    if(allocate(&resource1) != OK)
        goto exit0;
    if(allocate(&resource2) != OK)
        goto exit0;
    if(allocate(&resource3) != OK)
        goto exit0;
    ...
    MainProcess();    
    ...
exit0:
    if(resource3 != invalid)
        free(resource3);
    if(resource2 != invalid)
        free(resource2);
    if(resource1 != invalid)
        free(resource1);
    return;
}

In most environments, resources don't have have to be freed in reverse allocation order, in which case I release the resources in programmer friendly order instead of reverse order.

When creating multi-tasking applications, there are a large number of handles allocated, for threads, semaphores, mutexs, message pools, ... to allow intertask messaging, in additon to the normal stuff like memory and files. In this case using the goto's reduces the clutter of handling all those allocation failures via indentations.

The point here is that the allocated resources are normally independent of each other. The main dependency is with the applications usage of the resources. This is why I don't consider the allocation of resource n to be dependent on the allocation of resource n-1 and I don't code it that way.

Hurkyl · Apr 21, 2009

AUMathTutor said:

Plus, most of the time and money spent on software development is spent on maintenance, not on development, per se.

Exactly...

And the issue you mentioned about the value at the end of the loop, after exiting, is an easy one to remedy, and surely you know it.

(You're referring to the death trap I mentioned, right?) Which is why you seem to be missing the point here. It isn't about how easy it is to remedy, it's about how easy it is for someone to make the mistake, and the depth of knowledge the person debugging must have in order to track it down.

The remedy, of course, (given that you reject break) is to include an extra if block to jump past the rest of the outer loop

Code:

bool running = true;
for(int i = 0; i < 100 && running; i++)
{
   for(int j = 0; j < 100 && running; j++)
   {
      if (table[i][j] == 0)
      {
         running = false; // say this takes 1 time unit.
      }
      else
      {
         // the code...
      }
   }
   if(running)
   {
      // more code...
   }
}

and this increases the drawbacks I mentioned -- the "break out of loop" is now spread out over even more lines, flow control has become even more complicated, and the next person to touch your code has to realize that all of his changes have to the tail of the outer loop have to go inside that if-block.

Using conditionals to turn large blocks of code on and off is fairly difficult to follow, especially when hundreds of lines of code are involved.

You're doing the thing I mentioned -- you're using flags, if-blocks, and for-loops to emulate a goto. Just use the goto instead, to make the entire thing simpler.

The primary drawback of goto in these situations is that you can't use it when the loops have cleanup code at the end of each iteration. If the loop originally had goto, and you need to introduce cleanup code, then it would require rewriting the loop to use state variables. Fortunately, I program primarily in C++, and the RAII paradigm means pretty much all cleanup is handled automatically via destructors. (python would let me do the same with the with statement, but I haven't needed to do such things in python yet)

He sees a pair of nested loops, and he can only assume that they both run to completion. How is he to know that there's a couple of gotos buried in 100 lines of code?

And you'd prefer him to follow the flow 100s of lines through nested if and for loops?

Actually, there is a fix that requires a single line of code and has absolutely no drawbacks that I can find:

Code:

   // This loop exits immediately upon finding a zero
   for(int i = 0; i < 100; ++i)
   {
      for(int j = 0; j < 100; ++j)
      {
         if (table[i][j] == 0) { goto end_of_loop; }
      }
   }
end_of_loop:

And the efficiency issue... well, I almost feel like that's a smokescreen of sorts. Yes, doing an extra boolean check 10,100 times will take a little time... but compared to the rest of the loop, this time is likely neglible.

On the Cray X1, confusing the optimizer means that, rather than having your function executed in the vector unit which can essentially do 64 operations at a time, your function gets implemented in the (relatively poor) scalar unit, one operation at a time. Or, if it can still figure out how to use the vector unit, it might need a lot of setup/cleanup code around the inner loop that takes more time to execute than the inner loop itself!

AUMathTutor said:

I think I would feel better about the GOTO if it was supported by language mechanisms that added structure to it. For instance, a set of rules such as...

1. the GOTO can only be used in a block of code if the code is tagged as containing a GOTO. This would at least let readers know that GOTOs are involved... so they can look for them.

This sounds like it should really be a job for the editor, not the source code. I bet you can program xemacs to do it.

2. the GOTO can only jump to labels appearing below the GOTO. All of the examples you guys have given obey this rule; it seems like any rule breaking this is essentially duplicating the work of a regular loop.

Why? What benefit would there be to restricting goto in such a way? I can certainly think of two uses of it off the top of my head:

(1) State machines.
(2) The handling of an exceptional case requires you jump back and retry something. (And for whatever reason, it would be very awkward / obscure the intent of the algorithm to set it up as a loop)

So basically you guys see an opportunity to use GOTOs, and I look for new structured constructs. I guess it's just a matter of taste.

Looking for new structured constructs is only useful if you're designing a new language. I can't use your "condexec" statement in my C++ programs.

"Opportunity to use GOTOs" sounds misleading -- I don't go looking for "opportunities" to use goto; I use it because it's the right tool for the job.

Search for a label than a trailing right brace

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Use of AI (ML/DL) in Science

Other than just FizzBuzz to test programmer candidates

File Structure vs Data Structure

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

HTML/CSS Problems with DNS records

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight