Insights Your Software is Never Perfect - Comments

Svein · Sep 8, 2016

StatGuy2000 said:

Software plays a critical importance in systems where safety and/or reliability is of critical importance

Yes, and that is a specialist area all by itself. I am not going to list all the standards that must be met, but you can get an overview here: https://en.wikipedia.org/wiki/Functional_safety .

anorlunda · Sep 9, 2016

Often neglected is the fact that many (perhaps most?) bugs are benign. They may never get triggered, or their negative effect not noticeable.

An interesting example came up during the Y2K remediation. Some software in nuclear power plants had operated successfully for 35 years or more. Presumably, software standards were much better in 1999 than in 1965, so new software is expected to have many fewer bugs. On the other hand, the old software had amply demonstrated that any bugs remaining must be benign. New software may have few bugs (never say zero), but they may not be benign.

So which is safer, the old or new software? That question should not be answered flippantly.

That same question arose in many critical applications in many industries during the Y2K years. It is the age old debate between new and better versus proven.

QuantumQuest · Sep 9, 2016

anorlunda said:

Often neglected is the fact that many (perhaps most?) bugs are benign. They may never get triggered, or their negative effect not noticeable.

An interesting example came up during the Y2K remediation. Some software in nuclear power plants had operated successfully for 35 years or more. Presumably, software standards were much better in 1999 than in 1965, so new software is expected to have many fewer bugs. On the other hand, the old software had amply demonstrated that any bugs remaining must be benign. New software may have few bugs (never say zero), but they may not be benign.

So which is safer, the old or new software? That question should not be answered flippantly.

That same question arose in many critical applications in many industries during the Y2K years. It is the age old debate between new and better versus proven.

In my opinion, each software - old vs. new, must be judged according to its domain and problem/s it solved/solves, at its time.

In the old days, there were not so many programming languages and dialects of them, a few programming paradigms, testing was all but trivial and maintenance was difficult at best - talking about big software. Positive thing is that developers, had the time and opportunity to focus better on what they were creating, so for that instant there were mostly benign bugs.

In the present days on the other hand, there are huge demands for software to solve way more difficult/demanding problems, operate on a lot of newer domains, operate in an inter-domain fashion, and although there is a multitude of programming languages - this holds for multi-branch descendants too, a lot of programming and testing tools, libraries and frameworks, its complexity increases at a very steep fashion and even with the best designed tool chains for full cycle development, bugs are inevitable and this I think also justifies that are not so benign, as in the old days.

I don't think that a direct comparison can be done between old and new software. There is a whole lot of independent and interdependent factors between old and new, as well as in each of them, that makes a direct comparison very difficult.

anorlunda · Sep 9, 2016

QuantumQuest said:

In the present days on the other hand, there are huge demands for software to solve way more difficult/demanding problems, operate on a lot of newer domains, ...a

Where it comes to embedded real time control software, I dispute that so much has really changed. In a few cases, modern software may have difficulty doing as well as some of the vintage stuff.

QuantumQuest · Sep 9, 2016

anorlunda said:

Where it comes to embedded real time control software, I dispute that so much has really changed. In a few cases, modern software may have difficulty doing as well as some of the vintage stuff.

I agree. I was talking about software at large and not particularly for embedded software.

FactChecker · Sep 10, 2016

anorlunda said:

Where it comes to embedded real time control software, I dispute that so much has really changed. In a few cases, modern software may have difficulty doing as well as some of the vintage stuff.

In the military aerospace application, the control law software of 30 years ago can not be compared with current software. A control law diagram from 30 years ago would fit on a single sheet of paper. Modern control law software is on hundreds of pages of diagrams. The requirements have exploded 1000 fold along with the associated SW complexity.

D H · Sep 10, 2016

anorlunda said:

Where it comes to embedded real time control software, I dispute that so much has really changed. In a few cases, modern software may have difficulty doing as well as some of the vintage stuff.

The Space Shuttle flight software was written at a rate of about two or three lines of production code per person per month. The people behind that code weren't twiddling their thumbs for 159 working hours and then writing two or three lines of code. They were instead attending meetings (NASA loves meetings), writing requirements and specifications, critiquing the production code, writing and critiquing test code, evaluating test results, and tracing requirements to sub-requirements to specifications to code to tests, backwards and forwards. This was largely done by hand (the automated tools didn't exist), and of course was done without modern software development techniques such as Agile. While still very expensive, development of critical software has come a long ways since then.

FactChecker · Sep 10, 2016

D H said:

The Space Shuttle flight software was written at a rate of about two or three lines of production code per person per month. The people behind that code weren't twiddling their thumbs for 159 working hours and then writing two or three lines of code. They were instead attending meetings (NASA loves meetings), writing requirements and specifications, critiquing the production code, writing and critiquing test code, evaluating test results, and tracing requirements to sub-requirements to specifications to code to tests, backwards and forwards. This was largely done by hand (the automated tools didn't exist), and of course was done without modern software development techniques such as Agile. While still very expensive, development of critical software has come a long ways since then.

The things they were able to do with such tiny (capability-wise) computers back then is amazing to me. And they are still doing great things communicating with vehicles that were launched decades ago.

anorlunda · Sep 11, 2016

The closest I ever came to military software was an association with the Saturn V Project, so I can't comment on things military. But the point of post #34 was old versus new, so let's compare apples with apples. Compare the same mundane application then and now.

Consider a controller for a motor operated valve (MOV). The valve can be asked to open, close, or to maintain an intermediate position. The controller may monitor and protect the MOV from malfunctions. In the old days, the logic for this controller would be expressed in perhaps 100-150 bytes of instructions, plus 50 bytes of data. That is so little that not even an assembler would be needed. Just program it in machine language and type the 200 hex digits by hand into the ROM burner. A 6502, or 8008, or 6809 CPU variant with on-chip ROM would do the work. The software would have been the work product of a single person working less than one work-day, perhaps checked by a second person. Instantiations would cost about $1 each. (In the really old days, it would have been done with discrete logic.)

In the modern approach, the logic would be programmed in a high level language. That needs libraries, and those need an OS (probably a Linux variant), and that brings in more libraries. With all those libraries come bewildering dependencies and risks, (for example https://www.physicsforums.com/threads/science-vulnerability-to-bugs.878975/#post-5521131) All that software needs periodic patches, so we need to add an Internet connection (HORRORS!) and add a user interface. With that comes all the cybersecurity, and auditing overhead. All in all, the "modern" implementation includes ##!0^4## to ##10^6## times more software than the "old" 200 byte version, to perform the same invariant MOV controller function.

Now you can fairly call me old fashioned, but I find it hard to imagine how the world's best quality control procedures, and software standards could ever make the "modern" implementation as risk free or reliable as the "old" 200 byte version. Worse, the modern standards probably prohibit the "old" version because it can't be verifiabull, auditabull, updatabull, securabull, or lots of other bulls. I argue that we are abandoning the KISS principle.

Now, the reason that this is more than a pedantic point, is the IOT (Internet of Things). We are about to become surrounded by billions of ubiquitous micro devices implemented the "modern" way rather than the "old" way. It is highly germane to stop and consider if that is wise.

Svein · Sep 11, 2016

anorlunda said:

Now you can fairly call me old fashioned, but I find it hard to imagine how the world's best quality control procedures, and software standards could ever make the "modern" implementation as risk free or reliable as the "old" 200 byte version. Worse, the modern standards probably prohibit the "old" version because it can't be verifiabull, auditabull, updatabull, securabull, or lots of other bulls. I argue that we are abandoning the KISS principle.

Hear, hear!

FactChecker · Sep 11, 2016

anorlunda said: ↑
Now you can fairly call me old fashioned, but I find it hard to imagine how the world's best quality control procedures, and software standards could ever make the "modern" implementation as risk free or reliable as the "old" 200 byte version. Worse, the modern standards probably prohibit the "old" version because it can't be verifiabull, auditabull, updatabull, securabull, or lots of other bulls. I argue that we are abandoning the KISS principle.

Svein said:

Hear, hear!

Yes. When hardware is the subject, people adhere to the KISS principle, but for software they abandon it.
But I have one argument. I think that the old software would be very easy to apply new software processes to. The problem is that hundreds of new requirements are piled on. Many of the new requirements are good ideas, but there is a tendency to go overboard. They are trying to anticipate the needs of the next 30 years, which is very difficult if not impossible.

anorlunda · Sep 11, 2016

FactChecker said:

But I have one argument. I think that the old software would be very easy to apply new software processes to. The problem is that hundreds of new requirements are piled on. Many of the new requirements are good ideas, but there is a tendency to go overboard. They are trying to anticipate the needs of the next 30 years, which is very difficult if not impossible.

I'm sure that you're right. But let me ask you two questions.

How do you draw the line between necessary requirements and bloat? (Focusing first on a simple app like the MOV controller helps clarify.)
Who has the responsibility and authority to draw that line?

FactChecker · Sep 11, 2016

anorlunda said:

I'm sure that you're right. But let me ask you two questions.
1. How do you draw the line between necessary requirements and bloat? (Focusing first on a simple app like the MOV controller helps clarify.)

That is the million dollar question that I can't answer. Even trying to isolate a part like a MOV controller gets complicated when issues like redundancy, failure modes, communication standards, customer preferences, etc. come into play. It is often hard for technical software people to communicate (or even anticipate) the risk / cost / schedule consequences of decisions.

2. Who has the responsibility and authority to draw that line?

It is somewhat mysterious to me, but here is what I think. In military contracts top level requirements are initially set very optimistically by the military so that they can see how much different contractors can propose. As the contract winner is selected and development proceeds, they see that some requirements were unrealistic. They try to find some "low hanging fruit" that they did not think of before and can be done in place of the reduced / modified initial requirements. It's a negotiation.

Svein · Sep 13, 2016

anorlunda said:

I'm sure that you're right. But let me ask you two questions.

How do you draw the line between necessary requirements and bloat? (Focusing first on a simple app like the MOV controller helps clarify.)

Who has the responsibility and authority to draw that line?

That is where the paperwork comes in. Done correctly, it is of vital importance.

The Requirement Specification is the responsibility of the customer. The first revision is usually both "over the top" and imprecise, but it is a start.
The Functional Specification is the responsibility of the developer. It should not be a carbon copy of the Req. Spec., but a description of what the developer think it is possible to deliver in a realistic time frame.
Ideally: Revised Req. Spec., revised Func. Spec., ...
Hopefully: A Req. spec. that is reasonably clear and realistic and a Func. Spec. that is realistic.

I have used that model several times. Coupled with being stubborn I have flatly refused to start developing something until the project has arrived at point 4. above. Usually, I have done some notes to myself along the way about how to develop what I thought the customer wanted. Usually, those notes have to be discarded since the final Req. Spec. bore little resemblance to the first informal inquiry.

anorlunda · Sep 13, 2016

Svein said:

That is where the paperwork comes in. Done correctly, it is of vital importance.

The Requirement Specification is the responsibility of the customer. ...

The Functional Specification is the responsibility of the developer. ...

Ideally: Revised Req. Spec., revised Func. Spec., ...

Hopefully: A Req. spec. that is reasonably clear and realistic and a Func. Spec. that is realistic.

I have used that model several times.

I too have used that model many times. It promotes bloat rather than combats growth.

Neither can that process scale down to keep a simple function simple (KISS). Please reconsider the MOV application from #39. Use an order of magnitude budget estimate of $10000 (in 2016 $) and 48 hours elapsed time, including all specs, negotiations, lawyers, implementation, testing and documentation. Can your process fit in that budget? Could the product meet the same reliability performance as #39?

My argument in #39 is that we are delivering lower quality software today than in the past because we don't follow the KISS principle. Refute that if you will. To prevent muddling the point being argued, please stick to the MOV app as the benchmark,(because to apply KISS, we must start with something simple.).

Svein · Sep 13, 2016

anorlunda said:

My argument in #39 is that we are delivering lower quality software today than in the past because we don't follow the KISS principle. Refute that if you will. To prevent muddling the point being argued, please stick to the MOV app as the benchmark,(because to apply KISS, we must start with something simple.)

Yes - but not all things are that simple.

Anecdote: A customer came to me and announced: We want to put the horse betting status on the Internet! Here we have a simple statement that is nowhere near a specification. What is missing is (among several other details):

What should the status screen look like?
How often should it be updated?
How should we handle the fact that there are a varying number of horses in each race?
How should a stricken horse be handled - should we stop displaying it, should we display it in a contrasting font or should we do something else?

Being stubborn, I insisted on having all the details in the requirement spec. I then presented the customer with my functional spec - and since the deadline was now just two weeks away, they accepted it promptly. The software was delivered on time and it worked.

And - before you mention PHP - this was in 1995!

anorlunda · Sep 13, 2016

Svein said:

Yes - but not all things are that simple.

The question is not if you can make everything simply, but rather if you can make anything simply.

Aufbauwerk 2045 · Dec 28, 2016

Humans are unreliable. Also our brains have obvious limitations in handling complex systems. This is why people are working on formal specifications and automatic programming. I think that eventually we will program only at a high specification level and the computer will implement the code.

It is possible now for a robot brain to write its own software to solve a problem, using first order predicate logic. This code is correct in a logical sense, since it is deduced from the original set of axioms in the robot's knowledge base. I have done some work on this myself. The result is an algorithm which can be implemented in an appropriate language.

Pascal was mentioned. Wirth has the right idea. He builds his software the Swiss way, which is to say intolerant of error. He wrote a book years ago called Systematic Programming. I was happy to find a used copy on ebay. It's worth reading even today.

Sherwood Botsford · Jan 1, 2017

Hmm. Pascal was my first language. Years ago when I was using Turbo Pascal, you had the option for range checking. By default it was on. The compiler inserted checks on every array and pointer operation so that the program couldn't access data not originally assigned to that variable. Wonder how much that slows down the software. My estimate is only a few percent.

scottdave · Apr 7, 2018

Great article. Thanks for sharing. Here it is nearly 2 years old when I come across it and read it. Everything is still relevant. I'm taking a data science class; much of what goes into that is what decisions are made about how to handle certain data -- like flagging it or filtering it out.

donpacino · Apr 10, 2018

This is an amazing article that anyone aspiring to be an electrical or software engineer should read.

I also recently got bitten by a nimbus style situation, although much much smaller scale.

eachus · Jul 11, 2018

Sherwood Botsford said:

Hmm. Pascal was my first language. Years ago when I was using Turbo Pascal, you had the option for range checking. By default it was on. The compiler inserted checks on every array and pointer operation so that the program couldn't access data not originally assigned to that variable. Wonder how much that slows down the software. My estimate is only a few percent.

Negative. Not the idea, the impact of global checking enabled. One of the first things I learned when we were developing an Ada compiler at Honeywell, was how to find every place the compiler generated code to raise errors. Some of them I was looking to fix the compiler because the compiler had missed an inference. Some? Oops! Blush my error. And a few really belonged there.

Today you can program in Spark, which is technically a strict subset of Ada, plus a verifier and other tools. It is extra work, but worth if for software you need to trust. Sometimes you need that quality, or better. I remember one program where the "Red Team" I was on was dinged because ten years after the software was fielded, no decision had been made about whether to hire the contractor to maintain the code, or use an organic (AF) facility. I just shook my head. There were still no reported bugs, and I don't think there will ever be. Why? If you remember the movie War Games, the project was writing the code for Wopper. Technically the computer did not have firing authority. It just decoded launch control messages presented to the officers in the silos--well in-between a group of five silos. The software could also change targeting by reprogramming the missiles directly. We very much wanted the software to be perfect when fielded, and not changed for any reason without months (or longer) of revalidation.

Let me also bring up two disasters that show the limit of what can be accomplished. In a financial maneuver, Arianespace let a contract to upgrade the flight control software for Ariane 4. I'm hazy on why this was needed, but Arianespace to save money on Ariane 5, decided they were going to reuse all of the flight control and engine management hardware and software from Ariane 4 on Ariane 5. The Ariane 4 FCS was subcontracted by the German firm doing the hardware part of the upgrade, to an English company. The software developers aware that the software would also be used on the Ariane 5, asked to see the high-level design documents for Ariane 5. Fearing that this would give the English an advantage in bidding for Ariane 5 work, the (French) management at Arianespace refused.

Since the guidance hardware and software were already developed, the test plan was a "full-up" test where the engines and gyroscopes would be mounted, as in the Ariane 5 on a two degree of freedom platform which would allow for a flight of Ariane 5's first stage from launch to burnout. It ran way over budget and behind schedule. When it became the "long pole in the tent," the last box on a PERT or GANTT chart. Rather than wait another year on a project already behind schedule, they went ahead with the launch.

If you have any experience with software for systems like this, you know that there are about a dozen "constants" that are only constant for a given version of that rocket, jet engine, airplane, etc. Things like gross takeoff weight, moments of inertia, not to exceed engine deflections, and so on. Since the software hadn't been touched, the Ariane 5 launched with Ariane 4 parameters. One difference was that Ariane 5 heads East a lot earlier in the mission. And Ariane 4 had a program to update the inertial guidance system which Ariane 4 needed to run for 40 seconds after launch. (On Ariane 5 it could be shut off at t=0. Ariane 4 could be aborted and rapidly "turned around" until t=6. That capability was used, but couldn't happen on Ariane 5. At immediately after t=0, it was airborne. On the first Ariane 5 launch the clock reached t=39, and the Ariane 5 was "impossibly" far away from the launch site, and the unnecessary software (which was only needed, remember, on the pad) aborted both flight guidance computers. The engines deflected too far, the stack broke up, and a lot of VIPs got showered by hardware. (Fortunately no one was killed.)

What does this story tell you? That software you write, perfect for its intended function, can cause a half a billion Euros of damage, if it is used for an unintended purpose. We used to say that software sharing needs to be bottom up, not top down, because an M2 tank is not an M1 tank. Now we point to Ariane instead.

The second case is even more of a horror, and not just because it killed several hundred people. The A320 was programmed by the French in a language that they developed to allow formal proofs of correctness. The prototype crashed at an airshow, fortunately most of the people on board survived. The pilot had been operating way outside the intended operating envelope. He made a low slow pass, then went to pull up to clear some trees. The plane wanted to land, but there wasn't enough runway, and when the plane and pilot agreed on aborting the landing and going around, it was too late. The engines couldn't spool up fast enough. Unfortunately the (French) investigators didn't look at why the engines hadn't been spooled up already.

There were several more A320 crashes during landings and very near the runway. Lots of guesses, no clear answers. Then there was a flight into Strasbourg, the pilots had flown the route many times before, but this time they were approaching from the north due to winds. The plane flew through a pass, then dived into the ground. The French "probable cause" claims pilot error in setting the decent rate into the autopilot. The real problem seems to have been that the pilots set a waypoint in the autopilot at a beacon in the pass. The glide path for their intended runway, if extended to the beacon, was underground. The autopilot would try to put the aircraft into the middle of the glide path as soon as possible after the last waypoint. Sounds perfectly sensible in English, French, and the special programming language. But of course, that requirement was deadly.

The other crashes were not as clear. Yes, a waypoint above a rise in the ground, and an altitude for clearing the waypoint that put the plane above the center of the glide path. Pilots argue that the A320 could be recovered after flipping its tail up. The software has been fixed, the French say the problem was a dual use control (degrees of descent or 100's of feet per minute) and that pilots could misread or ignore the setting. But the real lesson is that it doesn't matter how fancy your tools are, if they can conceal fatal flaws in the programmer's thinking. (And conceal them from those who review the software as well.)

rootone · Jul 11, 2018

All compilers have options to insert code which can help with detection of anomalies.
This does result in longer run times, but modern CPUs don't really notice it.
If you have got your source code polished down to being near 100% reliable, there might be a small advantage in turning of those debugging options.

Insights Your Software is Never Perfect - Comments

Similar threads

How to increase phone signal strength by lying about it

A Crisis for Newly Minted CompSci Majors -- entry level jobs gone

How to calculate Tension for a series of connected points?

Learning Assembly and computer architecture for x86

Sequential Analog Computers?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers