How can scientists trust closed source programs?

fluidistic · Mar 11, 2016

I wonder how can scientists trust closed source programs/softwares.
How can they be sure there aren't bugs that return a wrong output every now and then? Assuming they use some kind of extensive tests that figures out whether the program behaves as it should, how can they be sure that the programs aren't going to suffer from bugs and stuff like that (malicious code included) in further releases? Are there any kind of extensive tests performed on software that are generally used in branches of physics or any other science involving data analysis? Blindly trusting a closed source program seems to go against scientific mindset to me.

pbuk · Mar 11, 2016

You can't, but then open source software is not immune to bugs either.

Choppy · Mar 11, 2016

Unfortunately I think there is a lot of blind trust in both closed source programs (and open ones for that matter).

That said, proper operation of a code is verified through (i) benchmarking and (ii) routine quality assurance testing, and (iii) independent checking. In my field, Medical Physics, for example, we often use commercial software for planning radiation therapy treatments in the clinic. They determine where the radiation dose goes in the patient and what parameters to set on the linear accelerator to deliver the intended treatment. It's very important that these codes get these calculations correct every time.

So before implementing clinically, we first have to run through a set of basic tests to confirm that the code accurately reproduces measurements under given conditions. Of course even before this, we go through the literature, where these tests have been performed by others. This is how we can establish how reliable the given algorithm is and conditions under which any assumptions break down. This also let's us know what a reasonable tolerance is - how close to measurement values can we expect to get. Then we run through a set of our own tests confirming that our version performs as advertised. Of course, you can't test everything, but you can try to approximate both commonly encountered situations and extreme situations where the code may not perform so well.

Once you've effectively benchmarked your code, it's also important to put it through routine quality assurance testing. So, for example, you may want to repeat a subset of your benchmarking calculations once a month, or after a software version upgrade, or after a patch installation, to assure yourself that your code is still performing as you expect.

Finally, when it comes to something critical like clinical calculations, we confirm the results through redundant, and independent checks or measurements. This can be as simple as performing a hand calculation or using a completely different planning system to redo the calculation. When independent systems arrive at the same answer, you have some increased confidence that the answer is correct. It's still possible they can both arrive at the wrong answer - GIGO and all that - but this serves to increase confidence that at least your black box is working as expected.

On a research front, it's important to be doing the same things.

PeroK · Mar 11, 2016

fluidistic said:

I wonder how can scientists trust closed source programs/softwares.
How can they be sure there aren't bugs that return a wrong output every now and then? Assuming they use some kind of extensive tests that figures out whether the program behaves as it should, how can they be sure that the programs aren't going to suffer from bugs and stuff like that (malicious code included) in further releases? Are there any kind of extensive tests performed on software that are generally used in branches of physics or any other science involving data analysis? Blindly trusting a closed source program seems to go against scientific mindset to me.

Why trust anything? How do you know the brakes on your car won't fail after 1000 miles? And, when you fly, do you do a personal check of the aircraft's engine, guidance systems and everything?

Or, if you buy copper sulphate from someone, how do you know it really is copper sulphate? Or, if you buy a box of drugs, how do you know some of them aren't a placebo? Or, a different drug altogether?

In order to fully check the source code for a system, you would need to be relatively expert in the software technologies. Many systems may be an integration of several technolgies, so no one person (even in the software development company) would be able to check it: you would need a team of people. Even then, source code may be inscrutable without all the software development facilities that were used to create it. In fact, from a software engineering perspective, starting with the source code would be a very inefficient way to verify a system.

FactChecker · Mar 11, 2016

For use in applications where safety is involved, there is usually a certification process that must be done before the software can be used. @Choppy 's example of radiation therapy is a good example of that. But then the computer operating system and compilers must also be certified. They are not just trusted blindly.

In the case of scientific research that does not have safety consequences, there is no formal certification process. You should not be reckless about the software you pick to use. Don't use experimental versions unless you want to do a lot of testing that has nothing to do with your research. Unless you are doing something really unusual, there are probably well tested versions of software that you can use.

FactChecker · Mar 11, 2016

There is no "magic bullet" to guarantee valid code. Making bug-free software is an example of "defense-in-depth". There are software development process that should be followed, unit testing, integrated testing, code standards, code review processes, etc., etc., etc. There is a set of code standards, MISRA-C that tells you what you should do or not do in your code. There are several code analysis tools that examine code for risky practices, test coverage, etc. Even when all the processes and rules are followed, some bugs escape to the released software. Then it is up to the public and the developer to spot and fix the mistakes.

fluidistic · Mar 11, 2016

PeroK said:

Why trust anything? How do you know the brakes on your car won't fail after 1000 miles? And, when you fly, do you do a personal check of the aircraft's engine, guidance systems and everything?

Or, if you buy copper sulphate from someone, how do you know it really is copper sulphate? Or, if you buy a box of drugs, how do you know some of them aren't a placebo? Or, a different drug altogether?

When you safely land with the plane, you know that if there was a problem it did not matter at all for you, unlike the case of having the output of a closed source program where you have no intuition on whether the results are fine or whether they are too low/high by say 0.8%. You don't necessarily get the feedback you'd get with a drug or plane or coppers sulphate.

In order to fully check the source code for a system, you would need to be relatively expert in the software technologies. Many systems may be an integration of several technolgies, so no one person (even in the software development company) would be able to check it: you would need a team of people. Even then, source code may be inscrutable without all the software development facilities that were used to create it. In fact, from a software engineering perspective, starting with the source code would be a very inefficient way to verify a system.

Of course checking the full source code is most of the time impossible. But one does not generally use all the functionalities of a complicated software either. Say I use a program that fits X ray diffractograms and the program claims to be using "name_of_algorithm"'s algorithm and I want to check out how exactly it's implemented. Or say the program claims to use the Scherrer equation to give out the crystallite's size but does not specify the value it uses for "K", the shape factor. In both cases it gets complicated to figure out what the program is really doing.

I realize that when publishing a paper a scientist could specify that the values were calculated via "name_of_software_name_of_version". So that if one day someone realize something was faulty with that software, then one should either fix the results of the published scientist or discard them.

mfb · Mar 11, 2016

fluidistic said:

When you safely land with the plane, you know that if there was a problem it did not matter at all for you

And if you crash, you know there was a problem. Which is certainly a much worse outcome than a value in a publication that is off a bit (with a few exceptions, like studies related to safety of systems like aircrafts...).

But unlike the aircraft you use, you can check the software. Run it on test cases where you know the expected outcome. Sure, they don't cover everything, but if the software passes all test cases you can be quite confident that it works with your actual data as well.
This is routinely done for basically all software packages.

fluidistic · Mar 11, 2016

mfb said:

Sure, they don't cover everything, but if the software passes all test cases you can be quite confident that it works with your actual data as well.
This is routinely done for basically all software packages.

That's what I wanted to know and that's reassuring. If the information of which tests have been done for which program and which version were available to the public, that would be great.

jedishrfu · Mar 11, 2016

One way to test it is to develop a test suite. However, even then its possible that a bug would slip through.

If you recall there was the famous Pentium bug that occurred under specific circumstances:

https://en.wikipedia.org/wiki/Pentium_FDIV_bug

It would appear as a software bug but was in fact hardware related.

mfb · Mar 12, 2016

fluidistic said:

That's what I wanted to know and that's reassuring. If the information of which tests have been done for which program and which version were available to the public, that would be great.

Publications often have a limited size, you cannot write up every little detail.

FactChecker · Mar 12, 2016

Popular software products often have web sites where bugs are reported and discussed.

Vanadium 50 · Mar 12, 2016

jedishrfu said:

If you recall there was the famous Pentium bug that occurred under specific circumstances

Ah yes. They call it "floating" point for a reason. :)

stevendaryl · Mar 12, 2016

I don't want to be paranoid, but there is a huge difference between testing for accidental programming errors and testing for intentional malicious code. Intentionally malicious code can be programmed to only show up under certain circumstances that may not ever occur during testing. To me, that's a big difference between open source and closed source. In the case of open source, you can actually study the code to see if there is peculiar logic that would only show up in certain circumstances.

stevendaryl · Mar 12, 2016

stevendaryl said:

I don't want to be paranoid, but there is a huge difference between testing for accidental programming errors and testing for intentional malicious code. Intentionally malicious code can be programmed to only show up under certain circumstances that may not ever occur during testing. To me, that's a big difference between open source and closed source. In the case of open source, you can actually study the code to see if there is peculiar logic that would only show up in certain circumstances.

Paranoia may be justified in the case of software where there might be a motivation to sometimes giving the wrong answer (for example, the code that counts electronic votes in an election).

rootone · Mar 12, 2016

In the end software development is just another engineering discipline.
As with any other, the first version of something usually does have unexpected bugs, even though there may have been a lot of time dedicated to testing and QA.
As the product matures later versions become more reliable until there are no longer are a significant number of bug reports, and those that there are frequently are not actually bugs but operator or input data errors.
Even those get ironed eventually by 'defensive' programming adjustments which detect and report improper input and so on before the program will proceed.

anorlunda · Mar 12, 2016

This is a worthy topic. I think perhaps the focus on errors is too narrow. It is even too narrow to focus on closed software, or to focus on software at all. There are many ways for things to go wrong or right. Machines add some new risks and reduce other risks.

May I recommend the book "Computer Related Risks" by Peter G. Neumann. Using numerous case histories, the book illustrates the nature and number of risks involved when humans and machines interact. It was written way back in 1994, but it is not at all dated. Many of the mistakes committed in decades past will be repeated in decades future. If you do read it, I expect that you will see that a much broader view of risks is appropriate.

jedishrfu · Mar 12, 2016

The VW experience is another example of software giving maliciously incorrect answers during pollution tests.

PeroK · Mar 12, 2016

stevendaryl said:

Paranoia may be justified in the case of software where there might be a motivation to sometimes giving the wrong answer (for example, the code that counts electronic votes in an election).

How would you know that an election system was actually running the open source code that someone claimed it was?

rootone · Mar 12, 2016

PeroK said:

How would you know that an election system was actually running the open source code that someone claimed it was?

An individual voter may not know that, but if it happened it would be fraud and it would have to be perpetrated on a massive scale to be effective.
Such a conspirarcy theory falls down at the first hurdle, like the 'moon landing hoax' theory.
The conspiracy would need to involve many people, hundreds at least, keeping silent about something they knew about.
That it's realistically not possible,

stevendaryl · Mar 12, 2016

PeroK said:

How would you know that an election system was actually running the open source code that someone claimed it was?

I didn't claim that open source would solve everything.

FactChecker · Mar 12, 2016

stevendaryl said:

I don't want to be paranoid, but there is a huge difference between testing for accidental programming errors and testing for intentional malicious code. Intentionally malicious code can be programmed to only show up under certain circumstances that may not ever occur during testing. To me, that's a big difference between open source and closed source. In the case of open source, you can actually study the code to see if there is peculiar logic that would only show up in certain circumstances.

Good point. Of course, the original code might just contain vulnerabilities to malicious attack. That is yet another problem since the code tested may not yet contain the malicious code. Some software analysis products are available to scan code for vulnerabilities. One is Coverity Static Code Analysis. I don't have much experience with it, so I don't know how well it works. I can say that it can find a lot of vulnerability and bad practices in code. But I don't know how much is left that it doesn't find.

PeroK · Mar 12, 2016

stevendaryl said:

I didn't claim that open source would solve everything.

Yes, I know. I didn't intend it like that. Election software is a good example because it's not clear who needs to trust whom.

Jaeusm · Mar 12, 2016

jedishrfu said:

One way to test it is to develop a test suite. However, even then its possible that a bug would slip through.

Yes, and to complicate matters, bugs can also be found in the test suites themselves.

mfb · Mar 12, 2016

rootone said:

An individual voter may not know that, but if it happened it would be fraud and it would have to be perpetrated on a massive scale to be effective.

There was a surprisingly small number of votes involved in Florida 2004.

Is there a lot of closed-source software specifically produced for science? I have worked with both open-source and closed-source software, but the latter only as standard programs. I can't imagine the programmers of e.g. Matlab introducing malicious code to mess around with some specific particle physics publications. How would you do that (without even knowing if and where Matlab would be used) and where would be the point?

Jaeusm said:

Yes, and to complicate matters, bugs can also be found in the test suites themselves.

Unless two bugs cancel each other, you still see that something needs more attention.

PeroK · Mar 12, 2016

rootone said:

An individual voter may not know that, but if it happened it would be fraud and it would have to be perpetrated on a massive scale to be effective.
Such a conspirarcy theory falls down at the first hurdle, like the 'moon landing hoax' theory.
The conspiracy would need to involve many people, hundreds at least, keeping silent about something they knew about.
That it's realistically not possible,

It wouldn't take a major conspiracy to introduce malicious code at an appropriate point in the release cycle. It's approximately the same as a software vendor selling malicious code in the first place. It just depends on who you trust.

On a less dramatic note, many system support teams, in my experience, insist on getting the source code and recompiling it in every environment, thereby introducing a major uncertainty about whether the system in live is the same as the one that was tested.

From my experience, more problems in a live system are caused by environmental and configuration issues than by traditional software bugs.

.Scott · Mar 12, 2016

stevendaryl said:

I don't want to be paranoid, but there is a huge difference between testing for accidental programming errors and testing for intentional malicious code. Intentionally malicious code can be programmed to only show up under certain circumstances that may not ever occur during testing. To me, that's a big difference between open source and closed source. In the case of open source, you can actually study the code to see if there is peculiar logic that would only show up in certain circumstances.

For mission critical and life safety systems, part of the design includes defending against malicious attacks. Moreover, you need to have traceability from the source code and other source components to the final system images. And you need to have controls in place that guarantees that what is being used in the manufacturing process is exactly what was tested during the release process.

As far as source code is concerned, yes every line code is peer reviewed. And, as was mentioned earlier, there are often standards such as MISRA that need to be followed. Those standards do two things: They minimize the influence of a single bug, and they make the code easier to be reviewed. Also, there are tools that scan the source code (static software tests) to report violations of MISRA standards.

There are also several layers of testing. First, there is modular black-box testing where individual software modules are tested by software programs that are developed based on the documented design for the target module. Then there is white box testing, where every line of code is checked - and there are "code coverage tools" for measuring how thorough this white box testing is. Any line of code that is not covered by a test needs to be examined so that it is understood why it cannot be directly tested.

Then, there is integration, system testing, and field testing. In each case, the test procedure is developed and checked by to determine whether the tests are comprehensive.

Finally, to the extent possible, the system is designed to be fault tolerant. Commonly, there are two separate software systems - each designed by separate software teams - so if one fails, there is a backup. And the system itself is commonly designed so that the mechanics limit the damage that can be done and afford a manual override.

.Scott · Mar 12, 2016

PeroK said:

Yes, I know. I didn't intend it like that. Election software is a good example because it's not clear who needs to trust whom.

With any system there are "stakeholders" who determine the requirements and audit the testing - and in many cases, fund the development process. In the cases of an election system, the corporate entity that is funding the project is the first stakeholder. They not only want the system to work, but should want that testing to be auditable. Their customers, mostly State and municipalities, are also stakeholders. For their money, they are going to want some reason to be confident that the systems work and are hardened against fraud.

xpell · Mar 12, 2016

fluidistic said:

When you safely land with the plane, you know that if there was a problem it did not matter at all for you, unlike the case of having the output of a closed source program where you have no intuition on whether the results are fine or whether they are too low/high by say 0.8%.

When you safely land with the plane, you don't know so much. You know the software worked in that flight, but not how will it behave in another flight under different circumstances. For example, Pitot tube freezing ---which can happen, or not---, depending on a large amount of other variables, has sometimes resulted in dangerous software behavior and at least once that was instrumental (together with human error / disorientation) in a major air disaster... with an aircraft that had otherwise flown over 2,600 times without any relevant trouble. (To make things worse, the problem was known and studied and there were even procedures in place to handle it, but it hadn't been thoroughly corrected because of a diversity of reasons that would be too long to explain here.)

Advanced aviation software is as closed and proprietary as it can gets, Boeing or Airbus or the like are not going to show you their or their providers' industrial secrets. But I don't think an open source approach would improve things very much. First, there are not so many people able to properly evaluate advanced, model-specific avionics code under realistic conditions... maybe a few major airlines could be able to if they were allowed and decided to spend their money doing it, but that's all (the real method is notifying the manufacturer about any perceived glitch.) I'd say this is applicable to other highly specialized industries like nuclear power plants, refineries, etc. Heavy testing, redundant systems and certification (and re-certification as needed) is the way to go. And well... we don't use to have many nuclear disasters, burning refineries or even software-caused air disasters. They can happen, yes. But it's highly improbable, even in "analogic" real-life situations with huge amounts of not-so-predictable interacting variables.

I'm not sure how this applies to purely scientific fields, but I wouldn't be surprised if we found quite a few analogies.

mfb · Mar 12, 2016

A few satellites and space probes got lost due to software issues. Examples:
Mars climate orbiter had a missing conversion between imperial units and SI.
The four "Cluster" spacecraft s (designed to measure the magnetosphere of Earth) got lost due to an integer overflow in the rocket.
CryoSat (designed to monitor polar ice) got lost due an unspecified software bug in the rocket.
Galaxy X (whatever that was supposed to do) got lost due a software bug in the rocket controlling oscillations.
STEREO-B's problem (sun observation) is still unclear.
Various others failed for unknown reasons, and you cannot just go there and have a look...

Wikipedia has a list

jedishrfu · Mar 12, 2016

mfb said:

Wikipedia has a list

Classic!

xpell · Mar 12, 2016

Don't forget these:

http://archive.gao.gov/f0102/115265.pdf (U.S., 1979-80)
https://en.wikipedia.org/wiki/1983_Soviet_nuclear_false_alarm_incident (USSR, 1983)

We went quite close to the Big Cliff with them. They were complex systems incidents, but their software played a role.

jedishrfu · Mar 12, 2016

There was also the famous lottery scam perpetrated by some officials of the lottery allowing some relatives to "win" big" but not too big.

https://en.wikipedia.org/wiki/Hot_Lotto_fraud_scandal

http://www.nydailynews.com/news/national/lottery-fixing-scandal-spreads-nationwide-article-1.2470819

while not explicitly mentioned there had to have been some sort of malicious software involved:

http://www.engadget.com/2015/12/19/lotto-hack/

http://arstechnica.com/tech-policy/...ed-lottery-computers-to-score-winning-ticket/

Another story of when it pays to be a software "tester":

http://www.wired.com/2013/05/game-king/

Lastly, Numb3rs had a great episode (season 05 episode 15) of how some hacks could influence jury selection in favor of the defendant. While it was only a story, its a very plausible one especially as we rely of software to do all sorts of tasks we can never know how it will be hacked until it is.

anorlunda · Mar 13, 2016

What is the point of posting lists of software failures? Obviously,there are also long lists of failures with non-software causes. What does that have to do with the OP?

I picture the case of software involved with delivery of the orders to begin global thermonuclear war. Surely that must have the most severe consequences of any possible failure. I'm sure there are both humans and machines in that loop, but nothing can ever be perfect.

Would you open source or close source it?
Should we trust it? If not,then what?
Should we distrust it? If not,then what?
At what point does adding more resources to perfect software (or anything) become counter productive?

FactChecker · Mar 13, 2016

.Scott said:

With any system there are "stakeholders" who determine the requirements and audit the testing - and in many cases, fund the development process.

I don't think this is necessarily true. Some software can have a very informal development history. You would need to be very loose with the terminology to make that statement about all software.

How can scientists trust closed source programs?

Similar threads

Hot Threads

Recent Insights