Security Versus Programming Language

In summary: I think. In summary, it seems that trying to write secure operating systems in C does not work. Very smart people have tried for 50 years, and the solution to the problem is not reduced to practice.
  • #36
C and C++ are entirely workable languages for creating a secure operating system and secure applications.

Security vulnerabilities are created by bad programming and bad software design. In the majority of cases, they are based on programming mistakes - buffer lengths that are not checked, resource arbitration that is flawed, memory that is used after it is 'free'd, password security or encryption that is not well thought out, functions reentered that are not coded for reentrancy, systems that depend on obscurity for their security. You will never have a secure system so long as you have sloppy design and coding.

Certainly operating systems contribute their share of vulnerabilities. But ultimately the OS needs to allow Adobe, Chrome, your favorite game, etc to operate. Those apps then use third party libraries that can include defects. And once those apps are running, there is nothing the OS can do if the app turns over all of its resources to malicious code.

There is no substitute for consistently good design and programming practices.

I would not promote coding in assembler (or machine language) when it can be avoided. It absolutely does not promote a secure system because that level of detail obscures the basic algorithm and implementation. The point of a programming language is to provide syntax that can be written and reviewed by people (programmers) and readily parsed by a compiler. On the other hand, it certainly pays to know exactly what code the compiler is generating since that is part of your debug environment.
 
  • Like
  • Informative
Likes sysprog, Nugatory, QuantumQuest and 1 other person
Technology news on Phys.org
  • #37
There's an humongous thread in alt.folklore.computers circa 2005, concerning buffer overruns, that started as a crosspost from sci.crypt . 6k'ish posts. Unfortunately, trying to view it in Google Groups is giving my laptop a headache.

Long story short :

"OMG, how can we avoid 'buffer overruns' ?".
"Hire better programmers".
"But we can't afford that. Why don't we just require that C automatically checks every single array reference for out-of-bounds".
"Are you insane ?"

Doesn't matter, really : make something 'idiotproof' and the universe sees it as a challenge.
 
Last edited:
  • Like
Likes pbuk and FactChecker
  • #38
It's a serious weakness of C. If a language does not have a built-in, protected, size indicator as part of its standard structures, then protecting code from space overruns is a problem.
 
  • #39
FactChecker said:
It's a serious weakness of C. If a language does not have a built-in, protected, size indicator as part of its standard structures, then protecting code from space overruns is a problem.
And C# does. But I would not suggest dropping C and C++ in place of C#. They all have their place.
And, of course, you can borrow or create a memory management library. It won't be built in, but it'll still look good in the code.
 
  • Like
Likes FactChecker
  • #40
FactChecker said:
It's a serious weakness of C. If a language does not have a built-in, protected, size indicator as part of its standard structures, then protecting code from space overruns is a problem.
Hammers has no built-in protection against hitting your fingers - that's what you expect from a hammer.
From a different angle: it is not a weakness, it is a feature...
 
  • Like
Likes anorlunda and .Scott
  • #41
I once wrote a document about security problems. The main document is the property of the firm I was working for at the time, but this excerpt is general (and mostly my own):

3.1.1 The “C” Language
The “C” language is terse, flexible – and quite unsecure. Other languages usually refuse to compile when they meet ambiguous constructions or parameters of the wrong type. “C” compiles everything, at most writing an error message to the console, and even that message can be suppressed.
Therefore it should be mandatory to run all “C” code that is concerned with security through the lint code checker, or even better, the splint (Secure Programming Lint) code checker. While not a substitute for good programming practices, it catches most spurious errors and ambiguities.

3.1.2 The “Buffer Overflow” Vulnerability
A frequent root cause in Microsoft security bulletins, the “Buffer Overflow” vulnerability is usually caused by uncritical use of the standard C string copy function strcpy().
What this function dos, is copying a string into a buffer. What it does not do, is checking whether the string fits inside the buffer. Therefore, strcpy() will happily keep copying the string data on top of whatever data that are adjacent to the buffer. This behavior causes all kinds of problems – from the obscure to the catastrophic.
There are several ways around this vulnerability. You can:
• Check the length of the input string (using strlen()) before copying. If the input string is longer than expected, you can raise an error or allocate a larger buffer.
• Use the safer version of strcpy(), namely strncpy(). Remember that the length parameter in strncpy() should be one less than the size of the buffer, and that the last character in the buffer should be set to ‘\0’ after the copy.
• Use strdup() instead. This function creates a duplicate of the input string and returns a pointer to the copy. Remember to get rid of the copy (free()) when you are finished with it!

3.1.3 The “Null Pointer” Vulnerability
Several “C” library functions (e.g. malloc()) return a NULL pointer to indicate an error. Sloppy coding skips testing the returned pointer for NULL and uses it as if it were a valid pointer. Writing something into location 0x0000 (=NULL) - or close by – usually introduces a catastrophic fault at an unrelated part of the software.
Security experts see unchecked null pointers as the next great vulnerability exploitation.
 
  • Like
Likes FactChecker and .Scott
  • #42
I have coded thousands of strcpy()'s and they all work. Using that routine without considering buffer lengths or the possibility of invalid pointers is barely good enough for one-off code that will never see light of day. It is not good enough for even the lowest impact application that is going to be made public.

Svein said:
“C” compiles everything, at most writing an error message to the console, and even that message can be suppressed.
My last couple of jobs have involved mission-critical and life-critical systems where you buy additional tools to look for additional warnings - which you then resolve. But even with less critical systems (such as video processing) there were software standards.
Allowing any warning to go unaddressed allows new warning messages to get lost in the clutter.

I am astounded that major corporations need to resort to things like C# to hold to coding standards. It suggests that there are armies of programmers who aren't eager to do their full jobs.
 
  • Like
Likes QuantumQuest and Nugatory
  • #43
Svein said:
3.1.2 The “Buffer Overflow” Vulnerability
A frequent root cause in Microsoft security bulletins, the “Buffer Overflow” vulnerability is usually caused by uncritical use of the standard C string copy function strcpy().
Or scanf()...
Svein said:
Use the safer version of strcpy(), namely strncpy().
Several versions of Visual Studio ago, Microsoft introduced s_scanf() and several other input functions in the C standard library, which take an additional parameter when character arrays are to be input. Current VS versions have deprecated the older, unsecure, versions.
 
  • #44
PeterDonis said:
My point is that none of this has anything to do with whether the executable code itself is compiled. You are focusing on compilation and linking as though that's the problem, but it isn't. The problem is executable code not having protection against being changed to something other than what the user wanted it to be.
Compilation and linking is indeed a part of the problem, although it is better described as end-to-end validation of the entire toolchain. A good summary of the issue is Ken Thompson’s 1984 Turing Award lecture: https://www.archive.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf (although it does contain some anachronisms: written before any significant uptake of multi-byte character encodings; no discussion of hashing and signing; ...).

Protecting executable code is also only part of the problem, as altering the code is only one way of persuading code at a privilege level other than your own to do something you are not be allowed. SQL injection attacks are a trivial example, but there are many more.

And to return to the subject in the thread title... The idea that one programming language might be more secure than another is a snare and a delusion. One set of development practices may lead to more secure designs and implementations (confusing these two is another fertile source of security vulnerabilities) than another, and as part of defining these practices we may decide that one programming language will integrate with these practices better than another, but that tells us more about our approach than about whether one language is more secure than another.

The most secure system I ever worked with was written completely in C, including the ROM-resident boot loader. The very small microkernel that managed mapping from virtual to physical addresses (aggressive use of the x86 architecture’s segmentation machinery here) was formally validated, and although it is not feasible to formally validate a toolchain that includes a C compiler, the toolchain was carefully audited. The design goal was to be able run arbitrary binaries outside of the microkernel with confidence that compartmentalization (yes, this was a national security application) could not be violated in either direction.
 
  • #45
Nugatory said:
A good summary of the issue is Ken Thompson’s 1984 Turing Award lecture: https://www.archive.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf

That lecture supports the idea that secure software is an oxymoron, because obviously, code is used by users other than the author.
Ken Thompson’s 1984 Turing Award lecture said:
The moral is obvious. You can't trust code that you did not totally create yourself.
Nugatory said:
The most secure system I ever worked with was written completely in C, including the ROM-resident boot loader.
Ken Thompson would not have trusted your secure system because he didn't write it. Details of why you trust it are irrelevant to Mr. Thompson.

But that level of philosophy is far above the properties of any SW, any language, any hardware.

A good discussion of trust is given in this book by Bruce Schneier. Liars and Outliers: Enabling the Trust that Society Needs to Thrive

If we trust nothing, then we become paralyzed and unable to function. We suffer from self-inflicted pain. If we trust too much, then we are vulnerable to externally-inflicted pain.

But I am optimistic that someday we may learn how to manage trust. Therefore continuing interest in the topic is warranted.
 
  • #46
.Scott said:
I have coded thousands of strcpy()'s and they all work.
Have they all been attacked?
Using that routine without considering buffer lengths or the possibility of invalid pointers is barely good enough for one-off code that will never see light of day. It is not good enough for even the lowest impact application that is going to be made public.
Perfect people make perfect code. Imperfect people make imperfect code. It is good to have languages and tools that can help the imperfect people.
 
  • #47
.Scott said:
I am astounded that major corporations need to resort to things like C# to hold to coding standards. It suggests that there are armies of programmers who aren't eager to do their full jobs.
Well. More like: there are armies of coders (appointed as programmers) who does not even know what's their job would be.
I think that's the sad reality of 'security'.
 
  • #48
FactChecker said:
Have they all been attacked?Perfect people make perfect code. Imperfect people make imperfect code. It is good to have languages and tools that can help the imperfect people.
This isn't an issue of perfect vs. imperfect. You shouldn't use strcpy() unless you know your destination is large enough. The normal mistake is not for someone to miscalculate, but to make assumptions that should never be made - like assuming that input from the outside world is going to be correct.

And yes they are attacked. I attacked them myself.
 
  • #49
Nugatory said:
And to return to the subject in the thread title... The idea that one programming language might be more secure than another is a snare and a delusion.
I agree. Security does not come as a score out of 100. A system is secure or it is not.

To be secure the hardware must;
1. Differentiate between executable code and data.
2. Have hardware bounds checking on virtual memory pages used for code and for data.
3. Restrict privileged instructions to the kernel of the OS.

When you compile a C source and execute it on that system, the task is restricted to allocated resources and cannot detect or interfere with other users or resources. Any bounds transgression, attempt to execute data, or modify code, will result in an exception seen by the OS and not by the task.

Security comes from the hardware.
You cannot build a castle, or a tower of babel, on a foundation of sand.
 
  • #50
I'm going to be a bit contrary and blame C.

The 8088 was a 16 bit computer, and 16 bit addresses go to 64k. To address more space (the original IBM PC has 640k of memory), addresses were in two parts: a segment and an offset. The physical address was (segment * 16) + offset, so 1 MB was theoretically addressable. There were four segments, CS, DS, SS and ES, intended for code, data, stack and "extra", but there were no consequences (other than what was caused by bad programming) for using them any way you wanted.

Because the physical address was easily calculable, even across segments, C pointer arithmetic worked, even across segments.

Enter the 80286. It was to address 16 MB, and it was to enforce this data model. CS, DS and SS could not overlap (not sure about ES, but probably the same) and depending on a process' privilege level or ring had varying degrees of access to these areas in memory. It couldn't (without a hack) deal with the old segment+offset addressing, so it replaced "segment" with "selector", a 16-bit number which "uniquely identifies one of 16K possible segemnts in a task's virtual address space" but "the selector value does not specify the segment's location in physical memory."

C programmers hated it. C was the dominant language at the time, and C programmers really hated it. This change killed the single 20-bit address space and it killed pointer arithmetic for pointers more than 64k apart. It's hard to express the degree of hate programmers had for it if you weren't there. It was a revolt - with few exceptions, programmers simply refused to program this way, instead using just the lowest 1MB of memory where they could pretend they had an 8086.

So the customers chose convenience over security, and the vendors saw this and provided exactly that. And that lesson has not been forgotten.
 
  • #51
Baluncore said:
You cannot build a castle, or a tower of babel, on a foundation of sand.

You can, however, build one on a swamp:

 
  • #52
Baluncore said:
Security comes from the hardware.
Well, no. The hardware (with a few exceptions) does what it is designed to do. Of course you can use it wrongly ("it is not the gun that kills, but the man using it").

An anecdote: When people started using the Ethernet to transport IP packets, they needed a way to map IP addresses (with their own rules and peculiarities) to Ethernet addresses (with their rules etc.). To handle this, a protocol was devised, the Address Resolution Protocol (ARP). In short, when you wanted the Ethernet address matching a given IP address, you sent an Ethernet broadcast asking "Who has got this IP address?" and you would (hopefully) get an answer - also using the Ethernet broadcast.

But then - in the name of efficiency - somebody thought of caching those mappings for a while (in case you needed it again soon). Well and useful.

Next optimization was when somebody thought it useful to cache all ARP answers (even if you had not used ARP recently) in case you would need that address mapping. And the address mapping everybody needed was the Ethernet address of the router to the outside world. So - if somebody got access to the network, they could use ARP to get the Ethernet address of the router, store it, and then send out an ARP reply mapping the IP address of the router to themselves. Thus every packet to the outside would pass through this rogue machine...
 
  • #53
Baluncore said:
Security comes from the hardware.
That's like one leg of a three legged stool claiming all the glory of being THE support for itself. Just - no.
There are some basic requirements for the hardware but right now most hardware is able to provide that much.

Has hardware anything to do with script worms, for example?
 
  • Like
Likes FactChecker
  • #54
Vanadium 50 said:
It couldn't (without a hack) deal with the old segment+offset addressing, so it replaced "segment" with "selector", a 16-bit number which "uniquely identifies one of 16K possible segments in a task's virtual address space" but "the selector value does not specify the segment's location in physical memory."

C programmers hated it. C was the dominant language at the time, and C programmers really hated it. This change killed the single 20-bit address space and it killed pointer arithmetic for pointers more than 64k apart. It's hard to express the degree of hate programmers had for it if you weren't there. It was a revolt - with few exceptions, programmers simply refused to program this way, instead using just the lowest 1MB of memory where they could pretend they had an 8086.

So the customers chose convenience over security, and the vendors saw this and provided exactly that. And that lesson has not been forgotten.
I never hated that memory addressing scheme - not do I remember any coworkers ever commenting on it one way or another.
Looking back on it, it was kludgy. But it was a reasonable way for Intel to extend physical memory while maintaining backward compatibility.

Separating code and data is good programming - and good programming provides fewer opportunities for security attacks.

Hardware can certainly help, but it is not the end all. In the 1980's, Data General introduced its Eclipse MV series featuring a ring system design specifically to protect the operating system from attack - intentional or otherwise. A kernel of the OS sat in ring 0, the center, most protected ring - with its own memory. Most OS Services ran out of rings 1 and 2. At outer ring could only access services from an inner ring through a gate. I think the theory was that as long as you got those gates right, an inner ring could not be broken. Within two months of working on that machine, I discovered a "System Available Memory" management bug that would allow application code to overwrite a kernel stack segment. If I wanted to intentionally hack the kernel, I could have replace kernel stack data and return addresses to redirect the kernel to whatever I wanted.

The moral of the story is that programming bugs will always present potential attack lanes - regardless of what hardware services are available. Every time you encountered a Windows blue screen, you were looking at a potential attack lane. The program was not doing what the programmer wanted it to do.

There are hardware features that can help - both directly, as with the DG ring structure or indirectly with features that make programming less complicated (like separating code and data). There have also been hardware features that have hurt - like memory caching mechanisms or device driver firmware that could not be protected by the operating system. But when the hardware holds up its end, security rests with the software.
 
  • #55
.Scott said:
The moral of the story is that programming bugs will always present potential attack lanes - regardless of what hardware services are available.
OS bugs yes, user code bugs no.

I hand punched my first programs to run on a Burroughs 6700. At the time I was naively unaware of the security provided by the Burroughs hardware, probably because no one was hacking that system's architecture, and they still don't. Later I did discover why hardware was so important, but only after I scrapped a B6800. Coding for, and servicing DG Nova and DG Eclipse RISC processors was also enlightening. Funny how you don't know what you've got till it's gone when you find yourself programming Intel based PCs.

It is interesting to see how Intel processors have gradually developed the absolute minimum hardware support needed to implement secure systems. But those features are largely ignored by system developers, except by a very few, like the Unisys ClearPath servers that are now based on Intel processors.

Many PC programmers today lack understanding of the elegant security available from a 3 bit tag field on each word, or the concept of a cactus stack, with hardware memory management.
That goes to explain why so many users today are immersed in an ocean of insecurity.
 
  • #56
Baluncore said:
OS bugs yes, user code bugs no.
It depends on what "user code" you are talking about.
One of the biggest security issues was created about 7 years ago by a zero-day Adobe bug. But if you Google this, you may have trouble because there was another zero-day Adobe bug just last year.
How can the OS defend against these? The Adobe application is allowed to read and write files in the application area and connect to the internet. For many exploits, that's all that's needed.
In both cases, all the user had to do was open a pdf - which would then appear to open exactly as expected.
 
  • #57
.Scott said:
It depends on what "user code" you are talking about.
Adobe is not part of the OS, it is a user application. A secure OS should prevent an application run by one user from writing to another users files.
 
  • #58
Baluncore said:
It is interesting to see how Intel processors have gradually developed the absolute minimum hardware support needed to implement secure systems. But those features are largely ignored by system developers, except by a very few, like the Unisys ClearPath servers that are now based on Intel processors.
I think you are somewhat lost with this. The migration of that kind of thought line to x86 is just a small corner - the main section is the boom of virtualization: and if you look for security of that level, then you have to look for the supervisor software and the continuous development of x86 based high reliability 'big' servers in general.

For any personal use OS that kind of security was never even considered - most user would just switch to something lighter with less limitations right off the bat.

You are asking for the reliability of a big railroad diesel - in a small, pink colored shopping car.

Ps.: the actual personal use OS in relation to the mainframe world would not be considered much more serious than an user shell...
 
Last edited:
  • Like
Likes .Scott
  • #59
Baluncore said:
To be secure the hardware must;
1. Differentiate between executable code and data.
2. Have hardware bounds checking on virtual memory pages used for code and for data.
3. Restrict privileged instructions to the kernel of the OS.

When you compile a C source and execute it on that system
That's contradictory because the source code is data to the compiler and the object code is data to the loader, a moment later it becomes executable code. When a OS patch is transmitted it is data, the patch file or .EXE first resides in user space, then it changes the OS executable code.

I agree with your point 1. But my definition of differentiating between executable code and data means that the executables must be changed with a soldering iron. There must be no software means to penetrate the data/code barrier. I say that because the threats I fear are all remote attacks, and because all successful malware somehow manages to get evil code executed. Remote attacks can include social engineering where the authorized user is tricked into doing some compromising act.

The title perhaps should have said secure against remote attacks. I would think is self-evident that there is no defense against a local attack that specifies, designs, replaces or destroys local hardware and software.
Baluncore said:
A secure OS should prevent an application run by one user from writing to another users files.
A web cam is hijacked remotely to become part of a botnet. Who are the multiple users in that case? Is there necessarily such as thing as a user account in a IOT device? I am trying to think of security in terms broader than just PCs, or servers.
 
  • #60
anorlunda said:
That's contradictory because the source code is data to the compiler and the object code is data to the loader, a moment later it becomes executable code. When a OS patch is transmitted it is data, the patch file or .EXE first resides in user space, then it changes the OS executable code.
I agree that all source code is data. The insecurity comes when the product of that data is carelessly promoted to privileged executable code, without appropriate permission.

Execution of privileged instructions that can break system security should only be permitted by a certified OS kernel of manageable size and complexity. Rogue software must be summarily terminated on any attempted breach of it's allocated resources. To be secure, a system requires a secure resource management kernel protected by hardware traps.

The single user PC today is insecure because it is too flexible, in too many ways.
 
Back
Top