# Is it possible to deduce the software from the hardware activity?

Aidyan
Suppose one knows well the laws of physics but knows nothing about computers and IT. Would one be able to deduce how a computer works only by studying its hardware? Could one rebuild the software code and understand its meaning only by looking at the internal flow of bits in its CPU?

PeroK

## Answers and Replies

Mentor
Suppose one knows well the laws of physics but knows nothing about computers and IT. Would one be able to deduce how a computer works only by studying its hardware?
I don't think so. If one knows nothing about computers, it seems safe to say that one would also know nothing about the CPU, RAM and ROM modules, and all the other pieces that make up a computer, let alone how they work and what they are doing when a program is running.
Could one rebuild the software code and understand its meaning only by looking at the internal flow of bits in its CPU?
Seems very doubtful to me.

Staff Emeritus
Could one rebuild the software code and understand its meaning only by looking at the internal flow of bits in its CPU?
Yes.

A number of resources on Youtube can help.

This is the channel:

"How semiconductors work."
"How a transistor works."
"Inverting the signal with an transistor."

An work your way up to building a working computer on a breadboard, to programming it, to advanced programming.
"8 bit computer"

In total, there are 76 videos, most about 10 minutes long, but ranging 5 to 38 minutes.

It is a good way to study, but it is still study. You have to put in the work to do it.

p.s. I see my answer is the opposite of @Mark44's. So be it. You can choose "no" or "yes".

Mentor
p.s. I see my answer is the opposite of @Mark44's. So be it. You can choose "no" or "yes".
My reply was conditioned on the hypothesis that one knew nothing about computers. Going through Ben Eater's videos would counter this hypothesis.

Aidyan
Aidyan
My reply was conditioned on the hypothesis that one knew nothing about computers. Going through Ben Eater's videos would counter this hypothesis.
Indeed, Ben Eater does not deduce the workings of a PC, CPU, memory, etc. from scratch. He only explains to the public how they work on the basis of what he already knows.

Staff Emeritus
Indeed, Ben Eater does not deduce the workings of a PC, CPU, memory, etc. from scratch. He only explains to the public how they work on the basis of what he already knows.
I'm not sure what you mean by scratch. But "How a semiconductor works" is pretty low level. Is that not scratch enough for you? If not, how deep do you want to go? Be warned, that at the bottom is quantum mechanics.

Aidyan
I'm not sure what you mean by scratch. But "How a semiconductor works" is pretty low level. Is that not scratch enough for you? If not, how deep do you want to go? Be warned, that at the bottom is quantum mechanics.
So, once you have a semiconductor in your hand and analyze it, and even know about quantum physics but have no idea how an IT chip works and especially what it does, how would you proceed in deducing its function?

Jarvis323
So, once you have a semiconductor in your hand and analyze it, and even know about quantum physics but have no idea how an IT chip works and especially what it does, how would you proceed in deducing its function?
First they would look at it.

Mentor
but have no idea how an IT chip works
What's an IT chip?

Could one rebuild the software code and understand its meaning only by looking at the internal flow of bits in its CPU?
There are many, many ways to write software to do the same task.

Klystron
Homework Helper
Gold Member
Suppose one knows well the laws of physics but knows nothing about computers and IT. Would one be able to deduce how a computer works only by studying its hardware? Could one rebuild the software code and understand its meaning only by looking at the internal flow of bits in its CPU?
Of course, modern computing technology using tiny microprocessors with billions of components has evolved from a handful of simple logic gates. So, it's fair to say that by looking at modern equipment hardware it is too difficult to unravel the computing code that runs on it. (In addition, some aspects of the code may never run or will rarely run... how will that be uncovered?)

Like many things, one needs to study a simple system first.

(This might be interesting: an adder using Conway's game of life https://nicholas.carlini.com/writing/2020/digital-logic-game-of-life.html
Could one determine what this machine is doing by studying this [with its given set of inputs]?)

Aidyan
First they would look at it.

Precisely. So, once one has "looked at it", what follows next? Say an extraterrestrial that knows well the laws of nature has to decipher how a human-made computer works (let us also assume it is still working and can analyze many of them). If someone knows all about electromagnetism, quantum physics, solid-state physics, etc., but nothing about the working principles that led humans to the construction of such complicated circuitry, can ET deduce what it is useful for, how it works, and especially how that circuitry achieves the specific tasks it performs?

Aidyan
Of course, modern computing technology using tiny microprocessors with billions of components has evolved from a handful of simple logic gates. So, it's fair to say that by looking at modern equipment hardware it is too difficult to unravel the computing code that runs on it. (In addition, some aspects of the code may never run or will rarely run... how will that be uncovered?)

Like many things, one needs to study a simple system first.
Ok, then let's assume that one discovers that there are gazillions of transistors. And begins with a bottom-up approach from there. Next would be to see how these transistors connect and interact with each other. Could one then discover that these form logic gates with some logical function? I guess yes... but am not sure. From there the next step would be to understand how these logic gates come together to form some CPU instructions, from there one might speculate "maybe these instructions correspond to the executions of a low-level programming language" (say assembler, sort of...), etc. But I suspect that this latter step is already too difficult if one does not know what the meaning and purpose of what we call a "programming language" is. Maybe it is even not possible in principle.

Overall I'm trying to understand if that is impossible (if it is) only because it is too difficult or also because it isn't possible, even not in principle? I'm wondering if that might connect to some sort of logical impossibility that tells us that something is not just too hard to figure out but even logically impossible (sort of Goedel's or Chaitin's incompleteness theorem)?

MikeeMiracle
CPU's have their own very low level programming language. You can connect to CPU's and run basic commands on them, I did this at College where we got to program the CPU directly using HEX to perform basics function like turning on a set of lights in sequence. This low level stuff you could find out by studying / reverse engineering the CPU.

Having said that it's all at a very basic level, just because you know what the CPU does at a basic level does not imply anything to the software running on top of that.

I.E. We could give someone not familiar with an Intel x86 type CPU the CPU and they could figure out the basic chip functions. That does not then imply that they could then create MS Windows or Linux just from studying the CPU. Sure if they had the concept of a GUI using a WIMP (Windows, Icons, Mouse, Pointer) they could try and create their own, but the secrets of an OS or software running on them cannot be reversed engineered just from the hardware.

Gold Member
Given sufficient knowledge of electronics one could determine the operating characteristics of a standard digital computer such as limits on input and output levels, power supply, possibly clock speeds, etc. The instruction sets / programs that run on digital computers are strongly limited by the electronic hardware but, by design, are not deducible from the hardware as these computers are designed to run a large array of programmable tasks, usually interleaved for efficiency.

Digital computers dedicated to a distinct task could carry this function in their electronic design in such a form as could be deduced form meticulous analysis of the electronics. Even so, this analysis would not produce the data sets -- the input/output -- of a dedicated digital computer run without a similar input stream. Complexity issues confounding analysis also come into play beyond simple examples.

Analyzing a dedicated analog computer design should yield more information concerning the computer function compared to a general purpose digital system. Given that form follows function in analog devices, that form should provide some functional information.

Take a simple analog device such as a ruler or 'wheel on a stick' used to measure length and perhaps a picture or video of an engineer employing the device. One could deduce the device measures distance from examination. This deduction does not preclude the device being used for purposes unrelated to measurement.

Aidyan
Could one rebuild the software code and understand its meaning only by looking at the internal flow of bits in its CPU?
I would say the possibility exists, but to really do it would work only for some really specific problems/applications.
For general application like a PC - well, there are so many extra stuff there that sometimes even possession of the source code won't give you a hint about it's actual purpose.
Just think about all the user interface/GUI things, for example. To understand them backward starting from a debug would be a real major headache/PITA.

PeroK
Aidyan
The instruction sets / programs that run on digital computers are strongly limited by the electronic hardware but, by design, are not deducible from the hardware as these computers are designed to run a large array of programmable tasks, usually interleaved for efficiency.
But can someone not deduce from a series of instructions executed by the CPU a specific high-level instruction? Say, for instance, a set of assembler instructions representing a sum = var1+var2? (just Imagine an ET dealing with a human build PC.) If not, is that due to the fact that it is much too hard (but in principle possible for some superhuman intelligence) or because there is a logical reason that forbids it even in principle?

Complexity issues confounding analysis also come into play beyond simple examples.
I think of another complexity issue, namely that usually a high-level language is compiled. Is it possible to reverse-engineer from compiled data the program without knowing the compiler? I tend to believe that one would see only meaningless white noise that would be impossible to relate with anything if the compilation rules are unknown. But I have no strong opinion on this... maybe there is a way to do that?

MikeeMiracle
Seeing what the CPU is doing in real time is of no use if the commands its running are out of context, you don't know what the high level software is doing with that command, why it's running it, what it intends to do with the data, is it just setting up commands, is it the processing of actual "data" etc

Software can be de-compiled but you really need to know the language used to create it or just try them all until you get something legible. Even then it may not make any sense.

Why this line of questions if I may ask? It seems to me that your example of Superhuman intelligence / ET would be far more advanced than us and have no interest in our computing.

PeroK
Aidyan
Seeing what the CPU is doing in real time is of no use if the commands its running are out of context, you don't know what the high level software is doing with that command, why it's running it, what it intends to do with the data, is it just setting up commands, is it the processing of actual "data" etc
Ok... wait a moment. The conversation moved on the CPU but the originally it referred to the entire PC. The clever mind that knows everything about physics and also logic and math but nothing about human computing (ET or whoever) has nevertheless access to the hard drive, the SSD, or whatever memory in which the program is stored. It has all the program lines and also all the hardware. The question is whether they may be able to relate each other and make any sense out of it?

Why this line of questions if I may ask? It seems to me that your example of Superhuman intelligence / ET would be far more advanced than us and have no interest in our computing.
Firstly, a purely logical curiosity.

Secondly, if and in what sense today's efforts to decipher the brain's activity is comparable to this "retro-engineering"? There are actually several brain mapping projects aiming at mapping neuron by neuron the brain and simulating its neural networks in the belief that this will lead us to understand how our brain works and why it does what it does. One might object that brains do not run on a software-hardware basis, but are only hardware. But that could make things eventually even harder. If retro-engineering of a simple PC is a mission impossible, then it seems to me that these projects are doomed to failure as well.

Thirdly, the question of whether an extraterrestrial life form that would find a human space probe would be able to decipher what it was doing, the program running, etc.?

MikeeMiracle
I focused on the CPU as it is the brains, everything else is useless without it.

I would think there are far too many neurons to map the human brain properly, I see this as an impossible tasks for today's technology.

I would think an advanced civilisation would be able to determine what the instruments do on one of our space probes, physics is the same everywhere after all. Let's take a digital camera as an example, I would not think it would be that out of this world to determine the photo sensor turned photon input's into electrical signals. What happens with those electrical signals after conversion they may not figure out but they would still know it's an image capture device of some sort I imagine.

Jarvis323
Precisely. So, once one has "looked at it", what follows next? Say an extraterrestrial that knows well the laws of nature has to decipher how a human-made computer works (let us also assume it is still working and can analyze many of them). If someone knows all about electromagnetism, quantum physics, solid-state physics, etc., but nothing about the working principles that led humans to the construction of such complicated circuitry, can ET deduce what it is useful for, how it works, and especially how that circuitry achieves the specific tasks it performs?

It's actually not very complex circuitry. It's just there are a small set of types of somewhat simple circuits, and there are a lot of instances of those circuits. I think that after figuring out what the transistors are/do, figuring out what the circuits do is straightforward.

Once they figured out how the computer works, they would be able to write low level machine instructions. But they probably wouldn't be able to use the computer at hand, because it requires an operating system, and drivers, and input/output devices. So they would be without the necessary software, which is quite complex.

They would be able to build more primitive computers at first (like calculators) that replicate the internal digital logic, and it would take years to incrementally make the advancements in terms of operating systems, programming languages, compilers, algorithms etc. to get to a level like we are at now.

Edit: I was assuming they have the computer, but no software. I'm not sure what the task is exactly. Do they have a working computer with an operating system installed? They can run it and dump the binary of the compiled programs? They can open up Visual Studio and write code on it themselves? What do they have and what is the goal?

Last edited:
Gold Member
Let's assume that you can identify everything about the structure and state of the machine. You are then left with the problem of how it was created. There may be multiple ways to create the same result.

So, for example, suppose the program is multiplying a vector times a matrix. In a high level language that could be done with a "for loop", a "while loop", or a "if then goto" construct, but those may be indistinguishable after it is compiled. There would be no way to know why the code cycles through rows before columns, or vice versa. Comments in the source code that describe why the code was written this way are not compiled.

So, there is no chance of recovering information that can have multiple versions for the same result, or for recovering information that the processing deems irrelevant to the execution of the code. Furthermore, you have no way of reconstructing what that matrix and vector represent to the person writing the code. Is it a QM problem, control system, optical ray tracing, network analysis, material failure analysis, math homework problem?

This whole idea reminds me of Laplace's Daemon and questions about determinism and free will. In the real world systems are much too complex to work backwards this way. But theoretically? Perhaps. Personally I find those sort of questions pointless, which is probably why I studied engineering instead of philosophy.

valenumr
My reply was conditioned on the hypothesis that one knew nothing about computers. Going through Ben Eater's videos would counter this hypothesis.
My two cents would be "maybe". It depends on the complexity of the actual software (think multi-theading), and the complexity of the CPU. Floating point unit? That depends on understanding or somehow decoding the representation. Deducing twos compliment for integer arithmetic? More likely. But fundamentally a basic CPU doesn't really have that many fundamental operations. Something like an 8088 would probably be easy to analyze and reverse engineer the assembly code if you could look at the machine state on every clock. Modern CPUs and software? It seems unlikely, but perhaps remotely possible.

There are a lot of modern constructs built into modern CPUs like prefetch, branch prediction, tlbs, page tables (virtual memory), caching, etc
that would probably make reverse engineering very difficult.

Staff Emeritus
Homework Helper
I don't see why not, given enough time. Isn't this essentially what scientists do—study a system to determine how it works? Researchers would discover Boolean algebra, gates, transistors, basic structures like flip flops, counters, etc., along the way. If anything, a computer should be simpler to understand since it's fairly simple conceptually.

valenumr
I don't see why not, given enough time. Isn't this essentially what scientists do—study a system to determine how it works? Researchers would discover Boolean algebra, gates, transistors, basic structures like flip flops, counters, etc., along the way. If anything, a computer should be simpler to understand since it's fairly simple conceptually.
I remember doing a project that was essentially taking a program and describing the machine state on an out of order execution processer. Something like a pentium 4. I do wonder if that is "invertible", I'm not prepared to give a proof one way or another, but it does feel like a "hard" problem with the way modern processors work.

happyhacker
Suppose one knows well the laws of physics but knows nothing about computers and IT. Would one be able to deduce how a computer works only by studying its hardware? Could one rebuild the software code and understand its meaning only by looking at the internal flow of bits in its CPU?
Possibly. You would need to observe and record output in response to inputs. Then you need to create flow charts for operations (black box approach). Then you would need to get the instruction set of the chips inside. Then you might be able to create the low level operations from the flow charts. You might then be able to deduce the compiler operations. That might close the loop. Wish you a long and happy retirement (you won't earn money doing it)! Aliens? Well they would get at the original design data (its there somewhere stored in the Earth's digital repositories). But then they probably have other ideas for visiting us.

Homework Helper
Overall I'm trying to understand if that is impossible (if it is) only because it is too difficult or also because it isn't possible, even not in principle? I'm wondering if that might connect to some sort of logical impossibility that tells us that something is not just too hard to figure out but even logically impossible (sort of Goedel's or Chaitin's incompleteness theorem)?

This is an in-principle question. You are not asking if you could succeed in this investigation, only that there is nothing in principle that would stop the ideal investigator.

In principle, computer hardware is self-documenting: Although you might want to provide some starting hints to your naive investigator like "start by applying a 120VAC 60Hz voltage source to the cord and press that 'on' button".

Everything that computers do is based on their physical form - so if one knows precisely (down to the atomic level) what that form is and "one knows well the laws of physics", the an appropriately smart one could deduce what it would do. I will level it up to the ideal naive investigator how she arranges the information.

Could one rebuild the software code and understand its meaning only by looking at the internal flow of bits in its CPU?

The software is bits. Certainly by looking at the "flow of bits" you will catch most of that software - in binary ("object") form. If the investigator is allows to look at bits that are not flowing (which was implied by the first part of your question), then he will have all object code loaded onto the computer.

As for understanding its meaning - it may stretch the ability of the investigator you have described to deduce the meaning of a pong game.

Homework Helper
A lot of what the ideal naive investigator would need to "fully" understand the PC landed in his lab would require knowledge of the creatures that used it. It would be easy to deduce that there is a large rectangular surface capable of emitting assorted combinations of EM radiation. But it might not immediately see the significance of focusing on three frequency ranges in the "visible" spectrum until it considered the possibility of a biological form with eye sight specifically adapted to that kind of reception.

Similarly, it would have to deduce that same biological form was especially adept at selecting and pressing buttons from that keyboard. And hopefully we included that mouse pad with the PC.

Ultimately, it would be able to identify photos on the PC as photos and to recognize photos of people as the likely biological forms that used the computer.

PeroK
Homework Helper
Software can be de-compiled but you really need to know the language used to create it or just try them all until you get something legible. Even then it may not make any sense.
I have read machine language and understood what it was doing. I have disassembled it to make the task easier. In a few cases, I have "decompiled" it, generating some sort of higher level semantics to further assist in deciphering the purpose of the binary software. It is not necessary to restore the original source code for a "decompile" to be useful.

Of course, I am far from the "naive ideal investigator" (NII), but the only information that I rely on for these investigations that the NII would not have is the overall purpose of the PC. But I believe the NII could deduce that as well.

Homework Helper
Secondly, if and in what sense today's efforts to decipher the brain's activity is comparable to this "retro-engineering"? There are actually several brain mapping projects aiming at mapping neuron by neuron the brain and simulating its neural networks in the belief that this will lead us to understand how our brain works and why it does what it does.
Here is a link reporting on one such investigation: Science Daily: Brain Connections
They estimate 200 billion neurons in the brain. They estimate that for the cerebral cortex alone, there are over 125 trillion connections (synapses) . If I (generously) allot 16 bytes per synapse connection, I would need 4 petabytes of storage - about four https://listings.emergentsx.com/products/emc-vnx8000-1-petabyte-1000tb-3-rack-plug-n-play-hybrid-storage?currency=USD&variant=39436141461581&utm_medium=cpc&utm_source=google&utm_campaign=Google%20Shopping at $35,000 each ($140K) would handle the storage - and another \$140K for the offsite backup.

The problem is that we don't know exactly what each neuron is doing. Nor can we assume that if you know one neuron, you know them all. Do some of them save up information before responding? Over what periods of time?

Certainly, investigations of that sort are interesting and useful. Even if we can't nail down exactly what is happening everywhere in the brain, any insights will contribute to the development of medical treatments.

PeroK
Mentor
I have read machine language and understood what it was doing. I have disassembled it to make the task easier.
But disassembling the code is no guaranteed that you're going to understand what the code is doing. Back in the late 80's/early 90's I played around with an x86 disassembler called MD86. A big problem for a disassembler is determining whether a sequence of bytes is machine instructions or data. One can make an educated guess, but that is, after all, only a guess.

Klystron
A related RESEARCH ARTICLE:

### Could a Neuroscientist Understand a Microprocessor?​

https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005268

## Abstract​

There is a popular belief in neuroscience that we are primarily data limited, and that producing large, multimodal, and complex datasets will, with the help of advanced data analysis algorithms, lead to fundamental insights into the way the brain processes information. These datasets do not yet exist, and if they did we would have no way of evaluating whether or not the algorithmically-generated insights were sufficient or even correct. To address this, here we take a classical microprocessor as a model organism, and use our ability to perform arbitrary experiments on it to see if popular data analysis methods from neuroscience can elucidate the way it processes information. Microprocessors are among those artificial information processing systems that are both complex and that we understand at all levels, from the overall logical flow, via logical gates, to the dynamics of transistors. We show that the approaches reveal interesting structure in the data but do not meaningfully describe the hierarchy of information processing in the microprocessor. This suggests current analytic approaches in neuroscience may fall short of producing meaningful understanding of neural systems, regardless of the amount of data. Additionally, we argue for scientists using complex non-linear dynamical systems with known ground truth, such as the microprocessor as a validation platform for time-series and structure discovery methods.

Homework Helper
But disassembling the code is no guaranteed that you're going to understand what the code is doing. Back in the late 80's/early 90's I played around with an x86 disassembler called MD86. A big problem for a disassembler is determining whether a sequence of bytes is machine instructions or data. One can make an educated guess, but that is, after all, only a guess.
I have never seen a problem. In 1973, I wrote a disassembler for the CDC-3300. It showed interpretations of memory as binary, EBCDIC (not ASCII), and assembly side-by-side on the same page of the printout. There was never any doubt which was the correct interpretation.

In some cases (including with the x86) you will see short pieces of code mixed with data structures - but even in that case, you will see function or interrupt returns at the end of the code segments.

Our Ideal Naive Investigator would be able to instrument the running PC and capture what is executed and what is not (a code coverage exercise).

Last edited:
Mentor
In 1973, I wrote a disassembler for the CDC-2100.
I was not able to find any reference to a CDC-2100 computer. This wiki page, https://en.wikipedia.org/wiki/Control_Data_Corporation, lists all of the models produced by Control Data Corp., but doesn't show a model CDC-2100. HP had a model 2100.

In any case, computers changed a fair amount from 1973 to 1993, when I was disassembling x86 code, and they have changed even more drastically from the 90s to the present day. If you're working with a system like ARM or MIPS, with a relatively small number of fixed-length instructions, that's a whole different ball game from other current architectures that are built using complex instruction sets. For example, 64-bit Intel and AMD instructions can range from a single byte up to 12 or more bytes in length. Besides that, modern processors can execute instructions out of order, and even execute multiple instructions in parallel in one machine cycle.

Staff Emeritus
The OP phrased this question very loosely. But I would compare the complexity of reverse engineering computer hardware/software to the complexity of deciphering Egyptian and/or Maya hieroglyphics. Both of those took a lot of time and effort to solve.

Remember that advanced aliens would certainly understand digital logic, and logic building blocks, and programmable devices with hardware and software, and would no doubt use them in their own devices. Some things are more alien (like spoken language) or less alien (like mathematics). Circuits IMO are somewhere in the middle.