Is it possible to deduce the software from the hardware activity?

nsaspook · Nov 13, 2021

A related RESEARCH ARTICLE:

Could a Neuroscientist Understand a Microprocessor?

https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005268

Abstract
There is a popular belief in neuroscience that we are primarily data limited, and that producing large, multimodal, and complex datasets will, with the help of advanced data analysis algorithms, lead to fundamental insights into the way the brain processes information. These datasets do not yet exist, and if they did we would have no way of evaluating whether or not the algorithmically-generated insights were sufficient or even correct. To address this, here we take a classical microprocessor as a model organism, and use our ability to perform arbitrary experiments on it to see if popular data analysis methods from neuroscience can elucidate the way it processes information. Microprocessors are among those artificial information processing systems that are both complex and that we understand at all levels, from the overall logical flow, via logical gates, to the dynamics of transistors. We show that the approaches reveal interesting structure in the data but do not meaningfully describe the hierarchy of information processing in the microprocessor. This suggests current analytic approaches in neuroscience may fall short of producing meaningful understanding of neural systems, regardless of the amount of data. Additionally, we argue for scientists using complex non-linear dynamical systems with known ground truth, such as the microprocessor as a validation platform for time-series and structure discovery methods.

.Scott · Nov 13, 2021

Mark44 said:

But disassembling the code is no guaranteed that you're going to understand what the code is doing. Back in the late 80's/early 90's I played around with an x86 disassembler called MD86. A big problem for a disassembler is determining whether a sequence of bytes is machine instructions or data. One can make an educated guess, but that is, after all, only a guess.

I have never seen a problem. In 1973, I wrote a disassembler for the CDC-3300. It showed interpretations of memory as binary, EBCDIC (not ASCII), and assembly side-by-side on the same page of the printout. There was never any doubt which was the correct interpretation.

In some cases (including with the x86) you will see short pieces of code mixed with data structures - but even in that case, you will see function or interrupt returns at the end of the code segments.

Our Ideal Naive Investigator would be able to instrument the running PC and capture what is executed and what is not (a code coverage exercise).

Mark44 · Nov 14, 2021

.Scott said:

In 1973, I wrote a disassembler for the CDC-2100.

I was not able to find any reference to a CDC-2100 computer. This wiki page, https://en.wikipedia.org/wiki/Control_Data_Corporation, lists all of the models produced by Control Data Corp., but doesn't show a model CDC-2100. HP had a model 2100.

In any case, computers changed a fair amount from 1973 to 1993, when I was disassembling x86 code, and they have changed even more drastically from the 90s to the present day. If you're working with a system like ARM or MIPS, with a relatively small number of fixed-length instructions, that's a whole different ball game from other current architectures that are built using complex instruction sets. For example, 64-bit Intel and AMD instructions can range from a single byte up to 12 or more bytes in length. Besides that, modern processors can execute instructions out of order, and even execute multiple instructions in parallel in one machine cycle.

anorlunda · Nov 14, 2021

The OP phrased this question very loosely. But I would compare the complexity of reverse engineering computer hardware/software to the complexity of deciphering Egyptian and/or Maya hieroglyphics. Both of those took a lot of time and effort to solve.

Remember that advanced aliens would certainly understand digital logic, and logic building blocks, and programmable devices with hardware and software, and would no doubt use them in their own devices. Some things are more alien (like spoken language) or less alien (like mathematics). Circuits IMO are somewhere in the middle.

.Scott · Nov 14, 2021

Mark44 said:

I was not able to find any reference to a CDC-2100 computer. This wiki page, https://en.wikipedia.org/wiki/Control_Data_Corporation, lists all of the models produced by Control Data Corp., but doesn't show a model CDC-2100. HP had a model 2100.

In any case, computers changed a fair amount from 1973 to 1993, when I was disassembling x86 code, and they have changed even more drastically from the 90s to the present day. If you're working with a system like ARM or MIPS, with a relatively small number of fixed-length instructions, that's a whole different ball game from other current architectures that are built using complex instruction sets. For example, 64-bit Intel and AMD instructions can range from a single byte up to 12 or more bytes in length. Besides that, modern processors can execute instructions out of order, and even execute multiple instructions in parallel in one machine cycle.

You're right, it was a CDC-3300 (and I fixed that post). It was in 1973 while I was a student at Lowell Tech (now UMass Lowell).
They also had an IBM-1620 and a HP-2100 series (HP-2136 I think), plus another computer in the Physics department that I never worked with.

I programmed one system before those (Honeywell series 200) and scores more since then. I am currently working with a TI MPS uP, last year it was an Infineon Aurix TC397. In all cases I have run across the machine language. If you just do a simple histogram on the bytes, you will get vastly different patterns depending on whether it is executable code, ASCII data, numeric data, or the kind of pointer/numeric mish-mash you find on the stack. As for executing instruction out of order, I first worked on a pipeline processor in the early 1980's.

But before our Ideal Naive Investigator looks at the code, it should nail down the hardware. Even if it miscategorized some data, it would discover it soon enough by simply trying to execute it. Executing data generally produces pointless results - like repeatedly moving the same content into the same register or overwriting a register content before it could have been used.

Is it possible to deduce the software from the hardware activity?

Could a Neuroscientist Understand a Microprocessor?

Abstract

Similar threads

Is AI Overhyped?

On Progress Toward AGI

How to disable AI responses in Google Searches?

If you think having a backup is too expensive, try not having one

Is this a good deal (laptop)?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Is it possible to deduce the software from the hardware activity?

Could a Neuroscientist Understand a Microprocessor?​

Abstract​

Similar threads

Could a Neuroscientist Understand a Microprocessor?

Abstract