Cryptographically secure code obfuscation

mXSCNT · Sep 26, 2011

I got to thinking about code obfuscation. Current code obfuscators use ad hoc techniques like symbol renaming. But is it possible to have a cryptographically secure code obfuscator that outputs programs that work the same way as the originals, but provably no one can understand?

So here's the key idea around what I think that would mean. If you understand a slow program, you should be able to improve the slow parts and make it faster. If a program is slow but you have no idea how to make it faster, you don't understand the program.

So a cryptographically secure code obfuscator is a function F that takes as input a program P, and outputs another program Q (in other words F(P) = Q), where the following conditions apply:

P and Q both compute the same function, but Q is slower than P
it should be computationally hard to improve the performance of Q, given only Q and not P. This can be expressed mathematically by saying, there should not exist an easy-to-compute optimization function W such that if F(P) = Q, W(Q) is faster than Q.
Calculating F should be possible in polynomial time.

Anyone heard of something like this?

rcgldr · Sep 26, 2011

The issue in the case of windows is that it supports debuggers for just about any program. It also supports remote (hosted on a second computer) debuggers than end up getting enabled during system boot time (often used to debug device drivers), where trying to prevent the debugger from attaching to the process used to run you program is probably not possible.

Once the program is actaully running, some debuggers can disassemble the running parts of a program, even if the program is kept encrypted in sections until it's needed to execute.

In the end, at best all you can do is make it difficult, but not impossible, to reverse engineer a program running under Windows.

Versions of Linux also include debug kernels (again mostly used for device driver development), so I'm not sure if Linux is any better, other than access to debugging kernels may be limited to approved developers.

AlephZero · Sep 26, 2011

Let's try stating the problem a different way.

Suppose you have a Turing machine. Clearly the machine can be defined by a sequence of integers in some way, so its definition could be encrypted by standard "unbreakable" methods.

You can then pose the qiestion: can you construct a Turing machine that processes the "encrypted" definition, without trivally decoding it back to the original definition?

I don't know how to start answering that version of the question, but at least it seems to be a more precisely defined problem than the original version.

mXSCNT · Sep 26, 2011

AlephZero said:

Let's try stating the problem a different way.

Suppose you have a Turing machine. Clearly the machine can be defined by a sequence of integers in some way, so its definition could be encrypted by standard "unbreakable" methods.

You can then pose the qiestion: can you construct a Turing machine that processes the "encrypted" definition, without trivally decoding it back to the original definition?

I don't know how to start answering that version of the question, but at least it seems to be a more precisely defined problem than the original version.

That's less precise than my version because you don't say what it means to "trivially" decode it back to the original definition. My version is reducible to math; yours is not.

AlephZero · Sep 26, 2011

mXSCNT said:

My version is reducible to math; yours is not.

Hm... I studied Turing machines as part of a math degree. But please yourself.

I don't see why you require an obfuscated program to be slower than the original, though. Your assertion

If you understand a slow program, you should be able to improve the slow parts and make it faster. If a program is slow but you have no idea how to make it faster, you don't understand the program.

is false in general. There are "slow" algorithms whch are very well understood (e.g. enumeration of all possible cases), but no faster algorithms are known.

rcgldr · Sep 27, 2011

Assuming that P represents some algorithm, then doesn't the end result of any computer that can run Q, end up being the same algorithm as defined by P? If so, then if there is some tool that let's a person run Q with a debugger with trace feature, wouldn't that effectively allow the recreation of P?

Cryptographically secure code obfuscation

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

HTML/CSS Problems with DNS records

PHP My website presents the visitor with the choice of opting out of using cookies....

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect