Cryptographically secure code obfuscation

  • Thread starter Thread starter mXSCNT
  • Start date Start date
  • Tags Tags
    Code
AI Thread Summary
The discussion centers on the concept of cryptographically secure code obfuscation, proposing a function that transforms a program into a slower version while making it computationally difficult to optimize. The idea is that if a program is slow but not understandable, it cannot be improved, which raises questions about the feasibility of such obfuscation techniques. Challenges include the presence of debuggers in operating systems like Windows and Linux, which can potentially reverse engineer obfuscated code. The conversation also touches on the theoretical implications of Turing machines processing encrypted definitions without trivial decoding. Ultimately, the feasibility of creating a truly secure obfuscator remains uncertain, with various perspectives on the requirements and limitations of such a system.
mXSCNT
Messages
310
Reaction score
1
I got to thinking about code obfuscation. Current code obfuscators use ad hoc techniques like symbol renaming. But is it possible to have a cryptographically secure code obfuscator that outputs programs that work the same way as the originals, but provably no one can understand?

So here's the key idea around what I think that would mean. If you understand a slow program, you should be able to improve the slow parts and make it faster. If a program is slow but you have no idea how to make it faster, you don't understand the program.

So a cryptographically secure code obfuscator is a function F that takes as input a program P, and outputs another program Q (in other words F(P) = Q), where the following conditions apply:
  1. P and Q both compute the same function, but Q is slower than P
  2. it should be computationally hard to improve the performance of Q, given only Q and not P. This can be expressed mathematically by saying, there should not exist an easy-to-compute optimization function W such that if F(P) = Q, W(Q) is faster than Q.
  3. Calculating F should be possible in polynomial time.

Anyone heard of something like this?
 
Technology news on Phys.org
The issue in the case of windows is that it supports debuggers for just about any program. It also supports remote (hosted on a second computer) debuggers than end up getting enabled during system boot time (often used to debug device drivers), where trying to prevent the debugger from attaching to the process used to run you program is probably not possible.

Once the program is actaully running, some debuggers can disassemble the running parts of a program, even if the program is kept encrypted in sections until it's needed to execute.

In the end, at best all you can do is make it difficult, but not impossible, to reverse engineer a program running under Windows.

Versions of Linux also include debug kernels (again mostly used for device driver development), so I'm not sure if Linux is any better, other than access to debugging kernels may be limited to approved developers.
 
Let's try stating the problem a different way.

Suppose you have a Turing machine. Clearly the machine can be defined by a sequence of integers in some way, so its definition could be encrypted by standard "unbreakable" methods.

You can then pose the qiestion: can you construct a Turing machine that processes the "encrypted" definition, without trivally decoding it back to the original definition?

I don't know how to start answering that version of the question, but at least it seems to be a more precisely defined problem than the original version.
 
AlephZero said:
Let's try stating the problem a different way.

Suppose you have a Turing machine. Clearly the machine can be defined by a sequence of integers in some way, so its definition could be encrypted by standard "unbreakable" methods.

You can then pose the qiestion: can you construct a Turing machine that processes the "encrypted" definition, without trivally decoding it back to the original definition?

I don't know how to start answering that version of the question, but at least it seems to be a more precisely defined problem than the original version.
That's less precise than my version because you don't say what it means to "trivially" decode it back to the original definition. My version is reducible to math; yours is not.
 
mXSCNT said:
My version is reducible to math; yours is not.

Hm... I studied Turing machines as part of a math degree. But please yourself.

I don't see why you require an obfuscated program to be slower than the original, though. Your assertion
If you understand a slow program, you should be able to improve the slow parts and make it faster. If a program is slow but you have no idea how to make it faster, you don't understand the program.
is false in general. There are "slow" algorithms whch are very well understood (e.g. enumeration of all possible cases), but no faster algorithms are known.
 
Assuming that P represents some algorithm, then doesn't the end result of any computer that can run Q, end up being the same algorithm as defined by P? If so, then if there is some tool that let's a person run Q with a debugger with trace feature, wouldn't that effectively allow the recreation of P?
 
Dear Peeps I have posted a few questions about programing on this sectio of the PF forum. I want to ask you veterans how you folks learn program in assembly and about computer architecture for the x86 family. In addition to finish learning C, I am also reading the book From bits to Gates to C and Beyond. In the book, it uses the mini LC3 assembly language. I also have books on assembly programming and computer architecture. The few famous ones i have are Computer Organization and...
I had a Microsoft Technical interview this past Friday, the question I was asked was this : How do you find the middle value for a dataset that is too big to fit in RAM? I was not able to figure this out during the interview, but I have been look in this all weekend and I read something online that said it can be done at O(N) using something called the counting sort histogram algorithm ( I did not learn that in my advanced data structures and algorithms class). I have watched some youtube...
Back
Top