Cryptographically secure code obfuscation

  • Thread starter Thread starter mXSCNT
  • Start date Start date
  • Tags Tags
    Code
Click For Summary

Discussion Overview

The discussion centers on the concept of cryptographically secure code obfuscation, exploring whether it is feasible to create an obfuscator that produces programs which maintain the same functionality as the originals but are resistant to understanding and optimization. The scope includes theoretical aspects of computation, algorithm performance, and the implications of debugging environments.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant proposes a definition of a cryptographically secure code obfuscator as a function that outputs a slower program which is computationally hard to optimize without access to the original program.
  • Another participant discusses the challenges of obfuscation in environments like Windows, where debuggers can attach to running programs, making complete obfuscation difficult.
  • A different perspective is introduced by posing a question about constructing a Turing machine that can process an encrypted definition without trivially decoding it back to the original, suggesting a more precise framing of the problem.
  • One participant challenges the necessity of the obfuscated program being slower, arguing that some slow algorithms are well understood and do not have faster alternatives.
  • Another participant questions whether running an obfuscated program with debugging tools could lead to the recreation of the original algorithm, implying a potential flaw in the obfuscation approach.

Areas of Agreement / Disagreement

Participants express differing views on the requirements and implications of cryptographic code obfuscation, with no consensus reached on the necessity of performance degradation or the effectiveness of obfuscation in the presence of debugging tools.

Contextual Notes

Some assumptions about the definitions of "trivially" decoding and the nature of algorithm performance are not fully explored, leaving room for interpretation and further discussion.

mXSCNT
Messages
310
Reaction score
1
I got to thinking about code obfuscation. Current code obfuscators use ad hoc techniques like symbol renaming. But is it possible to have a cryptographically secure code obfuscator that outputs programs that work the same way as the originals, but provably no one can understand?

So here's the key idea around what I think that would mean. If you understand a slow program, you should be able to improve the slow parts and make it faster. If a program is slow but you have no idea how to make it faster, you don't understand the program.

So a cryptographically secure code obfuscator is a function F that takes as input a program P, and outputs another program Q (in other words F(P) = Q), where the following conditions apply:
  1. P and Q both compute the same function, but Q is slower than P
  2. it should be computationally hard to improve the performance of Q, given only Q and not P. This can be expressed mathematically by saying, there should not exist an easy-to-compute optimization function W such that if F(P) = Q, W(Q) is faster than Q.
  3. Calculating F should be possible in polynomial time.

Anyone heard of something like this?
 
Technology news on Phys.org
The issue in the case of windows is that it supports debuggers for just about any program. It also supports remote (hosted on a second computer) debuggers than end up getting enabled during system boot time (often used to debug device drivers), where trying to prevent the debugger from attaching to the process used to run you program is probably not possible.

Once the program is actaully running, some debuggers can disassemble the running parts of a program, even if the program is kept encrypted in sections until it's needed to execute.

In the end, at best all you can do is make it difficult, but not impossible, to reverse engineer a program running under Windows.

Versions of Linux also include debug kernels (again mostly used for device driver development), so I'm not sure if Linux is any better, other than access to debugging kernels may be limited to approved developers.
 
Let's try stating the problem a different way.

Suppose you have a Turing machine. Clearly the machine can be defined by a sequence of integers in some way, so its definition could be encrypted by standard "unbreakable" methods.

You can then pose the qiestion: can you construct a Turing machine that processes the "encrypted" definition, without trivally decoding it back to the original definition?

I don't know how to start answering that version of the question, but at least it seems to be a more precisely defined problem than the original version.
 
AlephZero said:
Let's try stating the problem a different way.

Suppose you have a Turing machine. Clearly the machine can be defined by a sequence of integers in some way, so its definition could be encrypted by standard "unbreakable" methods.

You can then pose the qiestion: can you construct a Turing machine that processes the "encrypted" definition, without trivally decoding it back to the original definition?

I don't know how to start answering that version of the question, but at least it seems to be a more precisely defined problem than the original version.
That's less precise than my version because you don't say what it means to "trivially" decode it back to the original definition. My version is reducible to math; yours is not.
 
mXSCNT said:
My version is reducible to math; yours is not.

Hm... I studied Turing machines as part of a math degree. But please yourself.

I don't see why you require an obfuscated program to be slower than the original, though. Your assertion
If you understand a slow program, you should be able to improve the slow parts and make it faster. If a program is slow but you have no idea how to make it faster, you don't understand the program.
is false in general. There are "slow" algorithms whch are very well understood (e.g. enumeration of all possible cases), but no faster algorithms are known.
 
Assuming that P represents some algorithm, then doesn't the end result of any computer that can run Q, end up being the same algorithm as defined by P? If so, then if there is some tool that let's a person run Q with a debugger with trace feature, wouldn't that effectively allow the recreation of P?
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 17 ·
Replies
17
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
7K
  • · Replies 1 ·
Replies
1
Views
2K
Replies
1
Views
2K
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 12 ·
Replies
12
Views
3K
  • · Replies 1 ·
Replies
1
Views
1K