Well but let's say I emulate 10, 20, or even 100 of this software CPU emulator, will that dramatically boost the amount of calculations I can do in a second? I am not sure the amount of data I can process, someone has to enlighten me.It's a total waste to build a faster HOST computer to use multiple cycles to simulate a TARGET doing something. Just do it with the HOST as efficiently as possible.
If the TARGET has a better architecture, then build a faster computer using the TARGET architecture.
A common way to get peak performance from a machine is to hand code in assembly language. In many cases a human can do a better job of optimizing than the compiler (but that isn't always true). A prime example where humans do well is in tightly coded pipelined DSP arithmetic loops. A human can sniff out tricks and shuffle code to cut cycles from the loops that an optimizing compiler could never find.
The next level of optimization is to build dedicated hardware coprocessors to do certain operations more efficiently, like a graphics GPU, or a custom coprocessor to do a square root or an FFT. (or even just use multiple CPU's to execute parallel threads)
The only way to efficiently emulate a TARGET computer is to build dedicated hardware (that's a joke, BTW, since that would be the TARGET computer itself)