Simulate computer inside a computer

fredreload · Jan 26, 2016

If I simulate a virtual computer inside a real computer graphically, would it take more resources or would it work like before? It is different from cloud computing, I'm simulating a computer graphically.

anorlunda · Jan 26, 2016

Look up emulators and "virtual machine" on Wikipedia. After reading those articles, if you still have questions, post them here.

Edit: I don't understand what you mean by graphically.

russ_watters · Jan 26, 2016

More resources than what? Before what?

Jeff Rosenbury · Jan 26, 2016

In general, yes. It will use more resources. There's no free lunch.

It might be possible to trade resources though. For example certain unused portions of the CPU like branch prediction could be skipped by lowering the execution speed since only one branch will be taken anyway.

It also might be possible to limit resources by limiting the faithfulness of the simulation. A trivial example: A brick can simulate a computer that's been turned off.

nsaspook · Jan 26, 2016

Your modern Intel x86 instruction set CPU doesn't run native x86 instructions on the silicon. It runs RISC-like micro-instructions internally with very complex micro-coded decoders for CISC instructions. It's both faster and more efficient to emulate/simulate the CISC instruction set on modern processes than the run the native instructions in silicon.

http://www.hardwaresecrets.com/inside-pentium-m-architecture/4/

fredreload · Jan 27, 2016

I thought simulating a computer would create a faster one, but I feel that the exascale computer is still going in the right direction, we'll wait for that and go from there

fredreload · Jan 27, 2016

Well hmm, is it possible to simulate a computer with a particle simulation for the transistors, after all they are just on and off switches. So each particle would represent an on or off switch with a different type of computing. Eventually it would simulate how a CPU works. That way you don't need to build those transistors, you just simulate how it works. But again simulating how these particles behave might take up more computing power than a CPU can handle. What do you guys think? It's only a thought

Jeff Rosenbury · Jan 27, 2016

Typically they are not just on/off switches. Timing is critically important and most systems use the transition states to control timing. For example a state of one bit might be latched by the transition of another.

These are technical issues for optical and quantum computing. Solving them is a high priority.

fredreload · Jan 27, 2016

You are right, all you need is timing and RAM. We need a lot of RAM, and more RAM for the clock assuming 1 byte each

Jeff Rosenbury · Jan 27, 2016

Back when I was in school, we did an graphical simulator of the micro-coding of an 8x86 for teaching. It wasn't that hard. (Getting the artwork right was the hardest part). But it was a simulator, not any sort of emulator intended to run code.

fredreload · Jan 29, 2016

Hmm, quantum computer is for exponential calculation, couldn't we optimize arithmetic or create a supercomputer with a graphical simulator and use that for our calculations? The actual computer device is limited by 2 bit calculations but a simulation program can do just about anything.

analogdesign · Jan 29, 2016

The simulation program is running on a computer that is limited to binary calculations. Where is the win?

anorlunda · Jan 29, 2016

Jeff Rosenbury said:

Back when I was in school, we did an graphical simulator of the micro-coding of an 8x86 for teaching. It wasn't that hard. (Getting the artwork right was the hardest part). But it was a simulator, not any sort of emulator intended to run code.

Enlighten me please Jeff. Graphical? I don't know what you mean.

You used an interactive circuit builder in Spice?
You sketched artwork for a chip etching mask or a printed circuit mask?

analogdesign · Jan 29, 2016

anorlunda said:

Enlighten me please Jeff. Graphical? I don't know what you mean.

You used an interactive circuit builder in Spice?

You sketched artwork for a chip etching mask or a printed circuit mask?

SPICE is far, far too slow to use as an interactive microcode simulator.

I'm sure Jeff meant he had a program that allowed the various registers to be printed to the screen as the simulation progressed. I've done similar things but I've always just printed them to stdout or redirected to a log file.

Jeff Rosenbury · Jan 29, 2016

analogdesign said:

SPICE is far, far too slow to use as an interactive microcode simulator.

I'm sure Jeff meant he had a program that allowed the various registers to be printed to the screen as the simulation progressed. I've done similar things but I've always just printed them to stdout or redirected to a log file.

Yes. The program displayed the CPU in block diagram form. It would execute commands by loading the registers, running the ALU, setting the flags, etc. Basically it displayed the machine code and how the µcode drove the machine code.

It was intended for teaching CPU architecture classes.

fredreload · Jan 30, 2016

Can I emulate more CPU power for number of calculations greater than the CPU I use itself?

Borg · Jan 30, 2016

That's kind of like asking if you can run faster by carrying yourself.

fredreload · Jan 30, 2016

But Wikipedia says it is possible to emulate an IBM PC with a Commodore 64 here. I'm interested in the performance I would get from that

Vanadium 50 · Jan 30, 2016

fredreload said:

But Wikipedia says it is possible to emulate an IBM PC with a Commodore 64 here.

A. "But Wikipedia says" is not a very good argument.

B. You didn't - and should have - included the second half of that quote. "Yes, it's possible for a 64 to emulate an IBM PC, in the same sense that it's possible to bail out Lake Michigan with a teaspoon." It was disingenuous not to do that.

fredreload said:

Can I emulate more CPU power for number of calculations greater than the CPU I use itself?

Of course not. If you could run an emulator faster on the same hardware, we'd run emulators on top of emulators until we had infinitely fast computers and we could just sit back and await the robot apocalypse.

fredreload · Jan 30, 2016

Darn it I had my hopes up, it's not that it would be faster, but how would you maximize calculations per second? Is increasing the number of CPU the only way?

mfb · Jan 30, 2016

Note: The last few posts were merged into this thread, they were a separate thread before.

fredreload said:

I thought simulating a computer would create a faster one

It cannot - you cannot simulate more than one operation with one operation. Actual simulations of different processor architectures are way slower than those processors, as they need much more complex high-level operations to decide what each individual component in the simulated system would do. Simulating systems is a massive performance loss. This can be acceptable if the simulated system itself does not exist yet (you want to design something new), or does not exist any more (you want to see how some C64 software worked but you don't have a C64), otherwise it is a huge waste of ressources.

fredreload · Jan 30, 2016

A CPU is essentially a clock that turns on billions of times per second. Having multiple core CPU just for a clock sounds rather inefficient, and having a lot of clocks at that. Just saying, I think there should be a better design in maybe simulating these clocks?

mfb · Jan 30, 2016

A CPU is not a clock. It has a clock, and all elementary operations in the CPU need one clock cycle, afterwards the next operations can be performed in the next clock cycle. Modern CPUs can do many operations in parallel, and having more CPUs allows to perform even more calculations at the same time. A faster clock speeds up calculations as well, but every component in the CPU has to get fast enough to do so.

I suggest you look up how a CPU works before you continue this thread. It will help a lot to understand what limits computation speeds.

Averagesupernova · Jan 30, 2016

mfb said:

I suggest you look up how a CPU works before you continue this thread. It will help a lot to understand what limits computation speeds.

Yes it seems that someone who is asking the questions is pretty bent on giving the answers.

fredreload · Jan 30, 2016

Precisely I am referring to how CPU works here, I don't think you can make the clock goes any faster. The thing that bothers me is the amount of transistors you need. Since you can't make the clock goes faster, you can only increase the amount of data being calculated. How does the number of transistors factor in into this? Does anyone know?

mfb · Jan 30, 2016

fredreload said:

I don't think you can make the clock goes any faster.

CPU clock speeds increased by orders of magnitude in the last decades.

fredreload said:

How does the number of transistors factor in into this?

More transistors allow to do more things in parallel - if the software allows it. If every calculation step depends on the result of the previous step, a computer is very slow. An example
Modern CPUs spend a large fraction of their transistors on logic to check which steps can be done in parallel.

fredreload · Jan 30, 2016

mfb said:

Note: The last few posts were merged into this thread, they were a separate thread before.

It cannot - you cannot simulate more than one operation with one operation. Actual simulations of different processor architectures are way slower than those processors, as they need much more complex high-level operations to decide what each individual component in the simulated system would do. Simulating systems is a massive performance loss. This can be acceptable if the simulated system itself does not exist yet (you want to design something new), or does not exist any more (you want to see how some C64 software worked but you don't have a C64), otherwise it is a huge waste of ressources.

1. Let's see, first we need a clock, then we link multiple CPU designs onto this one clock, then we create a CPU emulator software from this. The CPU designs repeat infinitely, like a skyscraper, and each floor receives 4 bits of input 0000 to 1111. Now assume I have 16 bits information 1011 0100 1100 0011, I need to have 4 floors repeat of this skyscraper in parallel, but is there a way for me to truncate it down to just 1 floor and pass all 16 bits in one go? Like stack them or something.
2. Can't I make a CPU that takes 100 bits of information and output 100 bits? This sounds like a silly question.

Averagesupernova · Jan 30, 2016

fredreload, the type of questions you ask tell me that you need to start with more basics. A young teenage boy took a look at my bench one day not long ago. I was showing him how I prototyped some stuff and a motion control project with a graphic LCD. He asked how he could do that kind of stuff. I told him he needed to go to school for electronics and start with the basics. He claimed he already knew the basics. His experience with 'the basics' involved a gifted program at school hooking up some LEDs. On a residential wiring project he was with me one day and I was explaining to him some basics and he told me he knew the two types of circuits which were series and parallel. Well I found that he knew OF those types of circuits but really didn't know at all. He recognized the names. So, it is not uncommon to not realize how much you don't know. fredreload, I suspect you fall into this category. Do you know what a register is? A shift register? Gates? Latches? Counters? If you don't have a really good grasp on those things, and others I have missed, you will fail miserably with microcontrollers. Take it from folks on this forum who have gone through it. You will appreciate and understand microcontrollers much more when having a good grasp on the basics. It is my opinion that I had a very poor microcontroller program back when I went to school and it affected my understanding of microcontroller systems for a long time.

mfb · Jan 30, 2016

fredreload said:

1. Let's see, first we need a clock, then we link multiple CPU designs onto this one clock, then we create a CPU emulator software from this. The CPU designs repeat infinitely, like a skyscraper, and each floor receives 4 bits of input 0000 to 1111. Now assume I have 16 bits information 1011 0100 1100 0011, I need to have 4 floors repeat of this skyscraper in parallel, but is there a way for me to truncate it down to just 1 floor and pass all 16 bits in one go? Like stack them or something.

That does not make sense at all.

Imagine you would simulate a computer adding two numbers with pen and paper: look up how your computer would do the addition internally, write down the corresponding state of all ~1 billion transistors, then calculate how they change during a clock cycle. Repeat for 10 clock cycles, or whatever your computer needs. Total time you need? If you manage to simulate one transistor in a second (very optimistic, as you would be going through a library of a thousand books just to check all the previous states for each step), you are done in 300 years.
How long do you need if you just add the numbers? A few seconds.

meBigGuy · Jan 30, 2016

Simulation will always be slower than "the real thing" unless you make approximations in some way.

In order to emulate a TARGET computer the HOST computer must parse every instruction and perform the operations as contained in the instruction. Also, it must determine the program flow, that is, emulate the hardware portions of the TARGET computer regarding the program counter, memory access, cache operation and so on.

If you could take each TARGET computer's instruction apart and execute it in less than 100 HOST computer instructions you would be doing good.

Some languages are emulated (they call it interpreted) on the HOST computer, such as JAVA, PYTHON, TCL, etc. These are always slower than executing fully compiled and optimized code in the computers native language.

Parsing and emulating the TARGET instruction set on the HOST is a big task. But don't underestimate the hardware activity that is happening every TARGET clock cycle which must be emulated also by the HOST.

meBigGuy · Jan 30, 2016

I mentioned that you can speed things up by making approximations. But, that is something you can/must do in any program to get the best performance. Adding the overhead of emulating a different machine is just wasteful. It adds nothing. The only reason anyone ever does it is because they want machine code compatibility so they can (for example) execute the game machine ROMs exactly as the old machine did. It is horribly inefficient.

If you want better performance, you need to optimize your program to incorporate the algorithms, approximations and hierarchical reductions that you are willing to trade off for better performance. For example you can add 100 to a number by writing a loop that adds one 100 times, or you can execute an ALU add with the original number and 100 as operands to a single ADD instruction. That is your choice when you write the program.

DrZoidberg · Jan 30, 2016

fredreload said:

Precisely I am referring to how CPU works here, I don't think you can make the clock goes any faster. The thing that bothers me is the amount of transistors you need. Since you can't make the clock goes faster, you can only increase the amount of data being calculated. How does the number of transistors factor in into this? Does anyone know?

The performance per transistor is actually very low in modern CPUs. For example imagine you recreated a 6502 processor (similar to the one used in the C64) with a modern 22nm manufacturing process and run it at 4GHz. That thing only has about 3,500 transistors and it's performance for most types of software would be about 50 to 100 times slower than that of a quad core i7 with more then 1 billion transistors. So you put 285,000 times as many transistors in it but only get 100 times the processing power.
In other words the performance per transistor and clock cycle is about 3000 times worse.
The main reason for this is the fact that single thread performance is extremely important. Putting 10 times as many transistors in each core to increase their performance by just a few percent can be more valuable than putting 10 times as many cores in there. Most software isn't even able to use more than one core.
If we managed to find a way to automatically rewrite a piece of software that can only run on one core to one that can make use of hundreds of cores efficiently we could fundamentally change the way CPUs are designed and use huge amounts of small cores instead of a few big ones. That would then give us a lot more power per transistor.

Boolean Boogey · Jan 30, 2016

People use Minecraft as a means through which they can build and use computers.

fredreload · Jan 31, 2016

I was thinking about it too but, doesn't the CPU it takes to simulate such a computer out weight the computer itself? Or is it possible to build one with an even better performance?

meBigGuy · Jan 31, 2016

It's a total waste to build a faster HOST computer to use multiple cycles to simulate a TARGET doing something. Just do it with the HOST as efficiently as possible.
If the TARGET has a better architecture, then build a faster computer using the TARGET architecture.

A common way to get peak performance from a machine is to hand code in assembly language. In many cases a human can do a better job of optimizing than the compiler (but that isn't always true). A prime example where humans do well is in tightly coded pipelined DSP arithmetic loops. A human can sniff out tricks and shuffle code to cut cycles from the loops that an optimizing compiler could never find.

The next level of optimization is to build dedicated hardware coprocessors to do certain operations more efficiently, like a graphics GPU, or a custom coprocessor to do a square root or an FFT. (or even just use multiple CPU's to execute parallel threads)

The only way to efficiently emulate a TARGET computer is to build dedicated hardware (that's a joke, BTW, since that would be the TARGET computer itself)

Simulate computer inside a computer

Similar threads

Hot Threads

Recent Insights