Interpreter vs Compiler: What Are the Differences and Benefits?

PeterDonis · Oct 23, 2018

harborsparrow said:

I would expect languages such as Python DO create, or at least cache, machine code for parts of a program that are called repeatedly, such as in a loop.

The standard C implementation of Python, CPython, does not do this. PyPy does some just-in-time (JIT) compiling during program execution.

harborsparrow said:

Python may or may not be fast enough for scientific applications

The numpy and scipy packages are specifically designed using C extensions for the computation intensive parts of the code, to avoid this problem.

harborsparrow said:

One need not declare the types of variables, and Python makes assumptions as to what the type will be.

This is not correct. "Variables" in Python are not objects; they are namespace bindings, which is why no types need to be declared for them. (Recent versions of Python allow type hints, but those are ignored at run time by the interpreter; they are there to help with code clarity for programmers and to help with certain code analysis and review tools.) All objects in Python have well-defined types, which are determined when the objects are constructed; there is no ambiguity whatever about the type of any object in Python, and the interpreter does not have to "make assumptions" about what any object's type will be; the language syntax determines that unambiguously for all built-in types, and the programmer does it for user-defined types by choosing what type's constructor to call in the code.

harborsparrow · Oct 23, 2018

PeterDonis said:

The standard C implementation of Python, CPython, does not do this. PyPy does some just-in-time (JIT) compiling during program execution.

The numpy and scipy packages are specifically designed using C extensions for the computation intensive parts of the code, to avoid this problem.

This is not correct...

Aha, Mentor, you have caused me to learn something today. Clearly Python is not my area of expertise. However, I fear the waters are murkier than you might want to believe. As evidence, I offer this thread: https://news.ycombinator.com/item?id=17049499
which makes for some pretty fascinating reading, back and forth! But I think you are closer to the truth about Python and typing that I originally was, so I'll agree to stand corrected.

PeterDonis · Oct 23, 2018

harborsparrow said:

As evidence, I offer this thread

The debate in that thread is about terminology--what different people want the terms "strongly typed" and "static typing" to mean. I purposely avoided any such buzzwords in my post because I wanted to focus on how Python actually works, not on how different people want to use particular buzzwords.

In particular, the discussion in that thread entirely ignores a key point I made in my post: in Python, variables are namespace bindings, not data storage declarations. So in Python, variables don't have types; objects have types.

For example, consider this Python code:

Python:

my_string = "string"
my_int = 1

In this code, "string" and 1 are objects; the first is an object of type str and the second is an object of type int. my_string and my_int are namespace bindings: they bind the names "my_string" and "my_int" to the objects "string" and 1 in the global namespace dictionary of the module in which this code appears. (If these lines of code appeared inside a function, they would bind those names to those objects in the local namespace dictionary of the function.) Note that all this happens at runtime: the interpreter sees these lines and, internally, creates the appropriate objects and the appropriate namespace dictionary entries. (Even in a version of Python that has JIT compilation, like PyPy, the compiler generates machine code that performs these same operations at run time.)

Now compare with this C code:

C:

char[] my_string = "string";
int my_int = 1;

In this code, a block of memory is declared and filled with the bytes for the ASCII characters "string", and the compiler is told that this block of memory is an array of characters. A second block of memory is declared and filled with the bytes for the integer 1, and the compiler is told that this block of memory is a (signed) integer. No namespace bindings are made at all. Here all this happens at compile time; at run time these two lines of code do nothing, since they are just static variable declarations; their only effect is to tell the compiler to make sure that the executable that will be run has the appropriate memory locations allocated with the appropriate contents. The variable names also have no effect at run time; they are only used at compile time to tell the compiler what other pieces of code reference the given memory locations.

Chris Miller · Oct 26, 2018

Started with assembler, but now I work a lot in both interpreted and compiled languages. Assembly code is brutal to port because it's very architecture dependent. Compiled code (e.g., C) is easier, but can still be a pain to port from one OS to another. Interpreted languages are easiest, as a lot of the work is done for you. Often no modification is required. With today's processing speeds, most applications and utilities run fine interpreted.

FactChecker · Oct 26, 2018

Chris Miller said:

can still be a pain to port from one OS to another. Interpreted languages are easiest, as a lot of the work is done for you. Often no modification is required.

Good point. Interpreted languages (like Perl) may give you the option of directly calling an OS command versus using a call provided by the language. If one uses the OS command, he should expect to have to convert all of those. If he uses the language-provided call, that conversion is probably already done for him by the appropriate language download. The established interpreted languages like Perl and Python have taken care of a lot of the conversion work.

sysprog · Nov 23, 2020

Interpreters do not execute source code; the machine executes the instructions of the interpreter program, which interprets code written in the interpreted language, and performs functions as the interpreted code is interpreted to have directed. Source code is never executed. It is called source code to distinguish it as the input material by which the assembler or compiler generates object code. Assemblers or compilers convert source code into directly executable machine instructions; interpreters do not. The distinction is clear, meaningful, and useful.

Vanadium 50 · Nov 23, 2020

Rive said:

I wonder why do I have the feeling that the main reason why I'm buying my better and better computers is to satisfy the hunger of the actual newest layer between me and the resources...

Because you are. And because economics says you should.

The cost per computation has fallen by more than three orders of magnitude since the PC was introduced. When computing is expensive, it makes economic sense to put more programmers (also expensive) on it so you use less of it. When computing is cheap, it doesn't.

In the early 80's if you wanted a word processor, spreadsheet and database it would cost $1500 - or about $4000 today. Today you can get all that for free. Of course it's not going to have the same degree of hand-tuning.

FactChecker · Nov 23, 2020

With an interpreter, you can write self-modifying code. A program can build up code based on the data that it reads and evaluate it. I have occasionally found that very useful. One obvious use is as a calculator that can accept complicated equations/code as user input and evaluate it.

pbuk · Nov 23, 2020

This is quite an old thread and much of what is said about interpreters is not true in 2020.

In fact I would go as far as to say that the distinction is no longer relevant to the majority of users of a programming language: if you don't already know the difference, you probably don't need to.

phinds · Nov 23, 2020

pbuk said:

... I would go as far as to say that the distinction is no longer relevant to the majority of users of a programming language:

I disagree. Although you are certainly right that the distinction is not what it used to be, the CORE distinction (you can make on the fly changes to interpretive code) is still sometimes very relevant. For example, I do all my programming in VB.NET 2017 64-bit and it does NOT allow changes on the fly. It does compile very quickly so that's not as important as it used to be but there ARE times in degugging when I wish it was interpretive.

pbuk · Nov 23, 2020

I don't think the majority of users of programming languages need self-modifying code.

Rive · Nov 23, 2020

I wonder how would be that old Transmeta CPU SW layer categorized.

fresh_42 · Nov 23, 2020

phinds said:

Although you are certainly right that the distinction is not what it used to be, the CORE distinction (you can make on the fly changes to interpretive code) is still sometimes very relevant.

Indeed. Especially when numbers go large, e.g. the number and kind of clients, frequency of changes, or the amount of data. If a system slows down, then you need to know which component does what in order to optimize it.

phinds · Nov 23, 2020

pbuk said:

I don't think the majority of users of programming languages need self-modifying code.

Yeah, you are right about that. I was focusing on my own intersts (as usual

)

FactChecker · Nov 23, 2020

pbuk said:

I don't think the majority of users of programming languages need self-modifying code.

Using a language, like R, in an interpreter like RStudio is quite a different interaction from the normal use of a compiler. Some of these languages have quite a large user community.

phinds · Nov 23, 2020

fresh_42 said:

Indeed. Especially when numbers go large, e.g. the number and kind of clients, frequency of changes, or the amount of data. If a system slows down, then you need to know which component does what in order to optimize it.

I also find it annoying when I find myself (as I did just last night) in the situation of debugging several sections of new code, realizing that I now understand what I need to change in one section and wanting to execute on to the next (independent) section but having to stop and recompile to get rid of the error before the debugger will LET me go on.

pbuk · Nov 23, 2020

FactChecker said:

Using a language, like R, in an interpreter like RStudio is quite a different interaction from the normal use of a compiler. Some of these languages have quite a large user community.

Yes that is true: a Read, Evaluate, Print Loop (REPL) only really makes sense in the context of an interpretive language.

pbuk · Nov 23, 2020

fresh_42 said:

Indeed. Especially when numbers go large, e.g. the number and kind of clients, frequency of changes, or the amount of data. If a system slows down, then you need to know which component does what in order to optimize it.

I don't see any difference between compilers and interpreters in any of these situations, for example performance profiling exists for both Python and C++, or do I misunderstand what you mean?

pbuk · Nov 23, 2020

phinds said:

I also find it annoying when I find myself (as I did just last night) in the situation of debugging several sections of new code, realizing that I now understand what I need to change in one section and wanting to execute on to the next (independent) section but having to stop and recompile to get rid of the error before the debugger will LET me go on.

Hmmm, perhaps the lesson there is not to work on two independent changes at once! Of course I ~~would~~ do not practice what I have just preached...

phinds · Nov 23, 2020

pbuk said:

Hmmm, perhaps the lesson there is not to work on two independent changes at once! Of course I would not practice what I have just preached...

I agree and I would not normally do that but this was new code that I knew needed several kinks worked out and I wasn't even totally clear on how a couple of issues should be handled. The issues were independant of each other. It would have been nice to have been working in an interpretive environment.

fresh_42 · Nov 23, 2020

pbuk said:

I don't see any difference between compilers and interpreters in any of these situations, for example performance profiling exists for both Python and C++, or do I misunderstand what you mean?

The number and kind of clients effects the availability of necessary software, especially for interpreter code.
The frequency of changes effects the necessity for compiled distributions.
The amount of data is directly related to response times and as such where the needed data are provided, which therefore depends on the system.

harborsparrow · Nov 23, 2020

The "machine language" to which many languages compile (or are translated/interpreted) is now a virtual machine. The virtual machine is a program. For both user applications written in a given programming language, and the virtual machine that "executes" the instructions, there may be what is called "just in time" compilation, which is arguably a hydrid between compilation and interpretation. This topic gets very involved; the more you look, the more complexities there are. And none of it means anything without also studying the loading and linking mechanisms for programs and their libraries.

So, define yourselves blue in the face about interpreters vs. compilers; it won't be adequate to reflect what is really happening with most programming languages. So as @pbuk said, it's no longer a contemporary subject of much debate in comp. sci. circles.

pbuk · Nov 23, 2020

fresh_42 said:

The number and kind of clients effects the availability of necessary software, especially for interpreter code.

Oh you mean 'clients' in the sense of 'customers', I was thinking of devices connected to a server. Yes of course you need to distribute packages that your customers can install and use, and if you are targeting Windows or iPhones or Android this means executable code compiled for the platform - but the most widely installed customer platform today is web browsers running the interpreted languange JavaScript! In the mid 80s consumer market in the UK it meant Spectrum BASIC (also interpreted). So I don't think there is a difference of principle here, just transient market factors.

fresh_42 said:

The frequency of changes effects the necessity for compiled distributions.

How is this different whether you are distributing a new C# runtime or a bundle of Python scripts - they still need to be loaded on to the target hardware?

fresh_42 said:

The amount of data is directly related to response times and as such where the needed data are provided, which therefore depends on the system.

Sorry, I still don't understand this point.

fresh_42 · Nov 23, 2020

pbuk said:

Oh you mean 'clients' in the sense of 'customers'

I did not.

I was thinking of devices connected to a server.

So did I.

In grown network structures or on remote offline clients you can run an executable, but not necessarily code that must be interpreted, or vice versa.

pbuk said:

How is this different whether you are distributing a new C# runtime or a bundle of Python scripts - they still need to be loaded on to the target hardware?

Have you ever updated a server software, and changed an index.html? If so. then you know the difference. And if databases are involved, it's even worse.

pbuk said:

Sorry, I still don't understand this point.

A complied program gives you control over the data, an interpreted code not necessarily.

PeterDonis · Nov 23, 2020

opus said:

Interpreters take the source code, translate it to machine code, and executes this statement immediately and one statement at a time.

This is not necessarily true. In Python, for example, the interpreter first "compiles" the source code to byte code, and then executes the byte code; the byte code can be thought of as "machine code" for a virtual machine implemented by the interpreter, but this type of "machine" is many levels removed from the actual hardware machine. Also, if the interpreter is executing a source code file, it does not execute the file one statement at a time; it compiles the entire file to byte code and then executes the byte code.

If you are using the Python interpreter interactively, in the REPL, then each statement or expression you enter into the REPL is compiled to byte code and executed immediately, yes. But even then the byte code is not the same as hardware machine code, as above.

Interpreter vs Compiler: What Are the Differences and Benefits?

Similar threads

Hot Threads

Recent Insights