How programming languages are created?


by pairofstrings
Tags: languages, programming
pairofstrings
pairofstrings is offline
#1
Nov12-11, 03:26 PM
P: 192
If we write better compiler does it mean that programming language is getting better?
Phys.Org News Partner Science news on Phys.org
SensaBubble: It's a bubble, but not as we know it (w/ video)
The hemihelix: Scientists discover a new shape using rubber bands (w/ video)
Microbes provide insights into evolution of human language
Ivan92
Ivan92 is offline
#2
Nov12-11, 03:32 PM
P: 181
All a compiler does is check the code for syntax. If there are errors, it should alert you about the errors. If syntax is correct, it does not necessarily make the program run the way it should. In other words, building a better compiler doesn't mean the programming language is better.
MisterX
MisterX is offline
#3
Nov12-11, 05:35 PM
P: 541
Quote Quote by Ivan92 View Post
All a compiler does is check the code for syntax.

No, a compiler translates from a source language to a target langauge. The latter is often machine code or an intermediate language such as Java bytecode.

pairofstrings, programming langauges are often documented first, and then a compiler is implemented. If the compiler is better because it is closer to what is specified by the document(s), then we might not say the language has changed, but rather the compiler is more compliant to the language specification. The compiler might also be better because it produces code that runs faster or is more compact.

We would probably only claim the langauge has been revised if there were new specifing documents.

TylerH
TylerH is offline
#4
Nov12-11, 05:37 PM
P: 737

How programming languages are created?


"better" is too ambiguous to mean anything. A compiler can be "better" in many ways: faster, more memory efficient, faster object code, size-optimized object code, etc. A programming language can be more expressive, have more features, etc.
phinds
phinds is offline
#5
Nov12-11, 06:02 PM
PF Gold
phinds's Avatar
P: 5,720
To basically just restate what's already been said (we ALL think our own way of saying things is superior ... sorry about that), a compiler is just a tool to implement a language and does NOTHING to make the language better or worse (it might make how the language WORKS better or worse, but that is in the implementation, not in the language.

To make a language work better can be a VERY good thing, so good compilers are important, but it's even more important to have languages that are helpful in allowing humans to construct algorithms that get done whatever it is they need to have done.

The progression of languages from machine language to assemblers and then interpreters, and compilers has been an evolution of languages, not of implementation (which necessarily followed along to implement the improved languages).

So, based on what I just said, I'd say no, better compilers don't have anything to do with making better languages, but they can have a lot to do with better implementations of languages, which I see as a different thing.
chiro
chiro is offline
#6
Nov12-11, 11:41 PM
P: 4,570
Quote Quote by pairofstrings View Post
If we write better compiler does it mean that programming language is getting better?
While new compilers do have a habit of adding new "language" features (of which some are non-standard features specific to the compiler, like say compiler directives), many updates are often optimizations that are used to overcome existing compiler limits, or to better optimize the compiled results in terms of generating better output.

Also you should be aware that different languages are created for different specific purposes. Languages like C/BASIC/Java/FORTRAN are created for specific purposes. BASIC is good to run toy programs or models where speed is not an issue. C on the other hand has a very close resemblence to assembler although it is a lot easier to read (and you can even embed assembler code and use it from your C program under some environments).

Consequently there are languages that solve optimization problems, list processing languages (like LISP), and many others.

Again the different languages are built for (a) specific purpose(s) and when you understand those purposes it becomes a lot easier to understand the language and appreciate it. If someone has a purpose and needs a new language (or a substantial extension to an existing one), then one is usually created.
rcgldr
rcgldr is offline
#7
Nov13-11, 12:48 AM
HW Helper
P: 6,931
Your title asks one question, then your post asks a different question.

Quote Quote by pairofstrings View Post
How programming languages are created?
As mentioned above, one or more people create some type of document describing the language. A document describing a programming language and a compiler or interpreter for that programming language could be developed somewhat concurrently, but I don't know if this actually occurred for any specific programming language.

Quote Quote by pairofstrings View Post
If we write better compiler does it mean that programming language is getting better?
As mentioned above, it depends on what you mean by "better". The language isn't changed by the compiler, but part of the choice of which programming language is best for a particular computer and program may depend on the quality of the compilers and the corresponding size or speed of the code generated by the compilers that are available for that particular computer.
pairofstrings
pairofstrings is offline
#8
Nov13-11, 12:49 AM
P: 192
Question 1:

In language 1 if the instruction goes like this : add x y
In language 2 if the instruction goes like this : x add y

My understanding is that we need to have two different compilers which can understands the instructions of language 1 and 2. Am I right? I mean we cannot have same compiler compiling instructions of both Java and C,or instructions of different operating systems. Right?

Question 2:

If we have 'C' language code like this:

# include <stdio.h>
main(){
c = a + b;
printf(c);
}
When above code is compiled it might look like this(not familiar with machine language):
11111000 stdio.h
110011()
10101010 01 10
001100(01)
Is it an object code or machine code? I think of it as an object code because machine code has to be only 0's and 1's. There shouldn't be any stdio.h in the code. True?

Now.
Question 3:

If you see above main() statement when it is compiled it got converted into 1111100. Similarly, the other statements also got converted into 0's and 1's.
And every time I compile the same statements in our above 'C' program we will get the exact object code( in other languages exact intermediate code). Right?
Ok now, that means compiler has to know that the statements should be represented by a particular binary numbers only. In this case, the main() statement is represented as 11111000. That means the compiler has to remember which statement has to be converted to what binary numbers.
That means if we are able to make/program a compiler then we are able to create a new programming language. Or make changes(adding new features, as Chiro said) to existing ones. Is it right? That is my understanding. Help please.

Question 4:
How to program/make a compiler? I know it has evolved as phinds said, but can you tell me if we are using any language or something else in it's creation?
Thanks.
rcgldr
rcgldr is offline
#9
Nov13-11, 01:21 AM
HW Helper
P: 6,931
Quote Quote by pairofstrings View Post
Question 1:
My understanding is that we need to have two different compilers which can understands the instructions of language 1 and 2. Am I right? I mean we cannot have same compiler compiling instructions of both Java and C,or instructions of different operating systems.
A single program may be able to compile two similar programming languages, but this would be unusual. Some compilers are able to produce more than one type of machine code, for example, Visual Studio can produce 32 bit or 64 bit code. Going back to the 1960's, Cobol has an "environment division" which includes a source computer (the computer the program is compiled on) and an object computer (the computer the program is to be run on), but I'm not sure how many actual implementations supported multiple computers.


Quote Quote by pairofstrings View Post
Question 2, 3:
If we have 'C' language code like this:

# include <stdio.h>
main(){
c = a + b;
printf(c);
}
The issue here is that the C compliler will include machine code to call main(), and also all of the code required for printf(). It will end up being much larger than the minimal machine code required to read two numbers from memory, add them, and and store the result in memory.

Quote Quote by pairofstrings View Post
Question 3, 4:
How to program/make a compiler?
This can get tricky. An initial version of a compiler will have to be written in some other language, perhaps machine level language like assembly, or perhaps some other high level programming language, or a working compiler for one machine is modififed to produce code for another machine, or an emulator on the other machine is created and used to emulate the machine the compiler current works on. Once the initial version of a compiler is working, then the compiler may be re-written and updated in it's own language.
MisterX
MisterX is offline
#10
Nov13-11, 01:09 PM
P: 541
Quote Quote by pairofstrings View Post

Question 2:

If we have 'C' language code like this:

# include <stdio.h>
main(){
c = a + b;
printf(c);
}
When above code is compiled it might look like this(not familiar with machine language):
11111000 stdio.h
110011()
10101010 01 10
001100(01)
Is it an object code or machine code? I think of it as an object code because machine code has to be only 0's and 1's. There shouldn't be any stdio.h in the code. True?
You do not understand what the #include directive does. Essentially, #include <stdio.h> it results in the code of the file stdio.h being pasted in place of the #include line, before compilation. As such, we would not expect to see "stdio.h" in the object file.

The object files will typically be the output of the compiler. These object files may contain machine code, as well as other things such as object file headers and data sections.


Quote Quote by pairofstrings View Post
Question 3:

If you see above main() statement when it is compiled it got converted into 1111100. Similarly, the other statements also got converted into 0's and 1's.
And every time I compile the same statements in our above 'C' program we will get the exact object code( in other languages exact intermediate code). Right?
Ok now, that means compiler has to know that the statements should be represented by a particular binary numbers only. In this case, the main() statement is represented as 11111000. That means the compiler has to remember which statement has to be converted to what binary numbers.
That means if we are able to make/program a compiler then we are able to create a new programming language. Or make changes(adding new features, as Chiro said) to existing ones. Is it right? That is my understanding. Help please.
It's not as if statements translate into binary numbers with typical implementations of a high level language such as C. It's considerably more complicated than that.
pairofstrings
pairofstrings is offline
#11
Nov13-11, 01:18 PM
P: 192
Quote Quote by MisterX View Post
It's not as if statements translate into binary numbers with typical implementations of a high level language such as C. It's considerably more complicated than that.
I agree that translation could be more complicated. But by your first sentence do you mean that the compilers are made using 'C' language?
MisterX
MisterX is offline
#12
Nov13-11, 01:18 PM
P: 541
Quote Quote by rcgldr View Post
The issue here is that the C compliler will include machine code to call main(), and also all of the code required for printf().
This is implementation specific, but in many cases, the entire code for printf is not included by the compiler. Instead, printf is part of a linked library. For example, with Linux this may be "libc" and with Windows this may be one of the "C runtime libraries."
MisterX
MisterX is offline
#13
Nov13-11, 01:20 PM
P: 541
Quote Quote by pairofstrings View Post
I agree that translation could be more complicated. But by your first sentence do you mean that the compilers are made using 'C' language?
No, that't not what I meant. But, I'm sure compilers have been made using the C language, such as gcc which was "written primarily in C".
rcgldr
rcgldr is offline
#14
Nov13-11, 01:42 PM
HW Helper
P: 6,931
Quote Quote by rcgldr View Post
The issue here is that the C compliler will include machine code to call main(), and also all of the code required for printf().
Quote Quote by MisterX View Post
This is implementation specific, but in many cases, the entire code for printf is not included by the compiler.
It wasn't clear to me if the original poster was asking about object modules which include external links to be resolved by the linker, or was asking about executables, which would include library code (or overlay handlers). The compiler would at least need to generate the code required to call printf().

Quote Quote by MisterX View Post
No, that't not what I meant. But, I'm sure compilers have been made using the C language, such as gcc which was "written primarily in C".
How was the initial gcc compiler created? You'd need an existing C compiler in order to compile C code. I mentioned this above, that this is either done by cross compiling from another machine, or by creating the initial version of a compiler in a language already supported by the target machine.

As an early example, Altair Basic's roots go back to an 8008/8080 emulator that ran on a PDP-10. The paper tape loader mentioned in the wiki article had to be toggled into memory using the Altair's front panel:

http://en.wikipedia.org/wiki/Altair_BASIC
pairofstrings
pairofstrings is offline
#15
Nov14-11, 05:20 AM
P: 192
Quote Quote by rcgldr View Post
How was the initial gcc compiler created? You'd need an existing C compiler in order to compile C code. I mentioned this above, that this is either done by cross compiling from another machine, or by creating the initial version of a compiler in a language already supported by the target machine.
Can you give me little idea about the difference between cross compilers and bootstrapping?
I am new to computer science.
rcgldr
rcgldr is offline
#16
Nov14-11, 06:28 AM
HW Helper
P: 6,931
Quote Quote by pairofstrings View Post
cross compilers
As a modern example of a cross compiler, note that ARM processors are often embedded into the chips that go into consumer devices and computer peripherals:

http://en.wikipedia.org/wiki/ARM_architecture

Programmers can get an ARM toolset that runs on Wintel systems (Windows running on Intel processors). This includes a compiler, linker, emulator (includes it's own debugger), and debugger interface for the actual ARM processor using its "jtag" interface. Since the compiler and linker run on an Intel processor but produce code for the ARM processor, that would be an example of a cross compiler.

Quote Quote by pairofstrings View Post
bootstrapping
Boot - for a PC, this is done in the BIOS which is stored in some type of prom. For some early mini and micro computers, the boot strap program had to be manually entered via toggle switches on the front panel. One clever idea used in the ancient Monrobot mini computer was like a music box, except the pins on the drum were used to toggle switches to enter the bootstrap program (in this case to read and then run a program from paper tape). Wiki article:

http://en.wikipedia.org/wiki/Booting

There's also a more general usage of the term bootstrapping used in the computer industry of developing a more complex environment from a simpler one. The wiki article mentions this here:

http://en.wikipedia.org/wiki/Bootstr..._bootstrapping
D H
D H is offline
#17
Nov14-11, 07:06 AM
Mentor
P: 14,479
Mentor comment:
The questions raised in this thread fall into the category of questions that are beyond the scope of an internet forum.


pairofstrings: While you do not realize this, what you are doing is asking us to write multiple books just for you, teach multiple classes in computer science just to you.

Language theory and compiler design are upper level undergraduate / lower level graduate classes in college. It takes a long time, many classes, many books, to get from the "I am new to computer science" stage to the stage where a fair answer can be given to your questions on language theory and compiler design.

What you'll get by reading wikipedia is an apparent hodgepodge of nearly incomprehensible stuff. Note: I am not disparaging wikipedia here. It is an encyclopedia; this is a generic problem with encyclopedias. Multiple books are needed to answer these questions fairly and comprehensively. An encyclopedia cannot do full justice to such questions. Nor can an internet forum.


To those helping pairofstrings: Your work so far is commendable. Continue helping if you wish. However, don't be afraid to provide the short and sweet "Don't ask me to write a book" answer if it looks like you aren't helping or if answering the question at hand would indeed require you to write a book.
pairofstrings
pairofstrings is offline
#18
Nov19-11, 12:44 PM
P: 192
Can anyone suggest me some books which could help me grab this subject? I have a book on compilers. I have this book- Compilers: Principles, Techniques, and Tools.
I want a book which could explain me how compilers after converting source code to machine code interacts with the hardware in a computer, like a microprocessor for example.


Register to reply

Related Discussions
Simulations - Programming languages? General Astronomy 4
Programming languages used by physicists? Academic Guidance 5
Programming Languages Programming & Computer Science 19
New ideas for programming languages? Programming & Computer Science 2
Parallel Programming Languages Computing & Technology 22