How is compiler software compiled?

Bararontok · Jul 17, 2012

If a compiler using a particular programming language is used to compile software and convert it from text files to executable application files, then how is the source code written for the compiler itself compiled so that it will have its own executable files?

phinds · Jul 17, 2012

One way is a process called "bootstrapping". A modest-sized compiler is written in assembly language and then that modest compiler is used to write the next version.

That works well for relatively low-level high-level languages such as C, which are themselves relatively close to the machine. For really abstract languages such as APL, the compiler would be written in a low-level high-level language, generally C (or it could be written in assembly, but what a pain that would be).

jtbell · Jul 17, 2012

There's also cross-compilation: you compile the source code on machine A, but you make the compiler produce machine code for machine B. Then copy the compiled program from machine A to machine B and execute it there. If this program is itself a compiler, then you don't need machine A any more.

Bararontok · Jul 17, 2012

So if the compilers are made in assembly language which the thread originator has actually used to compile programming languages and assembly programs, how is the assembly language compiler itself compiled?

phinds · Jul 17, 2012

Bararontok said:

So if the compilers are made in assembly language which the thread originator has actually used to compile programming languages and assembly programs, how is the assembly language compiler itself compiled?

You seriously lack understanding of computers. Assembler are NOT compilers. Computer languages go like this:

machine language (JUST 1's and 0's, no letters)
assembly (one-to-one correspondance w/ machine language but uses letters, such as "add" to represent machine language statements)
compilers ("high-level" languages. A single statement can produce dozens of machine language statements)

and there are interpreters, such as the original BASIC, which are a whole 'nother animal.

EDIT: so when you are at the level of an assembler, there is no such thing as a compiler. Assemblers are written in small form in machine language, then bootstrapped up to a full version.

Bararontok · Jul 17, 2012

phinds said:

You seriously lack understanding of computers. Assembler are NOT compilers.

The thread originator has a Turbo Assembler program that they use to make programs and programming languages. There is confusion with the terminology, but perhaps the term Turbo Assembler is self-explanatory and is an assembler.

phinds said:

Computer languages go like this:

machine language (JUST 1's and 0's, no letters)
assembly (one-to-one correspondance w/ machine language but uses letters, such as "add" to represent machine language statements)
compilers ("high-level" languages. A single statement can produce dozens of machine language statements)

and there are interpreters, such as the original BASIC, which are a whole 'nother animal.

EDIT: so when you are at the level of an assembler, there is no such thing as a compiler. Assemblers are written in small form in machine language, then bootstrapped up to a full version.

So an assembler is used to assemble a compiler while the compiler is used to compile a program. Since the assembler is written in machine language, then the person writing the assembler must know how to translate machine language into human readable language. The thread originator has never made an assembler before and does not know the detailed process.

phinds · Jul 17, 2012

Bararontok said:

The thread originator has a Turbo Assembler program that they use to make programs and programming languages. There is confusion with the terminology, but perhaps the term Turbo Assembler is self-explanatory and is an assembler.
So an assembler is used to assemble a compiler while the compiler is used to compile a program. Since the assembler is written in machine language, then the person writing the assembler must know how to translate machine language into human readable language. The thread originator has never made an assembler before and does not know the detailed process.

Uh ... just out of curiosity, why do you refer to yourself as "the thread originator" rather than "I" ?

Turbo Assembler is, as you might guess from the name, an assembler, not a compiler.

Assemblers are not JUST used to write compilers. Many programs are written in assembler. In fact, in the early days, some computers only HAD an assembler, not a compiler. These days about the only thing written in assembly language is device drivers, but even these are now often written in C.

Yes, the person writing an assembler in machine language must understand machine language.

Bararontok · Jul 17, 2012

phinds said:

Assemblers are not JUST used to write compilers. Many programs are written in assembler. In fact, in the early days, some computers only HAD an assembler, not a compiler. These days about the only thing written in assembly language is device drivers, but even these are now often written in C.

Yes, the person writing an assembler in machine language must understand machine language.

Yes, the thread originator wrote plenty of programs and some programming languages in Turbo Assembler but never made an assembler. It is actually easy to write source codes in assemblers and compilers because the commands are all human readable commands but writing an assembler in machine language must be a very difficult task.

phinds · Jul 17, 2012

Bararontok said:

Yes, the thread originator wrote plenty of programs and some programming languages in Turbo Assembler but never made an assembler. It is actually easy to write source codes in assemblers and compilers because the commands are all human readable commands but writing an assembler in machine language must be a very difficult task.

Uh ... just out of curiosity, why do you refer to yourself as "the thread originator" rather than "I" ?

Yes, writing ANYTHING in machine language is a real pain in the butt.

If you have in fact written, as you say, "programming languages" in Turbo Assembler, and they are not assembler languages, then they MUST be either compilers or interpreters and if you understand computers well enough to write either of those, I cannot understand how it is possible that you to not already understand everything discussed in this thread.

phinds · Jul 17, 2012

Bararontok said:

... It is actually easy to write source codes in assemblers and compilers because ...

I disagree completely. Writing complex math programs in assembler is not even CLOSE to being as easy as it is in compiler languages.

Bararontok · Jul 17, 2012

phinds said:

Uh ... just out of curiosity, why do you refer to yourself as "the thread originator" rather than "I" ?

Yes, writing ANYTHING in machine language is a real pain in the butt.

If you have in fact written, as you say, "programming languages" in Turbo Assembler, and they are not assembler languages, then they MUST be either compilers or interpreters and if you understand computers well enough to write either of those, I cannot understand how it is possible that you to not already understand everything discussed in this thread.

The problem was merely a confusion of terminologies but the thread originator knows how to write the codes using the Turbo Assembler program.

Additionally, the thread originator uses the term thread originator because scientific documents must be impersonal. The term "the researcher" is used by the thread originator when writing scientific research documents.

phinds said:

I disagree completely. Writing complex math programs in assembler is not even CLOSE to being as easy as it is in compiler languages.

The comparison was being made between assemblers and writing in machine language. Of course a very complex program is more difficult to write in an assembler than in a compiler. Some compilers like Turbo C++ even have GUI compatibility so a selector icon can be used to do various operations more quickly than if only typing is used to make the source code.

phinds · Jul 17, 2012

Bararontok said:

The problem was merely a confusion of terminologies but the thread originator knows how to write the codes using the Turbo Assembler program.

Additionally, the thread originator refers to the thread originator because scientific documents must be impersonal. The term "the researcher" is used by the thread originator when writing scientific research documents.

Yeah, but you are not WRITING a scientific document, you are on a forum. Do you see anyone else here referring to themselves that way? This is not a big deal, I was just wondering why you like to sound so ridiculously stilted.

I take it from the first sentence above that you have NOT, as you originally stated, written any "programming languages" ?

phinds · Jul 17, 2012

Bararontok said:

The comparison was being made between assemblers and writing in machine language. Of course a very complex program is more difficult to write in assembler than in a compiler. Some compilers like Turbo C++ even have a GUI so a selector icon can be used to do various operations more quickly than if only typing is used to make the source code.

Yes, I agree w/ you on this. Modern assemblers are MUCH more friendly than the ones in the early days of computing and in any case, assembly is much more friendly than machine language.

Bararontok · Jul 17, 2012

phinds said:

I take it from the first sentence above that you have NOT, as you originally stated, written any "programming languages"?

The programming languages were just exercises and they possessed only very basic functionality. They did not possesses the large amount of functions and data that programming languages in widespread use possess.

And making the programming language is not difficult because of the logic or the commands needed to make the programming language but the massive amount of code that has to be written for the many functions that the programming language has to serve.

But to save time in making the programming languages, open source code was copied from various online sources and incorporated into modules in the source code to cut the programming work by a significant percentage.

Bararontok · Jul 17, 2012

In the earlier stages of computing technology, there were computers that used mechanical, electromechanical relays, and vacuum tubes to do computing. For a software to be written into these machines, the switches had to all be adjusted manually and one-by-one to generate the pattern of 1's and 0's that would cause the machine to contain information. In the case of the punch card computers, holes needed to be punched manually into the cards to make it contain data.

But in the advent of transistor controlled electronic computers, software would be written on the keyboard of a computer that is turned on and saved in, initially a tape drive when the work is done. But since these computers must initially already have some type of assembler program to make this possible, how is the assembler written into the device memory? Was there some type of simple keyboard that contained only a 0 & 1 and then these 2 buttons would be pressed and signals would be sent directly to the storage device without passing through more sophisticated hardware? Perhaps the electrical signal generated when either 1 or 0 is pressed would be sent through wires, one going into the write head of a magnetic tape or later a magnetic disk and then the other wire would send the signal to the motor to physically move the tape or disk so that data can be written into another sector so that the next bit of data can be written.

Since there is no editor software to assist in writing the assembler code, it must have been impossible to do the kind of text editing that could be done on assemblers, so that if an error occurred, the entire segment of code needed to be erased so that the process can be repeated again.

chiro · Jul 17, 2012

What happens is that the computer comes with a BIOS and a ROM which have some basic functionality for doing really simple I/O tasks.

On top of this you have an OS which supplies even more functionality which uses the functionality supplied by the BIOS and the hardware ROM.

Once you have a representation of a simple program allowing you to do I/O on all necessary devices like in RAM, the hard-drives, the video devices, and so on, then you can create the necessary memory representation corresponding to a program and one instance of a program is a compiler.

Now you can create a representation of a program on a hardware device like a ROM chip just like it would be represented on a hard-drive and this is what happens when you want to implement some basic BIOS frameworks.

From this point on, as mentioned above it's pretty much boot-strapping and other techniques that are easier to create and maintain.

One thing to be aware of though, is that it's not just converting stuff to machine language: every executable and system library has its own operating system headers and structures that are executed in an OS specific way. The instructions themselves will be in a machine-language, but the OS actually has to prepare the environment for execution and this needs OS specific stuff defined in the executable itself.

The linker (and sometimes the compiler to some degree) adds all this OS stuff to the final executable which gives an EXE file (also adds code to check whether windows is running and other code to prepare the system environment).

jtbell · Jul 17, 2012

When I was an undergraduate in the early 1970s, one of the computers that I worked on was a Digital Equipment PDP-5 which my college had acquired via army surplus. Its main input was via punched paper tape. In order to read a machine-language program from a paper tape, a paper-tape program loader had to be already in the computer's memory. There was no disk drive (hard disk or floppy disk), or any other kind of permanent memory.

When I turned the machine on, the first thing I did was to load the paper-tape program loader into the computer, via switches on the front panel. As I recall (this was forty years ago so some details may not be accurate), there were:

16 switches labeled DATA, for the binary digits of a 16-bit word
16 switches labeled ADDRESS, for the binary digits of a 16-bit address in memory
A button labeled LOAD
A button labeled RUN

The paper-tape loader consisted of 10-15 instructions in machine language, which I had written on a piece of paper and taped on the machine's front panel. To load the program, I had to:

Set the DATA switches according to the bits of the first instruction
Set the ADDRESS switches to the address of the first instruction
Press the LOAD button
Repeat steps 1 to 3 for the other instructions

After that, whenever I wanted to read a program from paper tape, I had to do the following:

Position the paper tape in the reader
Set the ADDRESS switches to the address of the first instruction of the paper-tape loader program
Press the RUN button

And then to run the program itself:

Set the ADDRESS switches to the beginning address of the program (given in the program's documentation)
Press the RUN button

Of course, this depended on the program being loaded not overwriting the paper-tape loader program, either while being loaded, or while being executed. By convention, all programs were written to use memory locations above the presumed location of the paper-tape reader.

Bararontok · Jul 17, 2012

chiro said:

What happens is that the computer comes with a BIOS and a ROM which have some basic functionality for doing really simple I/O tasks.

So if, in electronic computing, the BIOS and ROM provide the basic I/O tasks to allow the assembler to be written, then how is the BIOS and ROM programmed? Since, as mentioned in the previous post, there were no editor software when the first firmware and assemblers were being made, then these codes needed to be written in machine language and because there is no editor, then the whole memory device would have to be erased if an error.

When was the monitor invented? Because before that they needed to use indicator lights and some printing equipment to know the output generated by their inputs and this must have been difficult to do.

jtbell said:

Its main input was via punched paper tape. In order to read a machine-language program from a paper tape, a paper-tape program loader had to be already in the computer's memory.

What kind of memory did the computer use if it was not permanent storage? Was there already a solid state micro-controller chip or ROM chip?

jtbell · Jul 17, 2012

Bararontok said:

What kind of memory did the computer use if it was not permanent storage?

It used magnetic core memory (http://en.wikipedia.org/wiki/Magnetic-core_memory), which retained data only while the computer's power supply was on. For persistent storage we had paper tape and a magnetic tape drive. My big project was writing a program to store and load programs using the magnetic tape drive. In other words, a very crude operating system!

First I had to load my "magtape program" in the usual way from paper tape. When I ran it, it more or less asked me, "which program do you want to run?" If I entered "5", it would skip to the fifth program stored on the magtape, load it, and run it.

Bararontok · Jul 17, 2012

So the tape was operated directly by hardware circuits that were controlled by pressing the buttons. So the circuitry basically contained the functions that the buttons operated.

Mark44 · Jul 18, 2012

Bararontok said:

Additionally, the thread originator uses the term thread originator because scientific documents must be impersonal. The term "the researcher" is used by the thread originator when writing scientific research documents.

phinds said:

Yeah, but you are not WRITING a scientific document, you are on a forum. Do you see anyone else here referring to themselves that way? This is not a big deal, I was just wondering why you like to sound so ridiculously stilted.

I agree with phinds - posting on a forum is NOT writing a scientific document. In addition, it is confusing to refer to yourself as the "thread originator," as it seems that you are referring to some other person.

rcgldr · Jul 18, 2012

jtbell said:

It used magnetic core memory ... which retained data only while the computer's power supply was on.

Magnetic core memory retains data without power. Once a boot strap program was keyed into core memory on a typical mini-computer of the early 1970's, it remained there unless it was wiped out. Some mini computers like the HP 2100 series intended for the boostrap loader to reside in the upper 64 words of memory and had a write protect enable / disable feature to keep the loader from getting overwritten by a bad program. The loader initially and occasionally had to be manually entered, but generally once it was entered, the computer could be powered off for several days without losing the loader code.

The first mini I recall was an IBM 1130, first made in 1965. I think it had a hardwired boostrap loader that would read one punched card that would normally read one sector from the disk drive in order to boot up.

I also recall an early machine called a Monrobot, with a drum memory. The boot strap loader on that machine used a music box like drum with fixed pins used to mechanically toggle switches used to load in the boot strap code.

Bill Gates and Paul Allen developed a demo version of a basic interpreter for the Altair 8800 (Intel 8080 cpu) using a emulator on a PDP 11. They made a paper tape binary for the demo, but had to create and enter the bootstrap loader for the demo. Wiki article:

http://en.wikipedia.org/wiki/Altair_BASIC

Bararontok said:

So if, in electronic computing, the BIOS and ROM provide the basic I/O tasks to allow the assembler to be written, then how is the BIOS and ROM programmed?

The PC used an Intel 8088, which was predated by other systems that also used the 8088, so the assemblers already existed. The binary output of the assembler was programmed into PROM's for the initial testing, then later the binary output would be used to generate the ROM used on a PC. In addition to the boot code and some device I/O code, there was also a ROM based basic in some early PCs.

Bararontok said:

When was the monitor invented?

CRT's were invented back around 1900, with the first "conventional" type CRT being invented around 1922. In the early days of compuers, hard copy terminals like teletypes were used as monitors on typical computers. Some CRT's were used for vector graphics, and some high-end computers in the 1960's had CRT type monitors (monochrome) with text (and graphics).

Prior to PC's, monitors were often standalone devices with their own keyboards that communicated with a computer via RS 232. Some early versions of these emulated the hard copy teletypes that they were meant to replace. There were also monitors like the IBM 3270, usually with multiple monitors per location, each of which used coaxial cable to connect to a multiplexor that communicated to a mainframe computer via a modem.

What kind of memory did the computer use if it was not permanent storage?

Non permanent storage on early computers includes vacuum tubes and a special type of crt tube (although never used much). Core memory was permanent. Transistor and ram based memory is not permanent, but ROM, PROM, EEPROM, FLASH, and similar types of memory are considered permanent.

Bararontok · Jul 18, 2012

rcgldr said:

The PC used an Intel 8088, which was predated by other systems that also used the 8088, so the assembler's already existed. The binary output of the assembler was programmed into PROM's for the initial testing, then later the binary output would be used to generate the ROM used on a PC. In addition to the boot code and some device I/O code, there was also a ROM based basic in some early PCs.

How is the binary code entered into the PROM's? Before the advent of text editor software, were the signals generated by the key presses just directly sent to the PROM's without any output being shown on a monitor or passing through additional hardware such as the RAM cards and processors?

rcgldr · Jul 18, 2012

Bararontok said:

How is the binary code entered into the PROM's?

Depends on the PROM burner, but usually these will use some type of common interface, such as RS 232, found on just about any type of computer. The binary data is sent in some common format, such as Intel hex format. Wiki article:

http://en.wikipedia.org/wiki/Intel_HEX

Bararontok said:

Before the advent of text editor software ...

Text editors, computers, assemblers, and compilers, all predate prom burners, so this wasn't an issue. Some prom burner makers would supply software to convert a pure binary image into Intel hex format (or SREC format or whatever the prom burner used), for common computers, such as the CP/M systems that predate PC's.

Bararontok · Jul 18, 2012

rcgldr said:

Depends on the PROM burner, but usually these will use some type of common interface, such as RS 232, found on just about any type of computer. The binary data is sent in some common format, such as Intel hex format. Wiki article:

http://en.wikipedia.org/wiki/Intel_HEX

How does the burner work? Is there a cable connecting the interface of the PROM directly to a keyboard?

And interestingly, in the modern times, since there are already many copies of assemblers, the memory devices that store them are probably just programmed automatically by data copied from a storage database so that the manual effort is eliminated.

rcgldr · Jul 18, 2012

Bararontok said:

How does the burner work? Is there a cable connecting the interface of the PROM directly to a keyboard?

Not normally. The companies that make typical prom burners expect the users to have computers that can send data via RS 232 cable in order to transfer data to the prom burner. The prom burner itself may only have one or two buttons used to start the programming or verification of a prom. A minimal prom burner would just have a RS 232 interface and the prom socket, relying on commands sent via RS 232 to start programming or verifying a prom. It would be possible to use an old ascii terminal with an RS 232 interface to manually send data to a prom burner, but it wouldn't be practical.

Bararontok said:

in the modern times ... so that the manual effort is eliminated.

I'm not sure what type of devices you're mentioning here, but other than a hobby, there's no point in going back to toggling switches to enter machine code on some crude computer system.

Going back to an earlier post:

Bararontok said:

In the earlier stages of computing technology, there were computers that used mechanical, electromechanical relays, and vacuum tubes to do computing. For a software to be written into these machines, the switches had to all be adjusted manually and one-by-one to generate the pattern of 1's and 0's that would cause the machine to contain information.

The ENIAC initially had to be manually programmed for specific tasks, but during development of the ENIAC, the idea of using storage for both data and program was already considered, included in the EDVAC, EDSAC, and eventully an improved version of the ENIAC. Note that punched card readers and writers already existed before the ENIAC, and they were used for input and output. The punched cards that were output could be read and then printed on line printers on other early data processing systems. Wiki articles:

http://en.wikipedia.org/wiki/ENIAC

http://en.wikipedia.org/wiki/IBM_405

The last remants of manual programming would be plug board programming used on early data processing machines, and for portions of the programming early computers like the ENIAC. Wiki article:

http://en.wikipedia.org/wiki/Plugboard

Once computers were being programmed in assembly or higher level languages, the utility program FARGO, and the programming language RPG were used to help with the transition from plugboards to compiled language. Wiki articles:

http://en.wikipedia.org/wiki/FARGO_(programming_language)

http://en.wikipedia.org/wiki/RPG_programming_language

jackmell · Jul 18, 2012

Not too hard to hand-compile code (using paper and pencil). Take for example a Do-loop to add a set of numbers. First code it in say C, then convert it to assembly, then convert the assembly to machine code. If you're a bit-head that's fun and when you go through an assembly-language course, you'll get use to converting assembly code to machine code and actually building a program by hand by stuffing hex numbers in a stack so then when read by an instruction counter, executes the code. So if I can do that for a Do-loop, I can then do that for more code, and more code, and then it's not a stretch to see how the creation of code which creates code can be created first by hand, then gradually converted to code which create code by the execution of the code itself.

Bararontok · Jul 18, 2012

So basically computer technology has reached the point where programs are used to make programs in order to make programming easier and more automated. In fact Visual Basic and robotics programming applications even have pre-programmed modules stored in a library and all that needs to be done is to drag the icons that serve as command buttons into the screen and click the compile button to enable the use of the button activated codes.

jtbell · Jul 18, 2012

rcgldr said:

Magnetic core memory retains data without power.

You're right. I guess I didn't have to re-enter the paper tape loader as often as I thought I remembered. I did have to do it often enough to have that piece of paper with machine language code (written out in octal for ease of reading) taped next to the switches. Probably my programs ran amuck sometimes and wiped out the loader. :tongue:

Dickfore · Jul 18, 2012

ITT, the OP attempts to reproduce the "chicken-egg" philosophical problem in Computer Science. He pretends he does not realize that there are, and were different computer architectures and software throughout the history, and newer ones were first written on older ones.

uart · Jul 18, 2012

jtbell said:

You're right. I guess I didn't have to re-enter the paper tape loader as often as I thought I remembered. I did have to do it often enough to have that piece of paper with machine language code (written out in octal for ease of reading) taped next to the switches. Probably my programs ran amuck sometimes and wiped out the loader. :tongue:

A friend of mine who used to managed the computers in the EE dept of my old university recalled almost exactly the same account of manually entering the tape boot-loader code as per your experience (almost word for word actually). So I tend to believe that your memory of this is quite sound. :)

Personally I have no first hand knowledge of this, but here is my hunch: If say the typical up-time of the computer was many days (or even weeks) then it might have been considered thoroughly worthwhile to trade-off 10-15 minutes of your time each re-boot (in manually entering the boot loader code) for the advantage of having that small extra handful of bytes available for user programs. Considering the extremely limited available memory of these early computers, then this does actually make sense.

How is compiler software compiled?

1. What is a compiler software?

2. How is a compiler software different from an interpreter?

3. How is a compiler software compiled?

4. What are the key components of a compiler software?

5. What is the purpose of compiling a compiler software?

Similar threads

Hot Threads

Recent Insights