Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

How is compiler software compiled?

  1. Jul 17, 2012 #1
    If a compiler using a particular programming language is used to compile software and convert it from text files to executable application files, then how is the source code written for the compiler itself compiled so that it will have its own executable files?
     
  2. jcsd
  3. Jul 17, 2012 #2

    phinds

    User Avatar
    Gold Member
    2016 Award

    One way is a process called "bootstrapping". A modest-sized compiler is written in assembly language and then that modest compiler is used to write the next version.

    That works well for relatively low-level high-level languages such as C, which are themselves relatively close to the machine. For really abstract languages such as APL, the compiler would be written in a low-level high-level language, generally C (or it could be written in assembly, but what a pain that would be).
     
  4. Jul 17, 2012 #3

    jtbell

    User Avatar

    Staff: Mentor

    There's also cross-compilation: you compile the source code on machine A, but you make the compiler produce machine code for machine B. Then copy the compiled program from machine A to machine B and execute it there. If this program is itself a compiler, then you don't need machine A any more.
     
    Last edited: Jul 17, 2012
  5. Jul 17, 2012 #4
    So if the compilers are made in assembly language which the thread originator has actually used to compile programming languages and assembly programs, how is the assembly language compiler itself compiled?
     
  6. Jul 17, 2012 #5

    phinds

    User Avatar
    Gold Member
    2016 Award

    You seriously lack understanding of computers. Assembler are NOT compilers. Computer languages go like this:

    machine language (JUST 1's and 0's, no letters)
    assembly (one-to-one correspondance w/ machine language but uses letters, such as "add" to represent machine language statements)
    compilers ("high-level" languages. A single statement can produce dozens of machine language statements)

    and there are interpreters, such as the original BASIC, which are a whole 'nother animal.

    EDIT: so when you are at the level of an assembler, there is no such thing as a compiler. Assemblers are written in small form in machine language, then bootstrapped up to a full version.
     
  7. Jul 17, 2012 #6
    The thread originator has a Turbo Assembler program that they use to make programs and programming languages. There is confusion with the terminology, but perhaps the term Turbo Assembler is self-explanatory and is an assembler.

    So an assembler is used to assemble a compiler while the compiler is used to compile a program. Since the assembler is written in machine language, then the person writing the assembler must know how to translate machine language into human readable language. The thread originator has never made an assembler before and does not know the detailed process.
     
  8. Jul 17, 2012 #7

    phinds

    User Avatar
    Gold Member
    2016 Award

    Uh ... just out of curiosity, why do you refer to yourself as "the thread originator" rather than "I" ?

    Turbo Assembler is, as you might guess from the name, an assembler, not a compiler.

    Assemblers are not JUST used to write compilers. Many programs are written in assembler. In fact, in the early days, some computers only HAD an assembler, not a compiler. These days about the only thing written in assembly language is device drivers, but even these are now often written in C.

    Yes, the person writing an assembler in machine language must understand machine language.
     
  9. Jul 17, 2012 #8
    Yes, the thread originator wrote plenty of programs and some programming languages in Turbo Assembler but never made an assembler. It is actually easy to write source codes in assemblers and compilers because the commands are all human readable commands but writing an assembler in machine language must be a very difficult task.
     
  10. Jul 17, 2012 #9

    phinds

    User Avatar
    Gold Member
    2016 Award

    Uh ... just out of curiosity, why do you refer to yourself as "the thread originator" rather than "I" ?

    Yes, writing ANYTHING in machine language is a real pain in the butt.

    If you have in fact written, as you say, "programming languages" in Turbo Assembler, and they are not assembler languages, then they MUST be either compilers or interpreters and if you understand computers well enough to write either of those, I cannot understand how it is possible that you to not already understand everything discussed in this thread.
     
  11. Jul 17, 2012 #10

    phinds

    User Avatar
    Gold Member
    2016 Award

    I disagree completely. Writing complex math programs in assembler is not even CLOSE to being as easy as it is in compiler languages.
     
  12. Jul 17, 2012 #11
    The problem was merely a confusion of terminologies but the thread originator knows how to write the codes using the Turbo Assembler program.

    Additionally, the thread originator uses the term thread originator because scientific documents must be impersonal. The term "the researcher" is used by the thread originator when writing scientific research documents.

    The comparison was being made between assemblers and writing in machine language. Of course a very complex program is more difficult to write in an assembler than in a compiler. Some compilers like Turbo C++ even have GUI compatibility so a selector icon can be used to do various operations more quickly than if only typing is used to make the source code.
     
    Last edited: Jul 17, 2012
  13. Jul 17, 2012 #12

    phinds

    User Avatar
    Gold Member
    2016 Award

    Yeah, but you are not WRITING a scientific document, you are on a forum. Do you see anyone else here referring to themselves that way? This is not a big deal, I was just wondering why you like to sound so ridiculously stilted.

    I take it from the first sentence above that you have NOT, as you originally stated, written any "programming languages" ?
     
  14. Jul 17, 2012 #13

    phinds

    User Avatar
    Gold Member
    2016 Award

    Yes, I agree w/ you on this. Modern assemblers are MUCH more friendly than the ones in the early days of computing and in any case, assembly is much more friendly than machine language.
     
  15. Jul 17, 2012 #14
    The programming languages were just exercises and they possessed only very basic functionality. They did not possess the large amount of functions and data that programming languages in widespread use possess.

    And making the programming language is not difficult because of the logic or the commands needed to make the programming language but the massive amount of code that has to be written for the many functions that the programming language has to serve.

    But to save time in making the programming languages, open source code was copied from various online sources and incorporated into modules in the source code to cut the programming work by a significant percentage.
     
    Last edited: Jul 17, 2012
  16. Jul 17, 2012 #15
    In the earlier stages of computing technology, there were computers that used mechanical, electromechanical relays, and vacuum tubes to do computing. For a software to be written into these machines, the switches had to all be adjusted manually and one-by-one to generate the pattern of 1's and 0's that would cause the machine to contain information. In the case of the punch card computers, holes needed to be punched manually into the cards to make it contain data.

    But in the advent of transistor controlled electronic computers, software would be written on the keyboard of a computer that is turned on and saved in, initially a tape drive when the work is done. But since these computers must initially already have some type of assembler program to make this possible, how is the assembler written into the device memory? Was there some type of simple keyboard that contained only a 0 & 1 and then these 2 buttons would be pressed and signals would be sent directly to the storage device without passing through more sophisticated hardware? Perhaps the electrical signal generated when either 1 or 0 is pressed would be sent through wires, one going into the write head of a magnetic tape or later a magnetic disk and then the other wire would send the signal to the motor to physically move the tape or disk so that data can be written into another sector so that the next bit of data can be written.

    Since there is no editor software to assist in writing the assembler code, it must have been impossible to do the kind of text editing that could be done on assemblers, so that if an error occurred, the entire segment of code needed to be erased so that the process can be repeated again.
     
    Last edited: Jul 17, 2012
  17. Jul 17, 2012 #16

    chiro

    User Avatar
    Science Advisor

    What happens is that the computer comes with a BIOS and a ROM which have some basic functionality for doing really simple I/O tasks.

    On top of this you have an OS which supplies even more functionality which uses the functionality supplied by the BIOS and the hardware ROM.

    Once you have a representation of a simple program allowing you to do I/O on all necessary devices like in RAM, the hard-drives, the video devices, and so on, then you can create the necessary memory representation corresponding to a program and one instance of a program is a compiler.

    Now you can create a representation of a program on a hardware device like a ROM chip just like it would be represented on a hard-drive and this is what happens when you want to implement some basic BIOS frameworks.

    From this point on, as mentioned above it's pretty much boot-strapping and other techniques that are easier to create and maintain.

    One thing to be aware of though, is that it's not just converting stuff to machine language: every executable and system library has its own operating system headers and structures that are executed in an OS specific way. The instructions themselves will be in a machine-language, but the OS actually has to prepare the environment for execution and this needs OS specific stuff defined in the executable itself.

    The linker (and sometimes the compiler to some degree) adds all this OS stuff to the final executable which gives an EXE file (also adds code to check whether windows is running and other code to prepare the system environment).
     
  18. Jul 17, 2012 #17

    jtbell

    User Avatar

    Staff: Mentor

    When I was an undergraduate in the early 1970s, one of the computers that I worked on was a Digital Equipment PDP-5 which my college had acquired via army surplus. Its main input was via punched paper tape. In order to read a machine-language program from a paper tape, a paper-tape program loader had to be already in the computer's memory. There was no disk drive (hard disk or floppy disk), or any other kind of permanent memory.

    When I turned the machine on, the first thing I did was to load the paper-tape program loader into the computer, via switches on the front panel. As I recall (this was forty years ago so some details may not be accurate), there were:

    • 16 switches labeled DATA, for the binary digits of a 16-bit word
    • 16 switches labeled ADDRESS, for the binary digits of a 16-bit address in memory
    • A button labeled LOAD
    • A button labeled RUN

    The paper-tape loader consisted of 10-15 instructions in machine language, which I had written on a piece of paper and taped on the machine's front panel. To load the program, I had to:

    1. Set the DATA switches according to the bits of the first instruction
    2. Set the ADDRESS switches to the address of the first instruction
    3. Press the LOAD button
    4. Repeat steps 1 to 3 for the other instructions

    After that, whenever I wanted to read a program from paper tape, I had to do the following:

    1. Position the paper tape in the reader
    2. Set the ADDRESS switches to the address of the first instruction of the paper-tape loader program
    3. Press the RUN button

    And then to run the program itself:

    1. Set the ADDRESS switches to the beginning address of the program (given in the program's documentation)
    2. Press the RUN button

    Of course, this depended on the program being loaded not overwriting the paper-tape loader program, either while being loaded, or while being executed. By convention, all programs were written to use memory locations above the presumed location of the paper-tape reader.
     
    Last edited: Jul 17, 2012
  19. Jul 17, 2012 #18
    So if, in electronic computing, the BIOS and ROM provide the basic I/O tasks to allow the assembler to be written, then how is the BIOS and ROM programmed? Since, as mentioned in the previous post, there were no editor software when the first firmware and assemblers were being made, then these codes needed to be written in machine language and because there is no editor, then the whole memory device would have to be erased if an error.

    When was the monitor invented? Because before that they needed to use indicator lights and some printing equipment to know the output generated by their inputs and this must have been difficult to do.

    What kind of memory did the computer use if it was not permanent storage? Was there already a solid state micro-controller chip or ROM chip?
     
    Last edited: Jul 17, 2012
  20. Jul 17, 2012 #19

    jtbell

    User Avatar

    Staff: Mentor

    It used magnetic core memory (http://en.wikipedia.org/wiki/Magnetic-core_memory), which retained data only while the computer's power supply was on. For persistent storage we had paper tape and a magnetic tape drive. My big project was writing a program to store and load programs using the magnetic tape drive. In other words, a very crude operating system!

    First I had to load my "magtape program" in the usual way from paper tape. When I ran it, it more or less asked me, "which program do you want to run?" If I entered "5", it would skip to the fifth program stored on the magtape, load it, and run it.
     
  21. Jul 17, 2012 #20
    So the tape was operated directly by hardware circuits that were controlled by pressing the buttons. So the circuitry basically contained the functions that the buttons operated.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: How is compiler software compiled?
  1. How do you compile C? (Replies: 5)

Loading...