Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Introduction to programming

  1. Jun 20, 2012 #1
    Okay forgive me because i am completely new to programming - as i am going to ask some possibly stupid questions:
    How is a computer language like python, java created? Who creates them? Is Word,Excel created using python, java etc?
  2. jcsd
  3. Jun 20, 2012 #2


    User Avatar
    Science Advisor

    Hey jd12345.

    These are not as stupid as you think, so don't think of yourself that way.

    Typically what happens is that you first have to create your original compiler which is usually something like an assembler that takes mneumonic op-codes and compiles it to native language that your operating system can understand.

    You have an operating system that implements a basic interface for working with the hardware of your computer. It also includes a lot of tools for doing things like managing memory, file management and so on. Today's operating systems are very complex and are very big in terms of code and functionality and implement a lot more higher level routines than the days of DOS.

    So with your operating system and your compiler which runs in the OS environment, you then create code in your first compiler to implement something more complex. For example you create a C compiler in something like assembler language, and that becomes your compiler for C.

    Then you can take that compiler and create even more things like Java, C++, Python and so on. This concept is known as bootstrapping in programming and is very common.

    Typically, a lot of things that need speed are written in something like C or C++ because the compilers are highly optimized nowadays and can create really good code.

    Also you might be interested to know that sometimes, people create new compilers in their own language! So lets say you have a C-compiler written from an assembler compiler. You can now create a new C-compiler with C-code! This kind of thing is also done more than you think.

    Things like Word/Excel are usually done in high-level environments like .NET, or done in environments like C++ with a lot of developed libraries. Most software done nowadays has many libraries that the developers don't specifically write themselves and what that translates into is getting a lot of complex stuff done quickly based on the work of the people that created the libraries.

    Each application has its own language and features, but each are written in different languages for a variety of purposes.

    Java is good because you can run it on many OS platforms and you don't have to change the code at all. C++ is good because of its features and because the compiler creates really fast code.

    Usually developers want to create applications in common operating systems (which is why lots of programs are written for MS Windows) and also in environments that allow them to develop the software as quickly as possible. This translates into things like a lot of libraries (including in-house and 3rd party) and things like .NET platforms. There is no one answer, but these are good guidelines.

    If you want to know specifics, go to the creators website and see what language something is developed in.
  4. Jun 20, 2012 #3
    Computer looks like a magic box to me at the moment - i have no idea how it works fundamentally. Do they teach at undergraduate level that how a computer actually works at the deepest level?

    And what approach should i take while learning programming for the first time? Should i just learn the computer language without knowing how it works?
  5. Jun 20, 2012 #4


    User Avatar
    Science Advisor

    Well you have hardware and software levels and each level has its own sub-levels and so on.

    The lowest level for software is machine-language and the native instruction set for the platform. Each platform has its own instruction sets, architecture design, memory model, and so on.

    Everything in terms of software is built on top of this and with modern programs and operating systems, we are talking millions upon millions of lines of code in something like C or C++ (many millions) especially for operating systems.

    The main way a computer works is that the code is executed top-down where instructions get processed in a sequential order and that the execution responds one execution at a time. There are architecture issues to take into account but this is the basic model. Basically you have an instruction pointer that changes based on the previous instruction and you execute each new instruction.

    In terms of what you deal with you have registers and memory. The CPU deals with registers in the quickest possible manner and you load things from memory into registers to do stuff and then write the contents of the register back to memory. This is basically how things get done on the hardware level, but it's not exactly like this due to some architecture issues and design/feature issues.

    Now you've got your memory which is basically your RAM and also a few other types. In terms of accessing your video card, hard-drive and other hardware, you use things like interrupts, I/O ports, and things built on top of this like DMA (Direct Memory Access) Controllers. I/O port access is done through the instructions IN and OUT on normal x86 architectures (standard PC's).

    This is the basic idea of software although it's not detailed or complete, but it should help you understand what is going on.

    In terms of hardware this is again different: you basically have lots of components built on logical gates and you have to deal with topics like synchronization and the use of clocks to co-ordinate everything, as well as making sure the hardware is reliable enough to do what it's meant to do.

    Then you've got on top of this physics and material science issues for that scope as well as all the design and engineering issues for the actual circuitry specifics.

    Again this is an oversimplification, but it should give you an idea of how it works at least at one level.

    If you do computer engineering, you will learn how this all works at the lower levels including the hardware and the low-level software. You probably won't learn much about material science issues and some physics issues (but you will learn about other physics issues) and you won't learn about more of the high level computer science theory unless you take specific electives in this.

    If you want to learn computer languages, start with something like C or C++ and remember the top-down rule. Remember that you keep going down unless you are branching (if statements), looping (for, while statements), or calling functions ( example - foo()). The rules are: for branching is basically to execute something if the condition is met, for looping you stay in the loop until the exit condition is finished, and for functions you return to the last code scope (i.e. the code that called the function) and continue on top to bottom.

    For multi-threaded, parallel coding it's a little different but don't worry about learning that just yet!
  6. Jun 22, 2012 #5
    There are different levels of "programming languages." On the simplest level there are scripting languages which combine a series of commands together and run them in sequence. In this area you have shell scripts (*.sh) of the kind run from Unix/Linux terminal windows and batch scripts (*.bat) of the type that are run for Windows/DOS command prompts.

    Above this level (in control of details) you have interpreted languages like Basic and Python. From the computer's point of view these are very sophisticated scripts -- the computer "interprets" and runs the instructions as they are fed to it but the data structures and flow controls are generally more advanced than what are found in simple scripts.

    Finally, there are the compiled programming languages like C. Here, the entire program must be verified to not have any errors that violate the syntax rules of the language, then reprocessed into a computer optimized form (binary executable) that can be run by the computer. These are the program files you generally find in Windows/DOS that have the ".exe" ending.

    That was a very general grouping of what are called programming languages. There are many overlaps and cross-overs. For instance, some interpreted languages can also be compiled and Java introduced a class of programming language in the space between interpreted and compiled.

    I will try to be more helpful in getting to what I think you are asking for. When PCs first came out in the late 70s, most programmers started to learn programming with Basic since it came built in on most PCs. If you took a beginner's class in the classroom the language might have been Pascal since that was built in to most Apple computers then. Both were good learning languages because they are high-level languages, which means the statements are more human readable-like and there is a lot more underlying (computer level) functionality per statement as opposed to low level languages. Basic (DOS Basic) was an interpreted language and Pascal was a compiled language for the most part but there were some interpreted versions. Neither of these are very practical or useful for today's modern programming but there are modern versions of them available that are good learning tools.

    One of the most useful things about learning to program from Basic was that, as an interpreted language, it was very forgiving. You could run/test each statement in turn and immediately see what the results are. While there is almost no interest in interpreted Basic anymore, you can do much of the same in today's popular languages like Python.

    Programming compiled languages generally required more training because there were generally more parts and steps involved. The compiler programs have many command line arguments. Then, you might have to link in external functions libraries. And then, when you run through the process the result might not be your program, but a listing of dozens of errors you made.

    A lot of that pain has been reduced by the packaging of most of the program development tools you might need into Integrated Development Environments (IDE). These will generally syntax check your code as you write it. They may offer code-completion suggestions to help you complete your statements so you don't have to refer to reference manuals so often. And, they generally have built in debuggers that allow you to step through the statements of your program and inspect what it's actually doing, which brings some of the convenience of interpreted languages to compiled program development.

    Now, for a practical response to your question. There are a lot of free tools out there that will allow you to dabble in programming and you can start using them to accomplish simple tasks. If you want to do data processing, such as taking text data from a file and reprocessing it or re-compute numbers, print the results at the terminal or save it to another file, then simple shell scripting or using a more advance interpreted language like Python or Ruby may be a good place to start.

    If you want to start further out in deep water and build a Windows type program you should get a full blown IDE for that. My recommendation here is to download Microsoft's Visual Studio Express Edition. The current stable version is 2010. The Express Editions are completely free and have all of the features you would need as a beginner. You will probably want to start with the Visual Basic edition, which is a very modern version of the old Basic language, so it is still fairly high-level.

    Having said that, these tools won't teach you what you really need to know to be a programmer. There are a number of fundamental concepts of all computer programming you will have to learn. They may be hard to grasp at first and many of the reference manuals and tutorials may assume a certain amount of knowledge you don't have yet. These are concepts like variables, data types, data structures, operators, conditional statement, loops and flow control, subroutines and functions. They may have slightly different meanings from one language system to the next. If you encounter references to object-oriented programming or event handlers too early on, don't struggle with that until you have a real sense of what the other stuff is about.

    Start simple. Find tutorials with sample programs. Step through them on the command line for interpreted languages or in the debugger for compiled programs to see what is really going on. Make modifications to different aspects of the program to see what effect it has. It will require some persistence to get a hold of what they really mean to the computer because in the end, what you really want to know is what those things are making the computer do.
    Last edited: Jun 22, 2012
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook