You don't need an operating system for the simplest of programs. Intel used to include a "traffic signal" program at the back of the 8051 cpu manual, where there was no memory, just the registers. The program would just loop a fixed number of times to delay for so many seconds, and it could "read" sensors via pin inputs on the 8051 to switch the traffic signal based on sensors and elapsed time.
What makes a computer (versus a calculator) is the ability to do conditional jumps: test for something, then conditionally jump to a new set of instructions, which gives a computer the ability to "make decisions" and "act" upon them.
Here's a link that describes the kinds of basic circuits used in computers, the "latch" being one of the important ones, since it functions as a type of "memory". If you go to the home page, you'll find more basic computer stuff.
http://www.play-hookey.com/digital
Regarding the generation of languages, here's a Wiki link:
http://en.wikipedia.org/wiki/First-generation_programming_language
What Wiki shows at an alternate definition to "5th generation language", GUI input, source code output, used to be called "4th generation language". If I remember correctly, Think C was one of the first of these that was widely available, made for the MacIntosh, which otherwise (usng MPW - Macintosh Programmers Workshop) was a very programmer "un-friendly" environment (until OS-X).
update - Depending on the computer, there's a layer below "machine language". It used to be called micro code befor "micro processor" became a popular term. On a cpu with a limited instruction set, such as a AMD 2901 bit slice cpu that older mini-computers were based on, a typical implementation of the "opcode" of an instruction was to index an 80 bit wide array. In this array, each bit triggered an operation. There were only 80 possible cpu operations (copy A to B, copy B to A, add A to B, xor B to A, ...) so this scheme was reasonably efficient. On other machines, the microcode more closely represents yet another instruction set.
On main-frames, there may be a lot of machine language instructions in the instruction set, such as a IBM 360, which the 390 is a derivative of. These include instructions that can do math on variable legth BCD (binary coded decimal fiels, 4 bits per digit", and even copy / format the BDC fields into a byte oriented field. Now few of the older IBM mainframes truly implemented all these machine language instructions and instead a "trap" would occur and the instruction would be emulated.
Intel cpu's have a lot of instructions as well. Such as the ability to add an immediate value to a memory location without using any registers.