OP, I'll try to give a not-so-comprehensive, quasi-bottom to top description of the process.
We start with transistors, and in the CMOS technology of transistors, the basic logic building block of an inverter can be made with two FETs. All of your AND, OR, NOR, and more complex logic building blocks are made of these transistors. They are usually run with a clock signal, which is created by an oscillating circuit. These all operate like a machine, to move a signal or a bus of signals from an input to an output, and the output depends on what the input was to give you certain functions. So at all stages, you have inputs constantly being moved to outputs as time progresses.
The start of these inputs are "hard-wired" in the truth tables of the logic blocks, and then later these inputs can be switched to other sources, where they can be accessed from non-volatile memory. One class of these signals are the instructions to start the chip running whatever programs it is meant to run. The logic blocks are hardwired to "boot" or start executing from non-volatile memory as soon as power is applied to the hardware. At this point is where the microcontroller is going to start running from code rather than the innate starting states of the machine. The microcontroller can also switch from its start up state to a state where it begins taking its instructions from other sources than just non-volatile memory, but these are usually more advanced special operating cases like JTAG or other debug modes.
The interface to the non-volatile memory varies, but its always some physical mechanism that is being controlled by the logic blocks of the microcontroller. It can be a burn into a flash memory block using high voltage components or it can be an interface to an electro-mechanical device like a hard drive, but remember the process is always being controlled by logic-blocks that are constantly giving an output based on their inputs over time. Now, all the data in the non-volatile memory can be transferred to volatile memory, and the distinction between volatile and non-volatile memory is not so important at this point other than that volatile memory is usually much faster and has a more direct access to the logic blocks of the microcontroller.
At this point, the logic machines of the micrcontroller are running off inputs of the code you have written and many instructions will have temporary information that needs to wait for other information, such as when you take two values from memory and wish to add them together. If the logic is not designed to take both values at the same time, one value must be taken first and wait for the other value. Hence why some instructions take multiple clock cycles. The intermediate logic stages between the memory and its end location can store these bytes of information while waiting through the use of flip-flops.
You may or may not have heard of flip flops; these are digital circuits that have a memory ability. They stay in a high or low state depending on whether a high or low voltage is applied to their input even after the input goes away because they also have an enable/disable input to control when they accept a new state. The transistors used to make the flip-flop are used in circuits called bistable multivibrators, and these circuits are used to implement the temporary memory in the logic blocks of the electronics.
Finally the interface to the logic that decodes what it takes from the instruction memory and determines the truth tables of all the other logic to perform their functions based on these decoded values is the instruction set. So all the logic of the chip is designed to operate based on the instruction set. You simply give the microcontroller the instructions, which include all the information relevant to the instruction (a memory location of where to take the data), and its hardware takes care of all details without you having to think about it.
You could write these instructions in their 1s and 0s into the memory that the decoding logic will read it out of, or you can let your PC write these 1s and 0s after its assembler software has translated your assembly instructions into machine instructions. So assembly and programming languages are just the abstraction from what your human thinking brain wants to the machine code the microcontroller wants. The assembler is not playing any more of a role in storing the data than what you have told it to do with your assembly in the instructions.
However, there is additional abstraction in the assembler/compiler software because it is very impracticle for a human to write a program where you have to manually specify every memory location you wish to read or write values. This is where the assembler and compiler start to automate your instructions. You can intentionally write more abstract code and let the assembler/compiler decide what addresses to use for memory locations based on a set of rules you give it. The assembler/compiler software usually has a model of the device you're programming, and additionally, you define a memory map that tells the assembler/compiler what addresses it can use for the different kinds of memory. Now when you define a variable, you are giving that variable a memory location indirectly because you have automated your assembler/compiler to pick a memory location for you . . as long as it is following the rules and memory map you gave it, you don't really care what memory location that variable is given. The assembler/compiler keeps track of all these locations automatically for you, but in the end you could have picked all of these memory locations yourself, because they are just going into the instruction set interface either way. It would just take you a lifetime to write modern programs doing this manually.