Confusions about Position Independent Code

In summary, the offset between the code section and data section is known at compile time, and variable references can be reassigned as the position of the currently executing instruction address plus the known offset to the data section. However, the data section's address is encoded into each variable/function reference via the Global Offset Table, and this address would be directly added to the variable instead of going through the Global Offset Table if the data section's address was known relative to the code section.
  • #1
Marmoteer
8
0
Hello! I was reading this excellent article about position independent code and it's implementation for shared libraries. I'm still confused about one part though. My current understanding is that the offset between the code section and data section is known at compile time. Since this offset never changes, variable references can be reassigned as the position of the currently executing instruction address plus the known offset to the data section. This is where I get confused. The author states that the variable is indirectly addressed via the Global Offset table which resides in the beginning of the DS. The addresses in the GOT are assigned at runtime. What I'm wondering is that if the data section's address is known relative to the current instruction why not just add the offset to the variable instead of going through the Global Offset Table?

To summarize if the GOT address is known relative to the code section and its offset is encoded into each variable/function reference why not just encode the relative variable address instead?

Is it that the the data section is scrambled for some reason and the GOT has the only consistent address? (0x0 I believe in the DS)

Anyway I hope my question isn't too confusing and thanks for the help.
Here's the article referred to in my question - http://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-shared-libraries/

Some relevant information:
Relocations - http://en.wikipedia.org/wiki/Relocation_(computing )
Data Segment - http://en.wikipedia.org/wiki/Data_segment
Position Independent Code - http://www.gentoo.org/proj/en/hardened/pic-guide.xml
 
Last edited by a moderator:
Technology news on Phys.org
  • #2
This depends on the addressing modes supported by the processor. On a Motorola 68000 series processor, all memory reference instructions can be PC (program counter) relative, so position independent code just needs to use those PC relative addressing modes. On an Intel X86 processor, only the branch and call instructions are PC relative, the memory reference instructions use other registers as the base and/or index registers, so some scheme needs to be used in order to make X86 code (and data) position independent.

For windows, shared libraries are implemented as dynamic linked libraries instead of using position independent code methods. The code is shared between running processes, but usually each process has it's own copy of the dynamic linked library data. There can also be shared data with a dynamic link library. I'm not sure on the details on how the private and shared data virtual adress spaces are setup. Wiki article:

http://en.wikipedia.org/wiki/Dynamic-link_library

MSDN article:

http://msdn.microsoft.com/en-us/library/ms682594
 
Last edited:
  • #3
Right, on Windows relocations are performed to keep the code position independent. I'm pretty sure I understand the mechanism of position independent code on x86 (using instruction relative addressing) I'm just confused why (for linux) the GOT is accessed and used to address the data indirectly when it is in the same section as the data itself. I hope that makes sense.
 
  • #4
The loader gets involved when shared libraries are concerned. When a process is first created for an executable file, all calls to shared libraries are replaced with stubs which actually call "the linking loader". The first time this code executes, the linking loader finds where the shared library ACTUALLY is at that moment and patches in the address of the shared library (and also records, by some means, that this code is using that library so the library can't be "unloaded" too early). For subsequent calls, the code jumps directly to the library without the loader being involved.
 
  • #5
Sorry--the above is for "dynamically linked libraries" (DLL's on Windows, or dynamic libraries on Linux or UNIX).
 
  • #6
Static libraries on Linux actually copy the library routines right into the executable file, which is bad is lots of ways--thus the invention of dynamic linking.
 

1. What is Position Independent Code (PIC)?

Position Independent Code (PIC) is a type of code that can be executed from any memory address without any modifications. This means that the code can be loaded at any memory location and still run correctly.

2. Why is PIC important in computer programming?

PIC is important in computer programming because it allows the code to be easily relocated in memory without having to make any changes to the code. This is particularly useful for shared libraries and dynamically linked programs, which can be loaded at different memory addresses in different processes.

3. How is PIC different from Position Dependent Code (PDC)?

PIC is different from Position Dependent Code (PDC) in that PDC must be loaded at a specific memory address in order to run correctly. If the code is moved to a different location, it may not function properly. PIC, on the other hand, can be loaded at any memory address and still run correctly.

4. What is the advantage of using PIC in software development?

The main advantage of using PIC in software development is the flexibility it provides. With PIC, the code can be loaded at any memory address, making it easier to share and reuse code in different programs and processes. This can also improve security, as it makes it harder for attackers to exploit vulnerabilities by predicting the location of the code in memory.

5. Are there any drawbacks to using PIC?

One potential drawback of using PIC is that it may slightly increase the size and runtime of the code due to the extra instructions needed to make it position independent. However, this is usually a small trade-off for the benefits it provides. Additionally, some processors may not support PIC, so it is important to check the compatibility before using it in a program.

Similar threads

  • Programming and Computer Science
Replies
3
Views
1K
  • Programming and Computer Science
Replies
5
Views
1K
  • Electrical Engineering
Replies
6
Views
1K
Replies
8
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
2
Views
3K
Replies
10
Views
2K
  • Programming and Computer Science
Replies
1
Views
1K
  • Special and General Relativity
Replies
32
Views
5K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
2
Views
3K
Back
Top