Can Malloc Use Mmapped Shared Memory for Efficient Parent-Child Data Sharing?

  • Thread starter Thread starter TylerH
  • Start date Start date
  • Tags Tags
    Area Memory Set
Click For Summary

Discussion Overview

The discussion revolves around the use of malloc and mmap'ed shared memory for efficient data sharing between parent and child processes in a C/C++ programming context. Participants explore various methods and considerations for implementing shared memory to avoid page faults, particularly in scenarios involving multiple child processes and memory allocation.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant proposes using mmap'ed shared memory to set the area from which malloc allocates memory, arguing it would facilitate data sharing without page faults.
  • Another participant mentions that in GNU Linux, malloc hooks can be installed, but notes that this approach is not portable across different operating systems.
  • A participant suggests that in Windows, shared memory files can be utilized, but raises concerns about the complexity of using Windows-specific debugging functions and synchronization mechanisms.
  • One participant expresses reluctance to implement a full heap allocator using glibc malloc hooks, citing a lack of expertise and previous unsuccessful attempts to redefine memory allocation functions.
  • Another participant critiques the original proposal, suggesting that if child processes do not require dynamic memory allocation, using threads might be a more efficient alternative.
  • A participant elaborates on the prime number checking program, explaining the inefficiencies caused by page copying when the parent writes back results, and reiterates the desire to use mmap for shared memory to mitigate this issue.
  • Another participant questions the viability of using threads instead of processes, highlighting the benefits of a common virtual address space and the potential for reduced overhead.

Areas of Agreement / Disagreement

Participants express differing views on the best approach to achieve efficient data sharing between parent and child processes. There is no consensus on whether to use malloc with mmap, shared memory files, or to switch to a threading model.

Contextual Notes

Some participants note limitations in their proposed solutions, such as the non-portability of certain methods and the complexity of implementing synchronization in Windows. Additionally, there are unresolved questions about the implications of using threads versus processes in the context of the specific application discussed.

TylerH
Messages
729
Reaction score
0
I want to set the area from with malloc gets its memory to a mmap'ed shared memory file, to share data between children and parents without causing pagefaults. It is safe because the data used by children is guaranteed not to be touched by the parent and the children are guaranteed not to malloc at all. Is there a standard way to set the memory area which malloc manages?
 
Technology news on Phys.org
In GNU Linux you can install malloc hooks. See here.
This is not portable to other operating systems.

In C++ you can:
- overload the global new operator,
- declare a custom new operator within a class
- specify an allocator for the various C++ containers
- use placement new syntax
All these methods are portable.
 
In Windows you can use a shared memory file. MSDN article:

msdn_named_shared_memory.aspx

I'm not sure if the windows debugger functions could be used unless windows allows processes to attach each other. If so, then you could use DebugActiveProcess(), ReadProcessMemory(), WriteProcessMemory(), ... . You'd also probably need to use DuplicateHandle() for any mutexes or semaphores that you'd want to used for synchronization. One method to use DuplicateHandle() is to include the hex values of the main process id and any handles to be shared on the "command line" used for CreateProcess().

msdn_debugger_functions.aspx

msdn_create_process.aspx
 
I looked at the glibc malloc hook functions, but I'd really like to avoid writing a full heap allocator. It's outside the scope of my knowledge to do well. I also tried redefining sbrk and brk, but that didn't work either. (They weren't called by malloc while mallocing memory.)

As for the Windows stuff, rcgldr, my application would be totally unfit for Windows because Win processes are so heavy and my program forks a lot. Windows, IIRC, doesn't do COW address space copies. I didn't give enough context for you to know this, though.
 
Without knowing more of your problem, this sounds like a terrible idea to me. And if your child processes are so simple that they don't need to allocate, why not use threads? [Note that event printf does a malloc].
 
Threading in C++ is still pretty awful. The program makes a list of primes. The parent maintains the list and forks off a child for each number to be checked and the exit status of the child indicates the primality of the number.

The problem I'm having is that the parent writing back primes to the list it causes pages to be copied for no reason, so I was going to mmap an anonymous shared region of memory to use as the heap of the parent to prevent the pages from being copied.
 
TylerH said:
The program makes a list of primes. The parent maintains the list and forks off a child for each number to be checked and the exit status of the child indicates the primality of the number. The problem I'm having is that the parent writing back primes to the list it causes pages to be copied for no reason, so I was going to mmap an anonymous shared region of memory to use as the heap of the parent to prevent the pages from being copied.
If this was done using threads, there would only be a single and common virtual address space (just one actual instance of the list of primes) for the parent and all child threads. Each child thread would only need to be spawned one time, and each child thread could pend on a mutex and/or semaphore for each number to be checked, and then post a status also using a mutex and/or semaphore. Why isn't this a viable solution for this program?
 
Last edited:

Similar threads

  • · Replies 7 ·
Replies
7
Views
3K
Replies
3
Views
2K
Replies
4
Views
2K
  • · Replies 0 ·
Replies
0
Views
332
Replies
1
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K