Pointers, Ram, Hard drives and Databases

  • Thread starter Thread starter John Creighto
  • Start date Start date
  • Tags Tags
    Hard Pointers Ram
AI Thread Summary
The discussion highlights concerns about the efficiency of databases in programming, noting that while they organize data well, they may slow down retrieval and writing processes. It emphasizes the importance of RAM for fast access but acknowledges its limitations in size and the overhead of transferring data between RAM and hard drives. The conversation explores the use of pointers and memory mapping to treat non-volatile memory similarly to RAM, suggesting that modern operating systems facilitate shared memory for inter-process communication. Additionally, it discusses client-server models for accessing binary data across different programming languages and the trade-offs between binary and text serialization methods. Overall, the thread underscores the need for efficient data management strategies in programming to optimize performance.
John Creighto
Messages
487
Reaction score
2
A long time ago, I was once told that too much reliance on databases in programing can create slow programs. Databases provide a nice way to store and organize information but they may not retrieve and write information quickly. Clearly ram is the fastest way to access meomry but the size of ram is limited and their is overhead associated with transferring information between ram and the hard drive.

One way to store information is to serialize it. For instance you can serialize objects. You often have the option to select between several formats and I presume that one format represents the way the information is stored in Ram. So my first question is with regards to pointers. For programing languages that can use pointers, does the pointer care if the pointer points to ram or non volatile memory or in order to treat non volatile memory as ram is it necessary to deal with virtual memory? This could be handy if you have a large object that would take considerable time to load (unserialize) into memory.

I'm thinking if you have a data source, that is a binary file and you want to access that information though several sources (process, threads, separate programing languages)...a client server model might be good (not I'm not sure if this would be slow though), the server decides to keep the object on the hard disk or load into ram depending on the size of the object, the amount of free memory available and the demand for that object.

I've scene some bridge programs between languages that use client sever models (com connect for instance) and they seem to pass strings around to tell the object what method to use. This strikes me as not the most efficient way to do things. I'm wondering about dynamic link libraries, are these accessible from multiple sources? I need to do more research, but I'd be interested in any comments people might have for managing large amounts of data between programs.

Perhaps a good idea is to use a model like is use in com connect for programing languages that don't support pointers but give the option to get a pointer for programing languages that do. So for instance, on the java side of com connect, you can use the client server model but on the com side (visual basic, C, etc...) you can get a pointer and treat it as a native object.
 
Technology news on Phys.org
John Creighto said:
Databases provide a nice way to store and organize information but they may not retrieve and write information quickly.

To a user, a database is just an interface for storing and retrieving data. That data could be stored on a disk, or it could be stored in RAM.

For programing languages that can use pointers, does the pointer care if the pointer points to ram or non volatile memory or in order to treat non volatile memory as ram is it necessary to deal with virtual memory?

Pointers are (literally!) just addresses. They're just numbers. What the addresses actually mean is arbitrary. Typically, the addresses are in some very large virtual memory space. Pages of memory that are not often used will eventually be sent to the disk, while pages that are used frequently will remain in RAM. The program has no direct knowledge of where exactly each page of memory currently exists.

I'm thinking if you have a data source, that is a binary file and you want to access that information though several sources (process, threads, separate programing languages)...a client server model might be good (not I'm not sure if this would be slow though), the server decides to keep the object on the hard disk or load into ram depending on the size of the object, the amount of free memory available and the demand for that object.

All modern operating systems include a facility called "memory mapping," which maps a range of addresses in the program's virtual address space to a file. If you read from those addresses, you'll get data from the file. It is up to the operating system to determine whether to load the data into RAM all at once, or to read it from the disk in chunks as necessary.

I need to do more research, but I'd be interested in any comments people might have for managing large amounts of data between programs.

If you're trying to share large amounts of memory between two programs running on the same computer, you should note that all modern operating systems provide mechanisms for shared memory. These shared memory segments can be mapped into the virtual address space of multiple programs simultaneously. Two or more programs can read or write to the shared memory exactly as if it were normal, private memory. (But you should include some thread-safety mechanisms, like mutexes, to make sure your programs won't step on each other's toes.)

If you're trying to share large amounts of memory between programs running on separate computers, use MPI or some other multi-processing library.

- Warren
 
Since you menion COM, I assume that by client-server communication, you mean the ability to call a subroutine across a network. There are 2 basic approaches to this--binary serialization and text serialization.

COM (Microsoft) and CORBA (UNIX/Linux) are binary serialization technologies. Each is also operating-system specific, i.e., both client and server must have exactly compatible OS and compilers.

So-called "web services" is an OS-independent way of calling across a network, where the serialization of the call and return information is text. This approach has worse performance but it can be platform and version independent, which can sometimes be very useful.

Modern database such as SQL Server, MySQL and Oracle are very efficient at caching information in memory and moving it efficiently across a network, though you do have to be careful about what kind of pre-processing you ask the database to do (i.e., what kind of query you send it).

Hope this helps.
 
Dear Peeps I have posted a few questions about programing on this sectio of the PF forum. I want to ask you veterans how you folks learn program in assembly and about computer architecture for the x86 family. In addition to finish learning C, I am also reading the book From bits to Gates to C and Beyond. In the book, it uses the mini LC3 assembly language. I also have books on assembly programming and computer architecture. The few famous ones i have are Computer Organization and...
I have a quick questions. I am going through a book on C programming on my own. Afterwards, I plan to go through something call data structures and algorithms on my own also in C. I also need to learn C++, Matlab and for personal interest Haskell. For the two topic of data structures and algorithms, I understand there are standard ones across all programming languages. After learning it through C, what would be the biggest issue when trying to implement the same data...
Back
Top