How can we efficiently manage memory and disk usage in a BitTorrent client?

  • Thread starter Thread starter Yoss
  • Start date Start date
  • Tags Tags
    File
AI Thread Summary
Efficient memory and disk management in a BitTorrent client is crucial, especially when handling large files. The proposed solution involves using a 2D byte array to store downloaded pieces, but this can lead to excessive memory usage. A partial solution has been developed where completed pieces are dumped to an output file, and a temporary .dat file is created for incomplete downloads. Suggestions include pre-allocating disk space and using the seek() function to write pieces directly to their respective locations on the disk. This approach minimizes memory overload and simplifies the retrieval of pieces when uploading to peers.
Yoss
Messages
27
Reaction score
0
My friend and I are developing a BitTorrent Client for our CS class, but we've run into a dilemma. If anyone can point us in the right direction we'll be thankful.

A little background info if you're unfamiliar with BT. The file that is to be downloaded from multiple peers is broken up into file_length/32K (+ 1) pieces (conventionally.) And typically BT clients do not choose the pieces in order to download (they use the rarest first algorithm.) That ought to be enough!

Anyway, the scheme we developed was to create a 2D byte array of size [num_pieces][piece_length]. So once we've successfully downloaded a piece from a Peer and checked the infohashes and such, we dump the piece into its position in the byte array. We realized the problem of seriously overloading memory because the array in the data part of memory. Obviously this is a terrible solution..imagine downloading a 2Gb file and it all being stored in system memory (or VM) until it is finished.

We developed a partial solution. Once the file is completed, we dump all the pieces to the output file. To note, we create a temporary .dat file when the user prematurely quits before the file is complete, we save meta info (e.g. amount downloaded, uploaded, etc) and the 2D byte array with the downloaded pieces. When the program is loaded again it loads the data through a hashmap, etc. and the byte array is loaded into memory ONLY if the file is not complete. This is what we still have to fix. When the file is complete, this condition is checked when the user quits and the byte array of pieces is dumped to the output file, but not the .dat file. What we came up with, when the program is loaded again and we are uploading to peers, when the peer requests a piece, we just scan the save file and extract the bytes for the piece from there. This works like a charm because we don't have to load all the pieces into memory, at most , 32KB*(num peers uploading to). Much better solution.

When the file is not done, and we are uploading to peers the pieces we have, we need to develop a way to get the pieces off of the disk. But we don't necessarily download the pieces in order, and we might not get all of the pieces with the current peers. If anyone can give us suggestions of how to dump a piece to disk once we get it, and be able to extract it upon request, given that the pieces do not come in order. And how to assemble the pieces in order when we dump it to the final file.

Sorry for the longwindedness! Thanks
 
Technology news on Phys.org
What all bittorrent clients I've seen do is pre-allocate the file on disk and then write the data to file when the chunk is completed. You can use fseek to offset the file pointer and write the chunk where you need to.

Also, I don't know if you know but there is a bittorrent library that you could have used:

http://libtorrent.rakshasa.no/
 
Last edited by a moderator:
Do you not know how to use the seek() function?

- Warren
 
dduardo said:
What all bittorrent clients I've seen do is pre-allocate the file on disk and then write the data to file when the chunk is completed. You can use fseek to offset the file pointer and write the chunk where you need to.

Also, I don't know if you know but there is a bittorrent library that you could have used:

http://libtorrent.rakshasa.no/


Thanks dduardo. I bet that would have been the solution we would come to . We are using the seek() function to extract the pieces from the 100% file, I just didn't think of pre-allocation, duh! I was thinking a solution of keeping track of the order of downloaded pieces, writing the pieces in the order downloaded, and rearranging them at the end. Your solution is definitely more natural (and easier.)

chroot, do I dare say that your reply approaching scorn? Funny feeling, is all.
 
Last edited by a moderator:
Dear Peeps I have posted a few questions about programing on this sectio of the PF forum. I want to ask you veterans how you folks learn program in assembly and about computer architecture for the x86 family. In addition to finish learning C, I am also reading the book From bits to Gates to C and Beyond. In the book, it uses the mini LC3 assembly language. I also have books on assembly programming and computer architecture. The few famous ones i have are Computer Organization and...
What percentage of programmers have learned to touch type? Have you? Do you think it's important, not just for programming, but for more-than-casual computer users generally? ChatGPT didn't have much on it ("Research indicates that less than 20% of people can touch type fluently, with many relying on the hunt-and-peck method for typing ."). 'Hunt-and-peck method' made me smile. It added, "For programmers, touch typing is a valuable skill that can enhance speed, accuracy, and focus. While...
I had a Microsoft Technical interview this past Friday, the question I was asked was this : How do you find the middle value for a dataset that is too big to fit in RAM? I was not able to figure this out during the interview, but I have been look in this all weekend and I read something online that said it can be done at O(N) using something called the counting sort histogram algorithm ( I did not learn that in my advanced data structures and algorithms class). I have watched some youtube...
Back
Top