How can we efficiently manage memory and disk usage in a BitTorrent client?

  • Thread starter Thread starter Yoss
  • Start date Start date
  • Tags Tags
    File
AI Thread Summary
Efficient memory and disk management in a BitTorrent client is crucial, especially when handling large files. The proposed solution involves using a 2D byte array to store downloaded pieces, but this can lead to excessive memory usage. A partial solution has been developed where completed pieces are dumped to an output file, and a temporary .dat file is created for incomplete downloads. Suggestions include pre-allocating disk space and using the seek() function to write pieces directly to their respective locations on the disk. This approach minimizes memory overload and simplifies the retrieval of pieces when uploading to peers.
Yoss
Messages
27
Reaction score
0
My friend and I are developing a BitTorrent Client for our CS class, but we've run into a dilemma. If anyone can point us in the right direction we'll be thankful.

A little background info if you're unfamiliar with BT. The file that is to be downloaded from multiple peers is broken up into file_length/32K (+ 1) pieces (conventionally.) And typically BT clients do not choose the pieces in order to download (they use the rarest first algorithm.) That ought to be enough!

Anyway, the scheme we developed was to create a 2D byte array of size [num_pieces][piece_length]. So once we've successfully downloaded a piece from a Peer and checked the infohashes and such, we dump the piece into its position in the byte array. We realized the problem of seriously overloading memory because the array in the data part of memory. Obviously this is a terrible solution..imagine downloading a 2Gb file and it all being stored in system memory (or VM) until it is finished.

We developed a partial solution. Once the file is completed, we dump all the pieces to the output file. To note, we create a temporary .dat file when the user prematurely quits before the file is complete, we save meta info (e.g. amount downloaded, uploaded, etc) and the 2D byte array with the downloaded pieces. When the program is loaded again it loads the data through a hashmap, etc. and the byte array is loaded into memory ONLY if the file is not complete. This is what we still have to fix. When the file is complete, this condition is checked when the user quits and the byte array of pieces is dumped to the output file, but not the .dat file. What we came up with, when the program is loaded again and we are uploading to peers, when the peer requests a piece, we just scan the save file and extract the bytes for the piece from there. This works like a charm because we don't have to load all the pieces into memory, at most , 32KB*(num peers uploading to). Much better solution.

When the file is not done, and we are uploading to peers the pieces we have, we need to develop a way to get the pieces off of the disk. But we don't necessarily download the pieces in order, and we might not get all of the pieces with the current peers. If anyone can give us suggestions of how to dump a piece to disk once we get it, and be able to extract it upon request, given that the pieces do not come in order. And how to assemble the pieces in order when we dump it to the final file.

Sorry for the longwindedness! Thanks
 
Technology news on Phys.org
What all bittorrent clients I've seen do is pre-allocate the file on disk and then write the data to file when the chunk is completed. You can use fseek to offset the file pointer and write the chunk where you need to.

Also, I don't know if you know but there is a bittorrent library that you could have used:

http://libtorrent.rakshasa.no/
 
Last edited by a moderator:
Do you not know how to use the seek() function?

- Warren
 
dduardo said:
What all bittorrent clients I've seen do is pre-allocate the file on disk and then write the data to file when the chunk is completed. You can use fseek to offset the file pointer and write the chunk where you need to.

Also, I don't know if you know but there is a bittorrent library that you could have used:

http://libtorrent.rakshasa.no/


Thanks dduardo. I bet that would have been the solution we would come to . We are using the seek() function to extract the pieces from the 100% file, I just didn't think of pre-allocation, duh! I was thinking a solution of keeping track of the order of downloaded pieces, writing the pieces in the order downloaded, and rearranging them at the end. Your solution is definitely more natural (and easier.)

chroot, do I dare say that your reply approaching scorn? Funny feeling, is all.
 
Last edited by a moderator:
Dear Peeps I have posted a few questions about programing on this sectio of the PF forum. I want to ask you veterans how you folks learn program in assembly and about computer architecture for the x86 family. In addition to finish learning C, I am also reading the book From bits to Gates to C and Beyond. In the book, it uses the mini LC3 assembly language. I also have books on assembly programming and computer architecture. The few famous ones i have are Computer Organization and...
I have a quick questions. I am going through a book on C programming on my own. Afterwards, I plan to go through something call data structures and algorithms on my own also in C. I also need to learn C++, Matlab and for personal interest Haskell. For the two topic of data structures and algorithms, I understand there are standard ones across all programming languages. After learning it through C, what would be the biggest issue when trying to implement the same data...
Back
Top