Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

BitTorrent file saving problem

  1. Apr 10, 2006 #1
    My friend and I are developing a BitTorrent Client for our CS class, but we've run into a dilemma. If anyone can point us in the right direction we'll be thankful.

    A little background info if you're unfamiliar with BT. The file that is to be downloaded from multiple peers is broken up into file_length/32K (+ 1) pieces (conventionally.) And typically BT clients do not choose the pieces in order to download (they use the rarest first algorithm.) That ought to be enough!

    Anyway, the scheme we developed was to create a 2D byte array of size [num_pieces][piece_length]. So once we've successfully downloaded a piece from a Peer and checked the infohashes and such, we dump the piece into its position in the byte array. We realized the problem of seriously overloading memory because the array in the data part of memory. Obviously this is a terrible solution..imagine downloading a 2Gb file and it all being stored in system memory (or VM) until it is finished.

    We developed a partial solution. Once the file is completed, we dump all the pieces to the output file. To note, we create a temporary .dat file when the user prematurely quits before the file is complete, we save meta info (e.g. amount downloaded, uploaded, etc) and the 2D byte array with the downloaded pieces. When the program is loaded again it loads the data through a hashmap, etc. and the byte array is loaded into memory ONLY if the file is not complete. This is what we still have to fix. When the file is complete, this condition is checked when the user quits and the byte array of pieces is dumped to the output file, but not the .dat file. What we came up with, when the program is loaded again and we are uploading to peers, when the peer requests a piece, we just scan the save file and extract the bytes for the piece from there. This works like a charm because we don't have to load all the pieces into memory, at most , 32KB*(num peers uploading to). Much better solution.

    When the file is not done, and we are uploading to peers the pieces we have, we need to develop a way to get the pieces off of the disk. But we don't necessarily download the pieces in order, and we might not get all of the pieces with the current peers. If anyone can give us suggestions of how to dump a piece to disk once we get it, and be able to extract it upon request, given that the pieces do not come in order. And how to assemble the pieces in order when we dump it to the final file.

    Sorry for the longwindedness! Thanks
  2. jcsd
  3. Apr 10, 2006 #2


    User Avatar
    Staff Emeritus

    What all bittorrent clients i've seen do is pre-allocate the file on disk and then write the data to file when the chunk is completed. You can use fseek to offset the file pointer and write the chunk where you need to.

    Also, I don't know if you know but there is a bittorrent library that you could have used:

    http://libtorrent.rakshasa.no/ [Broken]
    Last edited by a moderator: May 2, 2017
  4. Apr 10, 2006 #3


    User Avatar
    Staff Emeritus
    Science Advisor
    Gold Member

    Do you not know how to use the seek() function?

    - Warren
  5. Apr 10, 2006 #4

    Thanks dduardo. I bet that would have been the solution we would come to . We are using the seek() function to extract the pieces from the 100% file, I just didn't think of pre-allocation, duh! I was thinking a solution of keeping track of the order of downloaded pieces, writing the pieces in the order downloaded, and rearranging them at the end. Your solution is definitely more natural (and easier.)

    chroot, do I dare say that your reply approaching scorn? Funny feeling, is all.
    Last edited by a moderator: May 2, 2017
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook