Best way of copying same data to several drives?

  • Thread starter ORF
  • Start date
  • Tags
    Data
In summary, using multicasting to send the data to the network once and have it received by all waiting clients is the fastest solution. This will require a separate PC for each copy. Multicasting is not limited in any way and can be used with any drive. 20 TB is a lot of data. It will take a long time no matter what.
  • #1
ORF
170
18
Hello

I would like to copy a huge amount of data (10-20TB) to 5-10 hdd (internal/external).

I thought about using the usual "cp/rsync" utilities, but it will take a long time.

After a quick search, I found this unix utility
http://ask.metafilter.com/260329/best-way-to-clone-my-hard-drive-to-multiple-external-drives-at-once
https://en.wikipedia.org/wiki/Mdadm#Non-RAID_configurations

Is there any limitation, something to worry about, or it can be used as "cp"?

Thank you for your time :)

Regards,
ORF.
 
Computer science news on Phys.org
  • #2
ORF said:
Hello

I would like to copy a huge amount of data (10-20TB) to 5-10 hdd (internal/external).

I thought about using the usual "cp/rsync" utilities, but it will take a long time.

After a quick search, I found this unix utility
http://ask.metafilter.com/260329/best-way-to-clone-my-hard-drive-to-multiple-external-drives-at-once
https://en.wikipedia.org/wiki/Mdadm#Non-RAID_configurations

Is there any limitation, something to worry about, or it can be used as "cp"?

Thank you for your time :)

Regards,
ORF.
I think the fastest way to copy data from one drive to many without special hardware is to use multicasting (supported by clonezilla), which sends the data to the network once and is received by all waiting clients (this would require a separate PC for each copy).
 
  • Like
Likes Jamison Lahman
  • #3
Have you found a good solution yet? You could always write a program that will do this. Maybe a command line tool that takes as it's first argument the folder to copy over and then all the destination drives. Shouldn't take more then a couple dozen lines of code.
btw. are all HDDs connected to the same PC? Are they all about equally fast, including the external ones?
Also where did you get 20TB HDDs from?
 
  • #4
Hello

@stoomart: thank you. I will try it :)
@DrZoidberg: No, I haven't found a good solution yet... The answers to your questions are: yes, all disks can be conneted to the same PC. No, the internal ones are much (3-4 times) faster than external ones.

Thank you very much for your time :)

Regards,
ORF
 
  • #5
ORF said:
No, the internal ones are much (3-4 times) faster than external ones.

In that case, you can run 3-4 copies at once. Your throughput is maximized when every drive is working at 100% capacity.

20 TB is a lot of data. A drive can run at about 1 Gb/s sustained, so it will take 2 days or so just to read every bit on the source drive, divided by however much parallelism you are able to achieve. If the output drives are 3-4 times slower, it will take a week. It's going to take a long time no matter what.
 
  • Like
Likes Jamison Lahman
  • #6
ORF said:
yes, all disks can be conneted to the same PC. No, the internal ones are much (3-4 times) faster than external ones.
I did some quick testing with the commands 'tar' and 'tee' for single read/parallel write, which seems ideal if all drives can be connected simultaneously. One thing I noticed was the writes only transferred as fast as the slowest drive, but as @Vanadium 50 mentioned, the best case scenario will still take at least a week with a single read from the source disk.
 
Last edited:
  • #7
I don't know what will happen with tee if one drive falls behind. I suspect it will not be pretty.
 
  • #8
Vanadium 50 said:
I don't know what will happen with tee if one drive falls behind. I suspect it will not be pretty.
Using the command 'iotop' during my test, I saw the other drives slow down so the same transer rate used to all drives.
 
  • #9
Yes, the tar/tee method sounds like a good solution. But maybe you want to do all the internal drives first and afterwards copy to the external ones.
 

1. What is the best way to copy the same data to multiple drives?

The best way to copy the same data to multiple drives is to use a disk cloning software or a file synchronization tool. This will ensure that all the drives have identical copies of the data without any errors or discrepancies.

2. Can I use the traditional copy and paste method to copy data to multiple drives?

While you can use the traditional copy and paste method, it is not recommended as it can be time-consuming and prone to errors. Using specialized software will save time and ensure accuracy.

3. What are the benefits of using a disk cloning software for copying data to multiple drives?

Disk cloning software allows for faster and more efficient copying of data to multiple drives. It also ensures that all the drives have identical copies, reducing the chances of errors or missing files.

4. Is it possible to copy data to multiple drives simultaneously?

Yes, it is possible to copy data to multiple drives simultaneously using specialized software. This can save time and effort, especially when dealing with large amounts of data.

5. Are there any precautions I should take when copying data to multiple drives?

It is important to ensure that all the drives are properly formatted and have enough storage space to accommodate the data. It is also recommended to make backups of the original data and to double-check the copies for accuracy.

Similar threads

Replies
2
Views
1K
  • Aerospace Engineering
2
Replies
35
Views
3K
Replies
5
Views
2K
  • Computing and Technology
Replies
1
Views
2K
Replies
152
Views
5K
  • Computing and Technology
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
4K
  • General Discussion
Replies
6
Views
4K
Replies
2
Views
3K
Back
Top