Home » Questions » Computers [ Ask a new question ]

Best way to copy millions of files between 2 servers

Best way to copy millions of files between 2 servers

I have roughly around 5 million small (5-30k) files in a single directory that I would like to copy to another machine on the same gigabit network. I tried using rsync, but it would slow down to a crawl after a few hours of running, I assume due to the fact that rsync has to check the source & destination file each time?

Asked by: Guest | Views: 236
Total answers/comments: 4
Guest [Entry]

"To copy millions of files over a gigabit switch (in a trusted environment) you may also use a combination of netcat (or nc) and tar, as already suggested by user55286. This will stream all the files as one large file (see Fast File Copy - Linux! (39 GBs)).

# requires netcat on both servers
nc -l -p 2342 | tar -C /target/dir -xzf - # destination box
tar -cz /source/dir | nc Target_Box 2342 # source box"
Guest [Entry]

"I prefer using lz4 as fastest compression tool at the moment. SSH option -c arcfour128 uses faster encryption algorithm than default. [1]

So directory transfer looks something like:

tar -c folder | lz4 -c | ssh -carcfour128 somehost 'lz4 -d | tar -x > folder'

Please note that on Debian lz4 command is lz4c and on CentOS it's lz4."
Guest [Entry]

"Robocopy is great for things like this. It will try again after network timeouts and it also allows you set an inter-packet gap delay to now swamp the pipe.

[Edit]

Note that this is a Windows only application."
Guest [Entry]

I know this may be stupid - but have you thought of just copying them onto an external disk and carrying it over to the other server? It may actually be the most efficient and simple solution.