Home » Questions » Computers [ Ask a new question ]

Windows file copy dialog: Why is the estimation so... BAD?

Windows file copy dialog: Why is the estimation so... BAD?

Asked by: Guest | Views: 376
Total answers/comments: 5
Guest [Entry]

"In short: the poor algorithms and the jumpy estimation is actually an implementation weakness.

Other tools like TeraCopy do a better job. I think it is not worth explaining why their implementation is not good. They will have noticed it and will improve.

What is difficult:

You have to take into account resource fluctuations (CPU/Network bandwidth/HDD speed mainly)
You need to extrapolate the time it'll take by predicting the behavior (what Windows file copy definitively does badly right now).
Make adjustments time over time to your original estimation (I mean small adjustments not like in the funny picture above!)

For this not only the amount of bytes but the amount of files to create play a role. If you have a million of 1KB files or thousand 1MB files the situation will be quite different because the former has the overhead of creating many many files. Depending on the filesystem used, this could take more time than actually transferring the data.

This dialog drove me mad also quite a couple of times:

On an older WinNT system, if you had a lot of small files to copy, it displayed the name and nice animation for each file slowing down the whole process to be practically unusable.

The modern Windows copy stuff is not much better:

To compute the amount of data to transfer it seems to make a lookup first (that is what I suppose it does) so it takes ages if you select many directories until it effectively starts to do the job.
Some built-in timeout impeaches big files to be copied (> about 60GB on my system). The pain is that it tell you that after having copied already more than 30GB over the network and this is lost bandwith and time because you have to restart from scratch!
Copy of files from one computer to another is damn slow for some reason. (I mean compared with the available network bandwidth, using other tools it is faster so it's not a computational limitation.)"
Guest [Entry]

"I am going to count to ten, 1....2....3....4 how many dots is it going to take to get to 10?

5.6.7 What about now? Do you take in to account all past dots between numbers and average it, do you only take the last 4 intervals and use that average, do you only look at the last interval?

You have the same problem with file transfers. The speed that the file transfers is not constant, it speeds up and slows down based on a lot of factors. The reason the number jumps around so much is Microsoft leaned toward the ""only count the last interval"" side of the spectrum.

There is nothing wrong with that side of the spectrum, it gives you more accurate ""seconds per second"" (one second in real time makes the counter go down by one second) but this causes the total ETA of the timer to jump around a lot.

A good example of the opposite side is 7-Zip when it is compressing. If the speed of the compression drops as it processes you can see that the ETA does not jump dramatically like a file transfer ETA, but it may take 2 to 3 real seconds before the timer ticks down one second (or it even may start counting up) until it stabilizes at the new speed."
Guest [Entry]

"The obvious reason is that the speed of the transfer varies over time, and so does the average, and so does prediction. To explain this to a non-tech friend, I've used an analogy involving travel by air. You're going to fly over the Atlantic. When you arrive with a taxi at the departing airport, your ETA is about two months. When you disembark at the arriving airport, based on your average speed so far, you will reach your friend's house in 5 seconds.

But you need to appreciate how much the speed can actually vary, even with what seems like a predictable scenario, like copying files within the same disk, or between two local disks. One of the new features I like in Windows 8 is the ability to graph the speed over time if you click ""more details"". If you don't have access to a Windows 8 machine, search images for Windows 8 copy dialog for a lot of examples. Many of them are fairly flat, but many of them are also disturbingly bumpy, to the point that you wonder whether the hard drive is actually healthy, when it dips to zero.

Some of these bumps are likely due to variations in file size—smaller fields yield more accesses, which slows things down, especially on a mechanical hard drive which has to seek by moving its read head—but some it might just be a cheap drive which stalls on the slightest touch to prevent damage to the platters.

There are better and worse ETA prediction algorithms, but for an accurate prediction, the computer would have to be all-knowing. The risk of trying to make the algorithm ""smart"" is that it might create new, unforeseen, cases where it's even more hilariously wrong."
Guest [Entry]

"The only way to know how long it'll take to compress a set of files is to compress them. Sometimes Windows' best guess is close, sometimes it's wildly wrong. The same is true of copying large numbers of files, as I'm sure you've noticed.

It's not so much a bug as a useless display of seldom-accurate information. The best way to fix it is to close your eyes. Ignore it. ;-)

Perhaps there's a program out there that can copy/compress files and make an alarm sound when it finishes. That would be truly useful. We could have a little nap while we wait for Windows to finish the housecleaning."
Guest [Entry]

"I think the reason was nicely explained in one of the comments of the blog post linked by Roald's answer:

It has a horrible estimate algorithm. There are no excuses. If has to copy 1000 1KB files and 10 1MB files it thinks it will be as busy with the 1 MB file as with the 1KB files.

The reason it gives such horrible estimates is that it's not well done. Obviously it can never be 100% precise but it could be much, much better."