Home » Questions » Computers [ Ask a new question ]

What is the difference between “Size” and “Size on disk?”

What is the difference between “Size” and “Size on disk?”

Looking at the properties for a Windows file I get two attributes, “Size” and “Size on disk,” and "Size on disk" is always larger.

Asked by: Guest | Views: 377
Total answers/comments: 2
Guest [Entry]

"Size is the actual size of the file in bytes.

Size on disk is the actual amount of space being taken up on the disk. They differ because the disk is divided into tracks and sectors, and can allocate blocks of discrete size.

For a more detailed explanation, see this text which I copied from another site:

We know that a disk is made up of Tracks and Sectors. In Windows that means the OS allocates space for files in ""clusters"" or ""allocation units"".

The size of a cluster can vary, but typical ranges are from 512 bytes to 32K or more. For example, on my C:\ drive, the allocation unit is 4096 bytes. This means that Windows will allocate 4096 bytes for any file or portion of a file that is from 1 to 4096 bytes in length.

If I have a file that is 17KB (kilo bytes), then the Size on disk would be 20.48 KB (or 20480 bytes). The calculation would be 4096 (1 allocation unit) x 5 = 20480 bytes. It takes 5 allocation units to hold a 17KB file.

Another example would be if I have a file that is 2000 bytes in size. The file size on disk would be 4096 bytes. The reason is, because even though the entire file can fit inside one allocation unit, it still takes up 4096 of space (one allocation unit) on disk (only one file can use an allocation unit and cannot be shared with other files).

So the size on disk is the space of all those sectors in which the file is saved. That means,usually, the size on disk is always greater than the actual size.

So the actual size of a file(s) or folder(s) should always be taken from the Size value when viewing the properties window.

Source: What's The Difference Between Size And Size On Disk In Windows Folder Properties."
Guest [Entry]

"Cluster Slack Space

You cannot access each individual byte on a storage medium separately. To do so would be terribly inefficient because the system needs some way of keeping track of which ones are used and which are free (i.e., a list), so doing so for each byte separately would create too much overheard (for each individual byte, i.e. 1-to-1, the list would be as big as the medium itself!)

Instead, the medium is broken up into chunks, blocks, units, groups, whatever you want to call them (the technical term is clusters), each of which contains a—consistent—number of bytes (you can usually specify the size of the clusters since different uses call for different sizes to reduce waste).

When a file is saved to disk, the size of the file is divided by the cluster size and rounded up if needed. This means that unless the filesize is exactly divisible by the cluster size, some of the cluster ends up being unused and thus wasted.

When you view the properties for a file, you see the true size of the file as well as the size it takes up on disk which includes any “slack”, that is, the “cluster tips” that are unused. This is usually not much per-file and the size on disk will usually be almost equal to the actual size, but when you add up the wasted space from all the thousands of files on a drive, they can add up. Therefore, when you view the size of a large folder, especially one with many tiny files that are smaller than a cluster, the size on disk (i.e., the amount of disk space marked as used) can end up being significantly larger than the actual size (i.e., the amount space the files actual require).

In a case like above, what you can try is to reduce the cluster size so that each file wastes less space. Generally, a drive with mostly lost of little files should use the smallest cluster size possible (to reduce waste) and a drive with mostly large files should use the largest cluster size possible (this way the bookkeeping structures end up being smaller).

Even at a lower level, if each cluster is only a single sector, unless a file is an exact multiple of the size of the sectors on the drive (usually 512 bytes traditionally, now often 4,096 with Advanced Format disks), then there will still be unused space between the end of the file and the end of the sector.

Compression

Another scenario where you might see a difference between the actual file size and size on disk is with compression. When a drive is compressed (e.g., using DriveSpace, NTFS compression, etc.) then there will be a difference between the size of the actual file (which needs to be know), and the actual size that the file occupies (i.e., uses or “takes up”) on the disk.

Shortcuts and Hardlinks

Yet another scenario that could result in a difference is with hardlinks. With file-systems that support hardlinks, when a duplicate file is created, instead of making a whole new file that takes up space for itself, the file-system creates a shortcut to the file so that both (or all three, etc.) copies point to the same physical file on disk. Therefore, when there are two files pointing to the same data, they each have the same size, but take up only slightly more than the space to store a single copy."