Why are there two types of GB?
Why is a KB 1024 and not 1000 bytes?
SI Decimal Units:
1 KB = 1,000 bytes
1 MB = 1,000 KB = 1,000,000 bytes
1 GB = 1,000 MB = 1,000,000,000 bytes
1 TB = 1,000 GB = 1,000,000,000,000 bytes
These are what we call base 10 measurements. Some refer to them as 10^x measurements.
1 KiB = 2^10 bytes = 1024 bytes
1 MiB = 2^20 bytes = 1024 KiBs = 1,048,576 bytes
1 GiB = 2^30 bytes = 1024 MiBs = 1,073,741,824 bytes
1 TiB = 2^40 bytes = 1024 GiBs = 1,099,511,627,776 bytes
These are referred to as base 2 measurements. Some refer to them as the 2^x measurement or the 1024 base measurement.
So here is where it gets tricky. A person might ask “if a disk is listed as being 1 kilobyte, how many bytes does this really represent?" He or she most likely will rely on the historical use of these prefixes (Kilo-, Mega-, Giga-, Tera, Peta, Etc.), thus his or her logic may go something like the following, “1 kilogram is 1000 grams and a kilometer is 1000 meters, so I deduce that 1 kilobyte would be 1000 bytes." But that logic isn’t necessarily correct. This is because, in general, it is very common for storage to list a capacity in GB of storage, when really it is actually in GiBs. Whether you’re working in a data center or buying a hard drive at Best Buy, this ambiguity plagues the storage industry with confusion. When looking at storage capacity sizes listed in MB, GB, TB, PD, and on up, you don’t know for sure, if you are looking at MiB, GiB, TiB, PiD, etc. or MB, GB, TB, PD, etc. When looking at any capacity measurements other than bytes you must always ask yourself the following question “Which system of measurement is being used to measure the bytes listed?" For instance, can you answer if your operating system uses GBs or GiBs? When you use Windows, Linux or OS X to look at your hard drive’s size, it is listed in GB. Is this the SI standard of measurement? The answer is that your operating system uses IEC units but labels them as SI units. It should be listed as GiBs not GBs. To understand the reason for this we have to go back into history.
So here is where the mystery starts. If the IEC units were created in 1999, why were KBs listed as being 1024 bytes long before the existence of the IEC standard? Why weren’t they 1000 bytes long from the beginning? Since all they had back then were SI units, why wouldn’t they use them correctly? This was because of how computers are physically built. When it comes to actual bytes in memory there is a reason why the total quantity is never a power of ten. This has to do with what we call addressable space and the nature of binary numbers. Hard drives historically have been slightly more confined by the limit of addressable space than available physical size. For example, If you have a 3 bit computer then you have a total of 8 (2^3) addresses available. No matter how much storage you have, you would only be able to address 8 bytes. This is because memory addresses in computers are a certain number of bits wide. Computers count in binary and therefor the max number they can count to will never be a multiple of ten. For instance, the old 6502 microprocessor had memory addresses 16 bits wide. With 10 bits, for instance, a computer can address 210 bytes which equals 1024 memory locations. With 16 bits, a computer can address 216 memory locations.
Let us suffice to use the metaphor of addresses on a street of vacant lots. If the city grants you 1024 addresses and you have the space, then you’re going to build on every lot. The way a computer generates addresses means that you will never have a total amount of addresses that are equal to 10^x. They could have squandered those 24 addresses and just built 1000 houses, but that would be a waste. In the old days memory was at premium, so every addressable byte was used. Back then you weren’t going to squander bytes just to conform to SI measurements. If for the first disk you have to use 1024 bytes and you don’t have the IEC standard then you are going to use the kilobyte (KB) because it is close enough. This is why the KB was 1024 bytes in length before the KiB was created and is why it was used incorrectly from the beginning.
If this was a requirement of the computer then why weren’t the IEC units created earlier? Equally they could have called a 1KB disk back then a 1.024KB disk, but I suppose that would have just sounded way overly precise and nerdy. Historically we were already in the habit of using the SI prefix kilo. It made sense to approximate and so this became the standard in operating systems and all the software that was built on them. When drives became bigger and we moved to the megabyte, we should have invented mebibytes because the round off error was no longer acceptable, but because of the strength of our historical habit of using kilo, mega, etc., these prefixes stuck.
We continue this round off with each order of magnitude today. Though this round off error is negligible at small sizes, it gets more and more significant as storage device capacities increase. Here is a quick reference to show the amount that the 2^x sizes differ compared to the size based on how your OS calculates size (See section titled Size Converted):
· 1 Megabyte in the OS (MiB), has 48,576 Bytes more than the SI MB of 1,000,000 bytes
· 1 Gigabyte in the OS (GiB), has 73,741,824 Bytes more than the SI GB of 1,000,000,000 bytes
· 1 Terabyte in the OS (TiB), has 99,511,627,776 Bytes more than the SI MB of 1,000,000,000,000 bytes
This is perhaps, why it took until 1999 for the IEC to invent the new standard. Because it wasn’t until then that this round off error became so huge. They needed a clean unit of measurement to describe these quantities in round numbers that was accurate and thus the KiB (and larger) was born around the time we started to see the gigabyte becoming widely spread. But it was too late. Too much software had been written using the incorrect SI units. What was the world to do, go back and add lowercase “i"s between all the KBs and GBs across billions of lines of code? Those computer scientists back then didn’t know we would achieve such incredible mass storage systems, some didn’t even think it was possible. So it is understandable that if they didn’t have the IEC measurements they wouldn’t know what units to use, so they would think an approximation of the SI units would be acceptable. Today, the SI units are used correctly mostly in the hardware industry where it is used to make hard drives seem bigger than they really are. When you correctly use the SI measurements 1 TB goes to 1.0995 TB. As far as the IEC standard, some manufacturers of memory list it in both GBs and GiBs for clarification. In the software industry the incorrect use of the SI standard is still very entrenched. As far as storage arrays (servers that specialize in mass storage) go, some report GB as 10^x , some as 2^x and some as both. Beware of the GB for it is not always what it seems.