Understanding The PDF File Format – Indexed Colorspaces

Indexed colorspaces are a very useful way of reducing the amount of memory and space needed when you only need a certain number of colours. It is best explained by an example.

Let’s image you have a image in CMYK colour. That means you need 4 bytes to describe the colour of a pixel. If the image had 4,000 pixels, that would be 16,000 bytes.

Now, lets imagine that the image only used 255 colors. Instead of storing each pixel value, we could have a table with 256 entries (that is 1,024 bytes storing the 4 bytes for each CMYK color used) and then just store 1 byte per pixel to say which entry in the table to use. That would be a total of 4,000 bytes (for the colours) and 1,024 bytes for the look-up table. That is 5,024 bytes (quite a reduction from the 16,000 bytes.

And if we do not need 256 colours, we could do even better by squashing the bits together. If we only had 2 colours, we would still need 8 bytes for the look-up table but only a bit (one eighth of a byte) for each pixel to say which colour to use – a total of 508 bytes for everything!

This is what an indexed colorspace does. It allows us to store a table of colours in a look-up table and store the index, not the color. This reduces substantially the size of data needed. It works best with fewer colours so images with a huge number of colours do not gain as well and could actually take up more memory. But for appropriate images, 16,000 bytes to 508 bytes is a substantial size reduction.

This article is part of a series of articles all about the PDF file format. You can read all the previous articles here.

Comments

By Mark Stephens

Mark Stephens runs IDRsolutions, developing the JPedal PDF library in Java, and shares his thoughts on Java, PDF, the Business of Software and Mediaeval History at http://www.jpedal.org/PDFblog .

Leave a comment