The Tickle Trunk
Guide to JPEG-89 Compression
This page contains an applet which demonstrates JPEG-89 compression, including the rarely used Hierarchical variant. The applet will make the most sense if you are already familiar with the technical details of JPEG-89 compression. This is discussed briefly below, but you may find it useful to refer to the Wikipedia article on JPEG for more detail.
The applet requires that your browser support at least the Java 2 runtime. If the applet above doesn't work properly, this is almost certainly the problem. You can download the latest Java runtime from Sun to correct this.
JPEG-89 compression is lossy, meaning that the reconstructed image has less information than the original — but hopefully the missing information is stuff you wouldn't have noticed anyway. Two aspects of the human visual system are taken advantage of: the fact that colour information is less important than brightness, and the fact that we tend not to notice the details of the parts of an image that are "busy."
To take advantage of the first aspect, the compressor first separates the image into three "channels". The first channel (Y) contains the brightness information (it is like a black and white version of the original picture). The other two channels (Cb and Cr) contain the colour information. When we get to the actual compression stage, the compression will be applied more heavily to the colour channels (they will lose more information). When the image is decompressed, the reconstructed Cb and Cr planes will be rather poor imitations of what they originally looked like— but we won't notice because the Y channel is still a close match, and that's what our brains really pay attention to.
The second aspect takes a little more work, and sounds a lot like magic in the space I have to talk about it here. First, what do we mean by "busy"? What we mean is high frequency, which means the image changes a lot over a short distance. Plaid is a high frequency fabric. Hospital scrubs are not. So the details of your plaid shirt tend to meld together, and if a stripe is missing here or there, you won't notice from a distance. In order to find the high frequency bits in our picture, we first break it up into little squares (each square is arbitrarily 8x8 pixels). We then use some legerdemain called the 2D Discrete Cosine Transform (DCT) to transform the information in each square from the spatial domain to the frequency domain. The spatial domain is the one you know and love. In this domain, the information in the upper left corner of the square is the colour or brightness of the upper-left corner of that part of the image, while the information in the lower-right corner of the square is the colour or brightness of the lower-right corner of that part of the image. But in the frequency domain, the information that ends up in the upper-left corner of the square is the low frequency stuff, and the information in the lower-right corner is the high frequency stuff. If you're not used to such transformations, that will sound like crazy talk. Don't worry, the applet will make it a bit easier to see what that means. The point is, we can lose a lot of the information in the lower-right corner of the frequency domain (and a little of the information in the upper-left). When we decompress the image, we will reconstruct that information (imperfectly), then apply the inverse DCT transform to change back from the frequency domain to the spatial domain, and hey presto, you've filtered out the high frequency information.
With that bit of background in mind, we can see it in action in the applet and get a better understanding. Following the steps in the applet:
June 23, 2003 — Updated January 01, 2011