Guide to JPEG-89 Compression
Send Feedback Home Page > The Tickle Trunk > JPEG 89 Compression
This page contains an applet which demonstrates JPEG-89 compression, including the rarely used Hierarchical variant. The applet will make the most sense if you are already familiar with the technical details of JPEG-89 compression. This is discussed briefly below, but you may find it useful to refer to the Wikipedia article on JPEG for more detail.
The applet requires that your browser support at least the Java 2 runtime. If the applet above doesn't work properly, this is almost certainly the problem. You can download the latest Java runtime from Sun to correct this.
JPEG-89 compression is lossy, meaning that the reconstructed image has less information than the original — but hopefully the missing information is stuff you wouldn't have noticed anyway. Two aspects of the human visual system are taken advantage of: the fact that colour information is less important than brightness, and the fact that we tend not to notice the details of the parts of an image that are "busy."
To take advantage of the first aspect, the compressor first
separates the image
into three "channels". The first channel (Y) contains the brightness information
(it is like a black and white version of the original picture). The other two
channels (Cb and Cr) contain the colour information. When we get to the actual
compression stage, the compression will be applied more heavily to the colour
channels (they will lose more information). When the image is decompressed, the
reconstructed Cb and Cr planes will be rather poor imitations of what they
originally looked like— but we won't
notice because the Y channel is still a close match, and that's what our
brains really pay attention to.
The second aspect takes a little more work, and sounds a lot like magic in the
space I have to talk about it here. First, what do we mean by "busy"? What
we mean is high frequency, which means the image changes a lot over a
short distance. Plaid is a high frequency fabric. Hospital scrubs are not.
So the details of your plaid shirt tend to meld together, and if a stripe is
missing here or there, you won't notice from a distance. In order to find
the high frequency bits in our picture, we first break it up into little squares
(each square is arbitrarily 8x8 pixels). We then use some legerdemain called the
2D Discrete Cosine Transform
(DCT) to transform the information in each square
from the spatial domain to the frequency domain. The spatial domain is
the one you know and love. In this domain, the information in the upper left
corner of the square is the colour or brightness of the upper-left corner of
that part of the image, while the information in the lower-right corner of the
square is the colour or brightness of the lower-right corner of that part of the
image. But in the frequency domain, the information that ends up in the
upper-left corner of the square is the low frequency stuff, and the information
in the lower-right corner is the high frequency stuff. If you're not used
to such transformations, that will sound like crazy talk. Don't worry, the
applet will make it a bit easier to see what that means. The point is, we
can lose a lot of the information in the lower-right corner of the frequency
domain (and a little of the information in the upper-left). When we decompress
the image, we will reconstruct that information (imperfectly), then apply the
inverse DCT transform to change back from the frequency domain to the spatial
domain, and hey presto, you've filtered out the high frequency
information.
With that bit of background in mind, we can see it in action in the applet and get a better understanding. Following the steps in the applet:
Return to Home Page Return to Tickle Trunk Send Feedback
June 23, 2003 — Updated January 10, 2010