What is A "Color Space?"
A “color space” is a way of specifying a color numerically,
usually as a triplet of numbers representing positions in a
three-dimensional “space” of color. Color spaces are
three-dimensional because our eyes have three different kinds
of color-sensitive cells (called “cone cells” or “cones”), and
thus every color space in one way or another must encode three
different color intensities. Most people are at least a bit
familiar with the way images are formed on a computer monitor
or television by combining red, green, and blue dots of
varying brightness to form a wide range of colors. That method
uses the most common kind of color space, the “RGB”
space, named for the colors Red, Green, and Blue.
As it turns out,
there is not just one RGB color space. There are an infinite
number of RGB color spaces, created by varying several
parameters, including the specific hue of red, green, and/or
blue to be used for the colored dots in the display, the hue
of white used, and the specific way the brightness of the dots
in the display varies as the numbers fed into the display
vary.
(It’s worth
noting at this point that the RGB spaces we use all the time
in video and computer displays are generally labeled “R’G’B’”
(pronounced “R-prime, G-prime, B-prime”) by color scientists
because they are “gamma-corrected” spaces. Color scientists
reserve “RGB” to refer to non-gamma-corrected (or “linear”)
spaces that use red, green, and blue primaries. This
distinction is not as commonly used in the video or computer
world, and is beyond the purview of this article. Just know
that in this article, when we talk about “RGB” we mean R’G’B’
in the language of color science. We’ll talk about this
further in a future article about gamma.)
You might well
ask, “If the way you specify colors is via the percentages of
R, G, and B, and different color spaces use different primary
colors of Red, Green, and/or Blue, how do you specify the
specific primary colors of Red, Green, or Blue used to define
the space?” The answer is that you use a fundamental color
space, called
XYZ, or more formally CIE
XYZ. XYZ is a color space that is derived from basic studies
of how the eye and brain sense color. It is notable for being
an “absolute” color space (meaning that colors are specified
directly, not by reference to other colors), and for being
able to represent any possible real visible color that a human
being can sense. RGB, by contrast, is a “relative” color
space, where the colors are specified relative to three
“primaries,” which are the colors of red, green, and blue used
in that particular space.
There are
additional color spaces that can represent any color, almost
all derived from XYZ, including “xyY,”
“CIELUV”,
and “CIELAB”.
But displays always use some form RGB as their fundamental
color space, for the simple reason that real-world displays
can’t show all colors. They can only show colors that can be
mixed from their specific RGB primaries, so it’s not useful to
send them colors they can’t display.
In the world of
high definition
video, there is one very common RGB space specified in an ITU
standard called
BT.709, or sometimes Rec. 709
(for “Recommendation number 709”). It specifies (in an
absolute space) the specific colors of red, green, and blue
that must be used in a conforming display, and what color of
white the display needs to produce when all three primaries
are at full brightness. There is no current standard for
display gamma, which is how the brightness of each pixel
varies as the input voltages or digital values vary, but there
is a common understanding based on using the gamma of the CRTs
used in video mastering.
Video and
Y'CbCr
Given that
video displays are fundamentally RGB devices and all share a
common RGB color space, specified in BT.709, you’d expect that
the primary color space used to transmit and store video would
be BT.709 RGB. But in fact, even though video cameras
physically measure RGB values, and displays are made using RGB
primaries, video is stored, transmitted, and processed in a
color space called Y’CbCr, or sometimes informally “YUV.”
Y’CbCr is the latest
version of a set of color spaces that were developed in the
early days of color television. The broadcasters and the FCC
wanted to make color television backward compatible with
black-and-white television, so all the people who owned
black-and-white televisions wouldn’t find them obsolete when
color broadcasting started. Unfortunately, there wasn’t enough
room to broadcast both a full-color signal and a
black-and-white signal within the frequency band owned by a
single television station. It was necessary to find a way to
send both a compatible black-and-white signal and a color
add-on signal that could be combined with the black-and-white
signal to produce a full color signal.
Since there was very
little room in the frequency band for even a color add-on
signal, it was necessary to make the color add-on very low
resolution. This worked out OK because your eyes are much less
sensitive to color resolution than to brightness resolution.
Another way of looking at it is that the viewer’s perception
of how sharp the picture is depends mostly on the main
black-and-white signal, with the extra color signal adding
almost no additional sharpness. Thus the color signal can be,
in effect, a somewhat rough and blurry overlay.
The main
black-and-white signal is carried in a single channel called
Y’ (pronounced Y-prime), and the low-resolution color signal
is carried in two channels, labeled Cb and Cr, also called
“color difference” signals, because they are derived from B-Y’
and R-Y’. Y’ itself is a weighted combination of R, G, and B,
using
specific weights that are
designed to make Y’ approximate perceived brightness.
Y’CbCr is a handy
color space for storing and broadcasting video, because the Y’
signal can be stored or sent at very high resolution, and Cb
and Cr can be stored or sent at low resolution without causing
the final image to look significantly worse. In effect it’s a
very simple
lossy compression scheme, throwing away portions of the
image that are less important for perception (the detailed
color information) in order to devote more resources to the
important stuff (the black-and-white details).
As with RGB, there are a potentially
infinite number of possible Y’CbCr color spaces, varying
primarily in the weights of R, G, and B
that are combined
to form the Y’ signal. Luckily, there is a standard for
high-definition video, the aforementioned BT.709, which gives
specific mathematical functions for converting RGB to and from
Y’CbCr. (Standard definition video uses a different standard,
BT.601, but it's becoming less and less relevant as more
content is being produced in HD or upconverted to HD.)
Subsampling
In the old
days of color TV, the Cb and Cr channels (which just to be
precise weren’t called “Cb” and “Cr” at the time, but that’s
not important to this discussion) were reduced in resolution
via an analog lowpass filter, which stripped out detail and
allowed the color signal to fit in the tiny amount of
broadcast bandwidth available for the extra color information.
But in the digital era, the Cb and Cr signals are reduced in
resolution via the simple expedient of scaling them down to a
smaller number of pixels.
The process of
scaling the color portions of an image to a lower resolution
is called “subsampling,” and scaling the color back to the
original resolution is called “upsampling.” Either one can
also be called “resampling.” All of these operations are
identical in practice to scaling the color channels, just like
scaling an image from one pixel size to another. There are a
variety of different resolutions that can be stored or sent,
and we often think of these various color resolution options
as a different color space. This isn’t strictly true, as
technically speaking the color space remains the same no
matter how the color channels are scaled, but it’s still
relatively common to speak of changing color spaces when one
is actually changing color subsampling modes.
 |
4:2:0 - Progressive*
|
 |
| 4:2:0 - Interlaced* |
The subsampling
format that is used on modern consumer video delivery media
like Blu-ray Disc and DVD is called 4:2:0. This confusing
designation means that for every 4 Y’ pixels on the even scan
lines, there will be 2 Cb and 2 Cr pixels and for every 4 Y’
pixels on the odd scan lines there will be 0 Cb and Cr pixels.
If you work all that out, it really means that the Cb and Cr
portions of the image are scaled by ½ in both dimensions. So
if the resolution of the overall image is 1920x1080, for
example, the Cb and Cr portions of the image will be at
960x540 resolution. In order to display an HD image that is
stored in this format, the Cb and Cr channels need to be
scaled back up to 1920x1080 by interpolating values for the
missing pixels, and then each pixel can be converted to RGB
using the Y’CbCr->RGB algorithm specified in BT.709.
 |
| 4:2:2* |
For professional
video, the most common format is
4:2:2, which means for every
4 Y’ pixels, there are 2 Cb and Cr pixels on the even lines
and 2 Cb and Cr pixels on the odd lines. Again, it really
works out to scaling the Cb and Cr by ½ in just the horizontal
direction, but leaving the vertical unchanged. So each
1920x1080 image in 4:2:2 has the Cb and Cr stored at 960x1080.
To display an image stored in this format, the Cb and Cr
channels only need to be scaled horizontally.
 |
| 4:4:4* |
Finally there is
4:4:4. This just means that
there are an equal number of Y’, Cb, and Cr pixels, with no
subsampling at all. This format is expensive to store, so is
only used for storage of very high-end professional master
video. But it is often available as an output format from a
player or video processor.
Given that all
the various shiny-disc and broadcast video formats use 4:2:0
natively, one might assume that players would just send the
video to the display in that format. But as it turns out,
video players are basically required to at minimum convert the
video to 4:2:2 in order to send it to the display, because
there are standards for storing 4:2:0, but no standards for
sending it to a display. While only 4:2:2 is required, many
players now also offer the ability to go further and convert
the video to 4:4:4, or even RGB.
And now we get to the meat
of this guide. What format should you set your player to
output? If you have a video processor, what format should you
feed it, and what format should you have it produce? Or does
it even matter?
The answer, as
with so many other things in life, is, “It depends.”
The
Conversion Chain
Let’s
consider the process necessary to get video off a shiny disc
(or from a digital broadcast or cable channel). First the
video needs to be converted from 4:2:0 to 4:2:2, then to
4:4:4, then to RGB, and finally it can be fed to the display
controller. This is the same process no matter what display
technology is being used, whether LCD, DLP, plasma, or CRT.
It’s possible to shortcut the process slightly by going
directly from 4:2:0 to 4:4:4, but in practice this isn’t used
as often as you’d think.
If you choose to
output 4:2:2 from your player to the display, then the display
will need to do the scaling to 4:4:4 and then to RGB. If you
output 4:4:4 to the display, the display will not need to do
any scaling at all, but will need to do the conversion to RGB.
If you output RGB to the display, then the display can avoid
all conversion steps and send the signal right to the
controller. No matter which you choose, the same conversion
steps are still happening; all you are choosing is which
device is performing the conversion.
There’s no
specific reason that a display or a player would be the
optimal place to do these conversion steps. In theory doing
the conversion in the display minimizes the amount of data
that has to flow across the HDMI link, but in practice HDMI is
more than adequate to handle any format all the way up to
4:4:4 or RGB.
So the key to
choosing the right color space to output is finding out which
device does a better job of converting color spaces. This is
not always easy to evaluate, and it’s quite possible for one
device to do a better job in one area, like 4:2:2 to 4:4:4,
but do worse in another area, like 4:4:4 to RGB.
You’d think that
if a display handles a 4:2:2 input signal well, then feeding
it an RGB signal would be no worse, but in fact some displays
do extra work when they are fed RGB, because they convert the
signal back to 4:4:4 or even 4:2:2! This happens because one
or more of their internal processing chips is designed only
for one color format. So for these displays, sending in any
format other than the one it will use for internal processing
will only add extra processing and potentially degrade the
image.
The same logic
applies to video processors. If the processor does all its
work in 4:2:2, there’s no advantage to sending it RGB or
4:4:4, and in fact there may be a disadvantage.
Unfortunately
device makers tend not to reveal the exact processing steps
they use internally, or the algorithms they use to convert
various color spaces to RGB. Some use different algorithms
depending on which color space is fed in. The bottom line is
to assume nothing, and test every combination.
Performing The Evaluation
Here’s how you can decide which color space mode works best
with your display and player combination, using the Spears &
Munsil High Definition Benchmark. If you change any component
in your system, either player, display, or processor, you’ll
want to run the evaluation again. There’s no easy way to
predict what will produce the best results with a particular
combination of components.
Scoring Form
We’ve helpfully provided a
PDF file you can download and print
out, so you can try all the output modes from your player and
evaluate which one works best.
Before you start, you’ll want to check that the settings for
brightness, contrast, color, tint, and sharpness are all
calibrated properly for each of the color space modes. Some
displays have separate memories for every input mode, so you
might find that even if the display is adjusted properly when
it is being fed 4:2:2, the settings change when it gets a
4:4:4 or RGB input signal.
Start by setting the output on the player to 4:2:2. Run
through the basic calibration steps for brightness, contrast,
color, and tint (using the articles on the Spears & Munsil web
site as a guide). Then switch the output on the player to
4:4:4 and run through the calibration again. You may not need
to adjust anything. If your player has RGB mode, do the
calibration again for that mode. If you have even more modes,
you may need to print out more forms and write in the names of
the other modes you want to compare.
For the color temperature and/or gamma adjustments, unless you
have special test equipment you won’t be able to calibrate
these settings. Just make sure that these settings are set the
same for all the color space modes. If your player only has
one set of settings that is correct for all of the color space
modes (which is the most common case), you don’t need to fill
in this section. Just do the calibration once and then make
sure it continues to work in the other picture modes.
Once you are sure that you have the correct settings for each
input picture mode, run through the various tests putting a
check in the box for pass, and leaving the box
unchecked for fail. When you’re done, hopefully one
mode will have the most boxes checked, and most of the time
that will be the preferred mode to use. In some cases, you may
find that one specific issue is more distracting for you than
the others, and in that case you’ll want to choose among the
modes that doesn’t have that particular problem.
If you can select modes in both your player and your video
processor, our recommendation is to start by trying the
various modes in your processor, leaving the player in factory
default mode. Choose the output mode that scores best and set
the processor in that mode, then move to the player and
evaluate all the various modes the player can produce. If you
end up changing the player’s output mode, you may want to
return to the processor to re-evaluate in case the input mode
affects the processor’s output. If you want to be completely
comprehensive, you may want to try every possible combination
of player and processor mode separately.
 |
Player -> Display
|
 |
Player -> Receiver ->
Display
|
Important note: If you are running the HDMI
signal through a receiver or switcher and find problems,
especially with clipping, you should try taking the receiver
or switcher out of the chain and connecting the player
directly to the display to see if that fixes the problem.
There are several receivers, switchers, and video processors
that will clip the signal passing through them, even if they
aren’t doing any processing of the image. Also check the web
sites of the manufacturer of your receiver or switcher to see
if there is a new firmware, as this might correct some or all
of the errors.
Let’s take a look at the various test patterns you’ll want to
look at and what to look for in each:
Chroma Alignment
This pattern contains shapes in various color combinations
that are designed to show any misalignment between the chroma
channels and the luma channel. These misalignments can be
caused by mistakes or shortcuts in the chroma upsampling, and
it’s not uncommon to find that changing the format sent from
the player to the display changes the amount of chroma
misalignment.
The primary things to look at are the long thin diamond shapes
on the left, right, top, and bottom of the screen. Each of
them has a single straight line of chroma pixels laid on top
of a long skinny diamond in the luma channel. When the
alignment is correct, the chroma should be centered on the
diamond, and the diamond should look completely symmetrical.
Most people find it easiest to see the alignment clearly
against the gray background. The difference can be quite
subtle, on the order of a half-pixel shift.
You can also often see difference (if any) in the chroma
upsampling algorithm. Nearest neighbor will have very sharp
chroma transitions, but will have a half-pixel shift to the
right in the chroma channel. Bilinear and bicubic will produce
a softer, but more accurate, chroma channel, with smoothly
rolled-off edges. Don’t be fooled by the sharp look of nearest
neighbor; on this pattern it often looks sharper, but it will
make the finished image look jagged. See the example image to
see what the various upsampling approaches tend to look like.
Put a check in the row labeled “Alignment correct” for any
mode where the chroma lines are centered in their diamonds. If
multiple modes have properly aligned chroma, put a check for
all of them. If none of them are properly aligned, put a check
for the mode that is the closest to correct, or for none of
them if none of them are close to correct.
Also compare the image on screen to our example images for the
various chroma upsampling approaches. If you're not sure what
kind of upsampling is being used, you may want to look at the
bursts and zone plate patterns and compare them to our samples
as well. If you don't see clear stairstepping in any of those
patterns, it's reasonable to assume that the upsampling is
using bilinear or better.
 |
Chroma Alignment
|
Chroma Multiburst
This pattern has ten horizontal bursts in two rows of five on
top, and ten vertical bursts in two rows of five on the
bottom. The horizontal bursts show how well the video playback
chain is reproducing horizontal chroma resolution, and the
vertical bursts show how well the video playback chain is
reproducing vertical chroma resolution.
For this pattern, look at the highest-frequency bursts, which
are on the lower right of both the horizontal and vertical
sections. They should have clear, bright colors that look
identical to the colors in the other bursts. If the colors are
muted, or the burst looks solid gray or any other color, it
shows that chroma resolution is being lost during one of the
upsampling conversions. If the horizontal burst is muted, that
shows a problem in the 4:2:2->4:4:4 conversion. If the
vertical burst is muted, that shows a problem in the
4:2:0->4:2:2 conversion.
Another thing that’s fairly easy to tell from this pattern is
the quality of the chroma upsampling being done. If the chroma
upsampling is being done using an algorithm called “nearest
neighbor” then each chroma pixel is just being copied four
times to make the new upscaled chroma image. This is fast and
easy, but produces blocky, jagged color contours in the final
image. Bilinear upsampling uses a linear interpolation
algorithm to create the replacement pixels when it scales up
the chroma channel, and looks much better. Bicubic upsampling
uses two cubic interpolation curves to produce a very smooth
and clean chroma channel, and is generally considered the best
commonly used algorithm. Take a look at our sample image to
get an idea of how this pattern will vary when upsampled with
different algorithms.
Put a check in the row labeled “High-frequency detail” for all
modes that have clean, bright, colorful high-resolution chroma
bursts. If no modes have good bursts, put a check for the mode
that has the best-looking ones.
Put a check in the row labeled “Upsampling bilinear or Better”
if the upsampling is clearly something better than Nearest
Neighbor.
 |
Chroma Multiburst - Low Frequency Burst
|
 |
Chroma Multiburst - High Frequency Burst
|
 |
Chroma Multiburst - Multiple Conversions
|
Chroma Zone Plate
This pattern is a good “at a glance” pattern to see problems
and issues with the chroma channels. It has diagonals and
high-frequency details that make it possible to see quickly if
there are problems with the chroma, once you know what the
pattern should look like.
This pattern should have clean, clear colors all the way to
the edge of the image, with no obvious loss of color intensity
in the corners. If the corners are not as colorful as the
center, or look like a solid color, that shows loss of the
highest chroma resolution.
The other thing to look at is the amount and strength of the
moiré in the pattern. Moiré is a visual impression of false
curves caused by aliasing of the true curves in the image. It
looks like an optical illusion of sorts, showing concentric
circles on the left and right sides of the screen, similar to
the actual concentric circles radiating out from the center.
There is always a bit of moiré in this pattern even when
everything is perfect.
After viewing this pattern with all of the different output
modes selected sequentially, put a check in the row labeled
“Minimal moiré” for the mode that has the smallest amount of
moiré. If all of them have the same amount of moiré, put a
check in all of them.
Put a check in the row labeled “Corner detail” for all of the
modes that show the corners of the image at full intensity
with no falloff in brightness or colorfulness.
 |
Chroma Zone Plate -
Minimal Moiré
|
 |
Chroma Zone Plate - Corner
Detail
|
 |
| Chroma Zone Plate |
Chroma Upsampling Error
This pattern was designed to test for the Chroma Upsampling
Error in MPEG decoders, but it’s also useful for checking the
smoothness of chroma upsampling. It has diagonal chroma bursts
in the bottom two rows that make it easy to see any jaggedness
or stairstepping in the chroma channel. As you choose
different output modes from the player, if there is a big
difference between the quality of the upsampling algorithms,
you’ll see the diagonals vary between smooth and jagged. The
best quality upsampling will generally produce the smoothest
diagonals on these lines.
After viewing this pattern with all of the different output
modes selected sequentially, put a check in the row labeled
“Diagonals smooth” for the mode that has the smoothest-looking
diagonal lines. If they all look the same, put a check in all
the boxes. You might also want to look at the diagonals and
curves in the Chroma Zone Plate pattern as well. Sometimes
it’s easier to see the differences on one or the other
depending on the specific display.
 |
Chroma Upsampling Error
|
Clipping
This
pattern tells you if any of the primary color channels is
being clipped above the reference white level at any point in
the chain. There are some popular HDMI transmitter chips that
clip the Y’ channel when converting 4:2:2 to 4:4:4 or RGB, so
when using a player with one of these chips, setting the
player to output anything other than 4:2:2 produces a hard
clip in the Y’ channel. A telltale sign of this is that the Y’
(white) channel is clipped, but the red, green, and blue
channels are not clipped, or at least not completely clipped.
Put a check in the row labeled, “White not clipped” if the
white portion of the pattern shows concentric squares rather
than a large solid square.
Put a check in the row labeled, “Red, green, and blue not
clipped” if the red, green, and blue portions of the pattern
show concentric squares rather than a large solid square.
 |
| Clipping |
*© Copyright Secrets of Home Theater and
High Fidelity and are used with permission.