|
Description
Information about this dataset
Data
Links
to data
Publications
Publications prepared with this
data
|
|
Shi's
Re-processing of Gehler's Raw Dataset
The Gehler dataset contains 568 images and includes a
variety of indoor and outdoor shots taken using two high quality DSLR
cameras (Canon 5D and Canon1D) with all settings in auto mode. Each image
contains a MacBeth colorchecker
for reference. All images were saved in Canon RAW format. Gehler also provides tiff-format versions created from
the RAW images using the automatic mode of the Canon Digital Photo Professional
program to convert the images into tiffs. The image coordinates
(measured by hand) of each colorcheckers' squares
are provided with the dataset.
Because the tiff images in
the Gehler dataset were produced automatically
they contain clipped pixels, are non-linear (i.e., have gamma or tone curve
correction applied), are demosaiced, and include
the effect of the camera's white balancing. To avoid these problems we
chose to reprocess the raw data and created almost-raw 12-bit Portable
Network Graphics (PNG) format (lossless compression) images from the Canon
RAW format data by decoding them using dcraw
(by Windows executable dcrawMS.exe). To preserve the original digital
counts for each of the RGB channels, demosaicing
was not enabled. The cameras both output 12-bit data per channel so the
range of possible digital counts is 0 to 4095. The raw images contain 4082
x 2718 (Canon 1D) and 4386 x 2920 (Canon 5D) 12-bit values in an RGGB
pattern. To create a color image the two G values were averaged, but no
further demosaicing was done. This results in a
2041 x 1359 (for Canon 1D) or 2193 x 1460 (for Canon 5D) linear image
(gamma=1) in camera RGB space. This processing takes into account that the
two camera models have slightly different sensor mosaics (same pattern but
different starting offset). We provide the least
processed possible data. Note that for most applications (e.g., testing colour constancy methods) the black
level offset will still need to be subtracted from the
original images. A Matlab
template for loading the images and removing the offset is available here.
The blacklevel of each camera was estimated by finding the minimum
pixel values across the whole dataset. For the Canon 5D the black level
is 129 and for the Canon 1D it is zero.
Measuring the Scene
Illumination
The colorchecker in each image has six achromatic
squares. As the ground truth measure of the illumination's RGB color, we
used the median of the RGB digital counts (i.e,
median R, median G, median B) from the brightest achromatic square (ranked
by average of each square) containing no RGB digital count > 3300. The
threshold eliminates any clipping and the effects of any possible
non-linearity in sensor response that might occur as intensities approach
the maximum of 4095. The median is used instead of the mean because the
median automatically excludes any of the black pixels surrounding each
square that might have been incorrectly included in the square due to the
inexactness in the hand labeling of a colorchecker's
position.
|
PNG Images: (Converted from Raw Images. Divided
into 4 parts, 1~2GB each)
Canon
1D, Canon
5D(1), Canon
5D(2), Canon
5D(3)
Measured Illumination: (based on linear images) download
If you use this version
of the data set please cite it as
Lilong Shi and Brian Funt,
"Re-processed Version of the Gehler Color
Constancy Dataset of 568 Images,"
accessed from http://www.cs.sfu.ca/~colour/data/
Please also include a
citation to the original source:
Peter Gehler and Carsten Rother and Andrew Blake and Tom Minka
and Toby Sharp, "Bayesian
Color Constancy Revisited,"
Proceedings of the IEEE Computer Society Conference on Computer Vision and
Pattern Recognition, 2008.
and http://www.kyb.mpg.de/bs/people/pgehler/colour/index.html.
|