The relative performance of color constancy algorithms is evaluated. We highlight some problems with previous algorithm evaluation and define more appropriate testing procedures. We discuss how best to measure algorithm accuracy on a single image as well as suitable methods for summarizing errors over a set of images. We also discuss how the relative performance of two or more algorithms should best be compared, and we define an experimental framework for testing algorithms. We reevaluate the performance of six color constancy algorithms using the procedures that we set out and show that this leads to a significant change in the conclusions that we draw about relative algorithm performance as compared with those from previous work.