Depth-integrated primary productivity (PP) estimates obtained from satellite ocean color-based models (SatPPMs) and those generated from biogeochemical ocean general circulation models (BOGCMs) represent a key resource for biogeochemical and ecological studies at global as well as regional scales. Calibration and validation of these PP models are not straightforward, however, and comparative studies show large differences between model estimates. The goal of this paper is to compare PP estimates obtained from 30 different models (21 SatPPMs and 9 BOGCMs) to a tropical Pacific PP database consisting of ~ 1000 14C measurements spanning more than a decade (1983-1996). Primary findings include: skill varied significantly between models, but performance was not a function of model complexity or type (i.e. SatPPM vs. BOGCM); nearly all models underestimated the observed variance of PP, specifically yielding too few low PP (< 0.2 g C m- 2 d- 1) values; more than half of the total root-mean-squared model-data differences associated with the satellite-based PP models might be accounted for by uncertainties in the input variables and/or the PP data; and the tropical Pacific database captures a broad scale shift from low biomass-normalized productivity in the 1980s to higher biomass-normalized productivity in the 1990s, which was not successfully captured by any of the models. This latter result suggests that interdecadal and global changes will be a significant challenge for both SatPPMs and BOGCMs. Finally, average root-mean-squared differences between in situ PP data on the equator at 140°W and PP estimates from the satellite-based productivity models were 58% lower than analogous values computed in a previous PP model comparison 6 years ago. The success of these types of comparison exercises is illustrated by the continual modification and improvement of the participating models and the resulting increase in model skill.