Four simple conceptual daily rainfall-runoff models are applied to a 25-basin data set. The drainage basins are all from the UK, covering a range of sizes, topographies, soils and climates. The quality of the simulation of the observed response is classically quantified by a minimized objective function. However, in this instance, model performance is judged by a range of quantitative and qualitative measures of fit, applied to both the calibration and validation periods. These include efficiency, mean annual runoff, baseflow index, the synthetic monthly and daily flow regimes, and the flow duration curve. The main conclusion is that the quantitative criteria used alone are rarely sufficient to determine the quality of the model performance. It is usually necessary to include some qualitative indication of goodness-of-fit, such as the quality of the synthetic daily flow hydrograph. However, assessment of the quality of daily flow regimes can be highly subjective.