Ensemble forecasting can be seen as serving two purposes: (1) by comparison of the control and ensemble members to the observed precipitation field, one can assess the forecast performance probabilistically; and (2) by comparison of ensemble members to the control forecast, one can assess the "diversity" of an ensemble and quantify the uncertainty of the forecast. Both problems are grounded to the basic requirement of being able to compare spatially nonhomogeneous, intermittent fields and come up with low-dimensional metrics that can summarize this comparison. Several standard metrics exist (e.g., root mean square error (RMSE), Brier score, and equitable threat score (EqTh)) and are adopted in many operational studies. We studied (1) a fine-scale ensemble precipitation forecast produced from the Advanced Regional Prediction System (ARPS) and (2) forecasts from multiple models (e.g., the 1998 Storm and Mesoscale Ensemble Experiment (SAMEX '98)) for the purpose of exploring how the selection of the performance metric can affect inferences about the quality and uncertainty of a forecast. We propose a new measure called forecast quality index, which combines image analysis and nonlinear shape comparison features, and we show that it is a more robust and informative metric compared to traditional metrics such as RMSE and EqTh.