Probabilistic slope stability analyses were compared to observed failures between the late 1970's and the mid 1990's along Wisconsin's Lake Michigan shoreline bluffs. An emphasis is also made on the importance of understanding the slope failure mechanism and incorporating the failure mechanism in modeling of slope behavior. Profiles along approximately 300 miles of shoreline bluffs were measured, analyzed and evaluated using deterministic methods in the late 1970's. Profiles along approximately 175 miles of shoreline were remeasured in 1995 and 1996. The shoreline bluffs studied are up to 40 meters (130 feet) tall and are composed of glacial and lacustrine sediments. Spring climate conditions and seasonal groundwater changes appear related to rotational failures along this shoreline. Our probabilistic analyses modeled changes in the internal groundwater elevation and variability in strength parameters, rather than external erosion impacts on the bluff. Probabilistic results were viewed using the reliability index, which can be converted into a probability of failure. Analyses were made using the 1975 to 1980 field measurements and then compared to the failures observed between the initial measurements and 1996. This allowed an empirical measure of the prediction accuracy of probabilistic slope stability models in the study area. The comparison of the model results to observed slope failures demonstrated the need to adjust the failure criteria so failing slopes could be more accurately predicted. An empirical calibration value of the probabilistic model was then developed to give the engineer a realistic probability of slope failure.