Least trimmed squares (LTS) is a criterion for analyzing multiple regression data sets in which there may be outliers. The method consists of finding that subset of cases whose deletion from the data set would lead to the regression with the smallest residual sum of squares. It is used as a general-purpose high breakdown method, and also has some inferential motivation in that it gives the maximum likelihood estimator of the regression under a widely-used outlier model. It is also theoretically attractive in that it has standard O(n- 1 2) asymptotics. Its practical usefulness has however been limited by the absence of a computationally manageable exact algorithm for performing LTS fits. This paper capitalizes on a necessary condition characterizing the LTS fit to develop a probabilistic 'feasible solution' algorithm. This algorithm takes random starting trial solutions and refines each to the local optimum satisfying this necessary condition. Repeating this using different starting sets provides the global optimum with arbitrarily high probability for sufficiently many random starts. Exhaustive enumeration of several standard data sets from the literature verifies the method's good performance. An example of a very large data set shows that the usefulness of the method is not confined to the small samples often used in the robustness and outlier literature.
Bibliographical noteFunding Information:
The author is grateful to the two referees for a number of helpful suggestions for improvement of the paper. The work reported here was supported by the National Science Foundation under grants DMS 9010983a nd 9208819.
- Linear model
- high breakdown regression