As a subject's true disease status is seldom known with certainty, it is necessary to compare the performance of new diagnostic tests with those of a currently accepted but imperfect 'gold standard'. Errors made by the gold standard mean that the sensitivity and specificity calculated for the new test are biased, and do not correctly estimate the new method's sensitivity and specificity. The traditional approach to this problem was 'discrepant resolution', in which the subjects for whom the two methods disagreed were subjected to a third 'resolver' test. Recent work has pointed out that this does not automatically solve the problem. A sounder approach goes beyond the discordant test results and tests at least some of the subjects with concordant results with the resolver also. This leaves some issues unresolved. One is the basic question of the direction of biases in various estimators. We point out that this question does not have a simple universal answer. Another issue, if one is to test a sample of the subjects with concordant results rather than all cases, is how to compute estimates and standard errors of the measures of test performance, notably sensitivity and specificity of the test method relative to the resolver. Expressions for these standard errors are given and illustrated with a numeric example. It is shown that using just a sample of subjects with concordant results may lead to great savings in assays. The design issue of how many concordant cells to test depends on the numbers of concordants and discordants. The formulae given show how to evaluate impact of different choices for these numbers and hence settle on a design that gives the required precision of estimates.