For Viterbi decoders, high throughput rate is achieved by applying look-ahead techniques in the add-compare-select unit, which is the system speed bottleneck. Look-ahead techniques combine multiple binary trellis steps into one equivalent complex trellis step in time sequence, which is referred to as the branch metrics precomputation (BMP) unit. The complexity and latency of BMP increase exponentially and linearly with respect to the look-ahead levels, respectively. For a Viterbi decoder with constraint length K and M-step look-ahead, 2M+K-1 branch metrics need to be computed and compared. In this paper, the computational redundancy in existing branch metric computation approaches is first recognized, and a general mathematical model for describing the approach space is built, based on which a new approach with minimal complexity and latency is proposed. The proof of its optimality is also given. This highly efficient approach leads to a novel overall optimal architecture for M that is any multiple of K. The results show that the proposed approaches can reduce the complexity by up to 45.65% and the latency by up to 72.50%. In addition, the proposed architecture can also be applied when M is any value while achieving the minimal complexity.