Byte Substitution (S-Box), which is essentially a combination of inversion and affine operations over a finite field GF(28), limits the throughput of the Advanced Encryption Standard (AES) algorithm. Among existing S-Box architectures, the composite field S-Box algorithm is very attractive for its extremely low area cost, which is only 12%-20% of other implementation approaches [I]. However, the composite field S-Box suffers from extremely low throughput rate. In this paper, we propose a novel fast composite field S-Box architecture. By applying pre-computation techniques, some computation on the critical data path can be eliminated so as to reduce the critical path delay. The complexity of the precomputation units is minimized via sharing common structures. The proposed design is implemented using a 0.18-μm CMOS technology library. The results show that the throughput rate is increased by 28.22% at the expense of a fairly modest increase in area. Based on the proposed design, we then present an approach to further reduce critical path delay. The gate-level analysis shows that the second proposed approach can increase the throughput rate by 56.25%. In addition, the proposed designs can reduce the pipelining latency by 40%-60% compared with the conventional design while keeping the same throughput rate.