Selection of proper architecture and implementation styles can strongly influence the performance of dedicated VLSI DSP circuits. The impact of architecture choices is illustrated by considering a number of representative signal processing examples and a few general architecture transformation techniques. It is shown that algorithm transformation techniques such as look-ahead computation can create concurrency in nonconcurrent recursive signal processing algorithms, and that inherently new signal processing algorithms are processed with concurrency. This leads to pipelined algorithm topologies without any hardware overhead. To illustrate the impact of the implementation styles, it is shown that the internal redundant number system can lead to more efficient realization of multipliers and adders. Using systematic folding and unfolding techniques, digit-serial architectures with no restrictions on the digit size can be described. All of these architectural techniques can increase the performance of DSP circuits.