Scalable delivery of network services to client applications is a difficult problem. We present a service architecture that is designed to accommodate the scalable delivery of heterogeneous services to clients with different QoS requirements. A prototype was built using the Legion wide area computing infrastructure to explore the design of application- and service-based replica selection and creation policies in a wide area network. The preliminary results demonstrate that application- and service-based policies outperform generic policies such as random and round robin. The results also show that overhead control is a key issue for latency sensitive requests.