In heterogeneous multicore systems, implementing a programmer-friendly memory consistency model while maximizing memory-level parallelism is challenging. Ideally, memory accesses can be performed out of order as long as program order is not violated. But enforcing memory access order at the end-point (e.g., a core) prohibits a number of architecture optimizations and limits memory-level parallelism. In this work, we explore the opportunity of preserving memory access order inside the on-chip interconnection network. We propose a hybrid switching networks-on-chip (NoC) attached with a light-weight token ring network to guarantee global memory access order. The hybrid switching NoC that supports both packet and circuit switching serves as the underlying communication infrastructure, while the token ring network is used to preserve memory order among multiple ordering points. Our proposed design enables strong memory consistency models and deterministic program execution, with negligible performance overhead compared to an un-ordered packet switching network.