Dynamic program execution monitors allow programmers to observe and verify an application while it is running. Instrumentation-based dynamic program monitors often incur significant performance overhead due to instrumentation. Special hardware supports have been proposed to reduce this overhead. However, these supports mostly target specific monitoring requirements and thus have limited applicability. Recently, with multi-core processors becoming mainstream, executing the monitored program and the monitor simultaneously on separate cores has emerged as an attractive option. However, communication between the two often becomes the new performance bottleneck due to large amounts of information forwarded to the monitor. In this paper, we present compiler techniques that aim to minimize the communication overhead. Our proposal is based on the observations that a monitor only requires specific information from the monitored programs and some information can be easily computed by the monitor from data that have already been communicated. We developed a code generator and optimization techniques to decide the set of data items to forward and the set to compute, so that the total execution time of the monitor is minimized. Our compiler can optimize a variety of monitors with diverse monitoring requirements, taking as input the control flow graph of the monitored program and the set of data that needs verification. Using a static binary rewriter, we evaluate the performance impact of the proposed compiler techniques on the SPEC2006 integer benchmarks for two intensive monitoring tasks: taint-propagation and memory bug detection. Comparing to instrumentation-based monitors, the proposed techniques can bring down the performance overhead of the two monitors from 10.6x and 9.0x to 2.36x and 2.17x, respectively.