We present an algorithm based on dynamic programming to perform the HW/SW partitioning and scheduling of a given task graph for minimum latency subject to resource constraint. The major contribution of this paper is to consider the edge communication delays in the dynamic programming solution of the problem. The algorithm has a polynomial run time complexity on trees. We also introduce a pruning technique to reduce the runtime of the worst-case scenario of directed acyclic graphs (DAGs). The algorithm has been implemented and the results are reported. A very fast quality heuristic is also proposed and implemented to provide good solutions in negligible run time.