This paper presents an efficient method for blind source separation of convolutively mixed speech signals. The method follows the popular frequency-domain approach, wherein researchers are faced with two main problems, namely, per-frequency mixing system estimation, and permutation alignment of source components at all frequencies. We adopt a novel concept, where we utilize local sparsity of speech sources in transformed domain, together with non-stationarity, to address the two problems. Such exploitation leads to a closed-form solution for per-frequency mixing system estimation and a numerically simple method for permutation alignment, both of which are efficient to implement. Simulations show that the proposed method yields comparable source recovery performance to that of a state-of-the-art method, while requires much less computation time.