We consider the scenario of an unknown overdetermined instantaneous mixture of quasi-stationary sources. Blind source separation (BSS) under this scenario has drawn much attention, motivated by applications such as speech and audio separation. The ideas in the existing BSS works often focus on exploiting the time-varying statistics characteristics of quasi-stationary sources, through various kinds of formulations and optimization methods. In this paper, we are interested in further assuming that the sources exhibit some form of local sparsity, which is generally satisfied in speech. By exploiting this additional assumption, we show that there is a simple closed-form solution for the BSS problem. Simulation results based on real speech show that the proposed closed-form algorithm is computationally much lower than some existing BSS algorithms, while delivering a promising mean-square-error performance.