The idea of learning overcomplete dictionaries based on the paradigm of compressive sensing has found numerous applications, among which image denoising is considered one of the most successful. But many state-of-the-art denoising techniques inherently assume that the signal noise is Gaussian. We instead propose to learn overcomplete dictionaries where the signal is allowed to have both Gaussian and (sparse) Laplacian noise. Dictionary learning in this setting leads to a difficult non-convex optimization problem, which is further exacerbated by large input datasets. We tackle these difficulties by developing an efficient online algorithm that scales to data size. To assess the efficacy of our model, we apply it to dictionary learning for data that naturally satisfy our noise model, namely, Scale Invariant Feature Transform (SIFT) descriptors. For these data, we measure performance of the learned dictionary on the task of nearest-neighbor retrieval: compared to methods that do not explicitly model sparse noise our method exhibits superior performance.