Despite their well-documented capability in modeling nonlinear functions, kernel methods fall short in large-scale learning tasks due to their excess memory and computational requirements. The present work introduces a novel kernel approximation approach from a dimensionality reduction point of view on virtual lifted data. The proposed framework accommodates feature extraction while considering limited storage and computational availability, and subsequently provides kernel approximation by a linear inner-product over the extracted features. Probabilistic guarantees on the generalization of the proposed task is provided, and efficient solvers with provable convergence guarantees are developed. By introducing a sampling step which precedes the dimensionality reduction task, the framework is further broadened to accommodate learning over large datasets. The connection between the novel method and Nystrom kernel approximation algorithm with its modifications is also presented. Empirical tests validate the effectiveness of the proposed approach.