Abstract
Bandable covariance matrices are often used to model the dependence structure of variables that follow a nature order. It has been shown that the tapering covariance estimator attains the optimal minimax rates of convergence for estimating large bandable covariance matrices. The estimation risk critically depends on the choice of the tapering parameter. We develop a Stein's Unbiased Risk Estimation (SURE) theory for estimating the Frobenius risk of the tapering estimator. SURE tuning selects the minimizer of SURE curve as the chosen tapering parameter. An extensive Monte Carlo study shows that SURE tuning is often comparable to the oracle tuning and outperforms cross-validation. We further illustrate SURE tuning using rock sonar spectrum data. The real data analysis results are consistent with simulation findings.
Original language | English (US) |
---|---|
Pages (from-to) | 339-351 |
Number of pages | 13 |
Journal | Computational Statistics and Data Analysis |
Volume | 58 |
Issue number | 1 |
DOIs | |
State | Published - Feb 2013 |
Bibliographical note
Funding Information:This work was supported in part by NSF grant DMS-0846068 . The authors thank the editor, AE and referees for their helpful comments. Appendix Proof of Lemma 1 We start with Stein’s identity ( □ Efron, 2004 ) (A.1) ( σ ˆ i j − σ i j ) 2 = ( σ ˆ i j − σ ̃ i j s ) 2 − ( σ ̃ i j s − σ i j ) 2 + 2 ( σ ˆ i j − σ i j ) ( σ ̃ i j s − σ i j ) . Taking expectation at both side of (A.1) and summing over i , j = 1 yield E ‖ Σ ̂ − Σ ‖ F 2 = E ‖ Σ ̂ − Σ ̃ s ‖ F 2 − ∑ i = 1 p ∑ j = 1 p var ( σ ̃ i j s ) + 2 ∑ i = 1 p ∑ j = 1 p cov ( σ ˆ i j , σ ̃ i j s ) . Note that E [ ( σ ˆ i j − σ i j ) ( σ ̃ i j s − σ i j ) ] = cov ( σ ˆ i j , σ ̃ i j s ) because E σ ̃ i j s = σ i j . Proof of Lemma 2 The estimators under consideration are translational invariant. Without loss of generality, we can let μ = E ( x ) = 0 . By straightforward calculation based on bivariate normal distribution, we have (A.2) E ( x i 2 x j 2 ) = σ i i σ j j + 2 σ i j 2 , which holds for both i = j and i ≠ j . (A.3) E ( ( σ ̃ i j s ) 2 ) = E ( ( n − 1 ) − 2 ( ∑ k = 1 n x k , i x k , j − n x ̄ i x ̄ j ) 2 ) = ( n − 1 ) − 2 { E ( ( ∑ k = 1 n x k , i x k , j ) 2 ) − 2 n − 1 ∑ k = 1 n E ( n x ̄ i n x ̄ j x k , i x k , j ) + n 2 E ( x ̄ i 2 x ̄ j 2 ) } . We also have (A.4) E ( ( n − 1 ∑ k = 1 n x k , i x k , j ) 2 ) = 1 n var ( x i x j ) + ( E ( x i x j ) ) 2 = 1 n ( σ i i σ j j + 2 σ i j 2 − σ i j 2 ) + σ i j 2 = 1 n σ i i σ j j + 1 + n n σ i j 2 . Note that X ̄ ∼ N ( 0 , Σ / n ) . Using (A.2) we have (A.5) n 2 E ( x ̄ i 2 x ̄ j 2 ) = 2 σ i j 2 + σ i i σ j j . (A.6) E ( n x ̄ i n x ̄ j x k , i x k , j ) = ∑ 1 ≤ l , l ′ ≤ n { I ( l = l ′ ≠ k ) E ( x l , i x l , j x k , i x k , j ) + I ( l = l ′ = k ) E ( x k , i 2 x k , j 2 ) } = ( n − 1 ) σ 12 2 + ( σ i i σ j j + 2 σ i j 2 ) . Substituting (A.4)–(A.6) into (A.3) gives (A.7) E ( ( σ ̃ i j s ) 2 ) = n σ i j 2 + σ i i σ j j n − 1 . Thus, var ( σ ̃ i j s ) = E ( ( σ ̃ i j s ) 2 ) − σ i j 2 = σ i j 2 + σ i i σ j j n − 1 . We now show □ (2.4) by deriving an expression for E ( σ ̃ i i s σ ̃ j j s ) . (A.8) ( n − 1 ) 2 E ( σ ̃ i i s σ ̃ j j s ) = ∑ 1 ≤ k , k ′ ≤ n E ( x k , i 2 x k ′ , j 2 ) − ∑ 1 ≤ k ′ ≤ n E ( x ̄ i 2 x k ′ , j 2 ) − ∑ 1 ≤ k ≤ n E ( x ̄ j 2 x k , i 2 ) + n 2 E ( x ̄ i 2 x ̄ j 2 ) . Repeatedly using (A.2) we have (A.9) ∑ 1 ≤ k , k ′ ≤ n E ( x k , i 2 x k ′ , j 2 ) = n 2 σ i i σ j j + 2 n σ i j 2 , (A.10) n 2 E ( x ̄ i 2 x k ′ , j 2 ) = ∑ 1 ≤ l , l ′ ≤ n { I ( l = l ′ ≠ k ′ ) E ( x l , i 2 x k ′ , j 2 ) + I ( l = l ′ = k ′ ) E ( x k ′ , i 2 x k ′ , j 2 ) } = n σ i i σ j j + 2 σ i j 2 , (A.11) n 2 E ( x ̄ j 2 x k , i 2 ) = n σ i i σ j j + 2 σ i j 2 . Substituting (A.5) and (A.9)–(A.11) into (A.8) gives (A.12) E ( σ ̃ i i s σ ̃ j j s ) = n + 1 n − 1 σ i i σ j j + 2 ( n + 2 ) n ( n − 1 ) σ i j 2 . Combining (A.7) and (A.12) gives (2.4) .
Keywords
- Covariance matrix
- Cross-validation
- Frobenius norm
- Operator norms
- SURE
- Tapering estimator