Optimally estimating the sample standard deviation from the five-number summary

Jiandong Shi, Dehui Luo, Hong Weng, Xian Tao Zeng, Lu Lin, Haitao Chu, Tiejun Tong

Research output: Contribution to journalArticlepeer-review

215 Scopus citations

Abstract

When reporting the results of clinical studies, some researchers may choose the five-number summary (including the sample median, the first and third quartiles, and the minimum and maximum values) rather than the sample mean and standard deviation (SD), particularly for skewed data. For these studies, when included in a meta-analysis, it is often desired to convert the five-number summary back to the sample mean and SD. For this purpose, several methods have been proposed in the recent literature and they are increasingly used nowadays. In this article, we propose to further advance the literature by developing a smoothly weighted estimator for the sample SD that fully utilizes the sample size information. For ease of implementation, we also derive an approximation formula for the optimal weight, as well as a shortcut formula for the sample SD. Numerical results show that our new estimator provides a more accurate estimate for normal data and also performs favorably for non-normal data. Together with the optimal sample mean estimator in Luo et al., our new methods have dramatically improved the existing methods for data transformation, and they are capable to serve as “rules of thumb” in meta-analysis for studies reported with the five-number summary. Finally for practical use, an Excel spreadsheet and an online calculator are also provided for implementing our optimal estimators.

Original languageEnglish (US)
Pages (from-to)641-654
Number of pages14
JournalResearch Synthesis Methods
Volume11
Issue number5
DOIs
StatePublished - Sep 1 2020

Bibliographical note

Funding Information:
The authors thank the editor, the associate editor, and the reviewer for their constructive comments that have led to a significant improvement of the paper. Lu Lin's research was supported in part by the National Natural Science Foundation of China (No. 11971265). Haitao Chu's research was supported in part by the NIH National Library of Medicine (No. R01LM012982). Tiejun Tong's research was supported in part by the Initiation Grant for Faculty Niche Research Areas (No. RC‐IG‐FNRA/17‐18/13) and the Century Club Sponsorship Scheme of Hong Kong Baptist University, the National Natural Science Foundation of China (No. 11671338), and the General Research Fund (No. HKBU12303918).

Funding Information:
General Research Fund, Grant/Award Number: HKBU12303918; The General Program, Grant/Award Numbers: 11671338, 11971265; The Century Club Sponsorship Scheme; Initiation Grant for Faculty Niche Research Areas, Grant/Award Number: RC‐IG‐FNRA/17‐18/13; National Library of Medicine, Grant/Award Number: R01LM012982 Funding information

Funding Information:
The authors thank the editor, the associate editor, and the reviewer for their constructive comments that have led to a significant improvement of the paper. Lu Lin's research was supported in part by the National Natural Science Foundation of China (No. 11971265). Haitao Chu's research was supported in part by the NIH National Library of Medicine (No. R01LM012982). Tiejun Tong's research was supported in part by the Initiation Grant for Faculty Niche Research Areas (No. RC-IG-FNRA/17-18/13) and the Century Club Sponsorship Scheme of Hong Kong Baptist University, the National Natural Science Foundation of China (No. 11671338), and the General Research Fund (No. HKBU12303918).

Publisher Copyright:
© 2020 John Wiley & Sons, Ltd

Keywords

  • SD
  • five-number summary
  • interquartile range
  • range
  • sample mean
  • sample size

Fingerprint

Dive into the research topics of 'Optimally estimating the sample standard deviation from the five-number summary'. Together they form a unique fingerprint.

Cite this