The role of publicly available data in MICCAI papers from 2014 to 2018

Nicholas Heller; Jack Rickman; Christopher Weight; Nikolaos Papanikolopoulos

doi:10.1007/978-3-030-33642-4_8

The role of publicly available data in MICCAI papers from 2014 to 2018

Nicholas Heller, Jack Rickman, Christopher Weight, Nikolaos Papanikolopoulos

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Widely-used public benchmarks are of huge importance to computer vision and machine learning research, especially with the computational resources required to reproduce state of the art results quickly becoming untenable. In medical image computing, the wide variety of image modalities and problem formulations yields a huge task-space for benchmarks to cover, and thus the widespread adoption of standard benchmarks has been slow, and barriers to releasing medical data exacerbate this issue. In this paper, we examine the role that publicly available data has played in MICCAI papers from the past five years. We find that more than half of these papers are based on private data alone, although this proportion seems to be decreasing over time. Additionally, we observed that after controlling for open access publication and the release of code, papers based on public data were cited over 60% more per year than their private-data counterparts. Further, we found that more than 20% of papers using public data did not provide a citation to the dataset or associated manuscript, highlighting the “second-rate” status that data contributions often take compared to theoretical ones. We conclude by making recommendations for MICCAI policies which could help to better incentivise data sharing and move the field toward more efficient and reproducible science.

Original language	English (US)
Title of host publication	Large-Scale Annotation of Biomedical Data and Expert Label Synthesis and Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention - International Workshops, LABELS 2019, HAL-MICCAI 2019, and CuRIOUS 2019, held in Conjunction with MICCAI 2019, Proceedings
Editors	Luping Zhou, Nicholas Heller, Yiyu Shi, Danny Chen, X. Sharon Hu, Yiming Xiao, Raphael Sznitman, Veronika Cheplygina, Diana Mateus, Emanuele Trucco, Matthieu Chabanas, Hassan Rivaz, Ingerid Reinertsen
Publisher	Springer
Pages	70-77
Number of pages	8
ISBN (Print)	9783030336417
DOIs	https://doi.org/10.1007/978-3-030-33642-4_8
State	Published - 2019
Event	4th International Workshop on Large-Scale Annotation of Biomedical Data and Expert Label Synthesis, LABELS 2019, the 1st International Workshop on Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention, HAL-MICCAI 2019, and the 2nd International Workshop on Correction of Brainshift with Intra-Operative Ultrasound, CuRIOUS 2019, held in conjunction with the 22nd International Conference on Medical Imaging and Computer-Assisted Intervention, MICCAI 2019 - Shenzhen, China Duration: Oct 17 2019 → Oct 17 2019

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	11851 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	4th International Workshop on Large-Scale Annotation of Biomedical Data and Expert Label Synthesis, LABELS 2019, the 1st International Workshop on Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention, HAL-MICCAI 2019, and the 2nd International Workshop on Correction of Brainshift with Intra-Operative Ultrasound, CuRIOUS 2019, held in conjunction with the 22nd International Conference on Medical Imaging and Computer-Assisted Intervention, MICCAI 2019
Country/Territory	China
City	Shenzhen
Period	10/17/19 → 10/17/19

Bibliographical note

Funding Information:
Acknowledgements. Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under Award Number R01CA225435. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Publisher Copyright:
© Springer Nature Switzerland AG 2019.

Access

10.1007/978-3-030-33642-4_8

OpenUrl availability

Full text

Cite this

Heller, N., Rickman, J., Weight, C., & Papanikolopoulos, N. (2019). The role of publicly available data in MICCAI papers from 2014 to 2018. In L. Zhou, N. Heller, Y. Shi, D. Chen, X. S. Hu, Y. Xiao, R. Sznitman, V. Cheplygina, D. Mateus, E. Trucco, M. Chabanas, H. Rivaz, & I. Reinertsen (Eds.), Large-Scale Annotation of Biomedical Data and Expert Label Synthesis and Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention - International Workshops, LABELS 2019, HAL-MICCAI 2019, and CuRIOUS 2019, held in Conjunction with MICCAI 2019, Proceedings (pp. 70-77). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11851 LNCS). Springer. https://doi.org/10.1007/978-3-030-33642-4_8

The role of publicly available data in MICCAI papers from 2014 to 2018. / Heller, Nicholas; Rickman, Jack; Weight, Christopher et al.
Large-Scale Annotation of Biomedical Data and Expert Label Synthesis and Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention - International Workshops, LABELS 2019, HAL-MICCAI 2019, and CuRIOUS 2019, held in Conjunction with MICCAI 2019, Proceedings. ed. / Luping Zhou; Nicholas Heller; Yiyu Shi; Danny Chen; X. Sharon Hu; Yiming Xiao; Raphael Sznitman; Veronika Cheplygina; Diana Mateus; Emanuele Trucco; Matthieu Chabanas; Hassan Rivaz; Ingerid Reinertsen. Springer, 2019. p. 70-77 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11851 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Heller, N, Rickman, J, Weight, C & Papanikolopoulos, N 2019, The role of publicly available data in MICCAI papers from 2014 to 2018. in L Zhou, N Heller, Y Shi, D Chen, XS Hu, Y Xiao, R Sznitman, V Cheplygina, D Mateus, E Trucco, M Chabanas, H Rivaz & I Reinertsen (eds), Large-Scale Annotation of Biomedical Data and Expert Label Synthesis and Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention - International Workshops, LABELS 2019, HAL-MICCAI 2019, and CuRIOUS 2019, held in Conjunction with MICCAI 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11851 LNCS, Springer, pp. 70-77, 4th International Workshop on Large-Scale Annotation of Biomedical Data and Expert Label Synthesis, LABELS 2019, the 1st International Workshop on Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention, HAL-MICCAI 2019, and the 2nd International Workshop on Correction of Brainshift with Intra-Operative Ultrasound, CuRIOUS 2019, held in conjunction with the 22nd International Conference on Medical Imaging and Computer-Assisted Intervention, MICCAI 2019, Shenzhen, China, 10/17/19. https://doi.org/10.1007/978-3-030-33642-4_8

Heller N, Rickman J, Weight C, Papanikolopoulos N. The role of publicly available data in MICCAI papers from 2014 to 2018. In Zhou L, Heller N, Shi Y, Chen D, Hu XS, Xiao Y, Sznitman R, Cheplygina V, Mateus D, Trucco E, Chabanas M, Rivaz H, Reinertsen I, editors, Large-Scale Annotation of Biomedical Data and Expert Label Synthesis and Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention - International Workshops, LABELS 2019, HAL-MICCAI 2019, and CuRIOUS 2019, held in Conjunction with MICCAI 2019, Proceedings. Springer. 2019. p. 70-77. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-33642-4_8

Heller, Nicholas ; Rickman, Jack ; Weight, Christopher et al. / The role of publicly available data in MICCAI papers from 2014 to 2018. Large-Scale Annotation of Biomedical Data and Expert Label Synthesis and Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention - International Workshops, LABELS 2019, HAL-MICCAI 2019, and CuRIOUS 2019, held in Conjunction with MICCAI 2019, Proceedings. editor / Luping Zhou ; Nicholas Heller ; Yiyu Shi ; Danny Chen ; X. Sharon Hu ; Yiming Xiao ; Raphael Sznitman ; Veronika Cheplygina ; Diana Mateus ; Emanuele Trucco ; Matthieu Chabanas ; Hassan Rivaz ; Ingerid Reinertsen. Springer, 2019. pp. 70-77 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{2b0ba37b81dd499f8d7e99157adad15c,

title = "The role of publicly available data in MICCAI papers from 2014 to 2018",

abstract = "Widely-used public benchmarks are of huge importance to computer vision and machine learning research, especially with the computational resources required to reproduce state of the art results quickly becoming untenable. In medical image computing, the wide variety of image modalities and problem formulations yields a huge task-space for benchmarks to cover, and thus the widespread adoption of standard benchmarks has been slow, and barriers to releasing medical data exacerbate this issue. In this paper, we examine the role that publicly available data has played in MICCAI papers from the past five years. We find that more than half of these papers are based on private data alone, although this proportion seems to be decreasing over time. Additionally, we observed that after controlling for open access publication and the release of code, papers based on public data were cited over 60% more per year than their private-data counterparts. Further, we found that more than 20% of papers using public data did not provide a citation to the dataset or associated manuscript, highlighting the “second-rate” status that data contributions often take compared to theoretical ones. We conclude by making recommendations for MICCAI policies which could help to better incentivise data sharing and move the field toward more efficient and reproducible science.",

author = "Nicholas Heller and Jack Rickman and Christopher Weight and Nikolaos Papanikolopoulos",

note = "Funding Information: Acknowledgements. Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under Award Number R01CA225435. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Publisher Copyright: {\textcopyright} Springer Nature Switzerland AG 2019.; 4th International Workshop on Large-Scale Annotation of Biomedical Data and Expert Label Synthesis, LABELS 2019, the 1st International Workshop on Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention, HAL-MICCAI 2019, and the 2nd International Workshop on Correction of Brainshift with Intra-Operative Ultrasound, CuRIOUS 2019, held in conjunction with the 22nd International Conference on Medical Imaging and Computer-Assisted Intervention, MICCAI 2019 ; Conference date: 17-10-2019 Through 17-10-2019",

year = "2019",

doi = "10.1007/978-3-030-33642-4_8",

language = "English (US)",

isbn = "9783030336417",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer",

pages = "70--77",

editor = "Luping Zhou and Nicholas Heller and Yiyu Shi and Danny Chen and Hu, {X. Sharon} and Yiming Xiao and Raphael Sznitman and Veronika Cheplygina and Diana Mateus and Emanuele Trucco and Matthieu Chabanas and Hassan Rivaz and Ingerid Reinertsen",

booktitle = "Large-Scale Annotation of Biomedical Data and Expert Label Synthesis and Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention - International Workshops, LABELS 2019, HAL-MICCAI 2019, and CuRIOUS 2019, held in Conjunction with MICCAI 2019, Proceedings",

}

TY - GEN

T1 - The role of publicly available data in MICCAI papers from 2014 to 2018

AU - Heller, Nicholas

AU - Rickman, Jack

AU - Weight, Christopher

AU - Papanikolopoulos, Nikolaos

N1 - Funding Information: Acknowledgements. Research reported in this publication was supported by the National Cancer Institute of the National Institutes of Health under Award Number R01CA225435. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Publisher Copyright: © Springer Nature Switzerland AG 2019.

PY - 2019

Y1 - 2019

N2 - Widely-used public benchmarks are of huge importance to computer vision and machine learning research, especially with the computational resources required to reproduce state of the art results quickly becoming untenable. In medical image computing, the wide variety of image modalities and problem formulations yields a huge task-space for benchmarks to cover, and thus the widespread adoption of standard benchmarks has been slow, and barriers to releasing medical data exacerbate this issue. In this paper, we examine the role that publicly available data has played in MICCAI papers from the past five years. We find that more than half of these papers are based on private data alone, although this proportion seems to be decreasing over time. Additionally, we observed that after controlling for open access publication and the release of code, papers based on public data were cited over 60% more per year than their private-data counterparts. Further, we found that more than 20% of papers using public data did not provide a citation to the dataset or associated manuscript, highlighting the “second-rate” status that data contributions often take compared to theoretical ones. We conclude by making recommendations for MICCAI policies which could help to better incentivise data sharing and move the field toward more efficient and reproducible science.

AB - Widely-used public benchmarks are of huge importance to computer vision and machine learning research, especially with the computational resources required to reproduce state of the art results quickly becoming untenable. In medical image computing, the wide variety of image modalities and problem formulations yields a huge task-space for benchmarks to cover, and thus the widespread adoption of standard benchmarks has been slow, and barriers to releasing medical data exacerbate this issue. In this paper, we examine the role that publicly available data has played in MICCAI papers from the past five years. We find that more than half of these papers are based on private data alone, although this proportion seems to be decreasing over time. Additionally, we observed that after controlling for open access publication and the release of code, papers based on public data were cited over 60% more per year than their private-data counterparts. Further, we found that more than 20% of papers using public data did not provide a citation to the dataset or associated manuscript, highlighting the “second-rate” status that data contributions often take compared to theoretical ones. We conclude by making recommendations for MICCAI policies which could help to better incentivise data sharing and move the field toward more efficient and reproducible science.

UR - http://www.scopus.com/inward/record.url?scp=85076700777&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85076700777&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-33642-4_8

DO - 10.1007/978-3-030-33642-4_8

M3 - Conference contribution

AN - SCOPUS:85076700777

SN - 9783030336417

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 70

EP - 77

BT - Large-Scale Annotation of Biomedical Data and Expert Label Synthesis and Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention - International Workshops, LABELS 2019, HAL-MICCAI 2019, and CuRIOUS 2019, held in Conjunction with MICCAI 2019, Proceedings

A2 - Zhou, Luping

A2 - Heller, Nicholas

A2 - Shi, Yiyu

A2 - Chen, Danny

A2 - Hu, X. Sharon

A2 - Xiao, Yiming

A2 - Sznitman, Raphael

A2 - Cheplygina, Veronika

A2 - Mateus, Diana

A2 - Trucco, Emanuele

A2 - Chabanas, Matthieu

A2 - Rivaz, Hassan

A2 - Reinertsen, Ingerid

PB - Springer

T2 - 4th International Workshop on Large-Scale Annotation of Biomedical Data and Expert Label Synthesis, LABELS 2019, the 1st International Workshop on Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention, HAL-MICCAI 2019, and the 2nd International Workshop on Correction of Brainshift with Intra-Operative Ultrasound, CuRIOUS 2019, held in conjunction with the 22nd International Conference on Medical Imaging and Computer-Assisted Intervention, MICCAI 2019

Y2 - 17 October 2019 through 17 October 2019

ER -

The role of publicly available data in MICCAI papers from 2014 to 2018

Abstract

Publication series

Conference

Bibliographical note

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this