Purpose: Summary score metrics, either from crowds of non-experts, faculty surgeons or from automated performance metrics, have been trusted as the prevailing method of reporting surgeon technical skill. The aim of this paper is to learn whether there exist significant fluctuations in the technical skill assessments of a surgeon throughout long durations of surgical footage. Methods: A set of 12 videos of robotic surgery cases from common human patient robotic surgeries were used to evaluate the perceived technical skill at each individual minute of the surgical videos, which were originally 12–15 min in length. A linear mixed-effects model for each video was used to compare the ratings of each minute to those from every other minute in order to learn whether a change in scores over time can be detected and reliably measured apart from inter- and intrarater variation. Results: Modeling the change over time of the global evaluative assessment of robotic skills scores significantly contributed to the prediction models for 11 of the 12 surgeons. This demonstrates that measurable changes in technical skill occur over time during robotic surgery. Conclusion: The findings from this research raise questions about the optimal duration of footage needed to be evaluated to arrive at an accurate rating of surgical technical skill for longer procedures. This may imply non-negligible label noise for supervised machine learning approaches. In the future, it may be necessary to report a surgeon’s skill variability in addition to their mean score to have proper knowledge of a surgeon’s overall skill level.
|Original language||English (US)|
|Number of pages||7|
|Journal||International Journal of Computer Assisted Radiology and Surgery|
|State||Published - Dec 2020|
Bibliographical noteFunding Information:
This work was supported, in part, by the Office of the Assistant Secretary of Defense for Health Affairs under Award No. W81XWH-15-2-0030, the National Science Foundation CAREER Grant under Award No. 1847610. Opinions, interpretations, conclusions and recommendations are those of the authors and are not necessarily endorsed by the Department of Defense, the National Science Foundation or the National Institutes of Health’s National Center for Advancing Translational Sciences.
- Crowd sourcing
- Surgical technical skill
- Video segmentation
PubMed: MeSH publication types
- Journal Article