Crowd-sourced assessment of technical skills: An adjunct to urology resident surgical simulation training

Daniel Holst; Timothy M. Kowalewski; Lee W. White; Timothy C. Brand; Jonathan D. Harper; Mathew D. Sorenson; Sarah Kirsch; Thomas S. Lendvay

doi:10.1089/end.2014.0616

Crowd-sourced assessment of technical skills: An adjunct to urology resident surgical simulation training

Daniel Holst, Timothy M. Kowalewski, Lee W. White, Timothy C. Brand, Jonathan D. Harper, Mathew D. Sorenson, Sarah Kirsch, Thomas S. Lendvay

Mechanical Engineering

Research output: Contribution to journal › Article › peer-review

68 Scopus citations

Abstract

Crowdsourcing is the practice of obtaining services from a large group of people, typically an online community. Validated methods of evaluating surgical video are time-intensive, expensive, and involve participation of multiple expert surgeons. We sought to obtain valid performance scores of urologic trainees and faculty on a dry-laboratory robotic surgery task module by using crowdsourcing through a web-based grading tool called Crowd Sourced Assessment of Technical Skill (CSATS). Methods: IRB approval was granted to test the technical skills grading accuracy of Amazon.com Mechanical Turk™ crowd-workers compared to three expert faculty surgeon graders. The two groups assessed dry-laboratory robotic surgical suturing performances of three urology residents (PGY-2,-4,-5) and two faculty using three performance domains from the validated Global Evaluative Assessment of Robotic Skills assessment tool. Results: After an average of 2 hours 50 minutes, each of the five videos received 50 crowd-worker assessments. The inter-rater reliability (IRR) between the surgeons and crowd was 0.91 using Cronbach's alpha statistic (confidence intervals=0.20-0.92), indicating an agreement level between the two groups of "excellent." The crowds were able to discriminate the surgical level, and both the crowds and the expert faculty surgeon graders scored one senior trainee's performance above a faculty's performance. Conclusion: Surgery-naive crowd-workers can rapidly assess varying levels of surgical skill accurately relative to a panel of faculty raters. The crowds provided rapid feedback and were inexpensive. CSATS may be a valuable adjunct to surgical simulation training as requirements for more granular and iterative performance tracking of trainees become mandated and commonplace.

Original language	English (US)
Pages (from-to)	604-609
Number of pages	6
Journal	Journal of endourology
Volume	29
Issue number	5
DOIs	https://doi.org/10.1089/end.2014.0616
State	Published - May 1 2015

Bibliographical note

Publisher Copyright:
© 2015 Mary Ann Liebert, Inc.

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access

10.1089/end.2014.0616

OpenUrl availability

Full text

Cite this

@article{b136cf457b2846099d703474c1a9bcc7,

title = "Crowd-sourced assessment of technical skills: An adjunct to urology resident surgical simulation training",

abstract = "Crowdsourcing is the practice of obtaining services from a large group of people, typically an online community. Validated methods of evaluating surgical video are time-intensive, expensive, and involve participation of multiple expert surgeons. We sought to obtain valid performance scores of urologic trainees and faculty on a dry-laboratory robotic surgery task module by using crowdsourcing through a web-based grading tool called Crowd Sourced Assessment of Technical Skill (CSATS). Methods: IRB approval was granted to test the technical skills grading accuracy of Amazon.com Mechanical Turk{\texttrademark} crowd-workers compared to three expert faculty surgeon graders. The two groups assessed dry-laboratory robotic surgical suturing performances of three urology residents (PGY-2,-4,-5) and two faculty using three performance domains from the validated Global Evaluative Assessment of Robotic Skills assessment tool. Results: After an average of 2 hours 50 minutes, each of the five videos received 50 crowd-worker assessments. The inter-rater reliability (IRR) between the surgeons and crowd was 0.91 using Cronbach's alpha statistic (confidence intervals=0.20-0.92), indicating an agreement level between the two groups of {"}excellent.{"} The crowds were able to discriminate the surgical level, and both the crowds and the expert faculty surgeon graders scored one senior trainee's performance above a faculty's performance. Conclusion: Surgery-naive crowd-workers can rapidly assess varying levels of surgical skill accurately relative to a panel of faculty raters. The crowds provided rapid feedback and were inexpensive. CSATS may be a valuable adjunct to surgical simulation training as requirements for more granular and iterative performance tracking of trainees become mandated and commonplace.",

author = "Daniel Holst and Kowalewski, {Timothy M.} and White, {Lee W.} and Brand, {Timothy C.} and Harper, {Jonathan D.} and Sorenson, {Mathew D.} and Sarah Kirsch and Lendvay, {Thomas S.}",

note = "Publisher Copyright: {\textcopyright} 2015 Mary Ann Liebert, Inc.",

year = "2015",

month = may,

day = "1",

doi = "10.1089/end.2014.0616",

language = "English (US)",

volume = "29",

pages = "604--609",

journal = "Journal of endourology",

issn = "0892-7790",

publisher = "Mary Ann Liebert Inc.",

number = "5",

}

TY - JOUR

T1 - Crowd-sourced assessment of technical skills

T2 - An adjunct to urology resident surgical simulation training

AU - Holst, Daniel

AU - Kowalewski, Timothy M.

AU - White, Lee W.

AU - Brand, Timothy C.

AU - Harper, Jonathan D.

AU - Sorenson, Mathew D.

AU - Kirsch, Sarah

AU - Lendvay, Thomas S.

PY - 2015/5/1

Y1 - 2015/5/1

N2 - Crowdsourcing is the practice of obtaining services from a large group of people, typically an online community. Validated methods of evaluating surgical video are time-intensive, expensive, and involve participation of multiple expert surgeons. We sought to obtain valid performance scores of urologic trainees and faculty on a dry-laboratory robotic surgery task module by using crowdsourcing through a web-based grading tool called Crowd Sourced Assessment of Technical Skill (CSATS). Methods: IRB approval was granted to test the technical skills grading accuracy of Amazon.com Mechanical Turk™ crowd-workers compared to three expert faculty surgeon graders. The two groups assessed dry-laboratory robotic surgical suturing performances of three urology residents (PGY-2,-4,-5) and two faculty using three performance domains from the validated Global Evaluative Assessment of Robotic Skills assessment tool. Results: After an average of 2 hours 50 minutes, each of the five videos received 50 crowd-worker assessments. The inter-rater reliability (IRR) between the surgeons and crowd was 0.91 using Cronbach's alpha statistic (confidence intervals=0.20-0.92), indicating an agreement level between the two groups of "excellent." The crowds were able to discriminate the surgical level, and both the crowds and the expert faculty surgeon graders scored one senior trainee's performance above a faculty's performance. Conclusion: Surgery-naive crowd-workers can rapidly assess varying levels of surgical skill accurately relative to a panel of faculty raters. The crowds provided rapid feedback and were inexpensive. CSATS may be a valuable adjunct to surgical simulation training as requirements for more granular and iterative performance tracking of trainees become mandated and commonplace.

AB - Crowdsourcing is the practice of obtaining services from a large group of people, typically an online community. Validated methods of evaluating surgical video are time-intensive, expensive, and involve participation of multiple expert surgeons. We sought to obtain valid performance scores of urologic trainees and faculty on a dry-laboratory robotic surgery task module by using crowdsourcing through a web-based grading tool called Crowd Sourced Assessment of Technical Skill (CSATS). Methods: IRB approval was granted to test the technical skills grading accuracy of Amazon.com Mechanical Turk™ crowd-workers compared to three expert faculty surgeon graders. The two groups assessed dry-laboratory robotic surgical suturing performances of three urology residents (PGY-2,-4,-5) and two faculty using three performance domains from the validated Global Evaluative Assessment of Robotic Skills assessment tool. Results: After an average of 2 hours 50 minutes, each of the five videos received 50 crowd-worker assessments. The inter-rater reliability (IRR) between the surgeons and crowd was 0.91 using Cronbach's alpha statistic (confidence intervals=0.20-0.92), indicating an agreement level between the two groups of "excellent." The crowds were able to discriminate the surgical level, and both the crowds and the expert faculty surgeon graders scored one senior trainee's performance above a faculty's performance. Conclusion: Surgery-naive crowd-workers can rapidly assess varying levels of surgical skill accurately relative to a panel of faculty raters. The crowds provided rapid feedback and were inexpensive. CSATS may be a valuable adjunct to surgical simulation training as requirements for more granular and iterative performance tracking of trainees become mandated and commonplace.

UR - http://www.scopus.com/inward/record.url?scp=84928944763&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84928944763&partnerID=8YFLogxK

U2 - 10.1089/end.2014.0616

DO - 10.1089/end.2014.0616

M3 - Article

C2 - 25356517

AN - SCOPUS:84928944763

SN - 0892-7790

VL - 29

SP - 604

EP - 609

JO - Journal of endourology

JF - Journal of endourology

IS - 5

ER -

Crowd-sourced assessment of technical skills: An adjunct to urology resident surgical simulation training

Abstract

Bibliographical note

UN SDGs

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this