Validation of automated scoring of science assessments

Ou Lydia Liu; Joseph A. Rios; Michael Heilman; Libby Gerard; Marcia C. Linn

doi:10.1002/tea.21299

Validation of automated scoring of science assessments

Ou Lydia Liu, Joseph A. Rios, Michael Heilman, Libby Gerard, Marcia C. Linn

Research output: Contribution to journal › Article › peer-review

94 Scopus citations

Abstract

Constructed response items can both measure the coherence of student ideas and serve as reflective experiences to strengthen instruction. We report on new automated scoring technologies that can reduce the cost and complexity of scoring constructed-response items. This study explored the accuracy of c-rater-ML, an automated scoring engine developed by Educational Testing Service, for scoring eight science inquiry items that require students to use evidence to explain complex phenomena. Automated scoring showed satisfactory agreement with human scoring for all test takers as well as specific subgroups. These findings suggest that c-rater-ML offers a promising solution to scoring constructed-response science items and has the potential to increase the use of these items in both instruction and assessment.

Original language	English (US)
Pages (from-to)	215-233
Number of pages	19
Journal	Journal of Research in Science Teaching
Volume	53
Issue number	2
DOIs	https://doi.org/10.1002/tea.21299
State	Published - Feb 1 2016
Externally published	Yes

Bibliographical note

Funding Information:
National Science Foundation; Contract grant number: 1119670. This material is based upon work supported by the National Science Foundation under Grant No. 1119670. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Publisher Copyright:
© 2016 Wiley Periodicals, Inc.

Keywords

automated scoring
c-rater-ML
science assessment

Access

10.1002/tea.21299

OpenUrl availability

Full text

Cite this

@article{0d3f5c9267784edc9abf0ed36b393a30,

title = "Validation of automated scoring of science assessments",

abstract = "Constructed response items can both measure the coherence of student ideas and serve as reflective experiences to strengthen instruction. We report on new automated scoring technologies that can reduce the cost and complexity of scoring constructed-response items. This study explored the accuracy of c-rater-ML, an automated scoring engine developed by Educational Testing Service, for scoring eight science inquiry items that require students to use evidence to explain complex phenomena. Automated scoring showed satisfactory agreement with human scoring for all test takers as well as specific subgroups. These findings suggest that c-rater-ML offers a promising solution to scoring constructed-response science items and has the potential to increase the use of these items in both instruction and assessment.",

keywords = "automated scoring, c-rater-ML, science assessment",

author = "Liu, {Ou Lydia} and Rios, {Joseph A.} and Michael Heilman and Libby Gerard and Linn, {Marcia C.}",

note = "Funding Information: National Science Foundation; Contract grant number: 1119670. This material is based upon work supported by the National Science Foundation under Grant No. 1119670. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. Publisher Copyright: {\textcopyright} 2016 Wiley Periodicals, Inc.",

year = "2016",

month = feb,

day = "1",

doi = "10.1002/tea.21299",

language = "English (US)",

volume = "53",

pages = "215--233",

journal = "Journal of Research in Science Teaching",

issn = "0022-4308",

publisher = "John Wiley and Sons Inc.",

number = "2",

}

TY - JOUR

T1 - Validation of automated scoring of science assessments

AU - Liu, Ou Lydia

AU - Rios, Joseph A.

AU - Heilman, Michael

AU - Gerard, Libby

AU - Linn, Marcia C.

N1 - Funding Information: National Science Foundation; Contract grant number: 1119670. This material is based upon work supported by the National Science Foundation under Grant No. 1119670. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. Publisher Copyright: © 2016 Wiley Periodicals, Inc.

PY - 2016/2/1

Y1 - 2016/2/1

N2 - Constructed response items can both measure the coherence of student ideas and serve as reflective experiences to strengthen instruction. We report on new automated scoring technologies that can reduce the cost and complexity of scoring constructed-response items. This study explored the accuracy of c-rater-ML, an automated scoring engine developed by Educational Testing Service, for scoring eight science inquiry items that require students to use evidence to explain complex phenomena. Automated scoring showed satisfactory agreement with human scoring for all test takers as well as specific subgroups. These findings suggest that c-rater-ML offers a promising solution to scoring constructed-response science items and has the potential to increase the use of these items in both instruction and assessment.

AB - Constructed response items can both measure the coherence of student ideas and serve as reflective experiences to strengthen instruction. We report on new automated scoring technologies that can reduce the cost and complexity of scoring constructed-response items. This study explored the accuracy of c-rater-ML, an automated scoring engine developed by Educational Testing Service, for scoring eight science inquiry items that require students to use evidence to explain complex phenomena. Automated scoring showed satisfactory agreement with human scoring for all test takers as well as specific subgroups. These findings suggest that c-rater-ML offers a promising solution to scoring constructed-response science items and has the potential to increase the use of these items in both instruction and assessment.

KW - automated scoring

KW - c-rater-ML

KW - science assessment

UR - http://www.scopus.com/inward/record.url?scp=84954103223&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84954103223&partnerID=8YFLogxK

U2 - 10.1002/tea.21299

DO - 10.1002/tea.21299

M3 - Article

AN - SCOPUS:84954103223

SN - 0022-4308

VL - 53

SP - 215

EP - 233

JO - Journal of Research in Science Teaching

JF - Journal of Research in Science Teaching

IS - 2

ER -

Validation of automated scoring of science assessments

Abstract

Bibliographical note

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this