Decisions that make a difference in detecting differential item functioning

Stephen G. Sireci; Joseph Rios

doi:10.1080/13803611.2013.767621

Decisions that make a difference in detecting differential item functioning

Stephen G. Sireci, Joseph Rios

Research output: Contribution to journal › Article › peer-review

42 Scopus citations

Abstract

There are numerous statistical procedures for detecting items that function differently across subgroups of examinees that take a test or survey. However, in endeavouring to detect items that may function differentially, selection of the statistical method is only one of many important decisions. In this article, we discuss the important decisions that affect investigations of differential item functioning (DIF) such as choice of method, sample size, effect size criteria, conditioning variable, purification, DIF amplification, DIF cancellation, and research designs for evaluating DIF. Our review highlights the necessity of matching the DIF procedure to the nature of the data analysed, the need to include effect size criteria, the need to consider the direction and balance of items flagged for DIF, and the need to use replication to reduce Type I errors whenever possible. Directions for future research and practice in using DIF to enhance the validity of test scores are provided.

Original language	English (US)
Pages (from-to)	170-187
Number of pages	18
Journal	Educational Research and Evaluation
Volume	19
Issue number	2-3
DOIs	https://doi.org/10.1080/13803611.2013.767621
State	Published - Apr 1 2013
Externally published	Yes

Keywords

differential item functioning
item bias
validity

Access

10.1080/13803611.2013.767621

OpenUrl availability

Full text

Cite this

@article{a074b1413dcb4198b09bf972631fe74d,

title = "Decisions that make a difference in detecting differential item functioning",

abstract = "There are numerous statistical procedures for detecting items that function differently across subgroups of examinees that take a test or survey. However, in endeavouring to detect items that may function differentially, selection of the statistical method is only one of many important decisions. In this article, we discuss the important decisions that affect investigations of differential item functioning (DIF) such as choice of method, sample size, effect size criteria, conditioning variable, purification, DIF amplification, DIF cancellation, and research designs for evaluating DIF. Our review highlights the necessity of matching the DIF procedure to the nature of the data analysed, the need to include effect size criteria, the need to consider the direction and balance of items flagged for DIF, and the need to use replication to reduce Type I errors whenever possible. Directions for future research and practice in using DIF to enhance the validity of test scores are provided.",

keywords = "differential item functioning, item bias, validity",

author = "Sireci, {Stephen G.} and Joseph Rios",

year = "2013",

month = apr,

day = "1",

doi = "10.1080/13803611.2013.767621",

language = "English (US)",

volume = "19",

pages = "170--187",

journal = "Educational Research and Evaluation",

issn = "1380-3611",

publisher = "Taylor and Francis Ltd.",

number = "2-3",

}

TY - JOUR

T1 - Decisions that make a difference in detecting differential item functioning

AU - Sireci, Stephen G.

AU - Rios, Joseph

PY - 2013/4/1

Y1 - 2013/4/1

N2 - There are numerous statistical procedures for detecting items that function differently across subgroups of examinees that take a test or survey. However, in endeavouring to detect items that may function differentially, selection of the statistical method is only one of many important decisions. In this article, we discuss the important decisions that affect investigations of differential item functioning (DIF) such as choice of method, sample size, effect size criteria, conditioning variable, purification, DIF amplification, DIF cancellation, and research designs for evaluating DIF. Our review highlights the necessity of matching the DIF procedure to the nature of the data analysed, the need to include effect size criteria, the need to consider the direction and balance of items flagged for DIF, and the need to use replication to reduce Type I errors whenever possible. Directions for future research and practice in using DIF to enhance the validity of test scores are provided.

AB - There are numerous statistical procedures for detecting items that function differently across subgroups of examinees that take a test or survey. However, in endeavouring to detect items that may function differentially, selection of the statistical method is only one of many important decisions. In this article, we discuss the important decisions that affect investigations of differential item functioning (DIF) such as choice of method, sample size, effect size criteria, conditioning variable, purification, DIF amplification, DIF cancellation, and research designs for evaluating DIF. Our review highlights the necessity of matching the DIF procedure to the nature of the data analysed, the need to include effect size criteria, the need to consider the direction and balance of items flagged for DIF, and the need to use replication to reduce Type I errors whenever possible. Directions for future research and practice in using DIF to enhance the validity of test scores are provided.

KW - differential item functioning

KW - item bias

KW - validity

UR - http://www.scopus.com/inward/record.url?scp=84875871533&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84875871533&partnerID=8YFLogxK

U2 - 10.1080/13803611.2013.767621

DO - 10.1080/13803611.2013.767621

M3 - Article

AN - SCOPUS:84875871533

SN - 1380-3611

VL - 19

SP - 170

EP - 187

JO - Educational Research and Evaluation

JF - Educational Research and Evaluation

IS - 2-3

ER -

Decisions that make a difference in detecting differential item functioning

Abstract

Keywords

Access

OpenUrl availability

Other files and links

Fingerprint

Cite this