Evaluating semantic relatedness and similarity measures with Standardized MedDRA Queries.

Robert W. Bill; Ying Liu; Bridget T. McInnes; Genevieve B Melton-Meaux; Ted Pedersen; Serguei V Pakhomov

Evaluating semantic relatedness and similarity measures with Standardized MedDRA Queries.

Robert W. Bill, Ying Liu, Bridget T. McInnes, Genevieve B Melton-Meaux, Ted Pedersen, Serguei V Pakhomov

Research output: Contribution to journal › Article › peer-review

Abstract

A potential use of automated concept similarity and relatedness measures is to improve automatic detection of clinical text that relates to a condition indicative of an adverse drug reaction. This is also one of the purposes of the Medical Dictionary for Regulatory Activities (MedDRA) Standardized Queries (SMQ). An expert panel evaluates SMQs for their ability to detect a condition of interest and thus qualifies them as a reference standard for evaluating automated approaches. We compare similarity and relatedness measurement methods on rates of correctly identifying intra-category and inter-category concept pairs from SMQ data to create ROC curves of each method's sensitivity and specificity. Results indicate an information content measure, specifically the Resnik method, achieved the highest results as measured by area under the curve, but using two different measures as predictors, Resnik and Lin, obtained the highest score. Overall, using SMQ data resulted in a productive method of evaluating automated semantic relatedness and similarity scores.

Original language	English (US)
Pages (from-to)	43-50
Number of pages	8
Journal	Unknown Journal
Volume	2012
State	Published - 2012

OpenUrl availability

Full text

Cite this

@article{f28eb01f9cd743198a8abbfae7876e2d,

title = "Evaluating semantic relatedness and similarity measures with Standardized MedDRA Queries.",

abstract = "A potential use of automated concept similarity and relatedness measures is to improve automatic detection of clinical text that relates to a condition indicative of an adverse drug reaction. This is also one of the purposes of the Medical Dictionary for Regulatory Activities (MedDRA) Standardized Queries (SMQ). An expert panel evaluates SMQs for their ability to detect a condition of interest and thus qualifies them as a reference standard for evaluating automated approaches. We compare similarity and relatedness measurement methods on rates of correctly identifying intra-category and inter-category concept pairs from SMQ data to create ROC curves of each method's sensitivity and specificity. Results indicate an information content measure, specifically the Resnik method, achieved the highest results as measured by area under the curve, but using two different measures as predictors, Resnik and Lin, obtained the highest score. Overall, using SMQ data resulted in a productive method of evaluating automated semantic relatedness and similarity scores.",

author = "Bill, {Robert W.} and Ying Liu and McInnes, {Bridget T.} and Melton-Meaux, {Genevieve B} and Ted Pedersen and Pakhomov, {Serguei V}",

year = "2012",

language = "English (US)",

volume = "2012",

pages = "43--50",

journal = "Unknown Journal",

issn = "0022-1120",

publisher = "Cambridge University Press",

}

TY - JOUR

T1 - Evaluating semantic relatedness and similarity measures with Standardized MedDRA Queries.

AU - Bill, Robert W.

AU - Liu, Ying

AU - McInnes, Bridget T.

AU - Melton-Meaux, Genevieve B

AU - Pedersen, Ted

AU - Pakhomov, Serguei V

PY - 2012

Y1 - 2012

N2 - A potential use of automated concept similarity and relatedness measures is to improve automatic detection of clinical text that relates to a condition indicative of an adverse drug reaction. This is also one of the purposes of the Medical Dictionary for Regulatory Activities (MedDRA) Standardized Queries (SMQ). An expert panel evaluates SMQs for their ability to detect a condition of interest and thus qualifies them as a reference standard for evaluating automated approaches. We compare similarity and relatedness measurement methods on rates of correctly identifying intra-category and inter-category concept pairs from SMQ data to create ROC curves of each method's sensitivity and specificity. Results indicate an information content measure, specifically the Resnik method, achieved the highest results as measured by area under the curve, but using two different measures as predictors, Resnik and Lin, obtained the highest score. Overall, using SMQ data resulted in a productive method of evaluating automated semantic relatedness and similarity scores.

AB - A potential use of automated concept similarity and relatedness measures is to improve automatic detection of clinical text that relates to a condition indicative of an adverse drug reaction. This is also one of the purposes of the Medical Dictionary for Regulatory Activities (MedDRA) Standardized Queries (SMQ). An expert panel evaluates SMQs for their ability to detect a condition of interest and thus qualifies them as a reference standard for evaluating automated approaches. We compare similarity and relatedness measurement methods on rates of correctly identifying intra-category and inter-category concept pairs from SMQ data to create ROC curves of each method's sensitivity and specificity. Results indicate an information content measure, specifically the Resnik method, achieved the highest results as measured by area under the curve, but using two different measures as predictors, Resnik and Lin, obtained the highest score. Overall, using SMQ data resulted in a productive method of evaluating automated semantic relatedness and similarity scores.

UR - http://www.scopus.com/inward/record.url?scp=84880828010&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84880828010&partnerID=8YFLogxK

M3 - Article

C2 - 23304271

AN - SCOPUS:84880828010

SN - 0022-1120

VL - 2012

SP - 43

EP - 50

JO - Unknown Journal

JF - Unknown Journal

ER -

Evaluating semantic relatedness and similarity measures with Standardized MedDRA Queries.

Abstract

OpenUrl availability

Other files and links

Fingerprint

Cite this