"A Primer on the Use of Equivalence Testing for Evaluating Measurement Agreement" by Philip M. Dixon

Selected Works of Philip Dixon

Follow Contact

Article

A Primer on the Use of Equivalence Testing for Evaluating Measurement Agreement

Medicine & Science in Sports & Exercise

Philip M. Dixon, Iowa State University
Pedro F. Saint-Maurice, Iowa State University
Youngwon Kim, University of Cambridge School of Clinical Medicine
Paul Hibbing, University of Tennessee, Knoxville
Yang Bai, University of Vermont
Gregory Welk, Iowa State University

Download Find in your library

Document Type

Article

Disciplines

Publication Version

Published Version

Publication Date

4-1-2018

DOI

10.1249/MSS.0000000000001481

Abstract

Purpose Statistical equivalence testing is more appropriate than conventional tests of difference to assess the validity of physical activity (PA) measures. This article presents the underlying principles of equivalence testing and gives three examples from PA and fitness assessment research.

Methods The three examples illustrate different uses of equivalence tests. Example 1 uses PA data to evaluate an activity monitor’s equivalence to a known criterion. Example 2 illustrates the equivalence of two field-based measures of physical fitness with no known reference method. Example 3 uses regression to evaluate an activity monitor’s equivalence across a suite of 23 activities.

Results The examples illustrate the appropriate reporting and interpretation of results from equivalence tests. In the first example, the mean criterion measure is significantly within ±15% of the mean PA monitor. The mean difference is 0.18 METs and the 90% confidence interval of −0.15 to 0.52 is inside the equivalence region of −0.65 to 0.65. In the second example, we chose to define equivalence for these two measures as a ratio of mean values between 0.98 and 1.02. The estimated ratio of mean V˙O2 values is 0.99, which is significantly (P = 0.007) inside the equivalence region. In the third example, the PA monitor is not equivalent to the criterion across the suite of activities. The estimated regression intercept and slope are −1.23 and 1.06. Neither confidence interval is within the suggested regression equivalence regions.

Conclusions When the study goal is to show similarity between methods, equivalence testing is more appropriate than traditional statistical tests of differences (e.g., ANOVA and t-tests).

Comments

This article is published as Dixon, P.M., Saint-Maurice, P.F., Kim, Y., Hibbing, P., Bai, Y. and Welk, G.J. 2018. A primer on the use of equivalence testing for evaluating measurement agreement. Medicine & Science in Sports & Exercise 50:837-845. doi: 10.1249/MSS.0000000000001481.

Creative Commons License

Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International

The Authors

2017

Language

File Format

application/pdf

Citation Information

Philip M. Dixon, Pedro F. Saint-Maurice, Youngwon Kim, Paul Hibbing, et al.. "A Primer on the Use of Equivalence Testing for Evaluating Measurement Agreement" Medicine & Science in Sports & Exercise Vol. 50 Iss. 4 (2018) p. 837 - 845
Available at: http://works.bepress.com/philip-dixon/65/