ORIGINAL RESEARCH| Volume 103, ISSUE 5, SUPPLEMENT , S84-S107.e38, May 2022

Examination of the Measurement Equivalence of the Functional Assessment in Acute Care MCAT (FAMCAT) Mobility Item Bank Using Differential Item Functioning Analyses



      To assess differential item functioning (DIF) in an item pool measuring the mobility of hospitalized patients across educational, age, and sex groups.


      Measurement evaluation cohort study. Content experts generated DIF hypotheses to guide the interpretation. The graded response item response theory (IRT) model was used. Primary DIF tests were Wald statistics; sensitivity analyses were conducted using the IRT ordinal logistic regression procedure. Magnitude and impact were evaluated by examining group differences in expected item and scale score functions.


      Hospital-based rehabilitation.


      Hospitalized patients (N=2216).


      Not applicable.

      Main Outcome Measures

      A total of 111 self-reported mobility items.


      Two linking items among those used to set the metric across forms evidenced DIF for sex and age: “difficulty climbing stairs step-over-step without a handrail (alternating feet)” and “difficulty climbing 3-5 steps without a handrail.” Conditional on the mobility state, the items were more difficult for women and older people (aged ≥65y). An additional 18 items were identified with DIF. Items with both high DIF magnitude and hypotheses related to age were difficulty “crossing road at a 4-lane traffic light with curbs,” “jumping/landing on one leg,” “strenuous activities,” and “descending 3-5 steps with no handrail.” Although DIF of higher magnitude was observed for several items, the scale-level effect was relatively small and the exposure rate for the most problematic items was low (0.35, 0.27, and 0.20).


      This was the first study to evaluate measurement equivalence of the hospital-based rehabilitation mobility item bank. Although 20 items evidenced high magnitude DIF, 5 of which were related to stairs, the scale-level effect was minimal; however, it is recommended that such items be avoided in the development of short-form measures. No items with salient DIF were removed from calibrations, supporting the use of the item bank across groups differing in education, age, and sex. The bank may thus be useful to assist clinical assessment and decision-making regarding risk for specific mobility restrictions at discharge as well as identifying mobility-related functions targeted for postdischarge interventions. Additionally, with the goal of avoiding long and burdensome assessments for patients and clinical staff, these results could be informative for those using the item bank to construct short forms.


      List of abbreviations:

      CAT (computerized adaptive testing), DIF (differential item functioning), FKGL (Flesch-Kincaid grade level), FREI (Flesch-Kincaid Reading Ease Index), IRT (item response theory), LD (local dependency), NCDIF (noncompensatory DIF), OLR (ordinal logistic regression)
      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Archives of Physical Medicine and Rehabilitation
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Leveille SG
        • Penninx BW
        • Melzer D
        • Izmirlian G
        • Guralnik JM.
        Sex differences in the prevalence of mobility disability in old age: the dynamics of incidence, recovery, and mortality.
        J Gerontol B Psychol Sci Soc Sci. 2000; 55: S41-S50
        • Ferrucci L
        • Guralnik JM.
        Mobility in human aging: a multidisciplinary life span conceptual framework.
        Annu Rev Gerontol Geriatr. 2013; 33: 171-192
        • Verbrugge LM
        • Brown DC
        • Zajacova A.
        Disability rises gradually for a cohort of older Americans.
        J Gerontol B Psychol Sci Soc Sci. 2017; 72: 151-161
        • Molton IR
        • Yorkston KM.
        Growing older with a physical disability: a special application of the successful aging paradigm.
        J Gerontol B Psychol Sci Soc Sci. 2017; 72: 290-299
        • Young Y
        • Frick KD
        • Phelan EA.
        Can successful aging and chronic illness coexist in the same individual? A multidimensional concept of successful aging.
        J Am Med Dir Assoc. 2009; 10: 87-92
        • Gill TM.
        Assessment of function and disability in longitudinal studies.
        J Am Geriatr Soc. 2010; 58: S308-S312
        • Hays RD
        • Spritzer KL
        • Amtmann D
        • et al.
        Upper-extremity and mobility subdomains from the Patient-Reported Outcomes Measurement Information System (PROMIS) adult physical functioning item bank.
        Arch Phys Med Rehabil. 2013; 94: 2291-2296
        • Guralnik JM
        • Ferrucci L.
        Assessing the building blocks of function: utilizing measures of functional limitation.
        Am J Prev Med. 2003; 25: 112-121
        • Cabrero-García J
        • Ramos-Pichardo JD
        • Muñoz-Mendoza CL
        • et al.
        Validation of a mobility item bank for older patients in primary care.
        Health Qual Life Outcomes. 2012; 10: 147
        • Abellan van Kan G
        • Rolland Y
        • Andrieu Y
        • et al.
        Gait speed at usual pace as a predictor of adverse outcomes in community-dwelling older people an International Academy on Nutrition and Aging (IANA) Task Force.
        J Nutr Health Aging. 2009; 13: 881-889
        • Hardy SE
        • Perera S
        • Roumani YF
        • Chandler JM
        • Studenski SA.
        Improvement in usual gait speed predicts better survival in older adults.
        J Am Geriatr Soc. 2007; 55: 1727-1734
        • Hirvensalo M
        • Rantanen T
        • Heikkinen E.
        Mobility difficulties and physical activity as predictors of mortality and loss of independence in the community living older population.
        J Am Geriatr Soc. 2000; 48: 493-498
        • Newman AB
        • Simonsick EM
        • Naydeck BL
        • et al.
        Association of long-distance corridor walk performance with mortality, cardiovascular disease, mobility limitation, and disability.
        JAMA. 2006; 295: 2018-2026
        • Studenski S
        • Perera S
        • Patel K
        • et al.
        Gait speed and survival in older adults.
        JAMA. 2011; 305: 50-58
        • Paz SH
        • Spritzer KL
        • Morales LS
        • Hays RD.
        Evaluation of the Patient-Reported Outcomes Information System (PROMIS) Spanish physical functioning items.
        Qual Life Res. 2013; 22: 1819-1830
        • Jones RN
        • Tommet D
        • Ramirez M
        • Jensen R
        • Teresi JA.
        Differential item functioning in Patient Reported Outcomes Measurement Information System® (PROMIS®) Physical Functioning short forms: analyses across ethnically diverse groups.
        Psychol Test Assess Model. 2016; 58: 371-402
        • Group The EuroQol
        EuroQol-a new facility for the measurement of health-related quality of life.
        Health Policy. 1990; 16: 199-208
        • Prieto L
        • Novick D
        • Sacristan JA
        • Edgell ET
        • Alonso J
        • Study Group SOHO
        A Rasch model analysis to test the cross-cultural validity of the EuroQoL-5D in the Schizophrenia Outpatient Health Outcomes Study.
        Acta Psychiatrica Scandanavica. 2003; 107: 24-29
        • Smith AB
        • Cocks K
        • Parry D
        • Taylor M.
        A differential item functioning analysis of the EQ-5D in cancer.
        Value Health. 2016; 19: 1063-1067
        • Bergner M
        • Bobbitt RA
        • Carter WB
        • Gilson BS.
        The Sickness Impact Profile: development and final revision of a health status measure.
        Med Care. 1981; 19: 787-805
        • Lindeboom R
        • Holman R
        • Dijkgraaf MGW
        • et al.
        Scaling the Sickness Impact Profile using item response theory: an exploration of linearity, adaptive use, and patient driven item weights.
        J Clin Epidemiol. 2004; 57: 66-74
        • McEwen J
        • McKenna S
        Nottingham Health Profile.
        in: Spilker B Quality of life and pharmacoeconomics in clinical trials. 3rd ed. Lippincott-Raven Publishers, Philadelphia1996: 281-286
        • Juhel J
        • Gaillot AC.
        Structural validity and age-based differential item functioning of the French Nottingham Health Profile in a sample of surgery patients.
        Adv Psychol Study. 2012; 1: 14-21
        • Roorda LD
        • Green JR
        • Houwink A
        • et al.
        The Rivermead Mobility Index allows valid comparisons between subgroups of patients undergoing rehabilitation after stroke who differ with respect to age, sex, or side of lesion.
        Arch Phys Med Rehabil. 2012; 93: 1086-1090
        • Teresi JA
        • Ramirez M
        • Lai JS
        • Silver S.
        Occurrences and sources of differential item functioning (DIF) in patient-reported outcome measures: description of DIF methods, and review of measures of depression, quality of life and general health.
        Psychol Sci Q. 2008; 50: 538-612
        • Flesch R.
        A new readability yardstick.
        J Appl Psychol. 1948; 32: 221-223
        • Meredith W.
        Measurement invariance, factor analysis and factorial invariance.
        Psychometrika. 1993; 58: 525-543
        • Meredith W
        • Teresi JA.
        An essay on measurement and factorial invariance.
        Med Care. 2006; 44: S69-S77
        • Millsap RE
        • Meredith W.
        Inferential conditions in the in the statistical detection of measurement bias.
        Appl Psychol Meas. 1992; 16: 389-402
        • van de Vijver F
        • Leung K.
        Methods and data analyses for cross-cultural research.
        Sage Publications, Thousand Oaks1997
        • Holland PH
        • Wainer H.
        Differential item functioning.
        Lawrence Erlbaum, Hillsdale1993
        • Lord FM.
        Applications of item response theory to practical testing problems.
        Lawrence Erlbaum, Hillsdale1980
      1. Wang & Weiss, this series, in press.

        • Samejima F.
        Estimation of latent ability using a response pattern of graded scores.
        Psychometrika Monogr Suppl. 1969; 34: 100-114
        • Orlando-Edelen M
        • Thissen D
        • Teresi JA
        • Kleinman M
        • Ocepek-Welikson K.
        Identification of differential item functioning using item response theory and the likelihood-based model comparison approach: applications to the Mini-Mental State Examination.
        Med Care. 2006; 44: S134-S142
        • Cai L
        • Thissen D
        • du Toit SHC.
        IRTPRO: Flexible, multidimensional, multiple categorical IRT Modeling.
        Scientific Software International, Inc, Chicago2011
        • Langer MM.
        A re-examination of Lord's Wald test for differential item functioning using item response theory and modern error estimation [dissertation].
        University of North Carolina at Chapel Hill, Chapel Hill, NC2008
        • Teresi JA
        • Kleinman M
        • Ocepek-Welikson K.
        Modern psychometric methods for detection of differential item functioning: application to cognitive assessment measures.
        Stat Med. 2000; 19: 1651-1683
        • Woods CM
        • Cai L
        • Wang M.
        The Langer-improved Wald test for DIF testing with multiple groups: evaluation and comparison to two-group IRT.
        Educ Psychol Meas. 2013; 73: 532-547
        • Reeve BB
        • Teresi JA.
        Overview to the two-part series: measurement equivalence of the patient-reported outcomes measurement information system (PROMIS®) short-forms.
        Psychol Test Assess Model. 2016; 58: 31-35
      2. Teresi JA, Wang C, Kleinman M, Jones RN, Weiss DJ. Differential item functioning analyses of the Patient Reported Outcomes Measurement Information System (PROMIS) measures: methods, challenges, advances and future directions. Psychometrika. 2021 Jul 12. [Epub ahead of print].

        • Swaminathan H
        • Rogers HJ.
        Detecting differential item functioning using logistic regression procedures.
        J Educ Meas. 1990; 27: 361-370
        • Zumbo BD.
        A handbook on the theory and methods of differential item functioning (DIF): logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores.
        Directorate of Human Resources Research and Evaluation, Department of National Defense, Ottawa, Canada1999
        • Raju NS.
        DFITP5: a Fortran program for calculating dichotomous DIF/DTF.
        Illinois Institute of Technology, Chicago1999
        • Raju NS
        • van der Linden WJ
        • Fleer PF.
        IRT-based internal measures of differential functioning of items and tests.
        Appl Psychol Meas. 1995; 19: 353-368
        • Flowers CP
        • Oshima TC
        • Raju NS.
        A description and demonstration of the polytomous DFIT framework.
        Appl Psychol Meas. 1999; 23: 309-326
        • Oshima TC
        • Kushubar S
        • Scott JC
        • Raju NS.
        DFIT for Windows user's manual: differential functioning of items and tests.
        St. Paul: Assessment Systems Corporation;. 2009;
        • McDonald RP.
        Test theory: a unified treatment.
        L. Erlbaum Associates, Mahwah1999
        • Cronbach LJ.
        Coefficient alpha and the internal structure of tests.
        Psychometrika. 1951; 16: 297-334
        • Zumbo BD
        • Gadermann AM
        • Zeisser C.
        Ordinal versions of coefficient alpha and theta for Likert rating scales.
        J Mod Appl Stat Methods. 2007; 6: 21-29
        • Lord FM
        • Novick MR.
        Statistical theories of mental test scores.
        Addison Wesley, Reading1968
        • Cheng Y
        • Liu C
        • Behrens J.
        Standard error of ability estimates and the classification accuracy and consistency of binary decisions.
        Psychometrika. 2015; 80: 645-664
        • Cheng Y
        • Yuan K-H
        • Liu C.
        Comparison of reliability measures under factor analysis and item response theory.
        Ed Psych Meas. 2012; 72: 52-67
        • Wang Z
        • Weiss D
        • Wang C.
        DIF-CAT: doubly adaptive CAT using subgroup information to improve measurement precision.
        in: Paper presented at: 2017 International Association of Computerized Adaptive Testing, Niigata, Japan2017 (August 18-21,)
        • Jiang S
        • Wang C
        • Weiss DJ.
        Sample size requirements for estimation of item parameters in the multidimensional graded response model.
        Frontiers Psych. 2016; 7: 1-10
        • Fleishman JA
        • Lawrence WF.
        Demographic variation in SF-12 scores: true differences or differential item functioning?.
        Med Care. 2003; 41 (III75-86)
        • Crane PK
        • Gibbons LE
        • Ocepek-Welikson K
        • et al.
        A comparison of three sets of criteria for determining the presence of differential item functioning using ordinal logistic regression.
        Qual Life Res. 2007; 16: 69-84
        • Teresi JA
        • Ocepek-Welikson K
        • Kleinman M
        • et al.
        Evaluating measurement equivalence using the item response theory log-likelihood ratio (IRTLR) method to assess differential item functioning (DIF): applications (with illustrations) to measures of physical functioning ability and general distress.
        Qual Life Res. 2007; 16: 43-68
        • Perkins AJ
        • Stump TE
        • Monahan PO
        • McHorney CA.
        Assessment of differential item functioning for demographic comparisons in the MOS SF-36 health survey.
        Qual Life Res. 2006; 15: 331-348
        • Flynn KE
        • Dombeck CB
        • DeWitt EM
        • Schulman KA
        • Weinfurt KP.
        Using item banks to construct measures of patient reported outcomes in clinical trials: investigator perceptions.
        Clin Trials. 2008; 5: 575-586
        • Lai JS
        • Cella D
        • Chang CH
        • Bode RK
        • Heinemann AW.
        Item banking to improve, shorten and computerize self-reported fatigue: an illustration of steps to create a core item bank from the FACIT-Fatigue Scale.
        Qual Life Res. 2003; 12: 485-501
        • Fries JF
        • Bruce B
        • Bjorner JB
        • Rose M.
        More relevant, precise, and efficient items for assessment of physical function and disability: moving beyond the classic instruments.
        Ann Rheum Dis. 2006; 65 (iii16-21)
        • Reeve BB
        • Hays RD
        • Bjorner JB
        • et al.
        Psychometric evaluation and calibration of health-related quality of life item banks: plans for the Patient-Reported Outcomes Measurement Information System (PROMIS).
        Med Care. 2007; 45 (Suppl 1): S22-S31
        • Cho S-J
        • Suh Y
        • Lee W.
        After differential item functioning is detected: IRT item calibration and scoring in the presence of DIF.
        Appl Psychol Meas. 2016; 40: 573-591
        • Van der Linden WJ
        Handbook of Item Response Theory: volume 3: applications.
        (editor.) Routledge, Milton Park, UK2018