Multidimensional Computerized Adaptive Testing: A Potential Path Toward the Efficient and Precise Assessment of Applied Cognition, Daily Activity, and Mobility for Hospitalized Patients

Published:January 24, 2022DOI:



      To develop and evaluate an efficient and precise variable-length functional assessment of applied cognition, daily activity, and mobility to inform mobility preservation and rehabilitation service delivery among hospitalized patients.


      A multidimensional item bank tapping into these dimensions was developed, with all items calibrated using a multidimensional graded response model. The items were adaptively selected from the item banks to maximize the test information, and the test ended when a joint stopping rule was satisfied. A simulation study was conducted based on the completed instrument, the Functional Assessment in Acute Care Multidimensional Computerized Adaptive Test (FAMCAT), to compare its measurement precision and efficiency capabilities relative to conventional unidimensional computerized adaptive testing. Precision was measured by the bias and root mean squared error between the estimated and true (ie, simulated) θ estimates, whereas efficiency was measured by average test length. Data were collected by an interviewer reading questions from a tablet computer and entering patients’ responses.


      A large Midwestern hospital.


      A total of 4143 patients hospitalized with medical diagnosis and/or surgical complications, with 2060 in the calibration sample and 2083 in the validation cohort.


      Not applicable.


      Among the 2083 patients in the validation sample, FAMCAT administration required an average of 6 (SD=3.11) minutes. Ninety-six percent had their tests terminated by the standard error rule after responding to an average of 22.05 (SD=7.98) items, whereas 15 were terminated by the change in θ rule, with an average test length of 45.27 (SD=11.49). The remaining 76 responded until reaching the maximum test length of 60 items.


      The FAMCAT has the potential to satisfy the need for structured, frequent, and precise assessment of functional domains among hospitalized patients with medical diagnosis and/or surgical complications. The results are promising and may be informative for others who wish to develop similar instruments when concurrent assessment of correlated domains is required.

      List of abbreviations:

      CAT (computerized adaptive testing), CT-rule (change of θ rule), FAMCAT (Functional Assessment in Acute Care Multidimensional Computerized Adaptive Test), IRT (item response theory), LD (local dependence), MAP (maximum a posteriori), MCAT (multidimensional computerized adaptive testing), MGRM (multidimensional graded response model), MIRT (multidimensional item response theory), PRO (patient-reported outcome), PROM (patient-reported outcome measure), PROMIS (Patient-Reported Outcome Measurement Information System), RMSE (root mean squared error), RMSEA (root mean square error of approximation), RT (response time), SE-rule (standard error rule), UCAT (unidimensional computerized adaptive testing)
      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Archives of Physical Medicine and Rehabilitation
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Michel P
        • Baumstarck K
        • Lancon C
        • et al.
        Modernizing quality of life assessment: development of a multidimensional computerized adaptive questionnaire for patients with schizophrenia.
        Qual Life Res. 2018; 27: 1041-1054
        • Zheng Y
        • Chang C-H
        • Chang H-H.
        Content-balancing strategy in bifactor computerized adaptive patient-reported outcome measurement.
        Qual Life Res. 2013; 22: 491-499
        • Fayers PM.
        Applying item response theory and computer adaptive testing: the challenges for health outcomes assessment.
        Qual Life Res. 2007; 16: 187-194
        • Fayers PM
        • Machin D.
        Quality of life: the assessment, analysis and interpretation of patient-reported outcomes.
        John Wiley & Sons, Chichester, England2013
        • Pilkonis PA
        • Choi SW
        • Reise SP
        • et al.
        Item banks for measuring emotional distress from the Patient-Reported Outcomes Measurement Information System (PROMIS®): depression, anxiety, and anger.
        Assessment. 2011; 18: 263-283
        • Wang C.
        Improving measurement precision of hierarchical latent traits using adaptive testing.
        J Educ Behav Stat. 2014; 39: 452-477
        • Vermeulen J
        • Neyens JC
        • van Rossum E
        • Spreeuwenberg MD
        • de Witte LP.
        Predicting ADL disability in community-dwelling elderly people using physical frailty indicators: a systematic review.
        BMC Geriatr. 2011; 11: 1-11
        • Paap MC
        • Kroeze KA
        • Glas CA
        • Terwee CB
        • van der Palen J
        • Veldkamp BP.
        Measuring patient-reported outcomes adaptively: multidimensionality matters!.
        Appl Psychol Meas. 2018; 42: 327-342
        • Paap MC
        • Born S
        • Braeken J.
        Measurement efficiency for fixed-precision multidimensional computerized adaptive tests: comparing health measurement and educational testing using example banks.
        Appl Psychol Meas. 2019; 43: 68-83
        • Kolen MJ
        • Brennan RL
        Test equating, scaling, and linking: methods and practices.
        3rd ed. Springer Science + Business Media, New York2014
        • Wang C
        • Weiss DJ
        • Shang Z.
        Variable-length stopping rules for multidimensional computerized adaptive testing.
        Psychometrika. 2018; 84: 1-23
        • Werner RM
        • Coe NB
        • Qi M
        • Konetzka RT.
        Patient outcomes after hospital discharge to home with home health care vs to a skilled nursing facility.
        JAMA Intern Med. 2019; 179: 617-623
        • Werner RM
        • Konetzka RT.
        Trends in post–acute care use among Medicare beneficiaries: 2000 to 2015.
        JAMA. 2018; 319: 1616-1617
        • Cheville AL
        • Wang C
        • Weiss DJ
        • et al.
        Improving the delivery of function-directed care during acute hosptilization: methods to develop and validate the functional assessment in acute care multidimensional computerized adaptive test (FAMCAT).
        Arch Rehabil Res Clin Transl. 2021; 3100112
        • Weiss DJ
        • Wang C
        • DeWeese J
        • Cheville AL
        • Basford J
        Adaptive measurement of change: a novel method to reduce respondent burden and detect significant individual-level change in patient-reported outcome measures.
        Arch Phys Med Rehabil. 2022; 103: S43-S52
        • Jiang S
        • Wang C
        • Weiss DJ.
        Sample size requirements for estimation of item parameters in the multidimensional graded response model.
        Front Psychol. 2016; 7: 109
        • Schalet BD
        • Lim S
        • Cella D
        • Choi SW.
        Linking scores with patient-reported health outcome instruments: a validation study and comparison of three linking methods.
        Psychometrika. 2021; 86: 1-30
      1. Lee W-C, Lee G. IRT linking and equating. In: Irwing P, Booth T, Hughes DJ, Editors. The Wiley handbook of psychometric testing: a multidisciplinary reference on survey, scale and test development. Hoboken: Wiley Blackwell; 2018. p 639–73.

        • Cai L.
        flexMIRT® version 2: Flexible multilevel multidimensional item analysis and test scoring.
        Vector Psychometric Group, Chapel Hill, NC2013 ([Computer software])
      2. Jiang S, Wang C. The different effects of collapsing categories on the graded response model and the generalized partial credit model. Paper presented at: American Educational Research Association Annual Meeting. April 5–9, Toronto, Canada; 2019.

        • Wang C
        • Chang H-H
        • Boughton KA.
        Deriving stopping rules for multidimensional computerized adaptive testing.
        Appl Psychol Meas. 2013; 37: 99-122
        • Wang C
        • Chang H-H.
        Item selection in multidimensional computerized adaptive testing—gaining information from different angles.
        Psychometrika. 2011; 76: 363-384
        • Segall DO.
        Multidimensional adaptive testing.
        Psychometrika. 1996; 61: 331-354
        • Wang C
        • Chang H-H
        • Douglas J.
        Combining CAT with cognitive diagnosis: a weighted item selection approach.
        Behav Res Methods. 2012; 44: 95-109
        • Wang C.
        On latent trait estimation in multidimensional compensatory item response models.
        Psychometrika. 2015; 80: 428-449
        • Cai L
        • Monroe S.
        IRT model fit evaluation from theory to practice: progress and some unanswered questions.
        Measurement (Mahwah, NJ). 2013; 11: 102-106
        • Hu Lt
        • Bentler PM
        Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives.
        Struct Equ Modeling. 1999; 6: 1-55
        • Auné SE
        • Abal FJ
        • Attorresi HF.
        Application of the graded response model to a scale of empathie behavior.
        Int J Psychol Res. 2019; 12: 49-56
        • Chen W-H
        • Thissen D.
        Local dependence indexes for item pairs using item response theory.
        J Educ Behav Stat. 1997; 22: 265-289
        • Orlando M
        • Thissen D.
        Likelihood-based item-fit indices for dichotomous item response theory models.
        Appl Psychol Meas. 2000; 24: 50-64
        • Su S
        • Wang C
        • Weiss DJ.
        Performance of the S− χ2 statistic for the multidimensional graded response model.
        Educ Psychol Meas. 2021; 81: 491-522
        • Stover AM
        • McLeod LD
        • Langer MM
        • Chen W-H
        • Reeve BB.
        State of the psychometric methods: patient-reported outcome measure development and refinement using item response theory.
        J Patient Rep Outcomes. 2019; 3: 1-16
        • Bayliss EA
        • Ellis JL
        • Powers JD
        • Gozansky W
        • Zeng C.
        Using self-reported data to segment older adult populations with complex care needs.
        eGEMs. 2019; 7: 1-11
        • Marfeo E
        • Ni P
        • Wang C
        • Weiss DJ
        • Cheville AL
        Correlation and crosswalks between patient-reported functional outcomes and PROMIS physical function among medically ill patients.
        Arch Phys Med Rehabil. 2022; 103: S15-S23
      3. Keeney T, Weiss DJ, Ni P, Wang C, Cheville AL. Ability of the Functional Assessment in Acute Care Multidimensional Computer Adaptive Test (FAMCAT) to predict hospitalization outcomes: 30-day hospital readmission and discharge to institutional post-acute care. Arch Phys Med Rehabil. 2021 Oct 17. [Epub ahead of print].

        • Wang C
        • Weiss DJ
        • Su S.
        Modeling response time and responses in multidimensional health measurement.
        Front Psychol. 2019; 10: 51
      4. Lu J, Wang C, Weiss DJ. Using response time to improve precision and efficiency of computerized adaptive testing. Paper presented at: Biannual Meeting of the International Association of Computerized Adaptive Testing (IACAT). June 10–13, 2019; Minneapolis, MN.

        • Braver TS
        • Barch DM.
        A theory of cognitive control, aging cognition, and neuromodulation.
        Neurosci Biobehav Rev. 2002; 26: 809-817
        • Anstey KJ
        • Wood J
        • Lord S
        • Walker JG.
        Cognitive, sensory and physical factors enabling driving safety in older adults.
        Clin Psychol Rev. 2005; 25: 45-65
        • Der G
        • Deary IJ.
        Age and sex differences in reaction time in adulthood: results from the United Kingdom Health and Lifestyle Survey.
        Psychol Aging. 2006; 21: 62-73
        • Fan Z
        • Wang C
        • Chang H-H
        • Douglas J.
        Utilizing response time distributions for item selection in CAT.
        J Educ Behav Stat. 2012; 37: 655-670
        • Wang C
        • Chen P
        • Huebner A
        Stopping rules for multi-category computerized classification testing.
        Br J Math Stat Psychol. 2021; 74: 184-202
        • Magnus BE
        • Liu Y.
        A zero-inflated Box-Cox normal unipolar item response model for measuring constructs of psychopathology.
        Appl Psychol Meas. 2018; 42: 571-589
        • Reise SP
        • Waller NG.
        Item response theory and clinical measurement.
        Annu Rev Clin Psychol. 2009; 5: 27-48
        • Reise S
        • Rodriguez A.
        Item response theory and the measurement of psychiatric constructs: some empirical and conceptual issues and challenges.
        Psychol Med. 2016; 46: 2025-2039
        • Wang C
        • Su S
        • Weiss DJ.
        Robustness of parameter estimation to assumptions of normality in the multidimensional graded response model.
        Multivariate Behav Res. 2018; 53: 403-418
      5. Lucke JF. Unipolar item response models. In: Reise SP, Revicki DA, editors. Handbook of item response theory modeling. Oxfordshire, England, UK: Routledge; 2014. p 290–302.

        • Malone DJ
        • Lindsay KLB
        Physical therapy in acute care: a clinician’s guide.
        Slack Inc, West Deptford2006
        • Marfeo E
        • Ni P
        • Wang C
        • Weiss D
        • Cheville AL.
        Identifying clinically relevant functional strata to direct mobility preservation among patients hospitalized with medical conditions.
        Arch Phys Med Rehabil. 2022; 103: S78-S83