Original research| Volume 101, ISSUE 2, P234-241, February 2020

Download started.


Video-Based Pairwise Comparison: Enabling the Development of Automated Rating of Motor Dysfunction in Multiple Sclerosis

Published:August 29, 2019DOI:



      To examine the feasibility, reliability, granularity, and convergent validity of a video-based pairwise comparison technique that uses algorithmic support to enable automated rating of motor dysfunction in patients with multiple sclerosis (MS).


      Feasibility and larger cross-sectional cohort study.


      The outpatient clinic of 2 specialist university medical centers.


      Selected sample from a cohort of patients with MS participating in the Assess MS study (N=42). Videos were randomly drawn from each strata of the ataxia severity-degrees as defined in the Expanded Disability Status Scale (EDSS). In Basel: 19 videos of 17 patients (mean age, 43.4±11.6y; 10 women). In Amsterdam: 50 videos of 25 patients (mean age, 50.0±10.0y; 15 women).


      Not applicable.

      Main Outcome Measures

      In each center, neurologists (n=13; n=10) viewed pairs of videos of patients performing standardized movements (eg, finger-to-nose test) to assess relative performance. A comparative assessment score was calculated for each video using the TrueSkill algorithm and analyzed for intrarater (test-retest; ratio of agreement) and interrater reliability (intraclass correlation coefficient [ICC] for absolute agreement) and convergent validity (Spearman ρ). Granularity was estimated from the average difference in comparative assessment scores at which 80% of neurologists considered performance to be different.


      Intrarater reliability was excellent (median ratio of agreement≥0.87). The comparative assessment scores calculated from individual neurologists demonstrated good-excellent ICCs for interrater reliability (0.89; 0.71). The comparative assessment scores correlated (very) highly with their Neurostatus-EDSS equivalent (ρ=0.78, P<.001; ρ=0.91, P<.05), suggesting a more fine-grained rating.


      Video-based pairwise comparison of motor dysfunction allows for reliable and fine-grained capturing of clinical judgment about neurologic performance, which can contribute to the development of a consistent quantified metric of motor ability in MS.


      List of abbreviations:

      ADL (activities of daily living), AMSQ (Arm Function in Multiple Sclerosis Questionnaire), EDSS (Expanded Disability Status Scale), FNT (finger-to-nose test), ICC (intraclass correlation coefficient), IQR (interquartile range), ML (machine learning), MS (multiple sclerosis), 9-HPT (9-Hole Peg Test), UEF (upper extremity function)
      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Archives of Physical Medicine and Rehabilitation
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Hobart J.C.
        • Cano S.J.
        • Zajicek J.P.
        • et al.
        Rating scales as outcome measures for clinical trials in neurology: problems, solutions, and recommendations.
        Lancet Neurol. 2007; 6: 1094-1105
        • Cohen J.A.
        • Reingold S.C.
        • Polman C.H.
        • et al.
        Disability outcome measures in multiple sclerosis clinical trials: current status and future prospects.
        Lancet Neurol. 2012; 11: 467-476
        • Miller G.A.
        The magical number seven, plus or minus two: some limits on our capacity for processing information.
        Psychol Rev. 1956; 63: 81-97
        • Kurtzke J.F.
        Rating neurologic impairment in multiple sclerosis: an expanded disability status scale (EDSS).
        Neurology. 1983; 33: 1444-1452
        • Kappos L.
        • D’Souza M.
        • Lechner-Scott J.
        • et al.
        On the origin of Neurostatus.
        Mult Scler Relat Disord. 2015; 4: 182-185
        • Hobart J.
        • Freeman J.
        • Thompson A.
        Kurtzke scales revisited: the application of psychometric methods to clinical intuition.
        Brain. 2000; 123: 1027-1040
        • Goodkin D.E.
        • Cookfair D.
        • Wende K.
        • et al.
        Inter- and intrarater scoring agreement using grades 1.0 to 3.5 of the Kurtzke Expanded Disability Status Scale (EDSS).
        Neurology. 1992; 42: 859-863
        • Meyer-Moock S.
        • Feng Y.S.
        • Maeurer M.
        • et al.
        Systematic literature review and validity evaluation of the Expanded Disability Status Scale (EDSS) and the Multiple Sclerosis Functional Composite (MSFC) in patients with multiple sclerosis.
        BMC Neurol. 2014; 14: 58
        • van Munster C.E.
        • Uitdehaag B.M.
        Outcome measures in clinical trials for multiple sclerosis.
        CNS Drugs. 2017; 31: 217-236
        • Morrison C.
        • D'Souza M.
        • Huckvale K.
        • et al.
        Usability and acceptability of ASSESS MS: assessment of motor dysfunction in multiple sclerosis using depth-sensing computer vision.
        JMIR Hum Factors. 2015; 2
        • Kontschieder P.
        • Dorn J.F.
        • Morrison C.
        • et al.
        Quantifying progression of multiple sclerosis via classification of depth videos.
        Med Image Comput Comput Assist Interv. 2014; 17: 429-437
        • D’Souza M.
        • Burggraaff J.
        • Kontschieder P.
        • et al.
        Automated quantification of motor dysfunction in multiple sclerosis using depth-sensing computer vision.
        Neurology. 2015; 84: 1526-1632
        • van Munster C.E.
        • D'Souza M.
        • Steinheimer S.
        • et al.
        Tasks of activities of daily living (ADL) are more valuable than the classical neurological examination to assess upper extremity function and mobility in multiple sclerosis.
        Mult Scler. 2018 Aug 31; ([Epub ahead of print])
        • Bonzano L.S.M.
        • Tacchino A.
        • Abate L.
        • et al.
        Validation of a new quantitative and objective tool for the assessment of hand motor disability in multiple sclerosis.
        Mult Scler. 2011; 17: 43
        • Sarkar A.
        • Morrison C.
        • Dorn J.F.
        • et al.
        Setwise comparison: consistent, scalable, continuum labels for computer vision.
        (Proceedings of the 2016 CHI conference on human factors in computing systems)in: Kaye J. Association for Computing Machinery, New York2016: 261-271
        • Cattelan M.
        Models for paired comparison data: a review with emphasis on dependent data.
        Stat Sci. 2012; 27: 412-433
        • Wickelmaier F.
        • Schmid C.
        A Matlab function to estimate choice model parameters from paired-comparison data.
        Behav Res Methods Instrum Comput. 2004; 36: 29-40
        • Delver R.
        • Monsuur H.
        • Storcken A.J.A.
        Ordering pairwise comparison structures.
        Theory Decis. 1991; 31: 75-94
        • Bradley R.A.
        • Terry M.E.
        Rank analysis of incomplete block designs: I. The method of paired comparisons.
        Biometrika. 1952; 39: 324-345
        • Thurstone L.L.
        A law of comparative judgment.
        Psychol Rev. 1927; 34: 273-286
        • Herbrich R.
        • Minka T.
        • Graepel T.
        TrueSkill(TM): a Bayesian skill rating system.
        Adv Neural Inf Process Syst. 2006; : 569-576
        • Alusi S.H.
        • Worthington J.
        • Glickman S.
        • et al.
        Evaluation of three different ways of assessing tremor in multiple sclerosis.
        J Neurol Neurosurg Psychiatry. 2000; 68: 756-760
        • Erasmus L.
        • Sarno S.
        • Albrecht H.
        • et al.
        Measurement of ataxic symptoms with a graphic tablet: standard values in controls and validity in multiple sclerosis patients.
        J Neurosci Methods. 2001; 108: 25-37
        • Amer M.
        • Hubert G.
        • Sullivan S.J.
        • et al.
        Reliability and diagnostic characteristics of clinical tests of upper limb motor function.
        J Clin Neurosci. 2012; 19: 1246-1251
        • Polman C.H.
        • Reingold S.C.
        • Banwell B.
        • et al.
        Diagnostic criteria for multiple sclerosis: 2010 revisions to the McDonald criteria.
        Ann Neurol. 2011; 69: 292-302
        • Goodkin D.E.
        • Hertsgaard D.
        • Seminary J.
        Upper extremity function in multiple sclerosis: improving assessment sensitivity with box-and-block and nine-hole peg tests.
        Arch Phys Med Rehabil. 1988; 69: 850-854
        • Mokkink L.B.
        • Knol D.L.
        • Van der Linden F.H.
        • et al.
        The Arm Function in Multiple Sclerosis Questionnaire (AMSQ): development and validation of a new tool using IRT methods.
        Disabil Rehabil. 2015; 37: 2445-2451
        • Kappos L.
        Neurostatus Scoring Definitions Version 04/10.2.
        • McGraw K.O.
        • Wong S.P.
        Forming inferences about some intraclass correlations coefficients.
        Psychol Methods. 1996; 1: 30-46
        • Altman D.G.
        Inter-rater agreement.
        Pract Stat Med Res. 1991; 5: 403-409
        • Hays R.D.
        • Anderson R.
        • Revicki D.
        Psychometric considerations in evaluating quality of life measures.
        Qual Life Res. 1993; 2: 441-449
        • Feys P.G.
        • Davies-Smith A.
        • Jones R.
        • et al.
        Intention tremor rated according to different finger-to-nose test protocols: a survey.
        Arch Phys Med Rehabil. 2003; 84: 79-82
        • Stone E.E.
        • Skubic M.
        Unobtrusive, continuous, in-home gait measurement using the Microsoft Kinect.
        IEEE Trans Biomed Eng. 2013; 60: 2925-2932
        • Morrison C.
        • Culmer P.
        • Mentis H.
        • et al.
        Vision-based body tracking: turning Kinect into a clinical tool.
        Disabil Rehabil Assist Technol. 2016; 11: 516-520
        • Bonnechère B.
        • Jansen B.
        • Salvia P.
        • et al.
        Validity and reliability of the Kinect within functional assessment activities: comparison with standard stereophotogrammetry.
        Gait Posture. 2014; 39: 593-598
        • Mentiplay B.F.
        • Clark R.A.
        • Mullins A.
        • et al.
        Reliability and validity of the Microsoft Kinect for evaluating static foot posture.
        J Foot Ankle Res. 2013; 6: 14