Review article| Volume 98, ISSUE 1, P151-164.e6, January 2017

Inter- and Intrarater Reliability of Clinical Tests Associated With Functional Lumbar Segmental Instability and Motor Control Impairment in Patients With Low Back Pain: A Systematic Review

Published:August 25, 2016DOI:



      To provide a comprehensive overview of clinical tests associated with functional lumbar segmental instability and motor control impairment in patients with low back pain (LBP), and to investigate their intrarater reliability, interrater reliability, or both.

      Data Sources

      A systematic computerized search was conducted on December 1, 2015, in 4 different databases (starting search year is indicated in parentheses, with articles included from that year until December 1, 2015): PubMed (1972–), Web of Science (1955–), Embase (1947–), and MEDLINE (1946–).

      Study Selection

      Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines were followed during design, search, and reporting stages of this review. The included population comprised patients with primary LBP.

      Data Extraction

      Data were extracted as follows: (1) description and scoring of the clinical tests; (2) population characteristics; (3) inclusion and exclusion criteria; (4) description of the used procedures; (5) results for both intra- and interrater reliability; and eventually (6) notification on used statistical method. The risk of bias of the included articles was assessed with the use of the COnsensus-based Standards for the selection of health Measurement INstruments checklist.

      Data Synthesis

      A total of 16 records were eligible, and 30 clinical tests were identified. All included studies investigated interrater reliability, and 3 studies investigated intrarater reliability. The identified interrater reliability scores ranged from poor to very good (κ=−.09 to .89; intraclass correlation coefficient, .72–.96), and the intrarater reliability scores ranged from fair to very good (κ=.51–.86).


      Three clinical tests (aberrant movement pattern, prone instability test, Beighton Scale) could be identified as having an adequate interrater reliability. No conclusions could be made for intrarater reliability. However, further research should focus on better study designs, provide an overall agreement for uniformity and interpretation of clinical tests, and should implement research regarding validity.


      List of abbreviations:

      COSMIN (COnsensus-based Standards for the selection of health Measurement INstruments), GRADE (Grading of Recommendations Assessment, Development and Evaluation), ICC (intraclass correlation coefficient), LBP (low back pain), LSI (lumbar segmental instability), MCI (motor control impairment), ROB (risk of bias)
      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'


      Subscribe to Archives of Physical Medicine and Rehabilitation
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Denteneer L.
        • Van Daele U.
        • De Hertogh W.
        • et al.
        Identification of preliminary prognostic indicators for back rehabilitation in patients with nonspecific chronic low back pain: a retrospective cohort study.
        Spine (Phila Pa 1976). 2016; 41: 522-529
        • Hicks G.E.
        • Fritz J.M.
        • Delitto A.
        • et al.
        Preliminary development of a clinical prediction rule for determining which patients with low back pain will respond to a stabilization exercise program.
        Arch Phys Med Rehabil. 2005; 86: 1753-1762
        • Comerford M.J.
        • Mottram S.L.
        Movement and stability dysfunction—contemporary developments.
        Man Ther. 2001; 6: 15-26
        • Comerford M.J.
        • Mottram S.L.
        Functional stability re-training: principles and strategies for managing mechanical dysfunction.
        Man Ther. 2001; 6: 3-14
        • Demoulin C.
        • Distree V.
        • Tomasella M.
        • Crielaard J.M.
        • Vanderthommen M.
        [Lumbar functional instability: a critical appraisal of the literature].
        Ann Readapt Med Phys. 2007; 50 ([English, French]) (669-76): 677-684
        • Atkinson G.
        • Nevill A.M.
        Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine.
        Sports Med. 1998; 26: 217-238
        • O'Sullivan P.
        Diagnosis and classification of chronic low back pain disorders: maladaptive movement and motor control impairments as underlying mechanism.
        Man Ther. 2005; 10: 242-255
        • Airaksinen O.
        • Brox J.I.
        • Cedraschi C.
        • et al.
        Chapter 4. European guidelines for the management of chronic nonspecific low back pain.
        Eur Spine J. 2006; 15: S192-S300
        • Moher D.
        • Liberati A.
        • Tetzlaff J.
        • et al.
        Preferred Reporting Items for Systematic Reviews and Meta-Analyses: the PRISMA Statement.
        Ann Intern Med. 2009; 151: 264-269
        • Mokkink L.B.
        • Terwee C.B.
        • Patrick D.L.
        • et al.
        The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study.
        Qual Life Res. 2010; 19: 539-549
        • Mokkink L.B.
        • Terwee C.B.
        • Knol D.L.
        • et al.
        The COSMIN checklist for evaluating the methodological quality of studies on measurement properties: a clarification of its content.
        BMC Med Res Methodol. 2010; 10: 22
        • Terwee C.
        • Mokkink L.
        • Knol D.
        • et al.
        Rating the methodological quality in systematic reviews of studies on measurement properties: a scoring system for the COSMIN checklist.
        Qual Life Res. 2012; 21: 651-657
        • Rankin G.
        • Stokes M.
        Reliability of assessment tools in rehabilitation: an illustration of appropriate statistical analyses.
        Clin Rehabil. 1998; 12: 187-199
        • Landis J.R.
        • Koch G.G.
        The measurement of observer agreement for categorical data.
        Biometrics. 1977; 33: 159-174
        • Sim J.
        • Wright C.C.
        The kappa statistic in reliability studies: use, interpretation, and sample size requirements.
        Phys Ther. 2005; 85: 257-268
        • Chen G.
        • Faris P.
        • Hemmelgarn B.
        • et al.
        Measuring agreement of administrative data with chart data using prevalence unadjusted and adjusted kappa.
        BMC Med Res Methodol. 2009; 9: 5
        • Fleiss J.L.
        Reliability of measurement. The design and analysis of clinical experiments.
        John Wiley & Sons, New York1999: 1-32
        • Gopalakrishna G.
        • Mustafa R.A.
        • Davenport C.
        • et al.
        Applying Grading of Recommendations Assessment, Development and Evaluation (GRADE) to diagnostic tests was challenging but doable.
        J Clin Epidemiol. 2014; 67: 760-768
        • Alyazedi F.M.
        • Lohman E.B.
        • Swen R.W.
        • et al.
        The inter-rater reliability of clinical tests that best predict the subclassification of lumbar segmental instability: structural, functional and combined instability.
        J Man Manip Ther. 2015; 23: 197-204
        • Biely S.A.
        • Silfies S.P.
        • Smith S.S.
        • et al.
        Clinical observation of standing trunk movements: what do the aberrant movement patterns tell us?.
        J Orthop Sports Phys Ther. 2014; 44: 262-272
        • Elgueta-Cancino E.
        • Schabrun S.
        • Danneels L.
        • et al.
        A clinical test of lumbopelvic control: Development and reliability of a clinical test of dissociation of lumbopelvic and thoracolumbar motion.
        Man Ther. 2014; 19: 418-424
        • Enoch F.
        • Kjaer P.
        • Elkjaer A.
        • et al.
        Inter-examiner reproducibility of tests for lumbar motor control.
        BMC Musculoskelet Disord. 2011; 12: 114
        • Fritz J.M.
        • Brennan G.P.
        • Clifford S.N.
        • et al.
        An examination of the reliability of a classification algorithm for subgrouping patients with low back pain.
        Spine (Phila Pa 1976). 2006; 31: 77-82
        • Fritz J.M.
        • Piva S.R.
        • Childs J.D.
        Accuracy of the clinical examination to predict radiographic instability of the lumbar spine.
        Eur Spine J. 2005; 14: 743-750
        • Hebert J.J.
        • Koppenhaver S.L.
        • Teyhen D.S.
        • et al.
        The evaluation of lumbar multifidus muscle function via palpation: reliability and validity of a new clinical test.
        Spine J. 2015; 15: 1196-1202
        • Hicks G.E.
        • Fritz J.M.
        • Delitto A.
        • et al.
        Interrater reliability of clinical examination measures for identification of lumbar segmental instability.
        Arch Phys Med Rehabil. 2003; 84: 1858-1864
        • Luomajoki H.
        • Kool J.
        • de Bruin E.D.
        • et al.
        Reliability of movement control tests in the lumbar spine.
        BMC Musculoskelet Disord. 2007; 8: 90
        • Murphy D.R.
        • Byfield D.
        • McCarthy P.
        • et al.
        Interexaminer reliability of the hip extension test for suspected impaired motor control of the lumbar spine.
        J Manipulative Physiol Ther. 2006; 29: 374-377
        • Qvistgaard E.
        • Rasmussen J.
        • Laetgaard J.
        • et al.
        Intra-observer and inter-observer agreement of the manual examination of the lumbar spine in chronic low-back pain.
        Eur Spine J. 2007; 16: 277-282
        • Rabin A.
        • Shashua A.
        • Pizem K.
        • et al.
        The interrater reliability of physical examination tests that may predict the outcome or suggest the need for lumbar stabilization exercises.
        J Orthop Sports Phys Ther. 2013; 43: 83-90
        • Ravenna M.M.
        • Hoffman S.L.
        • Van Dillen L.R.
        Low interrater reliability of examiners performing the prone instability test: a clinical test for lumbar shear instability.
        Arch Phys Med Rehabil. 2011; 92: 913-919
        • Roussel N.A.
        • Nijs J.
        • Truijen S.
        • et al.
        Low back pain: clinimetric properties of the Trendelenburg test, active straight leg raise test, and breathing pattern during active straight leg raising.
        J Manipulative Physiol Ther. 2007; 30: 270-278
        • Schneider M.
        • Erhard R.
        • Brach J.
        • et al.
        Spinal palpation for lumbar segmental mobility and pain provocation: an interexaminer reliability study.
        J Manipulative Physiol Ther. 2008; 31: 465-473
        • Sedaghat N.
        • Latimer J.
        • Maher C.
        • et al.
        The reproducibility of a clinical grading system of motor control in patients with low back pain.
        J Manipulative Physiol Ther. 2007; 30: 501-508
        • Ferrari S.
        • Manni T.
        • Bonetti F.
        • et al.
        A literature review of clinical tests for lumbar instability in low back pain: validity and applicability in clinical practice.
        Chiropr Man Ther. 2015; 23: 14
        • Carlsson H.
        • Rasmussen-Barr E.
        Clinical screening tests for assessing movement control in non-specific low-back pain. A systematic review of intra- and inter-observer reliability studies.
        Man Ther. 2013; 18: 103-110