Volume 89, Issue 1 , Pages 69-74, January 2008
Psychometric Properties of the Neck Disability Index and Numeric Pain Rating Scale in Patients With Mechanical Neck Pain
Article Outline
Abstract
Cleland JA, Childs JD, Whitman JM. Psychometric properties of the Neck Disability Index and numeric pain rating scale in patients with mechanical neck pain.
Objective
To examine the psychometric properties including test-retest reliability, construct validity, and minimum levels of detectable and clinically important change for the Neck Disability Index (NDI) and the numeric rating scale (NRS) for pain in a cohort of patients with neck pain.
Design
Single-group repeated-measures design.
Setting
Outpatient physical therapy (PT) clinics.
Participants
Patients (N=137) presenting to PT with a primary report of neck pain.
Interventions
Not applicable.
Main Outcome Measures
All patients completed the NDI and the NRS at the baseline examination and at a follow-up. At the time of the follow-up, all patients also completed the global rating of change, which was used to dichotomize patients as improved or stable. Baseline and follow-up scores were used to determine the test-retest reliability, construct validity, and minimal levels of detectable and clinically important change for both the NDI and NRS.
Results
Test-retest reliability was calculated using an intraclass correlation coefficient (ICC) (NDI ICC=.50; 95% confidence interval [CI], .25–.67; NRS ICC=.76; 95% CI, .51–.87). The area under the curve was .83 (95% CI, .75–.90) for the NDI score and .85 (95% CI, .78–.93) for the NRS score for determining between stable and improved patients. Thresholds for the minimum clinically important difference (MCID) for the NDI were 19-percentage points and 1.3 for the NRS.
Conclusions
Both the NDI and NRS exhibit fair to moderate test-retest reliability in patients with mechanical neck pain. Both instruments also showed adequate responsiveness in this patient population. However, the MCID required to be certain that the change in scores has surpassed a level that could be contributed to measurement error for the NDI was twice that which has previously been reported. Therefore the ongoing analyses of the properties of the NDI in a patient population with neck pain are warranted.
Key Words: Neck, Neck pain, Rehabilitation
HEALTH OUTCOME MEASURES are commonly used in both the clinical and research environment to determine if treatment has impacted the patient’s health status.1 Prior to using self-report measures to guide clinical decision-making regarding individual patients, the psychometric properties (minimum detectable change [MDC], minimal clinically important difference [MCID]) of the particular instrument must be identified to allow the clinician to categorize if a particular patient has experienced a clinically important change with a degree of confidence. MDC is the amount of change that must be observed before the change can be considered to exceed the measurement error,2 whereas MCID is the smallest difference that patients perceive as beneficial.3
The Neck Disability Index (NDI) is a commonly used health outcome measure to capture perceived disability in patients with neck pain.1 Riddle and Stratford4 identified a significant correlation between the NDI and both the physical and mental health components of the Medical Outcomes Study 36-Item Short-Form Health Survey (SF-36). The authors also confirmed that the NDI possesses adequate sensitivity to detect the magnitude of change that occurred for patients reaching their functional goals, work status, and if the patient was currently in litigation.4 Jette and Jette5 further substantiated the sensitivity to change by calculating the effect sizes for change scores of both the NDI and SF-36. However, these data do not provide useful information to assist clinicians in determining the minimal amount of change necessary to represent a clinically important difference from the patient’s perspective.6 It has been suggested that indices of responsiveness that indicate a cutoff point for identifying the MCID are of greatest value to clinicians when they are determining if a meaningful change has occurred.7
Two studies8, 9 with small sample sizes have identified the MDC, which is the amount of change that must be observed before the change can be considered to exceed the measurement error for the NDI. Westaway et al8 identified the MDC as 5 points in a group of 31 patients with neck pain. Additionally, Stratford et al9 identified the MDC to be 5 points in a group of 48 patients with neck pain. Although these studies reported that a change of 5 points (10%) must be observed to be certain that the change in scores is greater than measurement error, no values for the MCID have been reported in the literature for patients with neck pain.10
No consensus exists on the ideal reference standard for measuring functional change.4 Westaway8 and Stratford9 used a clinician’s prognostic rating as the reference standard to determine the responsiveness of the NDI. In these studies, experienced clinicians were asked a priori to identify those patients likely to experience little change in status or full recovery. This assessment was based on the patient’s initial presentation, degree of identified impairments, and duration of symptoms.9 However, recent evidence suggests considerable variation exists among expert clinician’s reports of identifying the probability of disease,11 which raises some concerns regarding the validity of a prognostic rating scale as a reference standard.
In addition to the NDI, the numeric rating scale (NRS) for pain is also a commonly used outcome measure for patients with neck pain.12, 13, 14 The responsiveness of the NRS in a broad population of patients with various musculoskeletal conditions has been investigated and the MCID has been identified to be 2 points.15 Additionally, in a patient population with low back pain, the scales also showed an MCID of 2 points.16 However, the responsiveness of the NRS in a patient population with neck pain has yet to be determined. Based on the limited scope of previous work, the purpose of this study was to identify the psychometric properties of the NDI and the NRS in a large cohort of patients with neck pain.
Methods
We collected data on consecutive patients presenting to 1 of 5 outpatient orthopedic physical therapy (PT) clinics (Rehabilitation Services of Concord Hospital, Concord, NH; Newton-Wellesley Hospital, Boston, MA; Centennial Physical Therapy, Colorado Springs, CO; Groves Physical Therapy, St Paul, MN; Sharp HealthCare, San Diego, CA) between July 2004 and July 2006 with a primary report of neck pain from 2 clinical trials that were included in the analysis.17, 18 Both studies included identical eligibility criteria. Inclusion criteria included patient age between 18 and 60 years, an NDI score greater than 10%, and a primary complaint of neck pain with or without referral of symptoms to the upper extremity or extremities. Exclusion criteria included any signs or symptoms consistent with a nonmusculoskeletal etiology for the patient’s symptoms, a history of a whiplash injury within the past 6 weeks, evidence of central nervous system involvement, 2 or more signs consistent with nerve root compression (myotomal weakness, sensory deficits in a dermatomal pattern, decreased or absent muscle stretch reflexes), prior surgery to the cervical or thoracic spine, or pending legal action regarding their neck pain. All subjects who agreed to participate in either study signed an informed consent approved by the institutional review board at the respective clinical site.
Twelve physical therapists participated in the examination and treatment of all patients in this study. All therapists underwent a standardized training regimen, which included studying a manual of standard procedures with the operational definitions of each examination and treatment procedure used in this study. All participating therapists underwent training provided by a current fellow in the Regis University Manual Therapy Fellowship Program, Denver, CO. During this training session, all participating therapists were required to demonstrate the examination and treatment techniques to ensure that all study procedures were performed in a standardized fashion. Participating therapists had a mean ± standard deviation (SD) of 9.7±6.8 years (range, 1–19y) of clinical experience.
All patients provided demographic information and completed a number of self-report measures, followed by a standardized history and physical examination at baseline. Self-report measures included a body diagram,19 NRS,20 the NDI,21 and the Fear-Avoidance Beliefs Questionnaire (FABQ).22
Neck Disability Index
The NDI contains 10 items—7 related to activities of daily living, 2 related to pain, and 1 related to concentration.23 Each item is scored from 0 to 5, and the total score is expressed as a percentage (total possible score, 100%), with higher scores corresponding to greater disability.
NRS for Pain
We used the NRS to capture the patient’s level of pain. Patients were asked to indicate the intensity of current, best, and worst levels of pain over the past 24 hours using an 11-point scale, ranging from 0 (no pain) to 10 (worst pain imaginable).24 The average of the 3 ratings was used to represent the patient’s level of pain over the previous 24 hours. This procedure has been shown to have adequate reliability, validity, and responsiveness in patients with low back pain,16, 25 but has not been specifically examined in patients with neck pain.
Standardized History, Physical Examination, and Interventions
The standardized history included questions regarding the mode of onset, nature and location of symptoms, aggravating and relieving factors, and prior history of neck pain. The physical examination included a neurologic screen,26 postural assessment,27 cervical range of motion measurements and symptom response,28 assessment of the length26 and strength27 of the muscles of the upper quarter, and endurance of the deep neck flexor muscles.29 The amount of mobility and symptom response was recorded for spring testing30 of the cervical and thoracic spine (C2-T9). The physical examination culminated with a number of special tests typically performed in the examination of patients with neck pain, including the Spurling test A,31 cervical distraction test,23 and the upper-limb neurodynamic test.32 All patients who completed the baseline evaluation underwent 1 PT treatment session consisting of manual therapy techniques directed at the thoracic spine. They returned within 2 to 4 days for a re-evaluation and again completed the self-report measures as well as the global rating of change scale (GRCS). After 1 session of PT, we expected to see a dramatic change in patient status for a subgroup of patients with mechanical neck pain, but no change in another subgroup that did not respond positively to the intervention provided.17
Global Rating of Change
At the follow-up evaluation, each patient completed a GRCS as described by Jaeschke et al.3 Patients were asked to rate their overall perception of improvement since beginning treatment on a scale ranging from −7 (a very great deal worse) to zero (about the same) to 7 (a very great deal better). It has been recommended3 that scores on the GRCS between ±3 and ±1 represent small changes, scores between ±4 and ±5 represent moderate changes, and scores of ±6 or ±7 large changes. A GRCS has also been used to identify responsiveness and the MCID for health outcome measures.33, 34, 35 Use of the GRCS to calculate responsiveness has been criticized because the patients must recall their initial status weeks or months after the initial examination.36 Schmitt and Di Fabio37 recently reported that a retrospective GRCS does not accurately reflect change over time. However, the follow-up time frame in this study by Schmitt and Di Fabio37 was 3 months and shorter follow-up periods (<1wk) could potentially reduce the biases associated with patient recall. Hence we elected to again collect the measures at the time of the patient’s first follow-up which was scheduled within 2 to 4 days of the initial examination. Intuitively, it also makes sense that a patient’s perception of improvement gives a more accurate assessment if a true change has occurred rather than a prognostic rating.25
Data Analysis
We dichotomized patients into 2 groups based on GRCS scores; those scoring between −3 and +3 were considered stable (minimal to no change), those scoring greater than +3 were considered to have exhibited clinically important improvement, and patients scoring less than −3 were considered to have experienced a worsening in status. Baseline variables were compared between groups using independent t tests for continuous data, and chi-square tests of independence for categoric data. Test-retest reliability of the NDI and NRS were investigated in the group of stable patients using an intraclass correlation coefficient (ICC2,1) with the 95% confidence interval (CI), calculated according to procedures described by Shrout and Fleiss.38 Assessment of reliability was performed using criteria described by Shrout39 with values less than .10 indicating no agreement, values between .11 and .40 indicating slight agreement, values between .41 and .60 indicating fair agreement, values between .61 and .80 indicating moderate agreement, and values greater than .81 indicating substantial agreement. Additionally, a Pearson product-moment correlation coefficient (r) was calculated for pre- and post-test measurements on the NDI to allow direct comparison with the results of other studies.21
Construct validity of the NDI and NRS were examined by comparing the change in scores for the stable and improved groups using separate 2-way analysis of variance for the repeated measures at baseline and re-evaluation. We hypothesized that stable patients would have NDI and NRS scores that did not change, whereas improved patients would show a significant change in disability. This would be represented by a significant group by time interaction.
We analyzed responsiveness, or the ability of a test to recognize change,40 of the NDI and NRS with 2 methods. The first method used receiver operator characteristic (ROC) curves constructed by plotting sensitivity values (true positive rate) on the y axis and 1 − specificity values (false positive rate) on the x axis for each level of change score for distinguishing improved from stable patients. Separate ROC curves were constructed for the NDI and NRS. The area under the curve (AUC) and the 95% CI were obtained as a method for describing the ability of each measure to distinguish improved patients from stable patients. An AUC of .50 indicates the measure has no diagnostic accuracy beyond chance, whereas a value of 1 would indicate perfect accuracy.41 Responsiveness was also analyzed by correlating the change scores of the NDI and NRS to the GRCS scores in all patients. Change scores were calculated by subtracting each patient’s baseline score from the score obtained at the re-evaluation.
Minimum detectable change, or the amount of change that must be observed before the change can be considered to exceed the measurement error, was calculated by determining the standard error (SE) of measurement for the NDI and NRS for the stable group.2 The SE of measurement was calculated using the formula (SD × [1 − r]1/2) where r is the test-retest reliability coefficient and the SD is the square root of the total variance. The SE of measurement was multiplied by 1.65 to determine the 90% CI.34 This value was multiplied by the square root of 2 to account for the errors taken with repeated measurements.42 Minimal clinically important change, the smallest difference that patients perceive as beneficial,3 was calculated by identifying the point on the ROC curve nearest the upper left-hand corner, which is considered to be the best cutoff score for distinguishing improved and stable patients.34 Sensitivity and specificity values for the selected cutoff score were calculated.
Results
Of the 209 consecutive patients with neck pain screened for eligibility, 138 patients satisfied inclusion and exclusion criteria and agreed to participate (mean, 42.5±11.9y). Of the 71 patients that were not included, 21 had a recent history of whiplash, 13 had signs of nerve root compression, 12 presented with contraindications to the interventions, 8 had prior surgery to the cervical or thoracic spine, 2 had signs of central nervous system involvement, 1 had insufficient English skills to complete the questionnaires, and 14 patients declined to participate. Only 1 patient exhibited a worsening of status (GRCS score, −4) and was not included in further analyses. The mean GRCS score for the remaining 137 patients was 2.1±2.3. Eighty-nine patients were considered to have remained stable (GRCS score range, −3 to 3) and 48 were considered to have improved (GRCS score, >4). Baseline characteristics for both groups can be found in table 1. The mean follow-up time between the first and second measurements was 2.50±0.95 days. A significant difference existed in the duration of symptoms and initial FABQ and FABQ physical activity scores between the group that remained stable and the group that improved. The ICC values calculated from the stable patients were .50 (95% CI, .25–.67) for the NDI and .76 (95% CI, .51–.87) for the NRS (table 2). The Pearson r value calculated between pre- and post-test measurements in the group of patients considered stable for the NDI was .56.
Table 1. Baseline Statistics for Stable and Improved Patients
| Characteristics | Stable Patients (n=89) | Improved Patients (n=48) | P |
|---|---|---|---|
| Age (y) | 42.1±12.0 | 43.2±12.0 | .620† |
| Symptom duration (d) | 82.4±74.8 | 46.4±40.3 | .002† |
| Sex (% female) | 60.0±69.0 | 25.0±52.0 | .160⁎ |
| Symptoms distal to shoulder (% yes) | 27.0±31.0 | 15.0±31.0 | .330⁎ |
| Initial NDI score | 32.2±11.6 | 35.7±9.8 | .080† |
| Initial NRS score | 4.6±1.9 | 5.1±1.6 | .110† |
| FABQ physical activity subscale | 12.3±4.4 | 10.9±4.4 | .030† |
| FABQ work subscale | 13.5±11.0 | 11.5±8.8 | .280† |
| Currently taking medications (% yes) | 51.0±59.0 | 25.0±52.0 | .590⁎ |
| Currently on workers compensation (% yes) | 8.0±9.0 | 4.0±8.0 | .760⁎ |
| Currently seeking litigation (% yes) | 6.0±7.0 | 4.0±8.0 | .730⁎ |
| Days between evaluations | 2.4±.87 | 2.5±1.1 | .540† |
⁎Chi-square tests; |
†independent-samples t tests. |
Table 2. Mean Change Scores and 95% CIs for Stable Patients (n=89) for the NDI and NRS
| Scale Type | Baseline | Re-Evaluation | Change Score | ICC2,1 (95% CI) |
|---|---|---|---|---|
| NDI | 32.2±11.6 | 26.2±12.2 | 6.9±10.4 | .50 |
| NRS | 4.6±1.9 | 3.9±1.8 | .71±1.1 | .76 |
Figure 1 shows the mean initial and follow-up scores for the NDI for the stable and improved groups. There was a significant interaction (P<.001) between groups for the pre- and post-test scores indicating that the change in NDI with time differed between stable and improved patients (mean, 12.9; 95% CI, 9.3–16.5). Figure 2 shows the initial and follow-up scores for the NRS for the stable and improved groups. A significant interaction (P<.001) was also found between the initial and follow-up evaluations for the NRS between improved and stable patients (mean, 2.1; 95% CI, 1.6–2.6) indicating that the change in NRS scores differed between patients determined to be stable or improved based on the GRCS. The AUC for the NDI was .83 (95% CI, .75–.90) and .85 (95% CI, .78–.93) for the NRS. Fig 3, Fig 4 represent the ROC curves for the NDI and NRS, respectively. Pearson r value between change scores of the NDI and NRS with the GRCS for the entire group was .58 (P=.01) and .57 (P=.01), respectively (table 3).

Fig 1.
NDI scores for the groups of subjects defined as stable and improved based on global rating of change. The interaction between time and group was significant (P<.001).

Fig 2.
NRS scores for the groups of subjects defined as stable and improved based on global rating of change. The interaction between time and group was significant (P<.001).

Fig 3.
ROC curve for the NDI at the follow-up. The point nearest the uppermost left-hand corner of the graph represents the MCID. The circled value is the point nearest the left-hand corner and represents the MCID for the NDI.

Fig 4.
ROC curve for the NRS score at the follow-up. The point nearest the uppermost left-hand corner of the graph represents the MCID. The circled value is the point nearest the left-hand corner and represents the MCID for the NRS score.
Table 3. Pearson r Values for Change Scores (N=137)
| Scale | NDI | NRS |
|---|---|---|
| GRCS score | .58 | .57 |
The SE of measurement values calculated from the stable patients were 8.4 for the NDI and .91 for the NRS, respectively. These values corresponded with MDC values of 19.6-percentage points for the NDI and 2.1 for the NRS. Thresholds for MCID for the NDI were 19-percentage points (sensitivity, .83; specificity, .72) and 1.3 for the NRS (sensitivity, .88; specificity, .71).
Discussion
It is essential for clinicians to have an understanding of the psychometric properties of measures, including reliability and responsiveness. Instruments should exhibit acceptable reliability and validity prior to being used to guide clinical decision-making. To determine the reliability and validity of self-report measures, it is useful to compare them with a construct that indicates when a true change has occurred.34 Frequently this construct of true change is a patient global rating of change.43, 44 We examined the test-retest reliability of the NDI and NRS for a subgroup of patients with mechanical neck pain. The results of this study suggest that the NDI exhibits only fair test-retest reliability (ICC=.50), which is considerably lower than the values reported by Vos et al45 for the Dutch version of the NDI. Other studies have identified the NDI to exhibit high test-retest reliability when using correlation coefficients as the method of data analysis.21, 46 However, our study exhibited a much lower correlation coefficient than that previously reported for the test-retest reliability of the NDI. The NRS exhibited moderate test-retest reliability, which is similar to the test-retest reliability identified in a patient population with cervical radiculopathy.35
Construct validity for both outcome measures was examined by comparing the baseline and follow-up scores for both the stable and improved groups. Both the NDI and NRS exhibited significantly greater reductions in disability among patients rating themselves as improved versus stable.4, 5, 9, 43, 44 These data further substantiate the findings of other studies that have investigated the validity of the NDI as compared with other measures.4, 5, 8, 9, 21
We used 2 methods to investigate the responsiveness. In the first, we used ROC curves to calculate the AUC for both the NDI and NRS. The AUC for the NDI and NRS were .83 and .85, respectively. The AUC of .83 (95% CI, .75–.90) is slightly lower than that reported by Stratford et al9 who identified an AUC of .9. However, the AUC of .85 (95% CI, .78–.93) for the NRS is greater than that identified by Childs et al16 in patients with low back pain, suggesting that the responsiveness of the NRS may differ depending on the patient population it is applied. Second, we investigated responsiveness by calculating correlation coefficients between the NDI and NRS for the group of patients who were identified as improved on the GRCS. Results showed a moderate but significant correlation for both the NDI and the NRS, further substantiating other reports supporting the validity of both measures.1, 6, 15
The NRS exhibited an MDC of 2 points and an MCID of 1.3 points, which is consistent with the findings in heterogeneous groups of patients with musculoskeletal conditions15 and patients with low back pain.16 However, Bolton6 investigated the psychometric properties of the 7 questions on the Bournemouth Questionnaire using the NRS. The results showed that for the questions on the Bournemouth the MCID was 3 points.6 Although pain is 1 dimension measured by the Bournemouth, it appears that the NRS possesses different psychometric properties when used to measure other dimensions (disability, affective, and cognitive-behavioral).
Although the NDI showed responsiveness in the ability to detect change, the MDC required to be certain that the patient has exhibited a change that exceeds that of measurement error (19 percentage points) was double that which was previously reported.9 Perhaps the difference is a direct result of using a prognostic rating to identify which patients exhibit a true change and that of the transitional scale, the GRCS.8, 9 Vos et al45 used a different GRCS and identified a much smaller MDC (1.66 points) for the Dutch version of the NDI. It is also possible that the dichotomous scoring system we used for the GRCS (status quo or improved) may fail to detect small but meaningful improvements in patient status.25 For example, if a GRCS cutoff between −2 and 2 were considered stable, the SE of measurement for the NDI would have been 8.26 with an MDC of 13.6%. Additionally, the MCID would have been 14% (or 7 points). A GRCS cutoff between −3 and 3 is most often used in the literature to examine the responsiveness of patient reported outcome measures34, 35; however, the use of a GRCS between −2 and 2 has also been reported.16 This suggests that the values reported using a GRCS score between −2 and 2 might be a more accurate value for the MCID. Although arguments can be made for and against both methods,37, 47 the results indicate that further investigation into the MDC and MCID in patients with neck pain is warranted.
Study Limitations
We have examined the responsiveness of the NDI and NRS in a population with mechanical neck pain. Hence, the findings of this study can only be generalized to a similar population and not to patients who present with neck pain related to other underlying pathologies. Although we have proposed that the short term follow-up could potentially reduce the likelihood of recall bias associated with the GRCS, the short-term follow-up could have also been the reason why the responsiveness of the NDI was lower than that reported in other studies.8, 9 It has been reported that instruments are usually less responsive in short-term follow-ups.16 Future studies should first identify the most appropriate reference standard for examining the responsiveness of self-report outcome measures.37, 48 Additionally, examining the psychometric properties of instruments should be considered an ongoing process and we suggest that future studies should continue to examine the responsiveness of the NDI in patient populations with neck pain.
Conclusions
The results of our study indicate that both the NDI and NRS exhibit fair to moderate test-retest reliability. Both instruments also showed adequate responsiveness in this patient population. However, the MDC required to be certain that the change in scores has surpassed a level that could be attibuted to measurement error for the NDI was twice that which has previously been reported in the literature.9, 16 This warrants further investigation. The responsiveness, as well as the MDC and MCID, for the NRS were consistent with studies performed on different patient populations.
Acknowledgment
None of the funding organizations played any role in the design, conduct, or reporting of the study or in the decision to submit the study for publication.
References
- . Standard scales for measurement of functional outcome for cervical pain or dysfunction. Spine. 2002;27:515–522
- Looking for important change/differences in studies of responsiveness. J Rheumatol. 2001;28:400–405
- . Measurement of health status: ascertaining the minimal clinically important difference. Control Clinl Trials. 1989;10:407–415
- . Use of generic versus region-specific functional status measures on patients with cervical spine disorders. Phys Ther. 1998;78:951–963
- . Physical therapy and health outcomes in patients with spinal impairments. Phys Ther. 1996;76:930–941
- . Sensitivity and specificity of outcome measures in patients with neck pain: detecting clinically significant improvement. Spine. 2004;29:2410–2417
- . Methods for assessing responsiveness: a critical review and recommendations. J Clin Epidemiol. 2000;53:459–468
- . The Patient-Specific Functional Scale: validation of its use in persons with neck dysfunction. J Orthop Sports Phys Ther. 1998;27:331–338
- . Using the Neck Disability Index to make decisions concerning individual patients. Physiother Can. 1999;51:107–112
- Manual therapy, physical therapy, or continued care by the general practitioner for patients with neck pain: long-term results from a pragmatic randomized clinical trial. Clin J Pain. 2006;22:370–377
- . Clinical experience did not reduce the variance in physicians’ estimates of pretest probability in a cross-sectional survey. J Clin Epidemiol. 2005;58:1211–1216
- A randomized trial of chiropractic manipulation and mobilization for patients with neck pain: clinical outcomes from the UCLA neck-pain study. Am J Public Health. 2002;92:1634–1641
- Manual therapy, physical therapy, or continued care by a general practitioner for patients with neck pain (A randomized, controlled trial). Ann Intern Med. 2002;136:713–722
- . Predicting short-term response and non-response to neck strengthening exercise for chronic neck pain. J Whiplash Relat Disord. 2005;4:43–55
- . Clinical importance of changes in chronic pain intensity measured on an 11-point numerical pain rating scale. Pain. 2001;94:149–158
- . Responsiveness of the numeric pain rating scale in patients with low back pain. Spine. 2005;30:1331–1334
- . Development of a clinical prediction rule for guiding treatment of a subgroup of patients with neck pain: use of thoracic spine manipulation, exercise and patient education. Phys Ther. 2007;87:9–23
- . Short-term response of thoracic spine thrust versus non-thrust manipulation in patients with mechanical neck pain: a randomized clinical trial. Phys Ther. 2007;87:431–440
- . A descriptive study of the centralization phenomenon (A prospective analysis). Spine. 1999;24:676–683
- . What is the maximum number of levels needed in pain intensity measurement?. Pain. 1994;58:387–392
- . The Neck Disability Index: a study of reliability and validity. J Manipulative Physiol Ther. 1991;14:409–415
- . A Fear-Avoidance Beliefs Questionnaire (FABQ) and the role of fear-avoidance beliefs in chronic low back pain and disability. Pain. 1993;52:157–168
- . Reliability and diagnostic accuracy of the clinical examination and patient self-report measures for cervical radiculopathy. Spine. 2003;28:52–62
- . The measurement of clinical pain intensity: a comparison of six methods. Pain. 1986;27:117–126
- . Responsiveness of pain, disability, and physical impairment outcomes in patients with low back pain. Spine. 2004;29:879–883
- . Orthopaedic manual physical therapy management of the cervical-thoracic spine and rib cage. San Antonio: Manipulations Inc; 2000;
- . Muscles: testing and function. 4th ed.. Baltimore: Williams & Wilkins; 1993;
- . Cervical and thoracic spine: mechanical diagnosis and therapy. Minneapolis: Orthopaedic Physical Therapy Products; 1990;
- . Reliability of a measurement of neck flexor muscle endurance. Phys Ther. 2005;85:1349–1355
- . Maitland’s vertebral manipulation. 6th ed.. Oxford: Butterworth-Heinemann; 2000;
- . Lateral rupture of the cervical intervertebral discs: a common cause of shoulder and arm pain. Surg Gynecol Obstet. 1944;78:350–358
- . The investigation of arm pain: signs of adverse responses to the physical examination of the brachial plexus and related tissues. In: Boyling JD, Palastanga N editor. Grieve’s modern manual therapy. 2nd ed.. New York: Churchill Livingstone; 1994;p. 577–585
- . Assessing change over time in patients with low-back pain. Phys Ther. 1994;74:528–533
- . A comparison of a modified Oswestry Low Back Disability Questionnaire and the Quebec Back Pain Disability Scale. Phys Ther. 2001;81:776–788
- . The reliability and construct validity of the Neck Disability Index and patient specific functional scale in patients with cervical radiculopathy. Spine. 2005;31:598–602
- . Methodological problems in the retrospective computation of responsiveness to change: the lesson of Cronbach. J Clin Epidemiol. 1997;50:869–879
- . The validity of prospective and retrospective global change criterion measures. Arch Phys Med Rehabil. 2005;86:2270–2276
- . Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86:420–428
- . Measurement reliability and agreement in psychiatry. Stat Methods Med Res. 1998;7:301–317
- . Foundations of clinical research: applications to practice. 2nd ed.. Upper Saddle River: Prentice Hall Health; 2000;
- . The meaning and use of the area under receiver operating characteristic (ROC) curve. Radiology. 1982;143:29–36
- . Further evidence supporting a SEM-based criterion for identifying meaningful intra-individual changes in health related quality of life. J Clin Epidemiol. 1999;52:861–873
- . Sensitivity to change of the Roland Morris Back Pain Questionnaire: part 1. Phys Ther. 1998;78:1186–1196
- Assessing disability and change on individual patients: a report of a patient-specific measure. Physiother Can. 1995;47:258–263
- . Reliability and responsiveness of the Dutch version of the Neck Disability Index in patients with acute neck pain in general practice. Eur Spine J. 2006;15:1729–1736
- . Validity and reliability of a modified version of the neck disability index. J Rehabil Med. 2002;34:284–287
- . Measuring change over time: assessing the usefulness of evaluative instruments. J Chronic Dis. 1987;40:171–178
- . A critical look at transition ratings. J Clin Epidemiol. 2002;55:900–908
Supported by the Orthopaedic Section of the American Physical Therapy Association, the American Academy of Orthopaedic Manual Physical Therapists, and Steens Physical USA.
No commercial party having a direct financial interest in the results of the research supporting this article has or will confer a benefit upon the author(s) or upon any organization with which the author(s) is/are associated.
PII: S0003-9993(07)01604-8
doi:10.1016/j.apmr.2007.08.126
© 2008 American Congress of Rehabilitation Medicine and the American Academy of Physical Medicine and Rehabilitation. Published by Elsevier Inc. All rights reserved.
Volume 89, Issue 1 , Pages 69-74, January 2008
