Advertisement

A comparison of 4 questionnaires to measure fatigue in postpoliomyelitis syndrome 1

      Abstract

      Horemans HL, Nollet F, Beelen A, Lankhorst GJ. A comparison of 4 questionnaires to measure fatigue in postpoliomyelitis syndrome. Arch Phys Med Rehabil 2004;85:392–8.

      Objective

      To assess the comparability and reproducibility of 4 questionnaires used to measure fatigue in postpoliomyelitis syndrome (PPS).

      Design

      Repeated-measures at a 3-week interval.

      Setting

      University hospital.

      Participants

      Convenience sample of 65 patients with PPS.

      Interventions

      Not applicable.

      Main outcome measures

      The Fatigue Severity Scale (FSS), the Nottingham Health Profile (NHP) energy category, the Polio Problem List (PPL) fatigue item, and the Dutch Short Fatigue Questionnaire (SFQ).

      Results

      Correlations of scores between questionnaires were all significant (P<.01) and ranged from .43 (between the NHP energy category and the PPL fatigue item) to .68 (between the PPL fatigue item and the SFQ). Scores on the second visit, normalized to a 0 to 100 scale, were: FSS, 78±15; NHP energy category, 47±35; PPL fatigue item, 81±17; and SFQ, 65±22. Except for the difference between the FSS and the PPL fatigue item, the differences in scores between the questionnaires were significant (P<.01). Scale analysis indicated that all questionnaires measured the same unidimensional construct. The reproducibility of the FSS, the PPL fatigue item, and the SFQ was moderate. The smallest detectable change was 1.5 points for the FSS, 2.0 points for the PPL fatigue item, and 1.9 points for the SFQ.

      Conclusions

      Although the questionnaires measure the same fatigue construct in PPS, the results are not interchangeable because the ranges of measurement differ. The NHP energy category, in particular, appeared to have a high detection threshold. The moderate reproducibility of the questionnaires indicates a lack of precision, especially when applied at the individual patient level.

      Keywords

      ONE OF THE MAJOR problems in postpoliomyelitis syndrome (PPS) is fatigue.
      • Berlly M.H.
      • Strauser W.W.
      • Hall K.M.
      Fatigue in postpolio syndrome.
      ,
      • Nollet F.
      • Beelen A.
      • Prins M.H.
      • et al.
      Disability and functional assessment in former polio patients with and without postpolio syndrome.
      Approximately 66% to 89% of patients with PPS perceive symptoms of increased fatigue,
      • Agre J.C.
      • Rodriquez A.A.
      • Sperling K.B.
      Symptoms and clinical impressions of patients seen in a postpolio clinic.
      ,
      • Ramlow J.
      • Alexander M.
      • LaPorte R.
      • Kaufmann C.
      • Kuller L.
      Epidemiology of the post-polio syndrome.
      ,
      • Halstead L.S.
      • Rossi C.D.
      Post-polio syndrome clinical experience with 132 consecutive outpatients.
      which may lead to a decline in their physical activities
      • Packer T.L.
      • Martins I.
      • Krefting L.
      • Brouwer B.
      Activity and post-polio fatigue.
      ,
      • Packer T.L.
      • Sauriol A.
      • Brouwer B.
      Fatigue secondary to chronic illness postpolio syndrome, chronic fatigue syndrome, and multiple sclerosis.
      and social functioning.
      • Berlly M.H.
      • Strauser W.W.
      • Hall K.M.
      Fatigue in postpolio syndrome.
      ,
      • Schanke A.K.
      • Stanghelle J.K.
      Fatigue in polio survivors.
      In studies in which the severity of fatigue in PPS was measured, different group scores have been reported.
      • Nollet F.
      • Beelen A.
      • Prins M.H.
      • et al.
      Disability and functional assessment in former polio patients with and without postpolio syndrome.
      ,
      • Packer T.L.
      • Martins I.
      • Krefting L.
      • Brouwer B.
      Activity and post-polio fatigue.
      ,
      • Packer T.L.
      • Sauriol A.
      • Brouwer B.
      Fatigue secondary to chronic illness postpolio syndrome, chronic fatigue syndrome, and multiple sclerosis.
      ,
      • Schanke A.K.
      Psychological distress, social support and coping behaviour among polio survivors a 5-year perspective on 63 polio patients.
      ,
      • Trojan D.A.
      • Collet J.
      • Pollak M.N.
      • et al.
      Serum insulin-like growth factor-I (IGF-I) does not correlate positively with isometric strength, fatigue, and quality of life in post-polio syndrome.
      ,
      • Willen C.
      • Grimby G.
      Pain, physical activity, and disability in individuals with late effects of polio.
      ,
      • Willen C.
      • Sunnerhagen K.S.
      • Grimby G.
      Dynamic water exercise in individuals with late poliomyelitis.
      ,
      • Grimby G.
      • Jonsson A.L.
      Disability in poliomyelitis sequelae.
      However, fatigue was assessed with different questionnaires. Comparing results obtained with different questionnaires in the various studies is problematic, because it is not known whether the differences in scores reflect differences in the severity of fatigue between the study populations or differences in the response characteristics of the questionnaires. Items may differ in the range for which they measure the severity of fatigue. Furthermore, items may assess different aspects of fatigue—for example, some items measure fatigue associated with exertion and other items measure the perception of fatigue.
      • Tiesinga L.J.
      • Dassen T.W.
      • Halfens R.J.
      DUFS and DEFS development, reliability and validity of the Dutch Fatigue Scale and the Dutch Exertion Fatigue Scale.
      Therefore, differences in the construct of fatigue may be present both within and between the questionnaires.
      Because fatigue has been used as an outcome measure, both in the treatment
      • Trojan D.A.
      • Cashman N.R.
      An open trial of pyridostigmine in post-poliomyelitis syndrome.
      ,
      • Trojan D.A.
      • Collet J.P.
      • Shapiro S.
      • et al.
      A multicenter, randomized, double-blinded trial of pyridostigmine in postpolio syndrome.
      ,
      • Bruno R.L.
      • Zimmerman J.R.
      • Creange S.J.
      • Lewis T.
      • Molzen T.
      • Frick N.M.
      Bromocriptine in the treatment of post-polio fatigue a pilot study with implications for the pathophysiology of fatigue.
      ,
      • Stein D.P.
      • Dambrosia J.M.
      • Dalakas M.C.
      A double-blind, placebo-controlled trial of amantadine for the treatment of fatigue in patients with the post-polio syndrome.
      and prospective follow-up studies of PPS,
      • Windebank A.J.
      • Litchy W.J.
      • Daube J.R.
      • Iverson R.A.
      Lack of progression of neurologic deficit in survivors of paralytic polio a 5-year prospective population-based study.
      another important aspect that needs to be addressed is whether questionnaires that measure fatigue are reliable and sensitive enough to detect change.
      • Fitzpatrick R.
      • Ziebland S.
      • Jenkinson C.
      • Mowat A.
      • Mowat A.
      Importance of sensitivity to change as a criterion for selecting health status measures.
      Our study was undertaken to investigate the comparability and reproducibility of 4 questionnaires that have been used to assess the severity of fatigue in PPS: the Fatigue Severity Scale (FSS), the energy category of the Nottingham Health Profile (NHP), the fatigue item on the Polio Problem List (PPL), and the Short Fatigue Questionnaire (SFQ). The comparability of the 4 questionnaires was investigated by determining their concurrent and construct validity. The reproducibility of the FSS, the PPL fatigue item, and the SFQ was investigated by determining the test-retest reliability and the smallest detectable change.

      Methods

      Study population

      Sixty-five patients with PPS were recruited from the Dutch Neuromuscular Diseases Association (Vereniging Spierziekten Nederland), (academic) hospitals, and rehabilitation centers. All patients met the following inclusion criteria: (1) PPS, according to the Halstead criteria,
      • Halstead L.S.
      Post-polio syndrome definition of an elusive concept.
      that is, a history of paralytic polio, a period of neurologic recovery followed by an interval of functional stability of at least 15 years, and the onset of weakness in previously affected and/or unaffected muscles not due to inactivity, possibly accompanied by excessive fatigue, muscle pain, decreased endurance, and atrophy; (2) symptoms of increased fatigue, that is, a minimum score of 10 on the SFQ, which is above the normal values in a healthy Dutch population; (3) age 18 to 70 years; and (4) no other diseases that could cause the symptoms. The patients underwent a medical examination to check the criteria, and all participants gave written informed consent.

      Questionnaires

      Fatigue severity scale

      The FSS consists of 9 statements that are scored on a 7-point Likert scale, ranging from 1 (strongly disagree) to 7 (strongly agree). For each subject, a total score is calculated as the mean score of the 9 statements. A lower total score indicates less effect of fatigue on everyday life. The FSS has shown good internal consistency (Cronbach α range, .81–.95)
      • Krupp L.B.
      • LaRocca N.G.
      • Muir-Nash J.
      • Steinberg A.D.
      The fatigue severity scale Application to patients with multiple sclerosis and systemic lupus erythematosus.
      ,
      • Merkies I.S.
      • Schmitz P.I.
      • Samijn J.P.
      • van der Meche F.G.
      • van Doorn P.A.
      Fatigue in immune-mediated polyneuropathies. European Inflammatory Neuropathy Cause and Treatment (INCAT) Group.
      ,
      • Kleinman L.
      • Zodet M.W.
      • Hakim Z.
      • et al.
      Psychometric evaluation of the fatigue severity scale for use in chronic hepatitis C.
      and test-retest reliability in patients with multiple sclerosis or systemic lupus erythematosus (Pearson r=.84),
      • Krupp L.B.
      • LaRocca N.G.
      • Muir-Nash J.
      • Steinberg A.D.
      The fatigue severity scale Application to patients with multiple sclerosis and systemic lupus erythematosus.
      immune-mediated polyneuropathies (intraclass correlation coefficient [ICC]=.86),
      • Merkies I.S.
      • Schmitz P.I.
      • Samijn J.P.
      • van der Meche F.G.
      • van Doorn P.A.
      Fatigue in immune-mediated polyneuropathies. European Inflammatory Neuropathy Cause and Treatment (INCAT) Group.
      and chronic hepatitis C (ICC=.82).
      • Kleinman L.
      • Zodet M.W.
      • Hakim Z.
      • et al.
      Psychometric evaluation of the fatigue severity scale for use in chronic hepatitis C.

      Nottingham health profile

      The NHP energy category is 1 of the 6 categories of the NHP and consists of 3 yes-no questions. The category score is calculated by dividing the number of questions answered with yes by the total number of questions and multiplied by 100, which results in a score ranging from 0 (no complaints) to 100 (answered yes to all questions). The Dutch version of the NHP energy category has shown satisfactory internal consistency (Cronbach α=.77)
      • Erdman R.A.
      • Passchier J.
      • Kooijman M.
      • Stronks D.L.
      The Dutch version of the Nottingham Health Profile investigations of psychometric aspects.
      and test-retest reliability (Spearman ρ range, .77–.86) in patients with chronic heart failure and myocardial infarction or stroke.
      • Erdman R.A.
      • Passchier J.
      • Kooijman M.
      • Stronks D.L.
      The Dutch version of the Nottingham Health Profile investigations of psychometric aspects.
      ,
      • Visser M.C.
      • Koudstaal P.J.
      • Erdman R.A.
      • et al.
      Measuring quality of life in patients with myocardial infarction or stroke a feasibility study of four questionnaires in the Netherlands.

      Polio problem list

      The PPL fatigue item is 1 of the 16 items on the PPL.
      • Nollet F.
      • Beelen A.
      • Prins M.H.
      • et al.
      Disability and functional assessment in former polio patients with and without postpolio syndrome.
      The PPL fatigue item assesses the extent to which fatigue is perceived as a problem, and it is scored on an 8-point Likert scale, ranging from 0 (no problem) to 7 (severe problem).

      Short fatigue questionnaire

      The SFQ consists of 4 statements that are scored on a 7-point Likert scale, similar to the scale of the FSS. For each subject, a total score is calculated as the mean score of the 4 statements. The SFQ has shown good internal consistency (Cronbach α=.88) and was found able to discriminate between patients and healthy subjects.
      • Alberts M.
      • Smets E.M.
      • Vercoulen J.H.
      • Garssen B.
      • Bleijenberg G.
      [Abbreviated fatigue questionnaire a practical tool in the classification of fatigue] [Dutch].

      Assessment protocol

      Two study visits to the hospital were scheduled on the same day of the week and at the same time, with a 3-week interval. The questionnaires were administered once on each study visit, except for the NHP, which was administered only on the second occasion. Before each visit, the patients received brief instructions on how to complete each questionnaire. They were asked to score only for the previous 2 weeks. The subjects were seated in a quiet room and were allowed to take all the time they needed to fill in the questionnaires and to rest in between, if necessary. In general, the time needed to fill in the questionnaires was less than 15 minutes. On both visits, the questionnaires were administered in the same order.

      Data analysis

      Validity was assessed on the basis of data obtained during the second study visit.

      Concurrent validity

      Correlations between the scores of the different questionnaires were determined by calculating the Spearman rank correlation coefficients. To determine whether there were any systematic differences between the scores of the questionnaires, the normalized scores of the questionnaires (on a 0–100 scale) were compared, using the Friedman analysis of variance (ANOVA). Post hoc (pairwise) comparisons were made, using Wilcoxon signed-rank tests corrected for multiple testing (α=.05/6, with 6 pairs tested).
      • Bland J.M.
      • Altman D.G.
      Multiple significance tests the Bonferroni method.

      Construct validity

      To investigate the construct validity of the questionnaires, a Mokken scale analysis for polytomous items
      MSP 5.0 for Windows; ProGAMMA BV, A weg 43, PO Box 841, 9700AV Groningen, The Netherlands.
      was performed to determine the single scale homogeneity of the 17 items of the 4 questionnaires.
      • Molenaar I.W.
      • Sijtsma K.
      User’s manual MSP5 for Windows. A program for Mokken scale analysis. Version 5.0.
      Mokken scale analysis is a nonparametric approach to the item response theory.
      • Mokken R.J.
      A theory and procedure of scale analysis.
      The concept of homogeneity refers to the Mokken model of monotone homogeneity, which assumes that the items measure the same construct, that item and item-step scores are locally independent, and that the item and the item-step response functions are monotonely nondecreasing functions of the latent trait.
      • Molenaar I.W.
      • Sijtsma K.
      User’s manual MSP5 for Windows. A program for Mokken scale analysis. Version 5.0.
      The Loevinger H scalability coefficient gives an indication of the extent to which the set of items form a homogeneous scale. An H smaller than .30 indicates a nonhomogeneous scale; between .30 and .40, weak homogeneity; between .40 and .50, moderate homogeneity; and greater than .50, strong homogeneity.
      • Molenaar I.W.
      • Sijtsma K.
      User’s manual MSP5 for Windows. A program for Mokken scale analysis. Version 5.0.
      The stronger the homogeneity, the more all items measure the same construct and the better the items discriminate between individual different positions on the construct. Homogeneous scales are considered to be unidimensional. Unlike dichotomous items, it is not possible to place polytomous items in hierarchic order with respect to the latent trait, because the order depends on both the item response functions and the item-step response functions.

      Reproducibility

      Systematic differences between the 2 study visits were investigated for the total scores of the FSS and the SFQ using t tests and for the PPL fatigue item and item scores of the FSS and the SFQ using Wilcoxon signed-rank tests. The distribution of the FSS and SFQ scores was considered normal. The test-retest reliability of the FSS and the SFQ was assessed with ICCs and the 95% confidence intervals (CIs) of the ICCs, using a random-effects 1-way ANOVA.
      • Rankin G.
      • Stokes M.
      Reliability of assessment tools in rehabilitation an illustration of appropriate statistical analyses.
      Test-retest reliability of the PPL fatigue item was analyzed by calculating the Spearman correlation coefficient. The internal consistency of the FSS, the NHP energy category, and the SFQ on the 2 study visits was determined with the Cronbach α coefficient. An α coefficient of .70 is considered sufficient, and an α coefficient of more than .80 is considered good for the purpose of group comparisons.
      • Nunnally J.
      • Bernstein I.
      Psychometric theory.
      Test-retest reliability for the 3 questionnaires was also assessed by means of Bland-Altman plots, in which for each subject the difference between the scores of the 2 visits was plotted against the mean of these 2 scores.
      • Bland J.M.
      • Altman D.G.
      Statistical methods for assessing agreement between two methods of clinical measurement.
      The 95% limits of agreement (mean difference ± 2 standard deviations [SDs]) were calculated for each questionnaire to assess the smallest detectable change, which gives an indication of the change that is needed to detect a real change, taking chance variation or measurement error into account.
      • Beckerman H.
      • Roebroeck M.E.
      • Lankhorst G.J.
      • Becher J.G.
      • Bezemer P.D.
      • Verbeek A.L.
      Smallest real difference, a link between reproducibility and responsiveness.
      A questionnaire is able to detect an individual change in score if the change lies outside the limits of agreement.
      For comparison of a group score in a paired situation, the smallest detectable change depends on the sample size. The effect of sample size on the smallest detectable change was estimated from n>k·σ22, with n being the number of subjects, k the constant based on tables of standard normal curve (k=10.51 for α=.05 and β=.10), σ2 the variance of differences, and Δ the smallest detectable change.
      • Pratt R.K.
      • Fairbank J.C.
      • Virr A.
      The reliability of the Shuttle Walking Test, the Swiss Spinal Stenosis Questionnaire, the Oxford Spinal Stenosis Score, and the Oswestry Disability Index in the assessment of patients with lumbar spinal stenosis.
      Statistical analysis was performed with the SPSS, version 10.0.5, statistical software package.
      SPSS Inc, 233 S Wacker Dr, 11th Fl, Chicago, IL 60606.
      An α level of P less than .05 was used for all tests of significance.

      Results

      Sixty-five patients (42 women, 23 men) with a mean age of 52±8 years completed the questionnaires on both visits. The mean time since the onset of polio was 49±8 years, and the mean time since new neuromuscular symptoms were perceived was 10±6 years. The total score and the item scores of the questionnaires on the first and second study visits are presented in table 1. The highest item scores on the FSS were found on the second visit for items 6 (“My fatigue prevents sustained physical functioning”; score, 6.3±1.0) and 8 (“Fatigue is among my three most disabling symptoms”; score, 6.2±1.3). The highest scores on the NHP energy category were found for items 1 (“I’m tired all the time”; score, 58±50) and 3 (“I soon run out of energy”; score, 62±49). The highest score on the SFQ was found on the first visit for item 2 (“I tire easily”; score, 5.9±1.3).
      Table 1Total and Item Scores on the First and Second Study Visits
      First VisitSecond VisitP Value
      FSS (range, 1–7)5.4±1.15.7±0.9.03
      1. My motivation is lower when I am fatigued5.0±1.95.3±1.8.22
      2. Exercise brings on my fatigue5.8±1.55.8±1.3.93
      3. I am easily fatigued5.6±1.65.9±1.3.02
      4. Fatigue interferes with my physical functioning5.7±1.65.9±1.1.15
      5. Fatigue causes frequent problems for me4.4±1.74.9±1.7.03
      6. My fatigue prevents sustained physical functioning5.9±1.76.3±1.0.10
      7. Fatigue interferes with carrying out certain duties and responsibilities5.3±1.75.5±1.5.40
      8. Fatigue is among my three most disabling symptoms6.1±1.46.2±1.3.56
      9. Fatigue interferes with my work, family, or social life5.3±1.75.2±1.8.43
      NHP energy category (range, 0–100)47±35
      1. I’m tired all the time58±50
      2. Everything is an effort22±41
      3. I soon run out of energy62±49
      PPL fatigue item (range, 0–7)5.5±1.45.7±1.2.05
      SFQ (range, 1–7)5.0±1.24.9±1.3.72
      1. I feel tired4.9±1.54.9±1.7.90
      2. I tire easily5.9±1.35.8±1.5.12
      3. I feel fit5.0±1.65.1±1.6.43
      4. I feel physically exhausted4.0±1.93.9±1.9.57
      NOTE. Values are mean ± SD. Differences between study visits were investigated for the total scores of the FSS and the SFQ usingt tests, and for the PPL fatigue item and item scores of the FSS and the SFQ using Wilcoxon signed-rank tests.

      Concurrent validity

      Pairs of total scores of questionnaires correlated significantly (P<.01), with the highest correlation coefficient of .68 between the PPL fatigue item and the SFQ and the lowest correlation coefficient of .43 between the NHP energy category and the PPL fatigue item (table 2).
      Table 2Spearman Correlation Coefficients
      FSSNHP Energy CategoryPPL Fatigue ItemSFQ
      FSS1.00
      NHP energy category0.50
      Significant correlation (P<.01).
      1.00
      PPL fatigue item0.60
      Significant correlation (P<.01).
      0.43
      Significant correlation (P<.01).
      1.00
      SFQ0.47
      Significant correlation (P<.01).
      0.67
      Significant correlation (P<.01).
      0.68
      Significant correlation (P<.01).
      1.00
      Significant correlation (P<.01).
      The normalized total and item scores of the questionnaires on the second study visit are given in table 3, and the normalized total scores are also presented in boxplots (fig 1). The mean normalized total scores were highest for the PPL fatigue item (81±17) and lowest for the NHP energy category (47±35). The mean normalized total scores for the FSS were 78±15, and for the SFQ they were 65±22. The normalized total scores of the questionnaires differed significantly (Friedman test, P<.01). Pairwise comparisons of the total scores of the questionnaires showed differences for all pairs of questionnaires (P<.05), except for the FSS-PPL fatigue item pair.
      Table 3Normalized Total and Item Scores (range, 0–100) on the Second Study Visit
      Mean ±SD255075
      FSS78±15708088
      1. My motivation is lower when I am fatigued71±306783100
      2. Exercise brings on my fatigue79±227583100
      3. I am easily fatigued82±227583100
      4. Fatigue interferes with my physical functioning82±186783100
      5. Fatigue causes frequent problems for me64±28506783
      6. My fatigue prevents sustained physical functioning88±1783100100
      7. Fatigue interferes with carrying out certain duties and responsibilities74±256783100
      8. Fatigue is among my three most disabling symptoms87±2183100100
      9. Fatigue interferes with my work, family, or social life69±305083100
      NHP energy category47±35333367
      1. I’m tired all the time58±500100100
      2. Everything is an effort22±41000
      3. I soon run out of energy62±490100100
      PPL fatigue item81±17718693
      SFQ65±22507183
      1. I feel tired65±28506783
      2. I tire easily79±256783100
      3. I feel fit68±27506783
      4. I feel physically exhausted48±32175083
      NOTE. Total and item scores are mean ± SD and quartiles.
      Figure thumbnail GR1
      Fig 1Boxplots of normalized total scores (scale range, 0–100) on the second study visit. The box represents the interquartile range with the bold line as median value. The whiskers represent the range of the scores. Abbreviations: NHPE, NHP energy category; PPLF, PPL fatigue item.

      Construct validity

      Scale analysis performed on the 17 items of the 4 questionnaires showed that 15 of the 17 items formed a unidimensional scale (H=.49). The first 2 items of the FSS did not fit into this scale and did not form a separate scale (H=−.11). When scale analysis was performed on the FSS as a separate scale, once again the first 2 items misfitted. The item H for item 1 (“My motivation is lower when I am fatigued”) was .05, and the item H for item 2 (“Exercise brings on my fatigue”) was –.05. Scale analysis on the remaining 7 items of the FSS showed H equal to .63.

      Reproducibility

      The total score of the FSS was higher on the second study visit than on the first (mean difference, 0.2±0.8; P=.03) (table 1). The score increased significantly for items 3 (“I am easily fatigued”; P=.02) and 5 (“Fatigue causes frequent problems for me”; P=.03). The total and item scores of the PPL fatigue item and the SFQ did not differ on retest. The ICCs (95% CI) for the FSS and the SFQ were .83 (.72–.90) and .84 (.73–.90), respectively. The Spearman ρ for the PPL fatigue item was .80 (P<.01).
      The FSS showed good internal consistency on the 2 study visits (Cronbach α=.85 and .80, respectively). The internal consistency of the NHP energy category on the second study visit (Cronbach α=.59) was below the .70 standard recommended for group comparisons.
      • Nunnally J.
      • Bernstein I.
      Psychometric theory.
      The SFQ showed reasonable internal consistency on both study visits (Cronbach α=.79 and .77, respectively).
      The mean of the individual scores on the 2 study visits was plotted against the difference of the scores on both visits for the FSS, the PPL fatigue item, and the SFQ (fig 2). The 95% limits of agreement, when expressed as a percentage of the mean of 2 study visits, were narrowest for the FSS and widest for the SFQ (table 4). The effect of sample size on the smallest detectable change is presented in table 5. For comparison of a group score in a paired situation, changes of less than 10% on the FSS, the PPL fatigue item, and the SFQ were required for sample sizes of at least 50 subjects.
      Figure thumbnail GR2
      Fig 2Bland-Altman plots for the FSS, the PPL fatigue item, and the SFQ. The difference is calculated as the score on the second study visit minus the score on the first study visit. The solid line represents the mean difference. The dotted lines represent the 95% limits of agreement.
      Table 4Limits of Agreement
      Mean of Scores of the 2 Study VisitsDifference in Scores of the 2 Study Visits95% Limits of Agreement
      FSS (range, 1–7)5.6±0.90.2±0.8−1.3 to 1.7 (−23% to 31%)
      PPL fatigue item (range, 0–7)5.6±1.20.2±0.9−2.0 to 2.0 (−36% to 36%)
      SFQ (range, 1–7)4.9±1.2−0.0±1.0−2.0 to 1.9 (−40% to 38%)
      NOTE. The mean of the scores and the difference in scores of the 2 study visits are mean ± SD. The difference is calculated as scores on the second study visit minus scores on the first study visit. The 95% limits of agreement are calculated as mean difference ±2 SDs of the difference and are expressed in original scale points and as a percentage of the mean of the 2 study visits.
      Table 5The Effect of Sample Size on the Smallest Detectable Change
      Individualn=25n=50
      FSS (range, 1–7)1.5 (27%)0.5 (9%)0.3 (6%)
      PPL fatigue item (range, 0–7)2.0 (36%)0.6 (10%)0.4 (7%)
      SFQ (range, 1–7)1.9 (39%)0.6 (13%)0.4 (9%)
      NOTE. The smallest detectable change is expressed in original scale points and as a percentage of the mean of the 2 study visits. For an individual case, the change required will be approximately 2 SDs of the mean difference. For comparison of a group score in a paired situation, the smallest detectable change was calculated from the formula: n>k·σ22, where n=number of subjects, k=constant based on tables of standard normal curve (k=10.51 for α=.05 and β=.10), σ2=variance of differences, and Δ=smallest detectable change.
      • Pratt R.K.
      • Fairbank J.C.
      • Virr A.
      The reliability of the Shuttle Walking Test, the Swiss Spinal Stenosis Questionnaire, the Oxford Spinal Stenosis Score, and the Oswestry Disability Index in the assessment of patients with lumbar spinal stenosis.

      Discussion

      Different questionnaires have been used to assess the severity of fatigue in PPS. However, little is known about the comparability and reproducibility of their results. In our study, both the validity and the reproducibility of various fatigue questionnaires were assessed in 65 patients with PPS. The data on reproducibility also provided information about the smallest detectable change for each questionnaire.
      Although the PPS patients selected for our study had elevated levels of fatigue, the scores for fatigue measured with the FSS, the NHP energy category, and the PPL fatigue item were comparable to those reported in the literature.
      • Nollet F.
      • Beelen A.
      • Prins M.H.
      • et al.
      Disability and functional assessment in former polio patients with and without postpolio syndrome.
      ,
      • Packer T.L.
      • Martins I.
      • Krefting L.
      • Brouwer B.
      Activity and post-polio fatigue.
      ,
      • Packer T.L.
      • Sauriol A.
      • Brouwer B.
      Fatigue secondary to chronic illness postpolio syndrome, chronic fatigue syndrome, and multiple sclerosis.
      ,
      • Schanke A.K.
      Psychological distress, social support and coping behaviour among polio survivors a 5-year perspective on 63 polio patients.
      ,
      • Trojan D.A.
      • Collet J.
      • Pollak M.N.
      • et al.
      Serum insulin-like growth factor-I (IGF-I) does not correlate positively with isometric strength, fatigue, and quality of life in post-polio syndrome.
      ,
      • Willen C.
      • Grimby G.
      Pain, physical activity, and disability in individuals with late effects of polio.
      ,
      • Willen C.
      • Sunnerhagen K.S.
      • Grimby G.
      Dynamic water exercise in individuals with late poliomyelitis.
      ,
      • Grimby G.
      • Jonsson A.L.
      Disability in poliomyelitis sequelae.
      ,
      • Thoren-Jonsson A.L.
      • Hedberg M.
      • Grimby G.
      Distress in everyday life in people with poliomyelitis sequelae.
      It must be mentioned that most of the NHP energy category scores reported in the literature were calculated from weighted item scores.
      • McKenna S.P.
      • Hunt S.M.
      • McEwen J.
      Weighting the seriousness of perceived health problems using Thurstone’s method of paired comparisons.
      However, the importance of weighting is under discussion,
      • Jenkinson C.
      Why are we weighting? A critical examination of the use of item weights in a health status measure.
      ,
      • Prieto L.
      • Alonso J.
      • Viladrich M.C.
      • Anto J.M.
      Scaling the Spanish version of the Nottingham Health Profile evidence of limited value of item weights.
      and the median score of 33 found in our study is well within the range of values reported in the literature.
      • Nollet F.
      • Beelen A.
      • Prins M.H.
      • et al.
      Disability and functional assessment in former polio patients with and without postpolio syndrome.
      ,
      • Willen C.
      • Grimby G.
      Pain, physical activity, and disability in individuals with late effects of polio.
      ,
      • Willen C.
      • Sunnerhagen K.S.
      • Grimby G.
      Dynamic water exercise in individuals with late poliomyelitis.
      ,
      • Grimby G.
      • Jonsson A.L.
      Disability in poliomyelitis sequelae.
      ,
      • Thoren-Jonsson A.L.
      • Hedberg M.
      • Grimby G.
      Distress in everyday life in people with poliomyelitis sequelae.

      Validity

      Analysis of concurrent validity showed low correlations (range, .43–.68) between the total scores of all pairs of questionnaires, which indicates that little of the variation in score of 1 questionnaire was explained by the variation in score of another questionnaire. Moreover, it was found that the normalized total scores differed between most questionnaires. Compared with the scores of the FSS, the PPL fatigue item, and the SFQ, the scores of the NHP energy category were markedly lower. It is well known that the dichotomous items of the NHP have a high threshold for positive scores
      • Franks P.J.
      • Moffatt C.J.
      Health related quality of life in patients with venous ulceration use of the Nottingham Health Profile.
      ,
      • Prieto L.
      • Alonso J.
      • Ferrer M.
      • Anto J.M.
      Are results of the SF-36 health survey and the Nottingham Health Profile similar? A comparison in COPD patients. Quality of Life in COPD Study Group.
      ,
      • Lamarca R.
      • Alonso J.
      • Santed R.
      • Prieto L.
      Performance of a perceived health measure in different groups of the population a comprehensive study in Spain.
      and are not likely to detect minor illnesses.
      • Hunt S.M.
      • MacEwen J.
      • MacKenna S.P.
      Measuring health status.
      Especially item 2 of the NHP energy category (“Everything is an effort”), which was scored affirmative by only 22% of the patients (table 1), showed a considerable ceiling effect.
      In contrast with the NHP energy category, the FSS and the PPL fatigue item seemed to have a low threshold. According to the median item values, 50% of the patients had a maximum score of 7 on the FSS for the items 6 (“My fatigue prevents sustained physical functioning”) and 8 (“Fatigue is among my three most disabling symptoms”). Therefore, item 2 of the NHP energy category and items 6 and 8 of the FSS may assess different aspects of fatigue. However, it appeared that 15 of the 17 items on the 4 questionnaires formed an almost strongly homogeneous scale (H=.49), which indicates that the questionnaires did not measure different constructs or other aspects of fatigue. Interestingly, the first 2 items of the FSS (“My motivation is lower when I am fatigued”; “Exercise brings on my fatigue”) did not fit in the overall scale, nor did they fit in their own 9-item FSS scale. The latter was surprising, because all 9 items of the FSS fit the assumption of unidimensionality when used to assess patients with chronic hepatitis C.
      • Kleinman L.
      • Zodet M.W.
      • Hakim Z.
      • et al.
      Psychometric evaluation of the fatigue severity scale for use in chronic hepatitis C.
      The fitting of these items may depend on the study population in which the FSS was applied. However, it must be stated that also in chronic hepatitis C patients,
      • Kleinman L.
      • Zodet M.W.
      • Hakim Z.
      • et al.
      Psychometric evaluation of the fatigue severity scale for use in chronic hepatitis C.
      the first 2 items of the FSS showed the lowest item-total correlations, which indicates that their scores reflected the total score least accurately.

      Reproducibility

      The ICCs of the FSS (.83) and the SFQ (.84) and the Spearman ρ of the PPL fatigue item (.80) seemed to be satisfactory. The ICC of the FSS was in accordance with ICCs found in other groups of patients (ICC range, .82–.86).
      • Merkies I.S.
      • Schmitz P.I.
      • Samijn J.P.
      • van der Meche F.G.
      • van Doorn P.A.
      Fatigue in immune-mediated polyneuropathies. European Inflammatory Neuropathy Cause and Treatment (INCAT) Group.
      ,
      • Kleinman L.
      • Zodet M.W.
      • Hakim Z.
      • et al.
      Psychometric evaluation of the fatigue severity scale for use in chronic hepatitis C.
      However, the lower limits of the 95% CIs of the ICCs of the FSS (.72) and the SFQ (.73) found in this study suggest only moderate test-retest reliability.
      • Lee J.
      • Koh D.
      • Ong C.N.
      Statistical evaluation of agreement between two methods for measuring a quantitative variable.
      ,
      • Andresen E.M.
      Criteria for assessing the tools of disability outcomes research.
      This might be due to large day-to-day variations in fatigue in PPS patients. However, because the literature presents no data on the CIs of the ICCs for the FSS, the PPL fatigue item, or the SFQ in PPS patients, this cannot be verified. In addition, the moderate test-retest reliability of fatigue as an outcome measure may be inherent in its subjective character, because the perception of fatigue depends not only on physical but also mental and emotional status.
      • Tiesinga L.J.
      • Dassen T.W.
      • Halfens R.J.
      Fatigue a summary of the definitions, dimensions, and indicators.
      The Cronbach α values for internal consistency that were found for the FSS and for the NHP energy category were comparable to those reported in other studies.
      • Krupp L.B.
      • LaRocca N.G.
      • Muir-Nash J.
      • Steinberg A.D.
      The fatigue severity scale Application to patients with multiple sclerosis and systemic lupus erythematosus.
      ,
      • Merkies I.S.
      • Schmitz P.I.
      • Samijn J.P.
      • van der Meche F.G.
      • van Doorn P.A.
      Fatigue in immune-mediated polyneuropathies. European Inflammatory Neuropathy Cause and Treatment (INCAT) Group.
      ,
      • Kleinman L.
      • Zodet M.W.
      • Hakim Z.
      • et al.
      Psychometric evaluation of the fatigue severity scale for use in chronic hepatitis C.
      ,
      • Franks P.J.
      • Moffatt C.J.
      Health related quality of life in patients with venous ulceration use of the Nottingham Health Profile.
      ,
      • Jans M.P.
      • Schellevis F.G.
      • van Eijk J.T.
      The Nottingham Health Profilescore distribution, internal consistency and validity in asthma and COPD patients.
      ,
      • Essink-Bot M.L.
      • Krabbe P.F.
      • Bonsel G.J.
      • Aaronson N.K.
      An empirical comparison of four generic health status measures. The Nottingham Health Profile, the Medical Outcomes Study 36-item Short-Form Health Survey, the COOP/WONCA charts, and the EuroQol instrument.
      With only 3 dichotomous items, the low internal consistency of the NHP energy category was not expected to be higher than already reported in other patient groups. The internal consistency of the SFQ was acceptable but lower than reported by Alberts et al.
      • Alberts M.
      • Smets E.M.
      • Vercoulen J.H.
      • Garssen B.
      • Bleijenberg G.
      [Abbreviated fatigue questionnaire a practical tool in the classification of fatigue] [Dutch].
      In addition to the assessment of reproducibility at the group level, it is also important to determine the reproducibility of an instrument at the individual level.
      • Angst F.
      • Aeschlimann A.
      • Stucki G.
      Smallest detectable and minimal clinically important differences of rehabilitation intervention with their implications for required sample sizes using WOMAC and SF-36 quality of life measurement instruments in patients with osteoarthritis of the lower extremities.
      The limits of agreement found for the FSS, the PPL fatigue item, and the SFQ were wide (table 4) and may indicate large individual day-to-day variations. With a 95% CI, the change in score of an individual, compared with the score on the first study visit, had to be at least 2 points on the scale of each questionnaire to be detected. The smallest detectable changes for the FSS, the PPL fatigue item, and the SFQ ranged from 27% to 39% (table 5). Therefore, at the individual level, the FSS, the PPL fatigue item, and the SFQ show too much variation in score to be able to detect changes in fatigue.
      Although less appropriate for detecting differences within an individual, the FSS, the PPL fatigue item, and the SFQ may be useful for group comparisons, in which the smallest detectable changes are much smaller. In a sample size of 50, the FSS, the PPL fatigue item, and the SFQ can detect changes of less than 10% from baseline. Similar conclusions with respect to the ability to detect change have been reported for the NHP energy category.
      • Lamarca R.
      • Alonso J.
      • Santed R.
      • Prieto L.
      Performance of a perceived health measure in different groups of the population a comprehensive study in Spain.
      ,
      • McHorney C.A.
      • Tarlov A.R.
      Individual-patient monitoring in clinical practice are available health status surveys adequate?.

      Conclusions

      When comparing the severity of fatigue in PPS reported in various studies, one should take into account the fact that, although the FSS, NHP energy category, PPL fatigue item, and SFQ measure the same construct of fatigue, the severity of fatigue may differ considerably as a result of differences in the range for which questionnaires measure fatigue. The NHP energy category, in particular, appeared to have a high detection threshold to measure fatigue.
      The choice of the appropriate questionnaire to measure fatigue in PPS may depend on the expected range in the severity of fatigue and the desired responsiveness of the questionnaire. For instance, if one is only interested in identifying high levels of fatigue, the NHP energy category may be preferred. On the other hand, if the main interest is to identify changes in fatigue, for instance due to intervention, it should be realized that no differences between the FSS, the PPL fatigue item, and the SFQ were found for reproducibility, which was comparable for all questionnaires—that is, sufficient at the group level but lacking precision at the individual patient level. The choice may further depend on the desired simplicity of the instrument, that is, the number of questions. Finally, when applying the FSS to measure fatigue in PPS, one should consider omitting items 1 and 2, because they do not appear to fit in the same fatigue construct as the other FSS items or with the other fatigue questionnaires studied.
      Suppliers
      aMSP 5.0 for Windows; ProGAMMA BV, A weg 43, PO Box 841, 9700AV Groningen, The Netherlands.
      bSPSS Inc, 233 S Wacker Dr, 11th Fl, Chicago, IL 60606.

      Acknowledgements

      We thank Bastiaan Hemker for sharing his expertise on Mokken scale analysis.

      References

        • Berlly M.H.
        • Strauser W.W.
        • Hall K.M.
        Fatigue in postpolio syndrome.
        Arch Phys Med Rehabil. 1991; 72: 115-118
        • Nollet F.
        • Beelen A.
        • Prins M.H.
        • et al.
        Disability and functional assessment in former polio patients with and without postpolio syndrome.
        Arch Phys Med Rehabil. 1999; 80: 136-143
        • Agre J.C.
        • Rodriquez A.A.
        • Sperling K.B.
        Symptoms and clinical impressions of patients seen in a postpolio clinic.
        Arch Phys Med Rehabil. 1989; 70: 367-370
        • Ramlow J.
        • Alexander M.
        • LaPorte R.
        • Kaufmann C.
        • Kuller L.
        Epidemiology of the post-polio syndrome.
        Am J Epidemiol. 1992; 136: 769-786
        • Halstead L.S.
        • Rossi C.D.
        Post-polio syndrome.
        Birth Defects Orig Artic Ser. 1987; 23: 13-26
        • Packer T.L.
        • Martins I.
        • Krefting L.
        • Brouwer B.
        Activity and post-polio fatigue.
        Orthopedics. 1991; 14: 1223-1226
        • Packer T.L.
        • Sauriol A.
        • Brouwer B.
        Fatigue secondary to chronic illness.
        Arch Phys Med Rehabil. 1994; 75: 1122-1126
        • Schanke A.K.
        • Stanghelle J.K.
        Fatigue in polio survivors.
        Spinal Cord. 2001; 39: 243-251
        • Schanke A.K.
        Psychological distress, social support and coping behaviour among polio survivors.
        Disabil Rehabil. 1997; 19: 108-116
        • Trojan D.A.
        • Collet J.
        • Pollak M.N.
        • et al.
        Serum insulin-like growth factor-I (IGF-I) does not correlate positively with isometric strength, fatigue, and quality of life in post-polio syndrome.
        J Neurol Sci. 2001; 182: 107-115
        • Willen C.
        • Grimby G.
        Pain, physical activity, and disability in individuals with late effects of polio.
        Arch Phys Med Rehabil. 1998; 79: 915-919
        • Willen C.
        • Sunnerhagen K.S.
        • Grimby G.
        Dynamic water exercise in individuals with late poliomyelitis.
        Arch Phys Med Rehabil. 2001; 82: 66-72
        • Grimby G.
        • Jonsson A.L.
        Disability in poliomyelitis sequelae.
        Phys Ther. 1994; 74: 415-424
        • Tiesinga L.J.
        • Dassen T.W.
        • Halfens R.J.
        DUFS and DEFS.
        Int J Nurs Stud. 1998; 35: 115-123
        • Trojan D.A.
        • Cashman N.R.
        An open trial of pyridostigmine in post-poliomyelitis syndrome.
        Can J Neurol Sci. 1995; 22: 223-227
        • Trojan D.A.
        • Collet J.P.
        • Shapiro S.
        • et al.
        A multicenter, randomized, double-blinded trial of pyridostigmine in postpolio syndrome.
        Neurology. 1999; 53: 1225-1233
        • Bruno R.L.
        • Zimmerman J.R.
        • Creange S.J.
        • Lewis T.
        • Molzen T.
        • Frick N.M.
        Bromocriptine in the treatment of post-polio fatigue.
        Am J Phys Med Rehabil. 1996; 75: 340-347
        • Stein D.P.
        • Dambrosia J.M.
        • Dalakas M.C.
        A double-blind, placebo-controlled trial of amantadine for the treatment of fatigue in patients with the post-polio syndrome.
        Ann N Y Acad Sci. 1995; 753: 296-302
        • Windebank A.J.
        • Litchy W.J.
        • Daube J.R.
        • Iverson R.A.
        Lack of progression of neurologic deficit in survivors of paralytic polio.
        Neurology. 1996; 46: 80-84
        • Fitzpatrick R.
        • Ziebland S.
        • Jenkinson C.
        • Mowat A.
        • Mowat A.
        Importance of sensitivity to change as a criterion for selecting health status measures.
        Qual Health Care. 1992; 1: 89-93
        • Halstead L.S.
        Post-polio syndrome.
        in: Munsat T.L. Post-polio syndrome. Butterworth-Heinemann, Stoneham (MA)1991: 23-38
        • Krupp L.B.
        • LaRocca N.G.
        • Muir-Nash J.
        • Steinberg A.D.
        The fatigue severity scale.
        Arch Neurol. 1989; 46: 1121-1123
        • Merkies I.S.
        • Schmitz P.I.
        • Samijn J.P.
        • van der Meche F.G.
        • van Doorn P.A.
        Fatigue in immune-mediated polyneuropathies. European Inflammatory Neuropathy Cause and Treatment (INCAT) Group.
        Neurology. 1999; 53: 1648-1654
        • Kleinman L.
        • Zodet M.W.
        • Hakim Z.
        • et al.
        Psychometric evaluation of the fatigue severity scale for use in chronic hepatitis C.
        Qual Life Res. 2000; 9: 499-508
        • Erdman R.A.
        • Passchier J.
        • Kooijman M.
        • Stronks D.L.
        The Dutch version of the Nottingham Health Profile.
        Psychol Rep. 1993; 72: 1027-1035
        • Visser M.C.
        • Koudstaal P.J.
        • Erdman R.A.
        • et al.
        Measuring quality of life in patients with myocardial infarction or stroke.
        J Epidemiol Community Health. 1995; 49: 513-517
        • Alberts M.
        • Smets E.M.
        • Vercoulen J.H.
        • Garssen B.
        • Bleijenberg G.
        [Abbreviated fatigue questionnaire.
        Ned Tijdschr Geneeskd. 1997; 141: 1526-1530
        • Bland J.M.
        • Altman D.G.
        Multiple significance tests.
        BMJ. 1995; 310: 170
        • Molenaar I.W.
        • Sijtsma K.
        User’s manual MSP5 for Windows..
        iec ProGAMMA, Groningen (Netherlands)2000
        • Mokken R.J.
        A theory and procedure of scale analysis.
        De Gruyter, New York1971
        • Rankin G.
        • Stokes M.
        Reliability of assessment tools in rehabilitation.
        Clin Rehabil. 1998; 12: 187-199
        • Nunnally J.
        • Bernstein I.
        Psychometric theory.
        3rd ed. McGraw-Hill, New York2000
        • Bland J.M.
        • Altman D.G.
        Statistical methods for assessing agreement between two methods of clinical measurement.
        Lancet. 1986; 1: 307-310
        • Beckerman H.
        • Roebroeck M.E.
        • Lankhorst G.J.
        • Becher J.G.
        • Bezemer P.D.
        • Verbeek A.L.
        Smallest real difference, a link between reproducibility and responsiveness.
        Qual Life Res. 2001; 10: 571-578
        • Pratt R.K.
        • Fairbank J.C.
        • Virr A.
        The reliability of the Shuttle Walking Test, the Swiss Spinal Stenosis Questionnaire, the Oxford Spinal Stenosis Score, and the Oswestry Disability Index in the assessment of patients with lumbar spinal stenosis.
        Spine. 2002; 27: 84-91
        • Thoren-Jonsson A.L.
        • Hedberg M.
        • Grimby G.
        Distress in everyday life in people with poliomyelitis sequelae.
        J Rehabil Med. 2001; 33: 119-127
        • McKenna S.P.
        • Hunt S.M.
        • McEwen J.
        Weighting the seriousness of perceived health problems using Thurstone’s method of paired comparisons.
        Int J Epidemiol. 1981; 10: 93-97
        • Jenkinson C.
        Why are we weighting? A critical examination of the use of item weights in a health status measure.
        Soc Sci Med. 1991; 32: 1413-1416
        • Prieto L.
        • Alonso J.
        • Viladrich M.C.
        • Anto J.M.
        Scaling the Spanish version of the Nottingham Health Profile.
        J Clin Epidemiol. 1996; 49: 31-38
        • Franks P.J.
        • Moffatt C.J.
        Health related quality of life in patients with venous ulceration.
        Qual Life Res. 2001; 10: 693-700
        • Prieto L.
        • Alonso J.
        • Ferrer M.
        • Anto J.M.
        Are results of the SF-36 health survey and the Nottingham Health Profile similar?.
        J Clin Epidemiol. 1997; 50: 463-473
        • Lamarca R.
        • Alonso J.
        • Santed R.
        • Prieto L.
        Performance of a perceived health measure in different groups of the population.
        J Clin Epidemiol. 2001; 54: 127-135
        • Hunt S.M.
        • MacEwen J.
        • MacKenna S.P.
        Measuring health status.
        Croom Helm, London1986
        • Lee J.
        • Koh D.
        • Ong C.N.
        Statistical evaluation of agreement between two methods for measuring a quantitative variable.
        Comput Biol Med. 1989; 19: 61-70
        • Andresen E.M.
        Criteria for assessing the tools of disability outcomes research.
        Arch Phys Med Rehabil. 2000; 81: S15-S20
        • Tiesinga L.J.
        • Dassen T.W.
        • Halfens R.J.
        Fatigue.
        Nurs Diagn. 1996; 7: 51-62
        • Jans M.P.
        • Schellevis F.G.
        • van Eijk J.T.
        The Nottingham Health Profilescore distribution, internal consistency and validity in asthma and COPD patients.
        Qual Life Res. 1999; 8: 501-507
        • Essink-Bot M.L.
        • Krabbe P.F.
        • Bonsel G.J.
        • Aaronson N.K.
        An empirical comparison of four generic health status measures. The Nottingham Health Profile, the Medical Outcomes Study 36-item Short-Form Health Survey, the COOP/WONCA charts, and the EuroQol instrument.
        Med Care. 1997; 35: 522-537
        • Angst F.
        • Aeschlimann A.
        • Stucki G.
        Smallest detectable and minimal clinically important differences of rehabilitation intervention with their implications for required sample sizes using WOMAC and SF-36 quality of life measurement instruments in patients with osteoarthritis of the lower extremities.
        Arthritis Care Res. 2001; 45: 384-391
        • McHorney C.A.
        • Tarlov A.R.
        Individual-patient monitoring in clinical practice.
        Qual Life Res. 1995; 4: 293-307