Archives of Physical Medicine and Rehabilitation
Volume 90, Issue 1 , Pages 87-94, January 2009

Reliability of Rehabilitative Ultrasound Imaging of the Transversus Abdominis and Lumbar Multifidus Muscles

  • Shane L. Koppenhaver, MPT

      Affiliations

    • College of Health, University of Utah, Salt Lake City, UT
    • Officer, United States Army
    • Corresponding Author InformationCorrespondence to Shane L. Koppenhaver, MPT, 1416 Downington Ave, Salt Lake City, UT 84105
  • ,
  • Jeffrey J. Hebert, DC

      Affiliations

    • College of Health, University of Utah, Salt Lake City, UT
  • ,
  • Julie M. Fritz, PhD, PT, ATC

      Affiliations

    • College of Health, University of Utah, Salt Lake City, UT
    • Intermountain Health Care, Salt Lake City, UT
  • ,
  • Eric C. Parent, PhD, PT

      Affiliations

    • Department of Physical Therapy/Faculty of Rehabilitation Medicine, University of Alberta, Edmonton, Alberta, Canada
  • ,
  • Deydre S. Teyhen, PhD, PT

      Affiliations

    • Officer, United States Army
    • US Army-Baylor University Doctoral Program in Physical Therapy, San Antonio, TX
  • ,
  • John S. Magel, DSc, PT

      Affiliations

    • Intermountain Health Care, Salt Lake City, UT

Article Outline

Abstract 

Koppenhaver SL, Hebert JJ, Fritz JM, Parent EC, Teyhen DS, Magel JS. Reliability of rehabilitative ultrasound imaging of the transversus abdominis and lumbar multifidus muscles.

Objectives

To evaluate the intraexaminer and interexaminer reliability of rehabilitative ultrasound imaging (RUSI) in obtaining thickness measurements of the transversus abdominis (TrA) and lumbar multifidus muscles at rest and during contractions.

Design

Single-group repeated-measures reliability study.

Setting

University and orthopedic physical therapy clinic.

Participants

A volunteer sample of adults (N=30) with current nonspecific low back pain (LBP) was examined by 2 clinicians with minimal RUSI experience.

Interventions

Not applicable.

Main Outcome Measures

Thickness measurements of the TrA and lumbar multifidus muscles at rest and during contractions were obtained by using RUSI during 2 sessions 1 to 3 days apart. Percent thickness change was calculated as thicknesscontracted–thicknessrest/thicknessrest. Intraclass correlation coefficients (ICC) were used to estimate reliability.

Results

By using the mean of 2 measures, intraexaminer reliability point estimates (ICC3,2) ranged from 0.96 to 0.99 for same-day comparisons and from 0.87 to 0.98 for between-day comparisons. Interexaminer reliability estimates (ICC2,2) ranged from 0.88 to 0.94 for within-day comparisons and from 0.80 to 0.92 for between-day comparisons. Reliability estimates comparing measurements by the 2 examiners of the same image (ICC2,2) ranged from 0.96 to 0.98. Reliability estimates were lower for percent thickness change measures than the corresponding single thickness measures for all conditions.

Conclusions

RUSI thickness measurements of the TrA and lumbar multifidus muscles in patients with LBP, when based on the mean of 2 measures, are highly reliable when taken by a single examiner and adequately reliable when taken by different examiners.

Key Words: Abdominal muscles, Low back pain, Rehabilitation, Reproducibility of results, Ultrasonography

List of Abbreviations: ADIM, abdominal drawing-in maneuver, ASLR, active straight leg raise, CI, confidence interval, ICC, intraclass correlation coefficient, LBP, low back pain, LOA, limits of agreement, MDC, minimal detectable change, ODI, modified Oswestry Disability Index, RUSI, rehabilitative ultrasound imaging, TrA, transversus abdominis

 

THE TRANSVERSUS ABDOMINIS and lumbar multifidus muscles have been proposed to play an important role in spinal stability1, 2, 3 and have been shown to have functional deficits in individuals with LBP.1, 2, 3, 4, 5, 6, 7, 8, 9 RUSI has been advocated as a noninvasive method to quantify muscle morphology and behavior and has been increasingly used both in research and as a clinical tool throughout the rehabilitative process.10, 11 RUSI has been validated as a measure of TrA and lumbar multifidus muscle morphology through comparisons with magnetic resonance imaging measurements12, 13 and as an indicator of muscle activation with indwelling electromyography.4, 14, 15, 16, 17

For RUSI to be useful as a research and rehabilitative tool, the reliability of its measurements must be determined as it is used clinically. Although several researchers have investigated the reliability of RUSI measures of the TrA16, 18, 19, 20, 21, 22, 23, 24, 25, 26 and lumbar multifidus6, 13, 15, 27, 28, 29, 30, 31 muscles, all have done so in small (n<10) and/or asymptomatic samples. Most of these studies have shown very high reliability (ICC>0.90) and good precision (TrA standard error of measurement<1.2mm and lumbar multifidus standard error of measurement<3.7mm); however, estimates obtained in asymptomatic samples cannot be generalized to individuals with LBP, and estimates obtained with small samples are often associated with wide confidence intervals. Furthermore, most researchers have investigated reliability in limited conditions, most commonly only during resting states repeated during a single testing session. Because RUSI is used primarily in symptomatic patients, of muscles during both resting and contracted states, and across different days, the reliability of such measures still needs to be established.

The primary purpose of this study was to evaluate the intraexaminer and interexaminer reliability in obtaining RUSI thickness measurements of the TrA and lumbar multifidus muscles at rest and during contractions both during a single session (within day) and between 2 sessions (between day) in patients with LBP. We hypothesized that RUSI measurements are adequately reliable (ICC>0.75) for research and clinical use in patients with LBP.

Back to Article Outline

Methods 

Participants 

Thirty volunteers aged 18 to 60 with current nonspecific LBP were recruited for this study by either responding to fliers posted around the University of Utah campus or by referral from a local orthopedic physical therapy clinic. LBP was defined as current symptoms of pain and/or numbness between the twelfth rib and buttocks with or without symptoms into 1 or both legs that limits function. Participants were excluded for prior lumbar surgery; the inability to lie both prone and supine for a minimum of 20 minutes each; or the presence of medical red flags of potentially serious conditions including cauda equina syndrome, major or rapidly progressing neurologic deficit, fracture, cancer, infection, or systemic disease. Participants signed consent forms approved by the institutional review boards of recruiting institutions.

Examiners 

One physical therapist (S.K.) and 1 chiropractor (J.H.) participated as examiners for the reliability analysis. Although both examiners had been practicing clinically for more than 8 years, neither had previously used RUSI in their clinical practice. Before testing, both examiners underwent 16 hours of hands-on training with a coinvestigator (D.T.) experienced with the specific RUSI protocol used in the study. Additionally, the physical therapist completed 70 hours of didactic training including a course and certification by the Burwin Institute in musculoskeletal ultrasound.

Procedures 

This single-group repeated-measures design involved a baseline measurement session and a follow-up session 1 to 3 days later. After providing consent, participants completed self-report measures including demographic/historic information and questionnaires on pain and disability. An 11-point numeric rating scale, ranging from 0 to 10, was used to estimate the mean of current pain intensity and the best and worst pain intensity in the past 24 hours.32, 33, 34 The ODI questionnaire was used to quantify self-reported disability. ODI scores range from 0 to 100, with higher scores representing more disability.35, 36 During the physical examination, the examiner determined the participant's symptomatic side, which was then used for all subsequent RUSI images. If pain was evenly distributed, the side of measurement was determined randomly. After the initial session, participants were asked to avoid any exercises or treatments for LBP between sessions.

Images of the TrA and lumbar multifidus muscles were acquired in B-mode with a Sonosite Titan ultrasound machinea and a 60-mm 2- to 5-MHz curvilinear array. Image acquisition for each condition was performed 3 times by each of the 2 examiners. To maximize time efficiency, 1 examiner positioned the transducer and optimized the quality of the image (imaging examiner), whereas the other examiner captured and saved the image. To help avoid an order effect associated with potential learning or fatigue, the order in which each examiner obtained the images and the order in which the muscles (TrA and lumbar multifidus) were imaged were counterbalanced. A total of 108 images were taken of each participant (72 during session 1 and 36 during session 2) to be able to calculate a mean from 2 or 3 measures and to calculate all within-days and between-days intraexaminer and interexaminer comparisons for all muscle conditions.

Transversus abdominis 

Images of the TrA muscle were acquired during the ASLR maneuver37, 38 and during the ADIM.25, 26 Ultrasound images of the TrA muscle were obtained with the transducer positioned just superior to the iliac crest along the midaxillary line and followed the techniques outlined by Teyhen et al26 in which the middle of the muscle belly was centered within the field of view. All images were collected at the end of normal exhalation to control for the influence of respiration.

The ASLR maneuver37, 38 was used in this study to assess automatic changes in the TrA muscle thickness without the subject being asked to volitionally activate the muscle. Participants were positioned supine with hips and knees extended at rest and were instructed to “raise your leg off of the table approximately 8 inches (20cm) without bending your knee.” All participants were given a single practice of the ASLR maneuver before image acquisition.

The ADIM is a fundamental motor control exercise used to train the TrA muscle and has been found to preferentially contract the TrA muscle relative to the more superficial lateral abdominal muscles.25, 26 The ADIM was used in this study to assess changes in muscle thickness associated with a volitional activation of the TrA muscle. The resting position involved the participants lying supine in a hook-lying position.25, 26 To perform the ADIM, participants were instructed to “take a relaxed breath in and out, hold the breath out, and then draw-in your lower abdomen without moving your spine.” Alternate cues of “cut off the flow of urine” or “close your rear passage” were sometimes given in an attempt to maximize a preferential TrA contraction. The cue resulting in the largest preferential TrA contraction was practiced approximately 5 times until a ceiling effect occurred in performance of the ADIM as visualized by changes in muscle thickness on the ultrasound image.

Lumbar multifidus 

Images of the lumbar multifidus muscle at rest and during a submaximal contraction were obtained following techniques outlined by Kiesel et al.15 To assess automatic changes in the lumbar multifidus muscle thickness during a task, a contralateral arm lift maneuver was performed prone with the elbows flexed 90°, shoulders abducted 120°, and holding a hand weight based on the participants body mass.15 Participants were instructed to “lift your arm approximately 2 inches (5cm) off the table” and were given 1 practice contralateral arm lift trial before image acquisition.

Measurements 

All images were measured offline by using Image J software (V1.38t)b on a different date than the images were obtained. TrA thickness measurements were made between the superficial and deep borders of the muscle, as visualized by the hyperechoic fascial lines (fig 1). Lumbar multifidus thickness measurements were made between the posterior-most portion of the L4/5 zygapophyseal joint and the plane between the muscle and subcutaneous tissue (fig 2). Each examiner measured all of the images they generated, allowing for the analysis of intraexaminer and interexaminer reliability. The physical therapist also measured all of the images obtained by the chiropractor to assess the reliability of 2 examiners measuring the same image. By using Image J's automatic measurement function (control M) and concealing the measurement output on the computer screen, examiners were blinded during measurement to the thickness values. Additionally, examiners were blinded to each other's measurements and to their own previous measurements.

  • View full-size image.
  • Fig 1. 

    Ultrasound images of the TrA, internal oblique (IO), and external oblique (EO) muscles (A) during rest and (B) during an ADIM. Thickness measurements were made between the superficial and deep borders of the TrA muscle.

  • View full-size image.
  • Fig 2. 

    Ultrasound images of the lumbar multifidus (LM) muscle (A) during rest and (B) during a contralateral arm raise. Thickness measurements were made between the posterior-most portion of the L4/5 facet joint and the plane between the muscle and subcutaneous tissue.

Data Analysis 

Data management and statistic analyses were performed by using the Statistical Package for the Social Sciences version 16.0c software. TrA data from 30 participants across 2 days and 4 different measurement conditions (supine rest, ASLR, hook-lying rest, and ADIM) and lumbar multifidus data from 29 participants across 2 days and 2 different measurement conditions (prone rest and contralateral arm lift) were included for analysis. The dependent measures for the TrA and lumbar multifidus muscles were resting thickness, contracted thickness, and percent thickness change. Percent thickness change was calculated for the TrA and lumbar multifidus muscles by using the following equation: thicknesscontracted–thicknessrest/thicknessrest.

ICCs with 95% CIs were calculated to assess intraexaminer (model 3,k) and interexaminer (model 2,k) reliability both within and between days.39 As recommended by Bland and Altman,40 biases with 95% CIs were estimated by calculating the mean difference between measures, and LOAs were calculated as the mean difference ± 2 × SD. To assess measurement precision, the standard error of measurement was calculated as (SD × √ [1-ICC]).41, 42 MDCs were calculated as 1.96×standard error of measurement×√2 and represent the minimal change in thickness that must occur to be 95% confident that a true change occurred.43, 44 To investigate the effect of using the mean of multiple thickness measurements on reliability and measurement precision, ICCs and standard error of measurements using the mean of the first 2 and 3 measures were compared with those using single measures.

Back to Article Outline

Results 

Demographic and baseline characteristics of the patient sample are provided in table 1. Images from 1 participant for the lumbar multifidus muscle were excluded because examiners were unable to identify muscle boundaries. Although specific pain level was not solicited during imaging, all participants satisfactorily completed all TrA and lumbar multifidus muscle-contraction tasks without verbal complaints of pain.

Table 1. Demographic and Baseline Characteristics of Participants (N=30)
Characteristic
Age (y)42.4±11.4
Sex43% women
BMI (kg/m2)26.6±4.8
Oswestry Disability Score (%)20.4±14.1
Numeric pain rating scale2.9±2.0
Pain in back/buttock only (%)77
Pain below buttocks, above knee (%)10
Pain below knee (%)13
Duration of symptoms (d)75 (17, 847)
Prior history of LBP (%)80

NOTE: Values are mean ± SD unless otherwise indicated.

Reports the average of the worst, best, and current scores for pain over the last 24 hours.

Median (interquartile range).

The standard error of measurement was calculated to determine if a single measure or an average of 2 or 3 images resulted in the greatest precision (table 2). Overall, the mean of 2 measurements across each condition decreased the standard error of measurement by a mean of 32.4%, whereas the mean of 3 measurements decreased the standard error of measurement by a mean of 36.4%. In comparing the 4.0% (95% CI, 2.9%–5.2%) mean improvement in precision relative to the additional time required to acquire and measure an additional set of images, the remaining reliability data have been analyzed by using the mean of the first 2 measurements of each condition. Reliability coefficients with corresponding 95% CIs, standard error of measurements, MDCs, bias, and LOAs are presented in table 3 for intraexaminer estimates and table 4 for interexaminer estimates. Means and SDs are also presented in Table 3, Table 4 and represent pooled values from all measures in the corresponding condition.

Table 2. Difference in Standard Error of Measurement Using the Mean of 2 and 3 Measures Compared to a Single Measure
IntraexaminerInterexaminer
Muscle/StateSingle MeasureMean of 2 Measures (% from 1 measure)Mean of 3 Measures (% from 1 measure)Single MeasureMean of 2 Measures (% from 1 measure)Mean of 3 Measures (% from 1 measure)
Within day
TrA
Rest (supine)0.20.1(41%)0.1(45%)0.40.3(26%)0.3(28%)
Contracted (ASLR)0.60.3(54%)0.2(61%)0.60.4(28%)0.4(31%)
TrA
Rest (hook-lying)0.40.2(44%)0.2(48%)0.40.2(34%)0.2(36%)
Contracted (ADIM)0.40.3(41%)0.2(48%)0.60.5(21%)0.5(24%)
LM
Rest1.51.0(31%)1.0(31%)2.92.1(27%)2.1(28%)
Contracted (CAL)0.90.6(38%)0.5(43%)2.51.7(34%)1.5(42%)
Avg%(41%)Avg%(46%) Avg%(28%)Avg%(31%)
Between days
TrA
Rest (supine)0.30.2(33%)0.2(39%)0.40.3(27%)0.2(33%)
Contracted (ASLR)0.60.4(33%)0.3(40%)0.80.6(21%)0.5(31%)
TrA
Rest (hook-lying)0.30.2(27%)0.2(27%)0.40.3(32%)0.2(39%)
Contracted (ADIM)0.70.5(26%)0.5(30%)0.60.4(28%)0.4(31%)
LM
Rest1.30.9(34%)0.9(34%)2.92.1(28%)2.1(30%)
CAL1.81.1(38%)1.1(40%)2.71.8(33%)1.8(35%)
Avg%(32%)Avg%(35%) Avg%(28%)Avg%(33%)

Abbreviations: Avg, average; LM, lumbar multifidus; CAL, contralateral arm lift.

Values in millimeters except % change.

Table 3. Intraexaminer Reliability of Examiner 1 Using a Mean of 2 Measures for Each Rating
Muscle/StateMean ± SD (mm)ICC3,2(95% CI)SEM (mm)MDC (mm)Bias (95% CI) ± 95% LOA (mm)
Within day
TrA
Rest (supine)3.2±0.80.98(0.95–0.99)0.10.40.1(0.0–0.1)±0.5
Contracted (ASLR)3.7±1.40.96(0.92–0.98)0.30.80.0(−0.2–0.2)±1.1
% change15.7±32.80.92(0.84–0.96)9.225.4−2.4(−9.0–4.2)±35.4
TrA
Rest (hook-lying)3.3±1.00.96(0.91–0.98)0.20.50.0(−0.2–0.1)±0.8
Contracted (ADIM)5.8±1.50.97(0.94–0.99)0.30.70.1(−0.1–0.2)±1.0
% change80.8±39.00.94(0.87–0.97)9.827.15.0(−2.0–12.0)±37.4
LM
Rest34.6±6.20.97(0.94–0.99)1.02.8−0.5(−1.2–0.3)±4.0
Contracted (CAL)37.9±6.50.99(0.98–1.00)0.61.60.0(−0.5–0.4)±2.4
% change9.8±8.30.78(0.53–0.89)4.011.01.6(−1.2–4.3)±14.3
Between days
TrA
Rest (supine)3.1±0.80.94(0.87–0.97)0.20.60.1(0.0–0.3)±0.8
Contracted (ASLR)3.7±1.50.93(0.86–0.97)0.41.1−0.1(−0.4–0.2)±1.5
% change18.3±36.70.89(0.76–0.95)12.334.1−7.7(−16.2–0.8)±45.5
TrA
Rest (hook-lying)3.2±0.90.93(0.85–0.97)0.20.70.2(0.0–0.3)±0.9
Contracted (ADIM)5.7±1.40.87(0.74–0.94)0.51.30.2(−0.1–0.6)±1.8
% change83.9±37.10.73(0.43–0.87)19.253.3−1.2%(−14.1–11.7)±69.1
LM
Rest34.4±6.20.98(0.95–0.99)0.92.5−0.1(−0.8–0.6)±3.6
Contracted (CAL)38.2±6.60.97(0.94–0.99)1.13.1−0.6(−1.4–0.3)±4.3
% change11.2±8.70.79(0.56–0.90)4.011.0−1.1(−3.9–1.7)±14.6

Abbreviations: LM, lumbar multifidus; CAL, contralateral arm lift.

Values in millimeters except % change.

Pooled from all measures in condition.

Mean difference ± 2 SDs.

Table 4. Interexaminer Reliability Using a Mean of 2 Measures for Each Rating
Muscle/StateMean ± SD (mm)ICC2,2 (95% CI)SEM (mm)MDC (mm)Bias (95% CI) ± 95% LOA (mm)
Within day
TrA
Rest (supine)3.1±0.90.89(0.78–0.95)0.30.8−0.2(−0.4–0.0)±1.1
Contracted (ASLR)3.5±1.30.91(0.79–0.96)0.41.1−0.3(−0.6–0.0)§±1.5
% change13.1±29.00.91(0.81–0.96)8.724.2−2.8(−9.1–3.5)±33.8
TrA
Rest (hook-lying)3.1±1.00.94(0.79–0.98)0.20.7−0.3(−0.4–−0.1)§±0.8
Contracted (ADIM)5.6±1.50.89(0.75–0.95)0.51.4−0.3(−0.7–0.0)§±1.8
% change85.2±36.30.73(0.42–0.87)19.052.73.8(−8.9–16.5)±67.9
LM
Rest33.2±6.00.88(0.63–0.95)2.15.8−2.3(−3.6–−0.9)§±7.0
Contracted (CAL)37.5±6.40.93(0.85–0.97)1.74.7−0.8(−2.1–0.4)±6.4
% change13.4±11.00.45(−0.09–0.73)8.122.65.5(0.5–10.4)§±26.1
Between days
TrA
Rest (supine)3.1±0.90.91(0.82–0.96)0.30.7−0.1(−0.3–0.1)±1.0
Contracted (ASLR)3.5±1.40.80(0.58–0.91)0.61.7−0.4(−0.8–0.0)§±2.2
% change16.9±32.90.78(0.53–0.89)15.643.2−10.5(−21.0–−0.1)±55.9
TrA
Rest (hook-lying)3.1±0.90.92(0.83–0.96)0.30.7−0.1(−0.3–0.1)±1.0
Contracted (ADIM)5.5±1.30.90(0.80–0.95)0.41.2−0.1(−0.4–0.2)±1.6
% change85.8±34.00.55(0.04–0.79)22.863.32.6(−11.6–16.9)±76.3
LM
Rest33.3±6.00.88(0.60–0.95)2.15.8−2.4(−3.7–−1.1)§±6.8
Contracted (CAL)37.7±6.60.92(0.82–0.97)1.85.1−1.4(−2.7–−0.1)§±6.7
% change13.9±10.70.73(0.42–0.88)5.515.34.4(0.7–8.0)§±19.2

Abbreviations: CAL, contralateral arm lift; LM, lumbar multifidus; SEM, standard error of measurement.

Values in millimeters except % change.

Pooled from all measures in condition.

Mean difference ± 2 SDs.

§Statistically significant bias (different from zero).

Depending on the muscle (TrA vs lumbar multifidus) and muscle condition (rest vs contraction), intraexaminer reliability point estimates (ICC3,2) of thickness measurements ranged from 0.96 to 0.99 for same-day comparisons and from 0.87 to 0.98 for between-day comparisons (see table 3). Estimates from the 2 different examiners were not statistically different from one another (ie, 95% CIs overlapped), therefore, intraexaminer data is only presented for examiner 1 (J.H.). Depending on the muscle and muscle condition, interexaminer reliability estimates (ICC2,2) of thickness measurements ranged from 0.88 to 0.94 for same-day comparisons and from 0.80 to 0.92 for between-day comparisons (see table 4). Reliability estimates comparing thickness measurements by the 2 examiners of the same image (ICC3,2) ranged from 0.96 to 0.98. Reliability estimates were lower for percent thickness change measures than the corresponding single thickness measures for both muscles in all conditions (see Table 3, Table 4).

Bias estimates were small and statistically not significantly different from 0 in all intraexaminer comparisons (see table 3). However, statistically significant interexaminer bias was found in approximately 50% of comparisons with estimates ranging between 0.3 and 0.4 mm for the TrA measurements and 1.4 and 2.4mm for the lumbar multifidus measurements (see table 4). Statistically significant bias was also found in 5 of 6 comparisons performed between examiners measuring the same image. Estimates ranged from 0.2mm (TrA) to 0.7mm (lumbar multifidus) with examiner 1 (chiropractor) consistently measuring a larger value than examiner 2 (physical therapist).

Back to Article Outline

Discussion 

This study evaluated the intraexaminer and interexaminer reliability in obtaining RUSI thickness measurements of the TrA and lumbar multifidus muscles at rest and during submaximal contractions both during a single session and between days in patients with LBP. Intraexaminer comparisons of thickness measures generally showed excellent reliability with only the ICC point estimate of between-day ADIM reliability below 0.90. Although generally lower than intraexaminer estimates, all interexaminer ICC estimates remained above 0.80, indicating good interexaminer reliability. These findings are consistent with previous studies that investigated both symptomatic20, 26, 45 and asymptomatic15, 16, 18, 19, 21, 22, 24, 25, 29, 31, 46 individuals and support our primary hypothesis that RUSI measurements are adequately reliable for research and clinical use in patients with LBP.

The comparison that resulted in the lowest intraexaminer reliability estimate for single thicknesses was that of between-day ADIM measures (ICC=0.87). The ADIM requires examiners to teach participants to volitionally contract the TrA to a specific degree that results in maximal thickness change of the TrA with minimal to no thickening of the more superficial abdominal muscles.26 Factors such as the instructions from the examiner, participant motivation, and participant's skill at motor control may all affect interrepetition performance during an ADIM and could explain the decreased reliability of these measures. In contrast to intraexaminer comparisons, it was the between-day ASLR measures that showed the poorest interexaminer reliability (ICC=0.80). The ASLR was included in this study in an attempt to avoid the additional performance variability that comes with volitional muscle contractions. It was the experimenters' observation in this study that TrA thickness change during the ASLR was highly variable in many participants between repetitions. Although not instructed to do so, it is possible that some participants purposefully altered their abdominal contraction during the ASLR. It is also likely that very small variations between repetitions (eg, 0.2–0.5mm) had moderate adverse effects on reliability because mean TrA thickness only increased approximately 0.5mm during the ASLR.

Percent thickness change measures may be more useful clinically than single thickness measures but incorporate the measurement error from both resting and contracted measurements ([thicknesscontracted–thicknessrest)/thicknessrest). Therefore, it is not surprising that estimates of the reliability of percent thickness change were consistently lower than those for single thickness measurements and is likely attributable to the fact that change scores are based on 2 imperfect measurements (rather than 1). To our knowledge, only 1 other study46 investigated the reliability of percent thickness change using RUSI. Although this study found high reliability in patients with LBP for both single thickness measures and percent thickness change of the TrA and lumbar multifidus muscles, they only investigated intraexaminer reliability during a single session. A potential problem with the lower reliability of percent thickness change measures is that they result in relatively large standard error of measurements and MDCs. For example, when using between-day intraexaminer MDCs, if a patient with LBP initially showed a lumbar multifidus thickness change of 10%, after rehabilitation they would have to increase it to at least 21% for the examiner to be 95% confident that a true change occurred. An even larger change would be necessary in the TrA during an ADIM. A patient initially showing an 80% thickness change would have to increase to at least 133% for the examiner to be 95% confident that a true change occurred. In some cases, these minimal detectable postrehabilitation values are larger than the percent thickness changes found in asymptomatic individuals46 and may not be realistically attainable.

To better identify the sources of variability during RUSI, the reliability of 2 examiners measuring the same image was calculated. Reliability was excellent, with all resting and contracted thickness measure point estimates above 0.96. This finding is consistent with previous work26 and suggests that the great majority of interexaminer measurement “error” is introduced during image acquisition as opposed to during measurement of muscle thickness on a previously obtained image. However, a statistically significant bias was found in all except 1 intraimage comparison with examiner 1 (chiropractor) consistently measuring a larger value than examiner 2 (physical therapist). During image measurement, standardization for the lateral cursor placement consisted of examiners agreeing to measure the horizontal “visual center of the muscle” for the TrA (see fig 1) and at the most posterior portion of the L4/5 facet for the lumbar multifidus (see fig 2). A systematic difference in how each examiner interpreted the “visual center of the muscle” or in the choice of landmark used to represent the muscle-fascial boundary or facet joint may have existed. Regardless of where the specific bias occurred, the actual differences between examiners were very small and did not result in poor reliability.

Throughout this discussion, we have mostly interpreted reliability estimates as suggested by Portney and Watkins41 who advocate that coefficients below 0.50 represent poor reliability, those between 0.50 and 0.75 represent moderate reliability, and coefficients above 0.75 represent good reliability. Other authors propose differing cutoff standards,47, 48, 49 some with different minimal reliability criteria for group comparisons (0.70) and individual comparisons (0.90–0 .95).48 In fact, there seems to be a growing consensus that any such standards in interpreting reliability coefficients should also consider both the precision of the measured variable and how the measures will be ultimately used.41, 42 RUSI measurements are most likely used clinically to make patient-management decisions regarding lumbar stabilization exercise. Because the cost of an “incorrect” decision in the clinic would likely be relatively benign (eg, having a patient without TrA deficits perform abdominal motor control exercises), a lower level of reliability of RUSI measures may be acceptable. Using RUSI as an outcome measure during research may similarly allow a lower level of reliability because measures are usually averaged across multiple individuals, thereby decreasing measurement error.

Study Limitations 

Several limitations exist within this study. Both examiners were clinicians with minimal RUSI experience other than 16 hours of training on the specific ultrasound machine and imaging protocol. Because some evidence suggests that the reliability of RUSI measurement may differ depending on user experience,50 it is unknown whether practitioners with either more or less experience will show a different level of reliability than did the examiners in this study. Additionally, pain was assessed during the initial evaluation only. Although all participants were able to complete the muscle contraction tasks without verbal complaints, the level of pain during contractions was not solicited and may have adversely affected reliability. Moreover, abdominal muscle thickness has been found to vary depending on the exact location of measurement, with more superior portions of the muscle being thicker than inferior locations.24Although the current study used a standardized transducer placement protocol, specific transducer placement was not marked between image acquisitions and likely varied to some small degree. Finally, some ICC point estimates were associated with wide 95% CIs in which the upper-bound and lower-bound estimates represent very different degrees of reliability. Although the current study is the largest study to date to investigate RUSI reliability in patients with LBP, our results should not be considered definitive. Further studies should continue to investigate the reliability of RUSI measures, especially of percent thickness change in symptomatic samples. Future studies should additionally attempt to better identify the sources of error involved with RUSI image acquisition and the measurement of muscle thickness. Lastly, attempts should be made to identify more reliable contraction strategies for the TrA and methods to reduce error during such measurements.

Back to Article Outline

Conclusions 

RUSI thickness measurements of the TrA and lumbar multifidus muscles in patients with LBP, when based on the mean of 2 measures, are highly reliable when taken by a single examiner and adequately reliable when taken by different examiners. Using the mean of 2 measures substantially increased the reliability and precision of all measurements and is recommended. Percent thickness change measures may be adequately reliable because clinical use of RUSI usually involves benign patient-management decisions regarding lumbar stabilization exercise, and measures are typically averaged across multiple individuals in research.

Suppliers

Back to Article Outline

Acknowledgments 

We would like to thank Aaron Swalberg and Steven Moffit of Intermountain Health Care, Salt Lake City, UT, for their help with participant recruitment.

Back to Article Outline

References 

  1. Hodges PW, Richardson CA. Delayed postural contraction of transversus abdominis in low back pain associated with movement of the lower limb. J Spinal Disord. 1998;11:46–56
  2. Hodges PW, Richardson CA. Altered trunk muscle recruitment in people with low back pain with upper limb movement at different speeds. Arch Phys Med Rehabil. 1999;80:1005–1012
  3. Hungerford B, Gilleard W, Hodges P. Evidence of altered lumbopelvic muscle recruitment in the presence of sacroiliac joint pain. Spine. 2003;28:1593–1600
  4. Ferreira PH, Ferreira ML, Hodges PW. Changes in recruitment of the abdominal muscles in people with low back pain: ultrasound measurement of muscle activity. Spine. 2004;29:2560–2566
  5. Hides JA, Stokes MJ, Saide M, Jull GA, Cooper DH. Evidence of lumbar multifidus muscle wasting ipsilateral to symptoms in patients with acute/subacute low back pain. Spine. 1994;19:165–172
  6. Hodges P, Holm AK, Hansson T, Holm S. Rapid atrophy of the lumbar multifidus follows experimental disc or nerve root injury. Spine. 2006;31:2926–2933
  7. Kiesel KB, Uhl T, Underwood FB, Nitz AJ. Rehabilitative ultrasound measurement of select trunk muscle activation during induced pain. Man Ther. 2008;13:132–138
  8. Yoshihara K, Shirai Y, Nakayama Y, Uesaka S. Histochemical changes in the multifidus muscle in patients with lumbar intervertebral disc herniation. Spine. 2001;26:622–626
  9. Zhao WP, Kawaguchi Y, Matsui H, Kanamori M, Kimura T. Histochemistry and morphology of the multifidus muscle in lumbar disc herniation: comparative study between diseased and normal sides. Spine. 2000;25:2191–2199
  10. Teyhen DS. Rehabilitative ultrasound imaging symposium San Antonio, TX, May 8–10, 2006. J Orthop Sports Phys Ther. 2006;36:A1–A3
  11. Teyhen DS. Rehabilitative ultrasound imaging: the roadmap ahead. J Orthop Sports Phys Ther. 2007;37:431–433
  12. Hides J, Wilson S, Stanton W, et al. An MRI investigation into the function of the transversus abdominis muscle during “drawing-in” of the abdominal wall. Spine. 2006;31:E175–E178
  13. Hides JA, Richardson CA, Jull GA. Magnetic resonance imaging and ultrasonography of the lumbar multifidus muscle: comparison of two different modalities. Spine. 1995;20:54–58
  14. Hodges PW, Pengel LH, Herbert RD, Gandevia SC. Measurement of muscle contraction with ultrasound imaging. Muscle Nerve. 2003;27:682–692
  15. Kiesel KB, Uhl TL, Underwood FB, Rodd DW, Nitz AJ. Measurement of lumbar multifidus muscle contraction with rehabilitative ultrasound imaging. Man Ther. 2007;12:161–166
  16. McMeeken JM, Beith ID, Newham DJ, Milligan P, Critchley DJ. The relationship between EMG and change in thickness of transversus abdominis. Clin Biomech (Bristol, Avon). 2004;19:337–342
  17. Vasseljen O, Dahl HH, Mork PJ, Torp HG. Muscle activity onset in the lumbar multifidus muscle recorded simultaneously by ultrasound imaging and intramuscular electromyography. Clin Biomech. 2006;21:905–913
  18. Ainscough-Potts AM, Morrissey MC, Critchley D. The response of the transverse abdominis and internal oblique muscles to different postures. Man Ther. 2006;11:54–60
  19. Bunce SM, Moore AP, Hough AD. M-mode ultrasound: a reliable measure of transversus abdominis thickness?. Clin Biomech (Bristol, Avon). 2002;17:315–317
  20. Critchley DJ, Coutts FJ. Abdominal muscle function in chronic low back pain patients: Measurement with real-time ultrasound scanning. Physiotherapy. 2002;88:322–332
  21. Hides JA, Miokovic T, Belavý DL, Stanton WR, Richardson CA. Ultrasound imaging assessment of abdominal muscle function during drawing-in of the abdominal wall: an intrarater reliability study. J Orthop Sports Phys Ther. 2007;37:480–486
  22. Kidd AW, Magee S, Richardson CA. Reliability of real-time ultrasound for the assessment of transversus abdominis function. J Gravit Physiol. 2002;9:P131–P132
  23. Misuri G, Colagrande S, Gorini M, et al. In vivo ultrasound assessment of respiratory function of abdominal muscles in normal subjects. Eur Respir J. 1997;10:2861–2867
  24. Rankin G, Stokes M, Newham DJ. Abdominal muscle size and symmetry in normal subjects. Muscle Nerve. 2006;34:320–326
  25. Springer BA, Mielcarek BJ, Nesfield TK, Teyhen DS. Relationships among lateral abdominal muscles, gender, body mass index, and hand dominance. J Orthop Sports Phys Ther. 2006;36:289–297
  26. Teyhen DS, Miltenberger CE, Deiters HM, et al. The use of ultrasound imaging of the abdominal drawing-in maneuver in subjects with low back pain. J Orthop Sports Phys Ther. 2005;35:346–355
  27. Hides J. Diagnostic ultrasound imaging for measurement of the lumbar multifidus muscle in normal young adults. Physiother Theory Pract. 1992;8:19–26
  28. Kennelly KP, Stokes MJ. Pattern of asymmetry of paraspinal muscle size in adolescent idiopathic scoliosis examined by real-time ultrasound imaging (A preliminary study). Spine. 1993;18:913–917
  29. Pressler JF, Heiss DG, Buford JA, Chidley JV. Between-day repeatability and symmetry of multifidus cross-sectional area measured using ultrasound imaging. J Orthop Sports Phys Ther. 2006;36:10–18
  30. Stokes M, Rankin G, Newham DJ. Ultrasound imaging of lumbar multifidus muscle: normal reference ranges for measurements and practical guidance on the technique. Man Ther. 2005;10:116–126
  31. Van K, Hides JA, Richardson CA. The use of real-time ultrasound imaging for biofeedback of lumbar multifidus muscle contraction in healthy subjects. J Orthop Sports Phys Ther. 2006;36:920–925
  32. Farrar JT, Berlin JA, Strom BL. Clinically important changes in acute pain outcome measures: a validation study. J Pain Symptom Manage. 2003;25:406–411
  33. Jensen MP, Turner JA, Romano JM, Fisher LD. Comparative reliability and validity of chronic pain intensity measures. Pain. 1999;83:157–162
  34. Li L, Liu X, Herr K. Postoperative pain intensity assessment: a comparison of four scales in Chinese adults. Pain Med. 2007;8:223–234
  35. Fairbank JC, Couper J, Davies JB, O'Brien JP. The Oswestry low back pain disability questionnaire. Physiotherapy. 1980;66:271–273
  36. Fritz JM, Irrgang JJ. A comparison of a modified Oswestry Disability Questionnaire and the Quebec Back Pain Disability Scale [published erratum appears in Phys Ther 2008;88:138-9]. Phys Ther. 2001;81:776–788
  37. Mens JM, Vleeming A, Snijders CJ, Koes BW, Stam HJ. Validity of the active straight leg raise test for measuring disease severity in patients with posterior pelvic pain after pregnancy. Spine. 2002;27:196–200
  38. Mens JM, Vleeming A, Snijders CJ, Stam HJ, Ginai AZ. The active straight leg raising test and mobility of the pelvic joints. Eur Spine J. 1999;8:468–473
  39. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86:420–428
  40. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1:307–310
  41. Portney LG, Watkins MP. Foundations of clinical research: applications to practice. In: 3rd ed.. Upper Saddle River: Pearson/Prentice Hall; 2008;p. 912
  42. Streiner DL, Norman GR. Health measurement scales: a practical guide to their development and use. In: New York: Oxford Univ Pr; 2003;p. 296
  43. Eliasziw M, Young SL, Woodbury MG, Fryday-Field K. Statistical methodology for the concurrent assessment of interrater and intrarater reliability: using goniometric measurements as an example. Phys Ther. 1994;74:777–788
  44. Roebroeck ME, Harlaar J, Lankhorst GJ. The application of generalizability theory to reliability assessment: an illustration using isometric force measurements. Phys Ther. 1993;73:386–395discussion 396-401
  45. Norasteh A, Ebrahimi E, Salavati M, Rafiei J, Abbasnejad E. Reliability of B-mode ultrasonography for abdominal muscles in asymptomatic and patients with acute low back pain. Journal of Bodywork and Movement Therapies. 2007;11:17–20
  46. Kiesel KB, Underwood FB, Matacolla C, Nitz AJ, Malone TR. A comparison of select trunk muscle thickness change between subjects with low back pain classified in the treatment-based classification system and asymptomatic controls. J Orthop Sports Phys Ther. 2007;37:596–607
  47. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–174
  48. Lohr KN, Aaronson NK, Alonso J, et al. Evaluating quality-of-life and health status instruments: development of scientific review criteria. Clin Ther. 1996;18:979–992
  49. Shrout PE. Measurement reliability and agreement in psychiatry. Stat Methods Med Res. 1998;7:301–317
  50. Hides JA, Wong I, Wilson SJ, Belavý DL, Richardson CA. Assessment of abdominal muscle function during a simulated unilateral weight-bearing task using ultrasound imaging. J Orthop Sports Phys Ther. 2007;37:467–471
  • a Sonosite Inc, 21919 30th Dr SE, Bothell, WA 98021.
  • b National Institutes of Health, 9000 Rockville Pike, Bethesda, MD 20892.
  • c SPSS Inc, 233 S. Wacker Dr, 11th Fl, Chicago, IL 60606.

 Supported in part by Sonosite Inc, Bothell, WA, by providing the ultrasound machine used in this study at no charge to the Division of Physical Therapy, University of Utah.

 No commercial party having a direct financial interest in the results of the research supporting this article has or will confer a benefit on the authors or on any organization with which the authors are associated.

 Reprints are not available from the author.

PII: S0003-9993(08)01497-4

doi:10.1016/j.apmr.2008.06.022

Archives of Physical Medicine and Rehabilitation
Volume 90, Issue 1 , Pages 87-94, January 2009