Volume 90, Issue 6 , Pages 1048-1054, June 2009
Assessment of Gluteus Maximus Muscle Area With Different Image Analysis Programs
Article Outline
Abstract
Wu GA, Bogie K. Assessment of gluteus maximus muscle area with different image analysis programs.
Objective
To determine the effectiveness of a percutaneous gluteal stimulation system (GSTIM) by comparing assessments of axial computed tomography (CT) scans for the pelvic area.
Design
Comparing the measurements of the cross-sectional area (CSA) of the gluteus maximus muscle between raters and 2 image analysis programs.
Setting
Retrospective axial CT scans of the pelvic area.
Participants
Men (N=9) with complete (below T6) spinal cord injury (SCI) and at least 2 years postinjury participated in the study (range, 29–75y; mean age, 51.8y).
Intervention
Comparing gluteus maximus CSA before and after a period of GSTIM.
Main Outcome Measure
Measurements made by 2 expert and 2 nonexpert raters were used to compare the repeatability and reliability of measuring muscle CSA. The longitudinal study presented is from repeated CT scans obtained over a 2-year period for 1 representative participant who received a GSTIM system.
Results
For repeatability, nonexpert raters measured a mean CSA of 35.2cm2 (range, 20–45cm2), while experts measured 21cm2 (range, 10–35cm2). A composite of all raters using the same program had SDs of 2.5 to 2.6cm2 for a program available through the National Institutes of Health and 2.5 to 4.4cm2 for a commercially available program. For reliability, differences between the 2 programs had mean differences in SD between 2.2 and 3.7cm2.
Conclusions
The same rater and program (preferably the more reliable ImageJ) is recommended for the course of a longitudinal study. Otherwise, significant error would be introduced. Furthermore, significant increases in the CSA of gluteal muscle compared with preintervention (baseline) measurements were observed for the participant receiving GSTIM.
Key Words: Electric stimulation, Rehabilitation, Spinal cord injuries
List of Abbreviations: CSA, cross-sectional area, CT, computed tomography, GSTIM, percutaneous gluteal stimulation system, MRI, magnetic resonance imaging, SCI, spinal cord injury
MUSCLE ATROPHY IS CAUSED by paralysis of muscles in patients with complete SCI as a result of loss of the ability to communicate command signals from the central nervous system to muscles below the level of injury. Rapid widespread loss of muscle mass has been found to occur in the first 6 weeks after SCI1 and continues for up to 18 months postinjury before the muscle mass eventually plateaus. Muscle atrophy leads to an average of 45% to 80% reduction in muscle CSA after SCI.2 An estimated 20% to 30% of these patients have a history of pressure ulcers by 5 to 10 years postinjury.3, 4 The prevalence, together with high economic and sociologic costs of pressure ulcers, demonstrates the importance of identifying risk factors, preventing pressure ulcer occurrence, and developing cost-effective interventions. The finding that a previous incidence of pressure ulcer is the most significant factor in predicting development of future pressure ulcers suggests that preventive methods must be made a priority.5
The use of GSTIM can provide a means for varying the pressure under bony prominences and minimize the loss of muscle bulk. While sitting, GSTIM can contract 1 side of the buttocks at a time, simulating weight-shifting strategies and redistributing pressure.6, 7 Tissue viability is improved when mechanical occlusion of blood flow is alleviated, increasing blood and lymph flow. Additionally, the repeated rhythmic muscle contractions may act as a pump, increasing oxygen supply in the muscle tissue.8, 9 Longer-term studies have shown that continued use produces an increase in weight-shifting efficacy.10
Minimizing the loss of muscle bulk also decreases the risk of pressure ulcer development. It has been shown that electrical stimulation of paralyzed muscles affected by SCI can increase muscle bulk,11 condition the muscle, and decrease fatigability.12 Previous studies have shown an increase in gluteal lean muscle mass after 6 months of stimulation in patients with acute SCI.11 CSA measurements of stimulated muscle were used as an indication of the effectiveness of stimulation.
The primary goal of the current study was to evaluate 2 image analysis techniques in CSA measurements of the gluteal region anatomy. The hypothesis was that measurements from 2 different image analysis programs and different raters were repeatable and reliable. The secondary goal was to determine changes over time in gluteal muscle CSA with use of the GSTIM. Because of the long-term nature of the study, it was also important to minimize the number of scans and confine them to the specific area of interest in order to minimize radiation exposure. CT imaging techniques met these goals and minimized the risk of compromising the GSTIM.
Methods
Intervention and Assessment
Nine men with complete SCI below the T6 level participated in the study. Ages ranged from 29 to 75 years old, and the mean age was 51.8. Participants were at least 2 years postinjury. Three participants received GSTIM, while 6 others served as controls.
CT imaging was used as an outcomes measure to monitor changes in the CSA of the gluteus maximus muscle caused by long-term use of GSTIM. Two image analysis software programs to measure the cross-sectional area were evaluated. The goals of this evaluation study were to determine the reliability and repeatability of measurements taken by different raters using either image software program. Two nonexpert raters carried out measurements for all 9 patients, while 2 nonexpert and 2 expert raters performed repeated measurements for 1 research participant.
For participants receiving the GSTIM system, a baseline CT scan was taken shortly after implantation. The research participants initially underwent conditioning stimulation for 2 to 3 months in order to build up the fatigue resistance of the muscles slowly. The participants then started the dynamic stimulation phase using GSTIM daily. The frequency of weight shifting recommended for wheelchair users at risk of tissue breakdown was approximated by applying stimulation for a 3-minute period, with a 17-minute interstimulation interval, giving an overall 20-minute pattern. Alternating stimulation was provided to the left and right gluteal muscles with a duty cycle of 15 seconds on and 15 seconds off; more specifically, stimulation was applied so that while 1 muscle (left) was being stimulated, the other (right) was off. The stimulation activity was then reversed (left off/right on), leading to a 50% active duty cycle for each muscle. CT scans were taken at regular intervals during the 3 years postimplantation. The stimulation protocol has previously been described in detail by Bogie et al.10
Retrospective spiral CT scans of the pelvic area were analyzed. Scans had a resolution of 0.5 to 1mm in the axial x-direction and y-direction and 5 to 12mm in the z-direction (slice thickness). Baseline scans were obtained before use of GSTIM with up to 3 further scans obtained over a period of 2 to 36 months for each research participant. Scan locations were determined by anatomical landmarks (fig 1): between the second and third sacral notch (C1), where the caudal head of the femur starts to appear (C2), and where the lateral aspect of the greater trochanter is most prominent (C3). These 3 scan locations were obtained for each participant at each time point. Images were then deidentified and randomized prior to taking measurements.

Fig 1.
CT scan image locations used for assessments. (A), C1: between the second and third sacral notch; (B), C2: caudal head of the femur appears; (C), C3: greater trochanter is most prominent. NOTE. outlined dotted area is the gluteus maximus. Inset square 6 × 6cm2 on top left used as reference for VeVMD measurements.
Image Analysis Techniques
ImageJa is an image processing and analysis program written in Java for operating systems such as Linux, Windows, and Mac OSX. It is provided free of charge by the National Institutes of Health. ImageJ was used to measure muscle area by outlining the muscle on the image. The pixel to square millimeter ratio was automatically converted by ImageJ from the information stored in the image. Image editing to ease visualization of the muscle was performed when necessary. The freehand selection tool was used to outline the gluteus maximus muscle. The measure tool was then used to calculate the area within the outline. The muscle outlines were saved separately from the image for future reference. The outline could not be modified after being saved, and measurements were repeated on the original image. Measurement values were saved in square centimeters to a Microsoft Excelb spreadsheet.
VeVMDc is an image analysis program primarily used for wound measurement, assessment, documentation, tracking, and outcomes. The program calibrates wound area using a 6 × 6cm2 white template placed in the field of view. In order to process CT images for measurement by VeVMD, Adobe Photoshop CS2d was used to create a template. This was merged onto the CT axial image to provide the VeVMD software with a reference for pixel to square centimeter conversion. Adjustment of contrast and brightness was performed when necessary using other image editing software. After image preparation, measurements were taken using the outline tool. The outline was saved for future reference. VeVMD allows storage of images for the same patient in the same batch so that one can easily refer to them. It is also possible to edit the measurements after saving. Measurement values were then saved to a Microsoft Excel spreadsheet.
Evaluation
In order to test the repeatability (accuracy) of measurements, 3 repeated measurements were tested against the mean of the measurements for all raters. The repeatability is assessed by the SD from the mean. The reliability of measurements were assessed using Bland-Altman analysis, which compares 2 outcome measures that are highly correlated and graphically represents rater bias and variance13 by analysis of the difference of the output from the individual measures with the average output from both outcomes measures.
The precision of each rater between image analysis programs was tested using the methods described by Shoukri and Edge.14 The statistic compares the overall variance (σ2) to the variance of each method (σImageJ2,σVeVMD2) and compares the ratio of the variances (Q=σImageJ2/σVeVMD2) to σ2. The statistic is similar in principle to nonparametric statistics. A value of 1 or –1 within the confidence interval Q shows comparable precision of variances.
All 4 raters analyzed a longitudinal series with 4 time points from 1 participant obtained over a period of 2 years. CT scans were obtained at T0, baseline (immediately after implantation of GSTIM); T1, 6 months; T2, 12 months; and T3, 24 months postimplantation. A total of 4 raters evaluated the scans: 2 experts (radiologists) and 2 nonexperts (university students). In order to determine intrarater reliability, each rater repeated the analysis of all images 3 times for each program at intervals of 1 to 2 weeks. The 2 nonexpert raters also analyzed the remaining 8 participants following the same protocol for periods between 3 to 36 months.
General factorial designs were used to test the significance of the following factors in measuring rater, image analysis program, longitudinal measures over time (longitudinal), side of the body (side), and cross-sectional imaging location (ImageLoc). The Student t test and Tukey pairwise comparisons were used to compare the difference in measurements of significant factors, specifically to distinguish separate levels for each factor—that is, different time points in the longitudinal factor to see whether the CSA of muscle changed significantly or stayed the same for each time point.
Results
The boundaries of the muscle of interest are hard to find, particularly in patients with muscle atrophy where the fascia is not as prominent, and can lead to widely varying repeated measurements. The differences in repeated measurements as seen in the histogram (fig 2) were centered about 0 and had different SDs for each rater, as seen in table 1. Measurement differences between raters showed an SD of 2.5 to 2.6cm2 with ImageJ and an SD of 2.5 to 4.4cm2 with VeVMD.

Fig 2.
Histograms showing differences in repeated measurements to the mean measurement for each muscle using (A) ImageJ and (B) VeVMD. — — — R1, – – – – R2, ------ R3, —·—· R4.
Table 1. SDs of the Repeated Measurements for Each Image Analysis Program According to Rater (N=72)
| Expertise | Raters | ImageJ | VeVMD |
|---|---|---|---|
| Nonexpert rater | 1 | 2.52 | 4.42 |
| 2 | 2.62 | 3.63 | |
| Expert rater | 3 | 2.63 | 2.73 |
| 4 | 2.49 | 2.54 |
The Bland-Altman analysis (fig 3) showed that nonexpert raters had interprogram measurement differences that increased with measurement size, while expert raters were not affected by measurement size with a bias of –1.6 to 0.6. VeVMD measurements are frequently larger than ImageJ measurements, with a trend line less than 0 for all but 1 rater (rater 3). Differences between the 2 analysis programs were found to have SDs ranging from 2.2 to 3.7cm2 for all raters (table 2). Measurements means are higher for nonexpert raters (raters 1 and 2) than expert raters (raters 3 and 4). Measurement differences between programs were significant for all raters, except for 1 expert (rater 3). The Shoukri and Edge14 analysis showed that precision of the 2 programs was comparable for expert raters but not for nonexpert raters (table 3).

Fig 3.
Bland-Altman analysis for each rater between image analysis programs. The analysis compares the difference in measurement to the mean measurement. Y describes the trendline for each rater. Dotted lines signify 2 SDs 2*SD=6.51, 7.42, 5.73, 4.21, respectively. (A) •R1 (circle), (B) ■R2 (square), (C) ♦R3 (diamond), (D) ▲R4 (triangle).
Table 2. Measurement Means for Each Image Analysis Program
| Expertise | Rater | ImageJ Mean | VeVMD Mean | ImageJ –VeVMD | SD of Mean Difference | t | 95% CI | P |
|---|---|---|---|---|---|---|---|---|
| Nonexpert rater | 1 | 31.44 | 32.85 | –1.42 | 3.25 | –2.13 | –2.79 | <.05 |
| 2 | 37.17 | 39.70 | –2.53 | 3.71 | –3.34 | –4.09 | <.05 | |
| Expert rater | 3 | 23.05 | 22.52 | 0.54 | 2.88 | 0.91 | –0.68 | NS |
| 4 | 23.07 | 24.64 | –1.56 | 2.20 | –3.48 | –2.49 | <.05 |
Table 3. Shoukri and Edge14 Analysis Test for Differences in Precision of Image Analysis Programs
| Expertise | Rater | σ2 | σImageJ2 | σVeVMD2 | t | Q=σImageJ2/σVeVMD2 |
|---|---|---|---|---|---|---|
| Nonexpert rater | 1 | 36.31 | 26.51 | –5.33 | 2.0⁎ | –4.98 |
| 2 | 21.77 | –3.01 | 30.53 | –2.36⁎ | –0.099 | |
| Expert rater | 3 | 40.79 | 10.14 | 6.27 | 0.24 | 1.62 |
| 4 | 26.47 | 4.82 | 4.02 | 0.08 | 1.199 |
⁎P <.05. |
A general linear model and Tukey comparison test were used to compare the significance of these factors to the measurements. Both the main effects and interaction plot (fig 4) showed significant differences in raters (P<.01), except between the 2 expert raters.

Fig 4.
Longitudinal study for 1 patient. (A) Main effects and (B) interaction plot for 4 raters and 2 different programs over a period of 24 months. •R1 (circle), ■R2 (square), ♦R3 (diamond), ▲R4 (triangle), ○Image J (empty circle), ☐VeVMD (empty square).
In figure 5, the muscle CSA increased compared with baseline. Nonexpert raters had larger measurements than expert raters. Likewise, expert raters observed significant statistical differences compared with baseline using both programs (see fig 5A, 5B), while significant differences were seen in ImageJ by 1 of the 2 nonexperts (See fig 5A). The differences in measurements between programs were statistically significant (P<.05), with VeVMD (program 2) giving a larger measurement than ImageJ (program 1).

Fig 5.
Interval plot of measurements using (A) ImageJ and (B) VeVMD. Interval bars represent 95% confidence interval for the mean. Measurements of gluteus maximus muscle at the slice where the caudal head of femur appears (C2) on the right side. Tables show significance of measurement differences between time points. Abbreviation: NS, not significant. *P<.05; **P<.01. •R1 (circle), ■R2 (square), ♦R3 (diamond), ▲R4 (triangle).
Longitudinal analysis for 1 research participant receiving GSTIM over a 2-year period postimplantation showed increased muscle CSA over time (see Fig 4, Fig 5). The Tukey comparison test showed that the difference in longitudinal measurements was significant only compared with baseline (P<.05).
Examination of the results for the whole study group showed that participants who did not receive a GSTIM system—that is, controls—showed no significant change in muscle CSA from baseline to the last measurement, although there were slight variations within these time points. For the 3 participants who received GSTIM systems, 1 patient had comparable results to those presented, while the other participants showed no significant change in muscle CSA while GSTIM was actively used but had decreased CSA when GSTIM use was stopped.
Discussion
There are several significant risk factors for the development of pressure ulcers after SCI. Increased pressures over bony prominences are a primary risk factor. Muscle loss leads to a reduced contact area for distribution of applied loads on the skin and soft tissues while sitting or lying, leading to increasingly high pressures. The lack of proprioception and inability to reposition in most people with SCI also prolongs this application of increased pressures. Decreased blood flow further compromises the cycling of vital nutrient intake and elimination of toxins in the muscle.15 When a person repositions, care must be exercised to prevent soft tissue damage caused by shear and friction. In addition, the skin must be checked regularly for wounds and discoloration. Skin hygiene must be maintained because incontinence and abnormal sweating adversely affect the skin health. GSTIM can be used to stimulate and condition the gluteus maximus muscle, providing dynamic weight-shifting in preprogrammed intervals. Stimulating muscle contractions with GSTIM may also prevent or attenuate the muscle atrophy seen in patients with SCI.
This study addresses the assessment of muscle CSA using retrospective CT scans as a measure for the effectiveness of GSTIM to increase muscle mass. Raters used 2 different image analysis programs to measure gluteus maximus CSA in participants with SCI, some of whom received regular GSTIM.
Resolution of CT images is in many cases comparable to MRI, and the technique does not have any contraindications for patients with implants. Furthermore, the technique is widely available and more economical than MRI. CT produces images by measuring the attenuation of x-rays that are sent through the body. Different tissue types attenuate the signal differentially, and thus fat-free skeletal muscle can be assessed very effectively. Both MRI and CT give a direct visualization of the muscle CSA and can be used to estimate the mass of skeletal muscle.16
Using the same rater will effectively minimize the large intrarater differences but may limit researchers to shorter studies or to retrospective scans. The Bland-Altman analysis showed that smaller areas tend to have greater uniformity of measurement. Expert raters were also found to obtain smaller but similar measurements, suggesting that rater training and well defined criteria for what constitutes the boundaries of cross-sectional muscle area will improve repeatability. Other factors such as prior familiarity with these types of images and rigor of training may also differentiate radiologists and other nonexpert users. In future studies, computer-aided diagnosis software may be used to outline the boundaries of the specific muscle of interest. Techniques such as increasing the high frequency content of the image to give contrast to the boundaries of muscles, seeding, and watershed techniques to separate regions of interest may be used. Computer-aided diagnosis may prove useful in further improving reliability and reproducibility of muscle CSA measurements. However, a human rater, preferably expert or well trained, would still need to approve of the assessment to prevent errors such as including other muscles in the measurement.
A patient with disuse muscle atrophy typically exhibits higher lipid content in the atrophied muscles. Therefore, not only does the CSA decrease over time but also the relative percentages of fat and fat-free tissue in the muscle may change significantly. This issue was not addressed in this current study; in the future, we may be able to use intrinsic features in CT scans to address the issue of fat content because fat and fat-free muscle exhibit different gray values. The relative percentage of fat in the muscle may be assessed by the gray values that are exhibited within the muscle boundary.17
VeVMD is designed to take measurements of wounds and pressure ulcers and thus has a very robust and flexible data management and handling protocol. However, VeVMD requires a square template of known size to quantify the image, presenting a potential source of error not necessarily found with ImageJ, which automatically reads the pixel conversion information from the CT scans. ImageJ also has more flexibility in image processing and analysis capabilities, such as the potential to apply a threshold based on gray levels. This capability is of value because different tissue types have different ranges of gray levels, and thus ImageJ can be used to segment and choose the muscle of interest via computer algorithms as described.
Conclusions
This study addresses the assessment of muscle CSA using retrospective CT scans as a measure for the effectiveness of GSTIM to increase muscle mass. Raters used 2 different image analysis programs to measure gluteus maximus CSA in participants with SCI, some of whom received regular GSTIM.
Either image analysis program can produce muscle measurements where interrater variability is minimized and measurements are similar (see Table 1, Table 3), but the variability in switching between different programs is significant (see table 2). Nevertheless, the interrater SDs are smaller for ImageJ (see table 1). These findings imply that using 1 program, and that ImageJ, will be more appropriate for repeatable and reliable muscle measurements.
Longitudinal studies for 1 research participant showed that muscle CSA changes compared with baseline were significant (see fig 4B). These results correspond to similar results of statistically significant decreases in ischial region pressures over time with baseline/postintervention comparisons of sitting interface pressures for gluteal stimulation system users.18 The error associated with using different image analysis programs and raters may mask or dramatically increase the change in measurements (see fig 4A). It is suggested that longitudinal studies be performed using the same program and same rater to decrease interrater and interprogram errors.
This article presents development of a methodology to observe muscle CSA changes using an image analysis program. Accurately assessing the CSA of muscle has several important applications in rehabilitation, physiology, nutrition, and clinical medicine. Significant increases in CSA of gluteal muscle receiving GSTIM compared with baseline measurements were observed. These findings confirm our previously observed significant changes in pressure relief in these subjects. The increase in gluteal muscle CSA provides improved pressure distribution and implies that GSTIM is an effective intervention for preventing pressure ulcers.
Suppliers
Acknowledgments
We thank the veterans of the Louis Stokes Cleveland Department of Veterans Affairs Medical Center (Wade Park Division), without which this study would not have been possible. Thanks to Chester Ho, MD, for his professional guidance. Thanks to Xiaofeng Wang, PhD, Steven Sidik, PhD, and Monique Washington for statistical expertise. Thanks to Hossam K. Saad, MD, Paul Rochon, MD, and Jonathan Olbrych as raters. Thanks to Nannette Alvarado, MD, Craig R. George, MD, and Ronald Lew, MD, and the Imaging Department for expertise and medical images. Thanks to Jonathan Sakai, Patricia Banks, Christine Wu, and Arden Bartlett for suggestions, ideas, and technical support.
References
- . Skeletal muscle fibre type transformation following spinal cord injury. Spinal Cord. 1997;35:86–91
- . Influence of complete spinal cord injury on skeletal muscle cross-sectional area within the first 6 months of injury. Eur J Appl Physiol. 1999;80:373–378
- . Rehabilitation research and training center in community-oriented services for persons with spinal cord injury: a progress report. Houston: Baylor College of Medicine, Institute for Rehabilitation and Research; 1991;
- . Secondary conditions following spinal cord injury in a population based sample. Spinal Cord. 1998;36:45–50
- . Late-life spinal cord injury and aging with a long term injury: characteristics of two emerging populations. J Spinal Cord Med. 1995;18:183–193
- . Electrical muscle stimulation for pressure variation at the seating interface. J Rehabil Res Dev. 1989;26:1–8
- . Electrical muscle stimulation for pressure sore prevention: tissue shape variation. Arch Phys Med Rehabil. 1990;71:210–215
- . Blood flow in the gluteus maximus of seated individuals during electrical muscle stimulation. Arch Phys Med Rehabil. 1990;71:682–686
- . Transcutaneous oxygen tension in subjects with paraplegia with and without pressure ulcers: a preliminary report. J Rehabil Res Dev. 1999;36:202–206
- . Long-term prevention of pressure ulcers in high-risk patients: a single case study of the use of gluteal neuromuscular electric stimulation. Arch Phys Med Rehabil. 2006;87:585–591
- . Muscle atrophy is prevented in patients with acute spinal cord injury using functional electrical stimulation. Spinal Cord. 1998;26:463–469
- Long lasting muscle trophism in complete upper motor neuron lesion paraplegia. Basic Appl Myol. 2005;15:191-20
- . Validation of image segmentation by estimating rater bias and variance. Med Image Comput Comput Assist Interv Int Conf Med Image Comput Comput Assist Interv. 2006;9:839–847
- . Statistical methods for health sciences. Boca Raton: CRC Pr; 1996;
- . Pressure ulcer treatment: a competency-based curriculum. http://www.npuap.org/PDF/treatment_curriculum.pdf2001;Accessed May 2, 2008
- . Cadaver validation of skeletal muscle measurement by magnetic resonance image and computerized tomography. J Appl Physiol. 1998;85:115–122
- . Skeletal muscle attenuation determined by computed tomography is associated with skeletal muscle lipid content. J Appl Physiol. 2000;89:104–110
- . A new technique for real-time interface pressure analysis: getting more out of large image datasets. J Rehabil Res Dev. 2008;45:523–53510 p following 535
- a Version 1.37; National Institutes of Health, Bethesda, MD. http://rsb.info.nih.gov/ij/.
- b Microsoft Corp, One Microsoft Way, Redmond, WA 98052.
- c Version 1.1.14; VeVMD c/o Vistamedical, Unit #3, 55 Henlow Bay, Winnipeg, MB, Canada.
- d Version 9.0.2; Adobe Systems Inc, 345 Park Ave, San Jose, CA 95110-2704.
Supported by the Veterans Administration Rehabilitation Research and Development Service (grant no. B4664).
No commercial party having a direct financial interest in the results of the research supporting this article has or will confer a benefit on the authors or on any organization with which the authors are associated.
PII: S0003-9993(09)00141-5
doi:10.1016/j.apmr.2008.12.009
Published by Elsevier Inc.
Volume 90, Issue 6 , Pages 1048-1054, June 2009
