Volume 88, Issue 1 , Pages 37-42, January 2007
Influence of Lever Arm and Stabilization on Measures of Hip Abduction and Adduction Torque Obtained by Hand-Held Dynamometry
Article Outline
Abstract
Krause DA, Schlagel SJ, Stember BM, Zoetewey JE, Hollman JH. Influence of lever arm and stabilization on measures of hip abduction and adduction torque obtained by hand-held dynamometry.
Objective
To examine the reliability of clinical techniques for testing hip abductor and adductor muscle performance.
Design
Repeated measures.
Setting
Academic laboratory.
Participants
A sample of 21 healthy subjects (12 men, 9 women) between 22 and 31 years of age.
Interventions
Not applicable.
Main Outcome Measures
Reliability of repeated measures was estimated by calculating intraclass correlation coefficients. Torque production capability was calculated by multiplying force output obtained with a hand-held dynamometer by the length of the resistance lever arm.
Results
The reliability of abduction testing was greatest in the long-lever condition. Adduction test reliability was greatest in the long-lever condition with bench stabilization. The maximal hip abduction torque tested in the long-lever position was significantly greater (t20=9.21, P<.001) than that in the short-lever position. The maximal hip adduction torque occurred using a long lever for resistance application and a bench to stabilize the nontest leg (F1,20=15.64, P=.001).
Conclusions
Muscle performance testing of hip abductors and adductors with a hand-held dynamometer can be performed with good to excellent intratester and intertester reliability. Hip abduction testing is best performed with a long lever. Hip adduction is best performed with a long lever and a bench to stabilize the nontest extremity.
Key Words: Hip, Muscle, Rehabilitation, Reliability and validity
THE MANUAL MUSCLE TEST (MMT) and hand-held dynamometry (HHD) are common tools used to clinically assess muscle performance. The MMT originated with Wilhelme Wright and Robert W. Lovett, MD, who used the techniques to test the effects of polio.1 Since the initial introduction of the procedures, the MMT and HHD have become widely used in the clinical setting with a variety of patient and client populations.2, 3, 4, 5, 6, 7, 8, 9
With the MMT, the subjective assessment of muscle performance is given a grade to quantify strength. Grading systems are based on the ability of muscles to function against the force of gravity and withstand additional force provided by an examiner. On a 5-point grading scale, a grade of 3 (fair) represents the ability to function against the force of gravity without any additional external resistance and 5 (normal) represents the ability to function against gravity and resist maximal force provided by an examiner.1, 10 Although the use of MMT grades provides a quick means of recording performance, limitations with their use include limited intertester reliability and sensitivity. Frese et al11 investigated manual muscle testing of the medial trapezius and gluteus medius muscles. They found low interrater reliability with the tests, and questioned the use of MMT grades for making accurate clinical assessments. Escolar et al12 likewise found that poor reliability with multiple examiners compromises the clinical usefulness of the MMT. Knepler and Bohannon13 reported that individual testers were able to graduate resisting forces that were applied with grades above 3 (fair); however, variability among the testers resulted in the recommendation that different testers should not be used when monitoring and making clinical judgments about a patient’s strength in grades above 3 (fair).
In response to the limitations of the MMT, HHD has been gaining in popularity.14, 15 Its advantages include a quick and inexpensive means of providing objective values in the clinical setting compared with the subjective MMT grades. In a study measuring knee extension using the MMT and HHD, Bohannon and Corrigan16 found a wide range of knee extension forces associated with an MMT grade of normal. Likewise, Schwartz et al17 reported that manual muscle testing when compared with HHD is more sensitive for grades less than 4. These studies support the clinical use of HHD to detect improvements in strength in the normal and near-normal ranges, which, because of limited sensitivity, may go undetected with the MMT.
Problems reported with MMT reliability can also influence HHD. Agre et al18 used HHD to test upper- and lower-extremity muscle groups. Correlation coefficients between examiners for lower-extremity testing were much lower than those for upper-extremity testing. For example, the reported mean interexaminer r for hip abduction was .74, whereas for shoulder flexion it was .94. Although the use of correlation coefficients does not provide a direct measure of reliability, a poor correlation coefficient represents a poor level of association between repeated tests. The investigators concluded that HHD was unacceptable for lower-extremity testing. Explanations for this conclusion included difficulty with stabilization of the lower extremities and the strength of the tested muscle groups relative to the strength of the examiner. Stabilization and the strength of muscle groups such as the hip musculature have also been have been advanced as problematic by other investigators.16, 17, 18, 19
When examining muscle performance, either manually or with HHD, resistance may be applied with either a short or long lever. For most tests a short lever is used. When testing hip abductors, the use of a long lever with resistance applied proximal to the ankle is advocated, because it allows examiners sufficient leverage to overcome the force production capacity of this muscle group.10 In addition, it is proposed that the resultant resistance with a long lever is more representative of the functional demands imposed on the abductors.1, 10 To our knowledge, the effects of lever arm, torque production, and reliability have not been investigated when testing hip adductors.
This study investigates the effects of lever arm and stabilization on reliability and torque production when testing the muscle performance of hip abductors and adductors in the clinical setting. Given the testing problems reported with the force-producing capacity of these muscle groups compared with examiner strength, we hypothesized that the reliability and resultant torque production will be the greatest with a long lever when testing the hip abductors and with a long lever with additional stabilization when testing the hip adductors. Information obtained with this study will provide evidence for preferred clinical techniques to test hip muscle performance.
Methods
Participants
Twenty-one healthy subjects (12 men, 9 women) were recruited from Mayo Clinic College of Medicine, Rochester, MN. The age range was 22 to 31 years. Criteria for selection of subjects included a normal grade (5) for both hip abduction and adduction. This was determined by a subjective assessment of an MMT performed by examiners. A normal grade represented the ability of a muscle or groups of muscles to function against gravity and resist strong pressure provided by an examiner.10 Exclusionary criteria were a history of hip or knee injuries or pathology on the tested lower extremity. The lower extremity used to kick a ball was tested. All subjects were informed of their rights and consented to participation in the study, which was approved by the institutional review board at Mayo Clinic.
Procedure
Anthropometric measurements were taken including height, mass, and lever arm length. All measurements and tests were conducted by 2 female, second-year, physical therapy students (examiners A, B) under the direct supervision of a board-certified orthopedic physical therapist with over 20 years of clinical experience. Men were 24±3 years in age, had a height of 183±4cm, and had a body mass of 94±11kg. Women were 25±3 years in age, had a height of 168±6cm, and had a body mass of 67±15kg. The short lever arm length was measured from the superior aspect of the greater trochanter to 7cm above the lateral joint line of the knee. Men had an average short-lever length of 36±3cm, and women had an average of 34±3cm. The long lever arm was measured from the superior aspect of the greater trochanter to 5cm above the lateral malleolus. Men had an average long-lever length of 82±4cm, and women had an average of 77±6cm. The distal references were sites used for the application of resistance with HHD. A reference mark was made on the skin for consistent placement of the dynamometer. The superior aspect of the greater trochanter was used as an estimate of the frontal plane axis of rotation at the hip joint.
Six different testing positions were performed. Each test was administered 3 times on each subject. The sequence of the tests was randomized. Examiner A tested each position twice to examine intrarater reliability, and examiner B tested each position once to examine intertester reliability compared with examiner A. The first test performed by examiner A was used for comparison with examiner B. Examiner A was blinded to the results of all tests. Examiner B was blinded to the examiner A’s results. Examiners applied force that was sufficient to overcome subject-generated force in each testing condition. There was a minimum of a 1-minute rest period between every test. Test results were recorded using the MicroFET2 hand-held dynamometer.a
Hip abduction was tested with each subject in the side-lying position with the upper thigh positioned at approximately 30° of abduction. The hips were in neutral flexion and extension and rotation. Stabilization at the pelvis was provided by examiners to limit rotation of the pelvis and trunk in the transverse plane to avoid unwanted muscular substitution. Tests were performed with the dynamometer placed at both the short- and long-lever positions as described earlier (Fig 1, Fig 2).

Fig 1.
Abduction short-lever MMT with a hand-held dynamometer showing dynamometer placement and patient stabilization.

Fig 2.
Abduction long-lever MMT with a hand-held dynamometer showing dynamometer placement and patient stabilization.
Hip adduction with manual stabilization was tested with each subject in the side-lying position and an examiner cradling the nontest extremity. Subjects provided additional stabilization by holding on to the side of the table. The pelvis was in neutral sagittal plane tilt with the hip in neutral flexion, extension, and rotation. Each subject actively adducted the test extremity. Tests were performed with long and short lever arms as previously described (Fig 3, Fig 4).

Fig 3.
Adduction short-lever MMT with a hand-held dynamometer showing dynamometer placement and manual stabilization of the nontest leg.

Fig 4.
Adduction long-lever MMT with a hand-held dynamometer showing dynamometer placement and manual stabilization of the nontest leg.
Hip adduction with fixed stabilization incorporated a 36-cm–high padded bench for support of the nontest extremity. The bench was placed anterior to the patient to gain a neutral pelvic tilt. In this series of adduction tests, the bench replaced stabilization traditionally provided by an examiner. The adduction tests were repeated with the same HHD placements as with the manual stabilization tests (Fig 5, Fig 6).

Fig 5.
Adduction short-lever MMT with a hand-held dynamometer showing dynamometer placement and bench stabilization for the nontest leg.

Fig 6.
Adduction long-lever MMT with a hand-held dynamometer showing dynamometer placement and bench stabilization for the nontest leg.
Data Analysis
The dependent variables—hip abduction and adduction torque production—were normalized to height and weight. The corresponding independent variable for hip abduction torque production—lever arm—had 2 levels (short vs long). The corresponding independent variables for hip adduction torque production included the lever arm with 2 levels (short vs long) and bench stabilization with 2 levels (bench vs no bench). Torque was calculated with the equation torque = force (N) × moment arm (m). Normalized torque percentage values were calculated as follows:
Normalized torque (%) = (torque [Nm] / weight [N] × height [m]) × 100.
Intratester reliability was assessed with intraclass correlation coefficient model 3,1 (ICC3,1) described by Shrout and Fleiss.20 Intertester reliability was assessed using ICC2,1.20
A t test was used to assess torque differences between the hip abductor tests. A 2-way repeated-measures analysis of variance (ANOVA) with lever arm (short, long) and bench (with, without) as independent variables was conducted to examine differences in torque production (α=.05). Post hoc comparisons were analyzed by using paired t tests with the Bonferroni-adjusted α. Statistical procedures were performed with SPSS statistical software.b
Results
The values of intrarater reliability for the various tests ranged from .80 to .93 (table 1). The range of interrater reliability values was .62 to .82 (table 2). The greatest values for intrarater and interrater reliability were found with a long lever for testing hip abductors and a long lever with a bench for stabilization for testing hip adductors.
Table 1. Intrarater Reliability for the 6 Muscle Tests With 95% Confidence Intervals
| Test | ICC3,1 | 95% CI |
|---|---|---|
| Abduction short | .91 | .80−.96 |
| Abduction long | .93 | .84−.97 |
| Adduction short | .89 | .74−.95 |
| Adduction long | .79 | .56−.91 |
| Adduction short with bench | .83 | .62−.93 |
| Adduction long with bench | .89 | .74−.95 |
Table 2. Interrater Reliability for the 6 Muscle Tests With 95% Confidence Intervals
| Test | ICC2,1 | 95% CI |
|---|---|---|
| Abduction short | .68 | .37−.86 |
| Abduction long | .73 | .44−.88 |
| Adduction short | .74 | .42−.89 |
| Adduction long | .64 | .27−.84 |
| Adduction short with bench | .62 | .28−.82 |
| Adduction long with bench | .82 | .61−.92 |
Torque production results are illustrated in figure 7. The maximal hip abduction torque tested in the long-lever position (mean, 10.7%±2.2% of body weight [BW] × height [Ht]) was significantly greater (t20=9.21, P<.001) than torque produced in the short-lever position (mean, 7.1%±1.5%BW×Ht). Descriptive statistics are provided (table 3). Maximal hip adduction torque production was influenced by both the lever arm condition and the use of bench stabilization (table 4). There was a statistically significant lever by bench interaction (F1,20=15.64, P=.001). Maximal hip adduction torque was produced when the upper (nontest) extremity was supported by the bench and the adductors were tested in the long-lever position (mean, 11.9%±2.8%BW×Ht), followed by long-lever testing with manual stabilization (mean, 10.6%±2.5%BW×Ht), short-lever testing with manual stabilization (mean, 6.9%±1.9%BW×Ht), and short-lever testing with bench stabilization (mean, 6.2%±1.4%BW×Ht). The difference in torque production capability between bench stabilization and manual stabilization in the long-lever position was statistically significant (t20=3.97, P=.001).

Fig 7.
Mean normalized torque: MMT using a long lever resulted in greater torque values for all conditions. Use of a bench for adduction stabilization resulted in significantly greater torque values than manual stabilization. *Significant at P<.05.
Table 3. Descriptive Statistics
| Test | Tester 1, Time 1 | Tester 1, Time 2 | Tester 2 |
|---|---|---|---|
| Abduction short | 7.1±1.5 | 7.0±1.4 | 6.8±1.7 |
| Abduction long | 10.7±2.2 | 10.8±2.2 | 10.7±1.9 |
| Adduction short | 6.9±1.9 | 7.0±2.0 | 6.2±1.9 |
| Adduction long | 10.6±2.5 | 11.1±3.0 | 11.8±3.0 |
| Adduction short with bench | 6.2±1.4 | 6.4±1.6 | 6.7±1.8 |
| Adduction long with bench | 11.9±2.8 | 12.1±2.7 | 11.8±2.7 |
Table 4. ANOVA Results for Adduction Muscle Performance Testing
| Source of Variance | Sum of Squares | df | Mean Square | F | P |
|---|---|---|---|---|---|
| Lever | 466.14 | 1 | 466.14 | 182.88 | <.001 |
| Bench | 1.93 | 1 | 1.93 | 5.25 | .033 |
| Lever times bench | 22.07 | 1 | 22.07 | 15.64 | .001 |
| Error | 28.22 | 1 | 1.41 |
Discussion
The results support our initial hypothesis that reliability and torque production would be greater when testing the hip abductors and adductors with a long lever and using improved stabilization methods. We analyzed our reliability based on the following values: .75 and greater, excellent reliability; .40 to .75, fair to good reliability; and less than .40, poor reliability.20 Given these references, our calculated values associated with adduction testing using a long lever and bench stabilization resulted in excellent reliability for both intratester and intertester reliability. All other adduction situations resulted in excellent intratester reliability and fair to good intertester reliability. Abduction had excellent intratester reliability for both the long- and short-lever situations and fair to good reliability for 2 examiners.
This study was conducted using subjects with normal hip strength. We believe these tests should also be reliable for use with people who have hip weakness; however, further research would be necessary to test this hypothesis. Reliability is influenced, in part, by between-subject variability. Our subjects were healthy people between the ages of 22 and 31 years without known hip weakness. This homogeneous group may have limited variability in torque production capability compared with a subject pool of people with impaired hip strength.
Examiner force sufficient to overcome the force produced by subjects was used for testing, because we were interested in evaluating the maximal force production of the tested muscles. A break test has been shown to result in greater forces than those produced using a make test.21, 22 A make test is performed with an examiner maintaining a static resistance while a subject exerts maximal effort against the resistance. A break test, in contrast, is performed with an examiner providing a level of force to overcome the muscular force produced by the subject.22 We believed that a break test would result in a more valid representation of the capacity of muscle to produce force. Stratford and Balsor22 found ICCs for the make test (.95) higher than those for the break test (.87) when testing elbow flexors. Both values, however, would be considered excellent reliability.20 Additional investigators21, 23 have also found the break test to have excellent reliability.
The mean normalized torque values were significantly greater in the long-lever position for testing both hip abductors and adductors. We hypothesize that examiners were able to use the longer lever to produce a resultant force that was a maximum challenge to these muscle groups and, in turn, required maximal torque production. When testing the adductors, some subjects reported that the dynamometer placement was less comfortable in the short-lever test, which may have influenced the ability to generate a maximal effort. Bench stabilization was also significant when testing hip adductors. We believe this was due to decreased physical demands on examiners. The test as described by Kendall et al10 requires manual support of the nontest extremity. Testing in this manner resulted in more difficulty for examiners to produce adequate resistance to overcome the force produced by the patient. In addition, we believe that the less secure stabilization made it more difficult for subjects to generate maximal force compared with use of a bench. Agre et al18 concluded that dynamometry was unreliable for lower-extremity testing in part because of the strength of the tested muscles relative to the strength of the examiner. Mulroy et al24 likewise reported limitations of testing strong muscle groups such as the quadriceps due to relative examiner strength. Our study used a long lever and external stabilization, which we believe allowed examiners the ability to apply a resistance greater than the force produced by the tested muscles. Normalized mean torque values show that a long lever and bench stabilization produce results more representative of the muscle performance capacity of these muscle groups. Clinically, this provides clinicians with techniques to better evaluate patients.
Several investigators4, 6, 8, 25 have suggested that evaluation of hip muscle performance may be important in predicting future injury and also may provide information on recovery from previous injury. Several investigators examined people without impaired hip strength. Tyler et al8 tested the hip abductors and adductors in a group of professional ice hockey players to investigate if the preseason performance on the muscle test correlated with the incidence of adductor strains. Although they stated that the testing procedures used were those described by Kendall,10 reliability values for their testing were not provided. The techniques described by Kendall10 require considerable strength on the part of examiners to support the nontest extremity while providing a counterforce to the test extremity. We believe the methods for assessing hip muscle performance described in our study would enable testers to provide the support and counterforce necessary to produce a more valid measure of hip muscle performance, particularly in people who may not present with impaired muscle strength on a gross muscle test. Similarly, Nadler et al4 examined hip extensor and abductor muscle strength in division I collegiate athletes and found in general that athletes with a history of lower-extremity or low back pain had a difference in hip strength compared with athletes without previous injury. They addressed the problems of patient strength and stabilization by using an anchoring system. The methods described in our study offer a reliable, clinically efficient means of testing the muscle performance of hip abductors and adductors without the use of an anchoring system.
Study Limitations
Limitations of our study include the influence of tester strength, application of results to all populations, and ability to perform the techniques on all subjects. As discussed previously, differences in recorded torque production are influenced by the strength of the examiner. We used healthy young subjects able to produce torque that was a challenge to the examiners. Sufficient examiner strength is necessary to perform the tests used in the study. External validity is questioned, because results may not represent those groups of people with impaired hip strength or those actively involved in high-level sport and strength programs. Last, the use of a long lever when testing hip muscle performance may not be appropriate for people with knee pathology.
Conclusions
We have found that testing of hip abductors and adductors in normal, healthy, young people can be performed with good to excellent reliability. Results show that hip abduction is best performed using a long lever to provide resistance. Hip adduction is best performed using a long lever and a bench to stabilize the nontest extremity.
Supplier
Acknowledgment
We thank Roy Bechtel, PT, PhD, for his assistance with manuscript review.
References
- . Daniels and Worthingham’s muscle testing: techniques of manual examination. 7th ed.. Philadelphia: WB Saunders; 2002;
- . Test-retest strength reliability: hand-held dynamometry in community-dwelling elderly fallers. Arch Phys Med Rehabil. 2002;83:811–815
- . Reference values for extremity muscle strength obtained by hand-held dynamometry from adults aged 20 to 79 years. Arch Phys Med Rehabil. 1997;78:26–32
- . The relationship between lower extremity injury, low back pain, and hip muscle strength in male and female collegiate athletes. Clin J Sport Med. 2000;10:89–97
- . Management of patellofemoral pain targeting hip, pelvis, and trunk muscle function: 2 case reports. J Orthop Sports Phys Ther. 2003;33:647–660
- . Hip muscle weakness and overuse injuries in recreational runners. Clin J Sport Med. 2005;15:14–21
- . The reliability of upper- and lower-extremity strength testing in a community survey of older adults. Arch Phys Med Rehabil. 2002;83:1423–1427
- . The association of hip strength and flexibility with the incidence of adductor muscle strains in professional ice hockey players. Am J Sports Med. 2001;29:124–128
- Comparison of maximal voluntary isometric contraction and hand-held dynamometry in measuring muscle strength of patients with progressive lower motor neuron syndrome. Neuromuscul Disord. 2003;13:744–750
- . Muscles: testing and function. 4th ed.. Baltimore: Williams & Wilkins; 1993;
- . Clinical reliability of manual muscle testing (Middle trapezius and gluteus medius muscles). Phys Ther. 1987;67:1072–1076
- Clinical evaluator reliability for quantitative and manual muscle testing measures of strength in children. Muscle Nerve. 2001;24:787–793
- . Subjectivity of forces associated with manual-muscle test grades of 3+, 4−, and 4. Percept Motor Skill. 1998;87:1123–1128
- . Research incorporating hand-held dynamometry: publication trends since 1948. Percept Motor Skill. 1998;86(3 Pt 2):1177–1178
- . Adoption of hand-held dynamometry. Percept Motor Skill. 2001;92:150
- . A broad range of forces is encompassed by the maximum manual muscle test grade of five. Percept Motor Skill. 2000;90:747–750
- . Relationship between two measures of upper extremity strength: manual muscle test compared to hand-held myometry. Arch Phys Med Rehabil. 1992;73:1063–1068
- Strength testing with a portable dynamometer: reliability for upper and lower extremities. Arch Phys Med Rehabil. 1987;68:454–458
- . Manual muscle test scores and dynamometer test scores of knee extension strength. Arch Phys Med Rehabil. 1986;67:390–392
- . Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86:420–428
- . Make tests and break tests of elbow flexor muscle strength. Phys Ther. 1988;68:193–194
- . A comparison of make and break tests using a hand-held dynamometer and the Kin-Com. J Orthop Sports Phys Ther. 1994;19:28–32
- . Muscle force measured using “break” testing with a hand-held myometer in normal subjects aged 20 to 69 years. Arch Phys Med Rehabil. 2000;81:653–661
- . The ability of male and female clinicians to effectively test knee extension strength using manual muscle testing. J Orthop Sports Phys Ther. 1997;26:192–199
- . Hip strength in females with and without patellofemoral pain. J Orthop Sports Phys Ther. 2003;33:671–676
No commercial party having a direct financial interest in the results of the research supporting this article has or will confer a benefit upon the author(s) or upon any organization with which the author(s) is/are associated.
PII: S0003-9993(06)01343-8
doi:10.1016/j.apmr.2006.09.011
© 2007 American Congress of Rehabilitation Medicine and the American Academy of Physical Medicine and Rehabilitation. Published by Elsevier Inc. All rights reserved.
Volume 88, Issue 1 , Pages 37-42, January 2007
