Volume 90, Issue 2 , Pages 309-313, February 2009
Validity and Interobserver Reliability of Visual Observation to Assess Partial Weight-Bearing
Article Outline
Abstract
Hurkmans HL, Bussmann JB, Benda E. Validity and interobserver reliability of visual observation to assess partial weight-bearing.
Objective
To determine the validity and interobserver reliability of visual observation to assess partial weight-bearing.
Design
Validation and interobserver reliability study.
Setting
University medical center.
Participants
Patients (N=10) with a total hip arthroplasty operated 1 to 12 months prior to the study referred by 10 physical therapists (5 experienced and 5 inexperienced in training patients in partial weight-bearing).
Interventions
Not applicable.
Main Outcome Measures
The amount of weight-bearing assessed by visual estimation (visual analog scale score) in percentage body weight (BW). Actual weight-bearing (percentage BW) as measured with the Pedar Mobile system. The mean difference (systematic error) between visual estimation and the Pedar system and the SD of the differences (random error) were determined by the limits of agreement (LOA) method with multiple observations per subject. The intraclass correlation coefficient (ICC) was calculated as a measure for the interobserver reliability.
Results
The mean difference ± SD between visual observation and the reference method was –9.5±20.1 percentage BW (95% confidence interval, –24.0 to 5.0 percentage BW) with LOA ranging from –49.8 to 30.8 percentage BW. The ICC was .57. The therapists' experience in partial weight-bearing training had no effect on the mean difference (P=.349) between the 2 methods.
Conclusions
Visual observation is not a valid and reliable method to assess partial weight-bearing.
Key Words: Rehabilitation, Reproducibility of results, Weight-bearing
List of Abbreviations: BW, body weight, CI, confidence interval, ICC, intraclass correlation coefficient, LOA, limits of agreement, PWB, partial weight-bearing, THA, total hip arthroplasty, VAS, visual analog scale
RESTRICTION OF LOWER-LIMB loading is frequently instructed by the surgeon or orthopedic surgeon after lower-limb surgery to ensure proper healing of a fracture, osteotomy, or hip arthroplasty.1, 2, 3, 4, 5, 6 The general concept behind PWB is to decrease the forces at the healing site by reducing the external load on the operated leg. For fracture healing (to induce bone growth) and for cementless implant fixation (for osseointegration), limited micromotion is necessary. However, too much micromotion can lead to delayed fracture healing or nonunion and less strong fixation of uncemented implants.1 Therefore, it is standard care for the physical therapist to ensure proper limb loading of the operated leg during rehabilitation.
In clinical practice, visual observation is one of the most common methods used by the physical therapist to estimate the amount of loading under the patient's foot. Other clinical techniques used to assess weight-bearing—for example, palpation and a bathroom scale—have previously been evaluated.7, 8, 9, 10, 11 Palpation by placing the hand under the foot of the patient was found to be subjective guesswork at best, and bathroom scales are able to assess weight-bearing only during standing.10, 11, 12 Unfortunately, no studies are available to support the validity and/or reliability of visual observation to assess PWB. Previous studies found that visual estimation is an inaccurate method to determine BW.13, 14, 15 This may indicate that visual assessment of the amount of weight placed on the leg during standing is also an inaccurate method. However, this provides no information on the accuracy of weight-bearing assessment during walking because the amount of load placed on the legs during walking is caused not only by body mass and acceleration of gravity but also by additional (eg, forward) accelerations of body mass. Hurkmans et al16 found that when physical therapists used visual observation to train and control PWB, 55% of the patients did not load the leg at the prescribed target load. Therefore, visual observation does not seem to be a valid method to assess PWB and can result in inaccurate limb loading, and consequently may lead to complications. At our department, PWB training is mostly performed by a group of physical therapists who have several years of experience in PWB training. Occasionally, during the weekends, PWB is performed by physical therapists who do not have this kind of experience in PWB training. Therefore, we wanted to determine whether inexperienced therapists can visually estimate weight-bearing as well as experienced therapists.
The present study aimed to determine the validity and interobserver reliability of visual observation to assess the amount of weight-bearing on the operated leg of patients with a THA during walking. We also investigated whether the therapist's amount of experience in PWB training had an effect on the systematic error of visual observation to determine the amount of weight-bearing.
Methods
Patient Population
We performed a prospective study in which a convenience sample of 10 patients with a THA who were operated 1 to 12 months prior to the study was selected. Patients were included if they gave informed consent, were between 40 and 80 years of age, and received PWB postoperatively using elbow crutches. Patients with neuromuscular diseases, foot orthosis, and foot deformities that needed special footwear were excluded. This study was approved by the institutional review board.
Observers
Visual observation was performed by a group of 10 physical therapists (5 experienced and 5 inexperienced in PWB training). The therapists who were experienced in PWB training provided physical therapy in the department of orthopedics and had at least 5 years of experience in PWB training. The therapists who were inexperienced in PWB training provided physical therapy in the department of neurology, the department of lung diseases, and the department of internal medicine and provided PWB training occasionally. The experienced observers had clinical experience ranging from 8 to 23 (mean ± SD, 17.8±9.4) years, of which at least 5 years of experience was in PWB training. The clinical experience of the observers without experience in PWB training ranged from 14 to 33 (mean ± SD, 27.8±8.2) years.
A power analysis was conducted and determined that with an α level of 5%, a sample size of 5 was necessary for a t test to have at least 80% power to detect a difference of 10% BW.17
Instrumentation
The actual amount of weight-bearing was measured with an insole pressure device, the Pedar Mobile system,a which was used as the reference method.18, 19 The Pedar Mobile system is a portable device with matrix insoles (2mm thick), each containing 99 capacitive sensors.
A VAS was used by the physical therapists to rate the amount of weight-bearing by visual observation in percentage BW. The VAS used was a horizontal line 100mm in length with 11 equidistantly spaced vertical lines. The descriptions 0% and 100% were placed at the left and right end of the scale, and 50% in the middle of the scale, corresponding to the percentage of BW.
Protocol
The physical therapists were informed about the purpose of the study and instructed how to use the VAS. One day prior to each measurement, the Pedar insoles were calibrated using the Trublu calibration devicea and a GDH 14AN digital manometer.b The pressure loads applied were 4, 7, and 10 to 60N/cm2 with intervals of 5N/cm2. The sample frequency was set at 50Hz. A half hour before the measurement, the patient was instructed to load the operated leg at 3 different weight-bearing levels—that is, low (10% BW), mid (50% BW), and high (100% BW) weight-bearing levels—using elbow (forearm) crutches and a 3-point gait.20 Immediately before data collection, the Pedar system was turned on, and a 0 setting was performed.18 Then the patients walked 3 times at their own walking speed for 15m at each instructed weight-bearing level. The 3 weight-bearing levels were placed in random order. The patients were observed from the operated side (lateral, anterior, and posterior view) by the physical therapists simultaneously, during each of the 3 walking trials. At the end of each of the 3 walking trials, the physical therapists scored an average weight-bearing load on the VAS. The physical therapists were unaware of the 10%, 50%, or 100% BW target weight-bearing levels, nor the weight-bearing level at which the patients were walking during each trial.
Data Analysis
Pedar-m Expert version 8.2 softwarea was used to calculate the vertical force data from the Pedar system. From each walking trial, the first and last 2 steps were excluded from analysis. For each step, the maximum peak load was determined, and from these maximum peak loads, the mean ± SD peak load (percentage BW) was calculated. Normative distribution for the mean peak loads and VAS scores was tested using the Kolmogorov-Smirnov test. For comparison of the visual observation method with the reference method, we used the 95% LOA method with multiple observations per subject described by Bland and Altman.21 The mean difference between the 2 methods was calculated, which represents the systematic error (bias) between the measurements. The SD of this difference represents the random error. The LOA were given by the mean difference ±2 × SD, indicating the total error—that is, systematic and random error together.22, 23, 24 The 95% CI for the mean difference and LOA was calculated using, respectively, the SEs
and
with t equal to 2.262 (df=9, α=.05).23 An assumption for using the 95% LOA is that there is no significant relationship between the difference and the mean—that is, the mean and SD are constant throughout the range of measurements.24 This was determined by plotting the difference against the average and calculating the correlation coefficient.
Because no criteria exist for clinically acceptable LOA for visual observation to assess PWB, we defined a priori that a difference of ±10% BW would be acceptable. Therefore, the percentage of agreement within the 10% BW was calculated.
To determine the interobserver reliability, we calculated the ICC. The ICC was calculated as the ratio of the variance between patients (ie, variance of interest) to the total variance (ie, variance of interest and error variance). If Varp is the variance between patients, Varo the variance between observers, and Varpo the variance attributed to the interaction between patients and observers, the ICC is calculated as Varp/(Varp + Varo +Varpo).25, 26
The LOA and ICC were calculated for the total group of physical therapists and for the experienced and inexperienced group separately. The t test was used to determine a possible effect of experience on the mean difference between visual observation and the reference method. All statistic analyses were performed with SPSSc for Windows. The level of significance for all tests was set at 5%.
Results
The patient characteristics and the raw weight-bearing data of the visual estimation VAS scores are presented in table 1 and figure 1. After the instruction to load the leg at 3 weight-bearing levels (low, mid, high), we observed that most of the patients did not load their legs on these prescribed target loads.
Table 1. Patient Characteristics
| Patient No. | Sex | Age (y) | BW (kg) | Operated Leg |
|---|---|---|---|---|
| 1 | M | 57 | 92 | R |
| 2 | M | 63 | 80 | R |
| 3 | F | 63 | 84 | L |
| 4 | F | 58 | 80 | L |
| 5 | F | 63 | 78 | L |
| 6 | F | 71 | 82 | R |
| 7 | F | 78 | 57 | R |
| 8 | F | 79 | 57 | R |
| 9 | F | 64 | 90 | R |
| 10 | F | 60 | 48 | R |

Fig 1.
The weight-bearing data of the visual estimation VAS score for each of the physical therapists per patient's walking trial, with the mean visual estimation (black markers) and the reference weight-bearing value (horizontal dashes). ○ = VAS scores; ● = mean VAS scores;
= reference values.
The mean difference between visual observation and the reference method was −9.5% BW; this difference was not significant (95% CI, –24.0 to 5.0 percentage BW; table 2, figure 2). The upper limit of agreement was 30.8% BW, and the lower limit was –49.8% BW. The percentage of agreement within ±10% BW was 46.7. There was no significant relationship between the difference and the average of the visual observation and the reference method (r=–.139; P=.464). Therapists' experience in PWB training had no effect on the mean difference between visual observation and the reference method (P=.349).
Table 2. Validity and Interobserver Reliability of Visual Observation of PWB
| Physical Therapists | Systematic Error | Total Error - LOA - | Agreement (%) Within 10% BW | Reliability | ||||
|---|---|---|---|---|---|---|---|---|
| Mean ± SD | 95% CI | LL | 95% CI | UL | 95% CI | ICC | ||
| Total (n=10) | −9.5 | −24.0 to 5.0 | −49.8 | −74.7 to −24.9 | 30.8 | 5.8 to 55.7 | 46.7 | .57 |
| Experienced (n=5) | −8.6 | −26.4 to 9.2 | −49.6 | −75.0 to −24.2 | 32.4 | 7.0 to 57.8 | 53.3 | .47 |
| Inexperienced (n=5) | −10.1 | −28.4 to 8.2 | −51.9 | −77.8 to −26.0 | 31.7 | 5.8 to 57.6 | 40.0 | .59 |

Fig 2.
Bland-Altman plot with mean error ± SD (–9.5±20.1; solid line) and 95% LOA (upper limit=30.8; lower limit=–49.8; dashed lines) of the total group of physical therapists (n=10).
The ICC was .57 for the total group of physical therapists. For the experienced and inexperienced group of physical therapists, the ICCs were .47 and .59, respectively (see table 2).
Discussion
The aim of our study was to determine the validity and interobserver reliability of visual observation to assess partial weight-bearing. The results showed no significant systematic error for visual observation to estimate the amount of weight-bearing. However, expressed by a large random error, a considerable lack of agreement existed between visual observation and the reference method with discrepancies of up to 50% BW. Also, the interobserver reliability was low (ICC range, .47–.59).
Although visual observation is commonly used in PWB training, no previous studies were found that evaluated its validity and reliability to assess the amount of loading under the foot. However, related studies showed that visual observation is an inaccurate method to determine BW.13, 14, 15 Lorenz et al13 and Leary et al14 found mean errors in estimating BW of 6 to 9 kg. These results are comparable to the mean error of 9.5% BW found in our group of patients (BW range, 48–92kg).
In the present study, experience of the physical therapists in PWB training had no effect on the systematic error of visual observation to determine the amount of weight-bearing. Leary14 and Coe et al15 also reported that experienced intensive care staff members inaccurately estimated the patient's BW. For PWB, this might be explained by the fact that physical therapists do not have any reference in clinical practice for how well they estimate the amount of weight-bearing. Therefore, no learning effect caused by repetition can be achieved when no correction for errors in estimating the amount of loading can be given during or after PWB training. Another explanation could be that, in our study, the physical therapists without experience in PWB training had a longer clinical experience (mean ± SD, 27.8±8.2) than the physical therapists who did have experience in PWB training (mean ± SD, 17.8±9.4).
The choice of a measuring technique for PWB depends largely on how accurately one needs to measure the amount of weight-bearing in the clinic to avoid complications. Unfortunately, the amount of error in weight-bearing estimation that is acceptable in clinical practice is unknown. In the present study, we used our clinically acceptable LOA of ±10% BW, which resulted in 53% of the visual weight-bearing scores being inaccurate. However, because these clinically acceptable limits may be too strict, clinical studies are needed to assess the relationship between the amount of weight-bearing and postoperative complications. More research is needed on which local forces are harmful for the healing site (eg, hip joint), which forces occur at the healing site during PWB, and what the relationship is between local forces and the forces under the foot (ie, the vertical ground reaction forces). Until then, currently prescribed target loads (eg, 10% BW, 15% BW, 50% BW, 8kg) with clinically defined upper and lower limits of 10% BW will be used.10, 16 For this, visual observation is too crude a method to ensure correct weight-bearing.
In clinical practice, visual observation is also used in combination with palpation of upper-arm muscles and/or the hand-under-the foot technique to estimate limb loading; to our knowledge, no studies have evaluated palpation separately (or a combination of these 3 methods) to assess PWB. Only 1 study has evaluated the hand-under-the foot technique, and the authors concluded that the amount of weight placed on the therapist's hand was subjective guesswork at best.10 Therefore, more research is needed that evaluates these techniques individually but also in combination to assess the amount of weight-bearing.
A promising technique to train patients to load the leg more accurately is audio feedback.27, 28 However, only 2 studies have evaluated audio feedback for weight-bearing training, and no information was given about the instrument's validity to measure the limb load. These 2 studies evaluated audio feedback during the patient's hospital stay, but additionally, because of reduced hospital duration of care, it is important to know whether patients perform proper weight-bearing at home. Therefore, more knowledge is needed regarding the effect of PWB training with audio feedback on the amount of weight-bearing during the entire postoperative recovery—that is, in and outside the hospital.
Study Limitations
The patients with THA participating in this study differed slightly from the patients with THA seen by the physical therapist directly postoperatively. Our patients could load the leg at 100% BW, had no pain, and were not afraid to load the operated leg. Factors like pain and anxiety are known to influence the patient's walking pattern directly postoperatively and could lead to lower leg loading.29 Therefore, these factors might help the physical therapist to estimate the amount of weight-bearing by visual observation because of an altered walking pattern. However, in our experience, patients with THA recover relatively quickly with regard to their postoperative pain and anxiety, and often have to be instructed not to walk too fast.
Conclusions
Visual observation alone is not a valid and reliable method to assess the loading of the leg during partial weight-bearing. Experience of the physical therapist in PWB training did not influence the outcome. Although PWB is frequently performed in the clinic, limited information is available on the validity and reliability of techniques (visual observation, palpation of upper arm muscles, hand-under-the-foot) used by the physical therapist to assess the amount of weight-bearing. Therefore, more studies are needed that evaluate these techniques individually but also in combination to assess the amount of weight-bearing.
Suppliers
Acknowledgments
We thank S. Groenendijk and M. van der Velde for their support in data collection.
References
- . Biomechanical aspects of load-bearing capacity after total endoprosthesis replacement of the hip joint: an evaluation of current knowledge and review of the literature. Z Orthop Ihre Grenzgeb. 1998;136:310–316
- . Fracture of the proximal tibia with immediate weightbearing after a Fulkerson osteotomy. Am J Sports Med. 1997;25:570–574
- Early, full weightbearing with flexible fixation delays fracture healing. Clin Orthop Relat Res. 1996;194–202(Jul)
- . Failure of trochanteric osteotomy in total hip replacement: a comparison of two methods of reattachment. Ann R Coll Surg Engl. 1996;78:43–44
- . Complications of trochanteric osteotomy: long-term implications. Clin Orthop Relat Res. 1993;209–213(Mar)
- . Complications of trochanteric osteotomy. Orthop Clin North Am. 1992;23:321–333
- . Reproducibility of partial weight bearing. Injury. 2005;36:556–559
- . Partial weight-bearing gait using conventional assistive devices. Arch Phys Med Rehabil. 2005;86:394–398
- . Quantitative analysis of the effects of audio biofeedback on weight-bearing characteristics of persons with transtibial amputation during early prosthetic ambulation. J Rehabil Res Dev. 2000;37:255–260
- . Assessing the accuracy of partial weight-bearing instruction. Am J Orthop. 1998;27:558–560
- . Training procedures and biofeedback methods to achieve controlled partial weight bearing: an assessment. Arch Phys Med Rehabil. 1975;56:449–455
- . Accuracy of weightbearing estimation by stroke versus healthy subjects. Percept Mot Skills. 1991;72:935–941
- Anthropometric approximation of BW in unresponsive stroke patients. J Neurol Neurosurg Psychiatr. 2007;78:1331–1336
- . The accuracy of the estimation of BW and height in the intensive care unit. Eur J Anaesthesiol. 2000;17:698–703
- . The accuracy of visual estimation of weight and height in pre-operative supine patients. Anaesthesia. 1999;54:582–586
- . The difference between actual and prescribed weight bearing of total hip patients with a trochanteric osteotomy: long-term vertical force measurements inside and outside the hospital. Arch Phys Med Rehabil. 2007;88:200–206
- . Statistics in medicine. In: 1st ed.. Boston: Little, Brown & Co; 1974;p. 142–146
- Validity of the Pedar Mobile system for vertical force measurement during a seven-hour period. J Biomech. 2006;39:110–118
- . Accuracy and repeatability of the Pedar Mobile system in long-term vertical force measurements. Gait Posture. 2006;23:118–125
- . Standing and walking with walking aids—an electromyokinesigraphic examination. Z Orthop Ihre Grenzgeb. 1979;117:247–259
- . Agreement between methods of measurement with multiple observations per individual. J Biopharm Stat. 2007;17:571–582
- . Analysis of method comparison studies. Ann Clin Biochem. 1996;33:1–4
- . Comparing methods of measurement: why plotting difference against standard method is misleading. Lancet. 1995;346:1085–1087
- . Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1:307–310
- . Health measurements and scales: a practical guide to their development and use. 2nd ed.. Oxford: Oxford Univer Pr; 1995;
- . Reliability assessment of isometric knee extension measured with a computer-assisted hand-held dynamometer. Arch Phys Med Rehabil. 1998;79:442–448
- . Feedback-controlled weight bearing following osteosynthesis of the lower extremity. Swiss Surg. 1996;2:252–258
- . Partial weight bearing after total hip arthroplasty: what does the patient really do? (A prospective randomized gait analysis). Hip Int. 1994;4:61–68
- . Postoperative weight-bearing after a fracture of the femoral neck or an intertrochanteric fracture. J Bone Joint Surg Am. 1998;80:352–356
No commercial party having a direct financial interest in the results of the research supporting this article has or will confer a benefit on the authors or on any organization with which the authors are associated.
PII: S0003-9993(08)01589-X
doi:10.1016/j.apmr.2008.07.022
© 2009 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Volume 90, Issue 2 , Pages 309-313, February 2009
