| | Inter- and Intraobserver Repeatability of the Salford Gait Tool: An Observation-Based Clinical Gait Assessment ToolAbstract Toro B, Nester CJ, Farren PC. Inter- and intraobserver repeatability of the Salford Gait Tool: an observational-based clinical gait assessment tool. ObjectiveTo evaluate the inter- and intraobserver repeatability of the Salford Gait Tool (SF-GT), a new observation-based gait assessment tool for evaluating sagittal plane cerebral palsy (CP) gait. DesignMasked comparative evaluation. SettingUniversity in the United Kingdom. ParticipantsA convenience sample of 23 pediatric physical therapists with varying degrees of clinical experience recruited from the Greater Manchester area. InterventionParticipants viewed videotapes of the sagittal plane gait of 13 children and used the SF-GT to analyze their 13 different gait styles on 2 occasions. Eleven children had hemiplegic, diplegic, or quadriplegic CP and 2 were neurologically intact. Main Outcome MeasuresInter- and intraobserver repeatability of hip, knee, and ankle joint positions at 6 different phases of the gait cycle. ResultsThe SF-GT demonstrated good interobserver (77%) and intraobserver (75%) repeatability. ConclusionsWe have established that the SF-GT is a repeatable clinical assessment tool with which to guide the diagnosis, treatment planning, and evaluation of interventions by pediatric physical therapists of sagittal plane gait deviations in CP. CLINICAL- AND OBSERVATION-BASED gait examination is fraught with potential inaccuracies. Judgments made from visual observation and subsequent interpretation are subjective and rely heavily on clinicians’ training and experience, which varies widely.1, 2, 3 If there is a high variation between clinicians in their assessments of gait, then the care pathway a patient enters may depend more on the clinician making the assessment rather than on the true gait problems the patients present with.4 Several studies have considered whether observation-based gait assessment tools can be used reliably between clinicians or by the same clinician on repeated occasions over time. The Hugh Williamson Gait Laboratory Scale, a modified version of the Physician Rating Scale (PRS),5 had a low interrater repeatability among 4 experienced raters who rated 25 children with spastic diplegic gait (κ=.46 for foot-strike),6 although Corry7 had previously demonstrated an interrater repeatability of κ equal to .67 for the scale’s foot-strike section. Another variant of the PRS, the Observational Gait Scale, had modest interrater repeatability (κ=.58; range, .29–.86) and intrarater repeatability (κ=.69; range, .30–.91) for its first 6 sections.8 A third variant of the PRS, the Visual Gait Assessment Scale, also had modest interrater repeatability (κ=.67; range, .44–.89) and intrarater repeatability (κ=.53; range, −.04 to .86) for 2 experienced observers.9 The interrater repeatability for 17 items on the Edinburgh Visual Gait Score (EVGS)10 ranged from 96% for initial contact to 55% for knee extension in terminal swing (mean interrater repeatability, 70%). Intrarater repeatability was reported to be good for 5 experts in gait analysis, demonstrated by a mean least significant difference of 3.20 (range, 2.63–4.01).10 Our previous work3, 11 established that clinicians have a need for a gait assessment tool for use in routine clinical settings, and we subsequently developed the Salford Gait Tool (SF-GT) with which to assess the gait of children with cerebral palsy (CP). Development of the tool has been described elsewhere12 and so only an overview of it is presented here. With the SF-GT, users can assess the position of the hip, knee, and ankle at 6 specific events during gait (initial contact, end double support, mid stance, start double support, toe-off, mid swing). Users watch a video of subjects walking and estimate the angular position (in degrees) of each joint at the instant of each of the 6 gait events. The angular position of each joint (recorded in degrees by visual estimation of a video recording) corresponds to a category (2, 1, 0, −1, −2) for that joint at the instant of each of the 6 gait events. There are therefore 6 categories allocated to each joint over the gait cycle and 18 categories in all to describe each lower limb in gait. The sum of the 6 categories for each joint represents the function of the joint over the entire gait cycle. The output from using the tool is therefore a numeric indicator for each of the 3 joints and when the scores for the joints are summed they provide an indication of the entire gait pathology. We have described in related work13 13 different gait styles in children with CP (diplegia, hemiplegia, quadriplegia, monoplegia, dystonia, ataxia), based on the statistical analysis of quantitative kinematic data of 56 children with CP and 11 children with normal gait (table 1). The quantitative data used to derive these 13 gait styles were also used to define the boundaries between the 5 categories (2, 1, 0, −1, −2) at the hip, knee, and ankle.11 This was done so that users of the SF-GT would have the best possible chance of differentiating between the 13 gait styles. Once the tool was developed we sought to assess its intra- and interclinician repeatability prior to its clinical implementation. | | |  | Gait Style | Comments (comparison of gait styles to normal gait) |  |
|---|
 | 1: mild crouch | Increased hip and knee flexion, reduced plantarflexion |  |  | 2: mobile crouch | Further increased hip and knee flexion, reduced plantarflexion |  |  | 3: moderate crouch | Increased hip and knee flexion, reduced knee mobility, reduced plantarflexion |  |  | 4: severe crouch | Increased hip and knee flexion, reduced hip and knee mobility, no plantarflexion |  |  | 5: mild equinus | Increased hip and knee flexion, no dorsiflexion but increased plantarflexion in stance |  |  | 6: moderate equinus/knee extension | Increased hip mobility, knee hyperextension in stance, increased plantarflexion in stance |  |  | 7: moderate equinus/knee flexion | Reduced hip and knee mobility, increased plantarflexion in stance |  |  | 8: severe equinus | Increased hip flexion in swing, increased knee flexion at initial contact, increased plantarflexion throughout gait cycle |  |  | 9: stiff leg | Reduced hip and knee flexion, reduced plantarflexion at toe-off |  |  | 10: weak plantarflexion | Normal hip, increased knee flexion in stance, reduced plantarflexion at toe-off |  |  | 11: ankle double bump | Increased hip flexion in swing, increased knee flexion at initial contact, 2 dorsiflexion waves in stance |  |  | 12: near normal | Reduced hip and knee flexion, reduced plantarflexion at toe-off |  |  | 13: normal | Normal gait |  | | | |
Methods  Gait Data Sagittal plane video recordings of the gait of 13 children (11 children with hemiplegic, diplegic, and quadriplegic CP gait; 2 neurologically intact children) were selected from a gait database comprising 67 children (56 with CP gait, 11 with normal gait). Data had been recorded for prior research using a digital camcorder at 25Hz. That research12 identified homogenous gait styles of the 67 children based on statistical analysis of quantitative sagittal plane hip, knee, and ankle data. Thirteen different gait styles were identified and the 13 children chosen (mean age, 9y 6mo; range, 6–16y) represented 1 example of each of the 13 different gait styles. This ensured that the observers assessed a wide range of gait styles. Parents of the children had previously given consent for use of the video images for research purposes and the research was approved by both university and health service ethics committees. Procedure The observers used the SF-GT to assess the gait of the 13 children on 2 separate occasions 14 days apart. At the first assessment day, they were given a 2-hour update on gait maturation, normal gait kinematics, the characteristics of the different phases of gait, and methods for recording and assessing video recordings of gait. The rationale and history of the SF-GT was also explained,3, 11, 13 after which there was a demonstration of how to use the SF-GT to assess normal and CP gait (using 2 children with normal gait and 1 child with CP gait). Observers were each allocated a workstation with a DVD player and a television screen. They practiced using the SF-GT together for 30 minutes, during which time they familiarized themselves with its layout and the mechanics of pausing the recording at the appropriate events during gait; the observers also discussed their experiences, which stimulated discussion about the observed joint positions and their interpretations of what they had learned in the update lecture. They informally “calibrated” their observations and interpretations against those of others. The observers then worked individually without discussion using the SF-GT to assess 1 gait cycle of 1 leg of each of the 13 children. They were permitted to work at their own speed, to review the gait cycles as often as required, and were under no time limit. The assessments were completed in 3 to 5 hours. At assessment day 2, 14 days later, no further training or advice was given. The 17 observers who returned the second day assessed the same video recordings, which were presented to them in an order different from the first day (but, as at the first assessment, the order was the same for all observers). Although there was again no time limit, all assessments were completed within 2 hours 30 minutes. Data Analysis Percentage agreements were chosen to describe “exact” agreement between 2 sets of data (assessment 1, assessment 2) and to allow for direct comparison with other studies. The Cohen κ for intraobserver repeatability could not be computed because κ statistics require a symmetric 2-way table in which the first observation uses the same rating categories of the second observation.14 This was not always the case because some observers’ range of category scores (from +2 to −2) allocated at the first assessment did not match the range of category scores at the second assessment (eg, assessment 1 range of categories was 1–2, but assessment 2 range of categories was 0–2). Interobserver repeatability for the assessments was evaluated by calculating the mean percentage agreement at each of the 6 gait events (table 3). For instance, if 95% of observers scored the hip with category 1 at initial contact, 55% with category 1 at end double support, 100% with category 0 at mid stance, 75% with category −1 at start double support, 100% with category 0 at toe off and 95% with category 1 at mid swing, the mean agreement for this hip would be 87%. Intraobserver repeatability for each of the 3 joints was evaluated by calculating the percentage agreement at the 6 gait events (table 4) for each observer. If 6 of the possible 6 category scores for a joint agreed at both assessments, then 100% of the categories agreed. If 5 of 6 scores for a joint were the same, then 83% of the categories agreed. If 4 of 6 scores were the same, then 67% of the categories agreed, and so 3/6=50% agreement, 2/6=33% agreement, 1/6=17% agreement, and 0/6=0% agreement. The intraobserver repeatability for the exact number of degrees allocated to each joint at the 6 phases was also evaluated by counting on how many occasions the exact degree values (eg, 25° of hip flexion at initial contact) agreed at both assessments for each observer (see table 4). Intraobserver repeatability for each of the 13 gait styles was evaluated using the mean percentage agreement for the 3 joints from all observers (table 5). The statistical analysis was performed with SPSS.a Results  Interobserver Repeatability Between observers, an average of 77% (range, 67%–83%) of hip, knee, and ankle category scores at the 6 phases of gait agreed (see table 3). The observers agreed on 3875 (77%) of a possible 5004 category scores. Of the 1129 scores that disagreed, 98% (1111) differed by 1 category and 2% (18) differed by 2 categories from the mode category score. The knee joint was assessed with greatest repeatability across all observers (mean agreement, 81%), followed by the hip (77%) and the ankle (75%). Across all assessments the highest interobserver agreement was achieved for the hip in gait style 12 (near normal gait) (91%) and the lowest was for the ankle in gait style 4 (severe crouch gait) (56%). Gait style 13 (normal gait) was assessed with the highest overall interobserver repeatability (mean agreement of hip, knee, and ankle, 83%), followed by styles 1 (mild crouch) (82%) and 12 (near normal gait) (82%). Gait style 4 (severe crouch) was assessed with the lowest interobserver repeatability, with 67% mean agreement (see table 3). Intraobserver Repeatability Within observers, an average of 75% (range, 66%–87%) of category scores agreed between assessments 1 and 2 (see table 4). Of a possible 3728 category scores, 2792 were identical in both assessments. Of the 936 scores that did not agree between days, 97% (911) differed by only 1 category, 2.4% (22) differed by 2 categories, and 0.3% (3) differed by 3 categories from the mode category score. Within observers, scores for the hip agreed 72%, for the knee 78%, and for the ankle 73%. Exact agreement in the estimated position of the hip, knee, and ankle (recorded in degrees) occurred on average on 30% of occasions (see table 4). This means that almost one third of all degrees estimated at the 2 assessments were identical. Observer 13, however, accounted for nearly half of all exact matches between both assessments (47% of exact matches), and there was marked variation between observers, with observer 17 demonstrating only 18% agreement between assessments. Observers 6 and 7 did not provide degree data (see table 4). Style 13 (normal gait) was assessed with the highest intraobserver repeatability (mean agreement, 87%) while style 4 (severe crouch) was assessed with the least intraobserver repeatability, with 62% mean agreement (see table 5). Discussion  We found inter- and intraobserver repeatability to be good, and better than with other observational gait assessment tools.5, 6, 8, 9, 10 The interobserver repeatability was modest for the PRS (κ range, .46–.67) and good for the EVGS (agreement, 70%). The SF-GT’s level of repeatability (interobserver agreement, 77%; intraobserver agreement, 75%) was higher. The SF-GT intraobserver repeatability of the exact position (in degrees) of the joints was better than expected, with nearly one third of all joint positions exactly the same in both assessments. This aspect of assessment repeatability has not been reported previously. The results presented here are particularly pleasing because previous studies involved fewer observers, typically between 2 and 5, and the gait assessment tools used have fewer scoring categories and offer a smaller number of choices or categorical scoring options in terms of describing joint or gait pathology. The larger number of possible scores for each joint and the segmenting of gait into 6 different events means that the SF-GT offers a far greater number of choices than do other tools. We assume this leads to an increased potential for variation between observers and between repeated assessments by the same observer. The knee joint was assessed with the highest level of agreement between and within observers (between observers, 81%; within observers, 78%). This result agrees with other research9 that also found the knee joint to be the most repeatable joint to assess visually in the sagittal plane. Observers’ estimations may be facilitated by the clear visibility of the large knee joint, while the femur and the tibia formed 2 natural “goniometric levers.” In contrast, the hip is a less visible joint because soft tissue often obscures the exact pelvis position. Some observers appeared to use the curvature of the spine as a secondary indicator for the pelvis, but this is likely to be a poor indicator of pelvis position. The ankle joint displays the least range of motion between dorsiflexion and plantarflexion (≈30°) and therefore, when coupled with a smaller goniometric lever of the foot, may require more precision during visual estimation. Normal and near-normal gait styles (normal, mild crouch, near normal) were assessed with higher level of agreement than severe gait abnormalities (severe crouch). This demonstrates that observers had knowledge of and were able to assess normal gait, a prerequisite for assessing abnormal gait. This contrasts with the findings of Eastlack et al2 that their observers were unfamiliar with normative values of gait. When severe gait abnormalities were present, however, observers demonstrated the least repeatability, possibly because of the complexity of abnormal gait that manifests itself in 3 planes. The majority of the percentage agreements below 70% occurred in hip and the ankle of abnormal gait styles (see Table 3, Table 4, Table 5). Observations between 2 successive assessments varied between observers, indicating that intraobserver variability is itself variable. We identified several factors that can affect observer repeatability during the experiment. We used a range of different DVD players, television screens, and liquid crystal display equipment to play back and display the images. We thought this would reflect the reality of using the tool in a clinical setting. Different brands of DVD players, however, appeared to offer more frames than others and their sensitivity to manual operations was different. Observers sometimes found it difficult to find the precise frame they required. We used both flat screen and curved screen television sets and found that joint angles were more difficult to assess on a curved screen. The precise image frame at which a specific gait event (eg, initial contact) occurred was interpreted differently between observers. There were various levels of fatigue, stamina, and patience between observers. They made variable allowances for errors in the 2-dimensional image of the leg, particularly in cases where there was medial (internal) rotation at the hip that distorted the users’ perspective of sagittal plane joint motion. Some observers attempted to adjust their estimation of joint angles according to the perceived amount of rotation. They made clinical judgments based on the gait problems they identified and those judgments may have affected their objectivity in estimating joint angles. For example, when a child was toe walking the clinical judgment would be that the Achilles’ tendon was likely to be shortened and the ankle would consequently be plantar flexed, while in fact the “toe walking” was caused by increased hip and knee flexion with the ankle being near its 90° (neutral) position. Errors (including errors in basic arithmetic) were made while in the process of assigning grades to the joints. None of these factors are specific to the SF-GT, but have not been mentioned previously. Our results are also pleasing because they were obtained in a scenario that might not be conducive to attaining highest possible levels of repeatability. The research design was pragmatic and sought to reflect some realities of undertaking gait assessment in a clinical setting. Participants were not experts in gait assessment and none were routinely involved in quantitative gait analysis. As noted, there were several practical issues raised during the experiment about the use of equipment and interpretation of images. Experience in the training sessions suggested that knowledge of normal gait values and terminology was limited among some observers. Other studies have typically evaluated assessment tools using “experts,”6, 8, 9, 10 which we assume leads to higher intra- and interrepeatability. Observer repeatability could be improved by using markers on bony landmarks to indicate joint centers, by using flat-screen televisions or personal computers. Training with peer support would seem essential in ensuring appropriate implementation of an observation-based clinical gait assessment tool. Using goniometers to measure joint angles on the display screen was suggested as a future possibility by our observers to enhance validity and repeatability of observations. Considering the problems associated with poor repeatability of using goniomters,15, 16 however, a strict protocol would be required. Also, users may begin to rely too heavily on the goniometers when, for example, errors in the recording process, or medial hip rotation, distort the 2-dimensional image on the screen. Subjectivity would still remain. Alternatively, technology could be developed that would automatically extract segment angles from video images and remove the subjective “visual” element from gait assessment. Conclusions  The SF-GT demonstrated good inter- and intrauser repeatability, comparable and better in some cases than reports of other clinically orientated observation-based gait assessment tools. The evaluation presented here was generally more pragmatic than evaluations of other tools, involving more a detailed analysis, more observers, and the use of nonexperts to test the tool. Despite good results, problems associated with the subjective nature of observational gait assessment and variations between clinicians remain. The use of the SF-GT to assess sagittal plane motion in CP gait would be complementary to an examination of gait in the frontal and transversal planes, as well as a full clinical examination of joint mobility, muscle power, and tone. There are some practical issues that could be addressed, but training and experience are key aspects that have not been fully explored. In future work we will evaluate whether the SF-GT has the sensitivity to detect treatment effects and changes in gait over time. Supplier References  1. 1Krebs DE, Edelstein JE, Fishman S. Reliability of observational kinematic gait analysis. Phys Ther. 1985;65:1027–1033. MEDLINE 2. 2Eastlack ME, Arvidson J, Snyder-Mackler L, Danoff JV, McGarvey CL. Interrater reliability of videotaped observational gait-analysis assessments. Phys Ther. 1991;71:465–472. MEDLINE 3. 3Toro B, Nester CJ, Farren PC. The status of gait assessment among physical therapists in the United Kingdom. Arch Phys Med Rehabil. 2003;84:878–884. 4. 4Skaggs DL, Rethlefsen SA, Kay RM, Dennis SW, Reynolds RA, Tolo VT. Variability in gait analysis interpretation. J Pediatr Orthop. 2000;20:759–764. MEDLINE 5. 5Koman LA, Mooney JF, Smith BP, Goodman A, Mulvaney T. Management of spasticity in cerebral palsy with botulinum-A toxin: report of a preliminary, randomized, double-blind trial. J Pediatr Orthop. 1994;14:299–303. MEDLINE 6. 6Pirpiris M, Ugoni A, Starr R, et al. The ‘physician rating scale’—validity and reliability. Gait Posture. 2001;13:293. 7. 7Corry IS. The ‘Koman scale’. Belfast: Queen’s Univ Belfast; 1995;. 8. 8Mackey AH, Lobb GL, Walt SE, Stott NS. Reliability and validity of the Observational Gait Scale in children with spastic diplegia. Dev Med Child Neurol. 2003;45:4–11. MEDLINE 9. 9Dickens WE, Smith MF. Validation of a visual gait assessment scale for children with hemiplegic cerebral palsy. Gait Posture. 2006;23:78–82. Abstract | Full Text |
Full-Text PDF (166 KB)
|
CrossRef
10. 10Read HS, Hazlewood ME, Hillman SJ, Prescott RJ, Robb JE. Edinburgh visual gait score for use in cerebral palsy. J Pediatr Orthop. 2003;23:296–301. MEDLINE |
CrossRef
11. 11Toro B, Nester CJ, Farren PC. A review of observational gait assessment in clinical practice. Physiother Theory Pract. 2003;19:137–149.
CrossRef
12. 12Toro B, Nester CJ, Farren PC. The development and validity of the Salford Gait Tool: an observation-based clinical gait assessment tool. Arch Phys Med Rehabil. 2007;88:321–327. Abstract | Full Text |
Full-Text PDF (120 KB)
|
CrossRef
13. 13Toro B, Nester CJ, Farren PC. Cluster analysis for the extraction of sagittal gait patterns in children with cerebral palsy. Gait Posture. 2007;25:157–166. Abstract | Full Text |
Full-Text PDF (845 KB)
|
CrossRef
14. 14Cohen L, Holliday M. Statistics for social scientists: an introductory text with computer programs in basic. London: Harper & Row; 1982;. 15. 15Rome K, Cowieson F. A reliability study of the universal goniometer, fluid goniometer and electrogoniometer for the measurement of ankle dorsiflexion. Foot Ankle Int. 1996;17:28–32. MEDLINE 16. 16McDowell BC, Hewitt V, Nurse A, Weston T, Dusoir T, Baker R. The variability of goniometric measurements in ambulatory children with spastic cerebral palsy. Gait Posture. 2000;12:114–121. Abstract | Full Text |
Full-Text PDF (139 KB)
|
CrossRef
a Directorate of Physiotherapy, University of Salford, Salford, England b Centre for Rehabilitation and Human Performance Research, University of Salford, Salford, England. Reprint requests to Christopher J. Nester, PhD, Centre for Rehabilitation and Human Performance Research, University of Salford, Salford, M6 6PU, England
No commercial party having a direct financial interest in the results of the research supporting this article has or will confer a benefit upon the author(s) or upon any organization with which the author(s) is/are associated. PII: S0003-9993(06)01586-3 doi:10.1016/j.apmr.2006.12.030 © 2007 American Congress of Rehabilitation Medicine and the American Academy of Physical Medicine and Rehabilitation. Published by Elsevier Inc. All rights reserved. | |
|