Advertisement
Original research| Volume 100, ISSUE 10, P1907-1915, October 2019

Artificial Neural Network Learns Clinical Assessment of Spasticity in Modified Ashworth Scale

Open AccessPublished:April 19, 2019DOI:https://doi.org/10.1016/j.apmr.2019.03.016

      Highlights

      • Artificial intelligence was trained to mimic spasticity assessment by humans.
      • The artificial intelligence achieved satisfactory agreement with multiple raters.
      • The contribution of 9 characteristics of spasticity to modified Ashworth scale (MAS) rating was quantified.
      • Each characteristic contributed differently to the decision of different MAS grades.
      • Application of artificial intelligence for clinical assessment was discussed.

      Abstract

      Objective

      To propose an artificial intelligence (AI)-based decision-making rule in modified Ashworth scale (MAS) that draws maximum agreement from multiple human raters and to analyze how various biomechanical parameters affect scores in MAS.

      Design

      Prospective observational study.

      Setting

      Two university hospitals.

      Participants

      Hemiplegic adults with elbow flexor spasticity due to acquired brain injury (N=34).

      Intervention

      Not applicable.

      Main Outcome Measures

      Twenty-eight rehabilitation doctors and occupational therapists examined MAS of elbow flexors in 34 subjects with hemiplegia due to acquired brain injury while the MAS score and biomechanical data (ie, joint motion and resistance) were collected. Nine biomechanical parameters that quantify spastic response described by the joint motion and resistance were calculated. An AI algorithm (or artificial neural network) was trained to predict the MAS score from the parameters. Afterwards, the contribution of each parameter for determining MAS scores was analyzed.

      Results

      The trained AI agreed with the human raters for the majority (82.2%, Cohen’s kappa=0.743) of data. The MAS scores chosen by the AI and human raters showed a strong correlation (correlation coefficient=0.825). Each biomechanical parameter contributed differently to the different MAS scores. Overall, angle of catch, maximum stretching speed, and maximum resistance were the most relevant parameters that affected the AI decision.

      Conclusions

      AI can successfully learn clinical assessment of spasticity with good agreement with multiple human raters. In addition, we could analyze which factors of spastic response are considered important by the human raters in assessing spasticity by observing how AI learns the expert decision. It should be noted that few data were collected for MAS3; the results and analysis related to MAS3 therefore have limited supporting evidence.

      Graphical abstract

      Keywords

      List of abbreviations:

      AI (artificial intelligence), MAS (modified Ashworth scale), MLP (multilayer perceptron), PROM (passive range of motion)
      Spasticity is a common sequela after acquired brain injuries, which influences activities of daily living. It is defined as involuntary muscle activation due to abnormal sensory-motor control resulted from upper motor neuron lesion.
      • Burridge J.H.
      • Wood D.E.
      • Hermens H.J.
      • et al.
      Theoretical and methodological considerations in the measurement of spasticity.
      One particular characteristic of spastic muscle is sudden resistance (named catch) and/or increase of tone subjected to passive stretching. Reliable assessment of spasticity is essential to determine the proper treatment and know its efficacy. However, modified Ashworth scale (MAS)
      • Bohannon R.W.
      • Smith M.B.
      Interrater reliability of a modified Ashworth scale of muscle spasticity.
      is the most popular tool in clinical practices
      • van Wijck F.M.
      • Pandyan A.D.
      • Johnson G.R.
      • Barnes M.P.
      Assessing motor deficits in neurological rehabilitation: patterns of instrument usage.
      mainly due to its simplicity despite reported poor interrater reliability.
      • Blackburn M.
      • van Vliet P.
      • Mockett S.P.
      Reliability of measurements obtained with the modified Ashworth scale in the lower extremities of people with stroke.
      • Mehrholz J.
      • Wagner K.
      • Meissner D.
      • et al.
      Reliability of the Modified Tardieu Scale and the Modified Ashworth Scale in adult patients with severe brain injury: a comparison study.
      • Fleuren J.F.
      • Voerman G.E.
      • Erren-Wolters C.V.
      • et al.
      Stop using the Ashworth Scale for the assessment of spasticity.
      There are 2 approaches to enhance the poor interrater reliability. First, quantitative and objective measurement can replace or augment the current subjective instrument. Second, without changing the simplest protocol, interrater reliability can still be enhanced by standardized training based on quantitative and statistical analysis. This paper presents artificial intelligence (AI)-based analysis for both approaches.
      Quantitative measures of spasticity have been developed based on different approaches. The first approach quantified onset, latency, and magnitude of neural reflex using electromyography.
      • Pisano F.
      • Miscio G.
      • Del Conte C.
      • Pianca D.
      • Candeloro E.
      • Colombo R.
      Quantitative measures of spasticity in post-stroke patients.
      • Calota A.
      • Feldman A.G.
      • Levin M.F.
      Spasticity measurement based on tonic stretch reflex threshold in stroke using a portable device.
      • Blanchette A.K.
      • Mullick A.A.
      • Moin-Darbari K.
      • Levin M.F.
      Tonic stretch reflex threshold as a measure of ankle plantar-flexor spasticity after stroke.
      • Powers R.K.
      • Mardermeyer J.
      • Rymer W.Z.
      Quantitative relations between hypertonia and stretch reflex threshold in spastic hemiparesis.
      • Lynn B.O.
      • Erwin A.
      • Guy M.
      • et al.
      Comprehensive quantification of the spastic catch in children with cerebral palsy.
      Another approach examined joint properties such as stiffness under controlled kinematic conditions including isokinetic stretching or random perturbation.
      • Pisano F.
      • Miscio G.
      • Del Conte C.
      • Pianca D.
      • Candeloro E.
      • Colombo R.
      Quantitative measures of spasticity in post-stroke patients.
      • Alibiglou L.
      • Rymer W.Z.
      • Harvey R.L.
      • Mirbagheri M.M.
      The relation between Ashworth scores and neuromechanical measurements of spasticity following stroke.
      • Chung S.G.
      • Van Rey E.
      • Bai Z.
      • Roth E.J.
      • Zhang L.Q.
      Biomechanic changes in passive properties of hemiplegic ankles with spastic hypertonia.
      The last approach quantified motion and resistance of the joint during manual assessment.
      • Lynn B.O.
      • Erwin A.
      • Guy M.
      • et al.
      Comprehensive quantification of the spastic catch in children with cerebral palsy.
      • Park H.S.
      • Kim J.
      • Damiano D.L.
      Development of a Haptic Elbow Spasticity Simulator (HESS) for improving accuracy and reliability of clinical assessment of spasticity.
      • Pandyan A.D.
      • Price C.I.
      • Barnes M.P.
      • Johnson G.R.
      A biomechanical investigation into the validity of the modified Ashworth Scale as a measure of elbow spasticity.
      Although various quantitative measures have been proposed, clinical practices are still based on MAS presumably due to an unclear relationship between the suggested measures and MAS ratings. For a smooth and systematic transition from the old to the new, a user-friendly introduction of new quantitative measures should provide quantitative and statistical analysis explaining how previously used MAS rating system can be interpreted in the suggested system.
      Beyond quantification of spasticity, there were few attempts to mimic clinical assessment based on statistical classification methods.
      • Seth N.
      • Johnson D.
      • Taylor G.W.
      • Allen O.B.
      • Abdullah H.A.
      Robotic pilot study for analysing spasticity: clinical data versus healthy controls.
      • Zupan B.
      • Stokic D.S.
      • Bohanec M.
      • Priebe M.M.
      • Sherwood A.M.
      Relating clinical and neurophysiological assessment of spasticity by machine learning.
      They performed simple classification (eg, severe vs nonsevere spasticity or subjects with spasticity vs healthy controls) based on quantitative parameters in literature. Compared to these approaches, an artificial neural network, an AI algorithm, can learn more complex tasks such as prediction of MAS score. It also allows quantitative analysis of how the AI algorithm learns the human decision-making process. This would be useful to design standardized training if AI learns what majority experts do.
      This paper proposes an AI-based approach to understand how a majority of human experts decide MAS scores in common by evaluating contributions of various characteristic features quantified by biomechanical parameters. The AI-based analysis can either be used to describe standardized feeling of each MAS rating for consistent training of raters that would eventually improve interrater reliability of MAS itself or be used for systematic transition from the MAS to new quantitative instruments. Specifically, we collected joint motion, resistance, and MAS scores simultaneously from 848 trials conducted by multiple clinicians. Nine biomechanical parameters quantified the joint motion and resistance. An AI algorithm was then trained to predict the MAS score from the parameters. We quantitatively analyzed which parameters are considered important.

      Methods

      Participants

      Hemiplegic adults with elbow flexor spasticity due to acquired brain injuries were recruited from 2 hospitals (Presbyterian Medical Center, Jeonju, Korea, and Chung-Ang University Hospital, Seoul, Korea). Inclusion criteria were as follows: (1) acquired brain injuries confirmed by computer tomography or magnetic resonance imaging; (2) more than 2 weeks after onset; (3) cognitive ability confirmed by Mini-Mental State Examination (score>18, Chung-Ang University Hospital) or by testing a 3-step command (Presbyterian Medical Center). Exclusion criteria were as follows: (1) no passive movement (ie, MAS4); (2) other neurological or orthopedic conditions that could affect testing the limb. The experimental protocol was approved by institutional review boards of the hospitals and every subject gave written consent. A total of 34 subjects (table 1) and a total of 28 rehabilitation doctors and occupational therapists (with an average career of 7.6y) participated.
      Table 1Detailed subject information
      SubjectAge (y)Sex (Male/Female)Affected or More-Affected Side (Examined Side)Type of Acquired Brain InjuryMAS Score (Rated by Human Rater) (%)
      011+23
      S0145MaleRightStroke25571800
      S0259MaleRightStroke0435700
      S0356MaleLeftStroke0227620
      S0466MaleRightStroke0056440
      S0550MaleLeftStroke2653300
      S0644MaleLeftStroke01178110
      S0756MaleLeftStroke0604000
      S0836MaleRightStroke0060400
      S0950MaleLeftStroke10781200
      S1054FemaleLightStroke4247200
      S1133FemaleLeftStroke0505000
      S1251MaleLeftStroke0060400
      S1365MaleLeftStroke05036140
      S1454MaleRightStroke03333330
      S1561FemaleLeftStroke8020000
      S1650MaleRightStroke0031690
      S1775MaleLeftStroke7327000
      S1871FemaleRightStroke0100000
      S1941FemaleRightTraumatic brain injury793000
      S2080FemaleRightStroke1000000
      S2159MaleRightStroke04010500
      S2241MaleLeftStroke1000000
      S2388MaleLeftStroke3367000
      S2466MaleRightBrain tumor1000000
      S2575FemaleRightStroke4753000
      S2667FemaleRightStroke4060000
      S2761MaleRightTraumatic brain injury3367000
      S2850MaleRightStroke1387000
      S2952MaleLeftStroke093700
      S3054MaleLeftStroke1387000
      S3176MaleLeftStroke03320470
      S3268MaleLeftStroke01320670
      S3382FemaleLeftStroke0000100
      S3447MaleRightStroke0000100

      Data collection

      The manual spasticity evaluator
      • Park H.S.
      • Kim J.
      • Damiano D.L.
      Development of a Haptic Elbow Spasticity Simulator (HESS) for improving accuracy and reliability of clinical assessment of spasticity.
      was used to collect joint angle and resistance (fig 1). The forearm brace including the hand support was carefully designed to minimize slippage. At the same time, the device was designed to be light to minimize distortion of force perception. After wearing the device and fastening by straps, the joint was maximally flexed vertically to the initial posture (shoulder flexion of 40 degrees, abduction of 20 degrees and sitting on the chair), selected by referring to previous studies.
      • Park H.S.
      • Kim J.
      • Damiano D.L.
      Development of a Haptic Elbow Spasticity Simulator (HESS) for improving accuracy and reliability of clinical assessment of spasticity.
      • Gregson J.M.
      • Leathley M.J.
      • Moore A.P.
      • Smith T.L.
      • Sharma A.K.
      • Watkins C.L.
      Reliability of measurements of muscle tone and muscle power in stroke patients.
      While extending the joint maximally, the raters held the handle of the device so that the resistive force (felt by the raters) was measured by a force sensora beneath the handle. The force reading was converted to torque about elbow joint afterwards. The joint angle was measured by an angle sensor.b The raters assessed spasticity for 5 trials and recorded MAS scores for each trial. Between trials, rest of 2 to 5 seconds was given randomly to prevent the subject from predicting the stretching. At least 3 raters (4.9 raters on average) assessed a subject. A rest of 5 minutes was given between raters. Prior to participation, raters were trained and became accustomed to the experimental protocol.
      Figure thumbnail gr1
      Fig 1Spasticity assessment using the manual spasticity evaluator. The rater extends the subject’s elbow joint while holding the handle of the device. The angle sensor aligned to the subject’s elbow joint measures angle of the joint and the force sensor beneath the handle (area surrounded by a dotted line) measures the force applied to the subject. The device is fastened to the subject’s upper arm, forearm, and hand using fabric straps.

      Quantification of biomechanical factors

      Joint motion and resistance were quantified by 9 biomechanical parameters using custom MATLABc codes. Eight of them were adopted from the literature. Three joint motion parameters are passive range of motion (PROM), the difference between the minimum and maximum extension angles
      • Kumar R.T.
      • Pandyan A.D.
      • Sharma A.K.
      Biomechanical measurement of post-stroke spasticity.
      ; maximum stretching speed
      • Pandyan A.D.
      • Price C.I.
      • Barnes M.P.
      • Johnson G.R.
      A biomechanical investigation into the validity of the modified Ashworth Scale as a measure of elbow spasticity.
      ; and angle of catch based on the maximum deceleration, the angular displacement from the initial angle to the joint angle when the largest deceleration occurs, which was normalized by PROM
      • Lynn B.O.
      • Erwin A.
      • Guy M.
      • et al.
      Comprehensive quantification of the spastic catch in children with cerebral palsy.
      • van den Noort J.C.
      • Scholtes V.A.
      • Becher J.G.
      • Harlaar J.
      Evaluation of the catch in spasticity assessment in children with cerebral palsy.
      (fig 2A-C). Three resistance parameters are the maximum resistance
      • Kumar R.T.
      • Pandyan A.D.
      • Sharma A.K.
      Biomechanical measurement of post-stroke spasticity.
      ; magnitude of catch
      • Park H.S.
      • Kim J.
      • Damiano D.L.
      Development of a Haptic Elbow Spasticity Simulator (HESS) for improving accuracy and reliability of clinical assessment of spasticity.
      ; the peak resistance torque due to catch divided by the maximum stretching speed (fig 2D); and the slope of linear regression of the resistance with respect to the joint angle
      • Pandyan A.D.
      • Price C.I.
      • Barnes M.P.
      • Johnson G.R.
      A biomechanical investigation into the validity of the modified Ashworth Scale as a measure of elbow spasticity.
      (fig 2A). Another 2 parameters related to mechanical power (ie, multiplication of resistance and angular velocity) are angle of catch based on the local minimum of the mechanical power,
      • Lynn B.O.
      • Erwin A.
      • Guy M.
      • et al.
      Comprehensive quantification of the spastic catch in children with cerebral palsy.
      which was calculated in a similar manner to angle of catch based on the maximum deceleration; and the local minimum value of the power
      • Lynn B.O.
      • Erwin A.
      • Guy M.
      • et al.
      Comprehensive quantification of the spastic catch in children with cerebral palsy.
      (fig 2E).
      Figure thumbnail gr2
      Fig 2Nine biomechanical parameters to quantify the spastic response. (A) PROM and resistance related parameters (slope of linear regression of the resistance with respect to the joint angle, magnitude of catch, resistance after catch, and maximum resistance during the passive stretching); (B) speed related parameters, maximum stretching speed; (C) acceleration related parameters, angle of catch based on the maximum deceleration; (D) mechanical power related parameters, angle of catch based on the local minimum of the mechanical power and local minimum of mechanical power; (E) detailed explanation on calculation of resistance after catch and magnitude of catch.
      The ninth parameter, change of resistance after catch, was proposed since it is an apparent factor of MAS.
      • Bohannon R.W.
      • Smith M.B.
      Interrater reliability of a modified Ashworth scale of muscle spasticity.
      For example, it is described as catch and release (MAS1), catch followed by minimal resistance (MAS1+), or increase of muscle tone (MAS2, 3). To express the different shapes of resistance after the catch as a numeric value, we considered the amount of torque variation as well as the amount of angular variation since the raters would recognize the torque variation more dominant if the accompanied angular variation is larger. Figure 2E describes how we calculated resistance after catch. All parameters were transformed to have 0 means and unit variances to prevent the AI algorithm from biasing toward parameters with larger scales.

      AI predicting MAS score

      A multilayer perceptron (MLP
      • Gardner M.W.
      • Dorling S.
      Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences.
      ) predicted MAS score from the 9 biomechanical parameters. It resembles a multilayer of biological neurons. We developed a perceptron which has 9 neurons in each of the first 2 layers and 5 output neurons representing MAS0-3 in the last layer. Each neuron linearly combined outputs of neurons in the previous layer (or the biomechanical parameters) with its own weights and bias. Then the result of linear combination was saturated. Hyperbolic tangent functions were used for the first and second layers, while a log softmax function was used for the last layer in order that the output neurons had values ranging from 0 to 1, the sum of which was 1. Throughout the training, weights and biases were tuned to make the output neuron of the desired MAS score produce the maximum output value since the MAS score corresponding to the maximum output will be selected as the predicted score for the given sample. The weights and biases were modified after the prediction of each sample based on the gradient descent method (learning rate of 10-5). The training was repeated for 30,000 epochs, enough for the training result to converge. Note that all samples were used for the training. The perceptron we used was the simplest to achieve agreement with the human raters for more than 80% of trials, which is equal to or higher than those of previous studies.
      • Seth N.
      • Johnson D.
      • Taylor G.W.
      • Allen O.B.
      • Abdullah H.A.
      Robotic pilot study for analysing spasticity: clinical data versus healthy controls.
      • Zupan B.
      • Stokic D.S.
      • Bohanec M.
      • Priebe M.M.
      • Sherwood A.M.
      Relating clinical and neurophysiological assessment of spasticity by machine learning.
      In addition to the percentage agreement, Cohen’s kappa and Spearman rank correlation was used to evaluate the performance of the AI algorithm. Since single training might not represent averaged performance, 30 copies of the AI algorithm were trained separately and their results were averaged.
      The contribution of the 9 parameters on the MAS scoring was reversely traced using layerwise relevance propagation.
      • Bach S.
      • Binder A.
      • Montavon G.
      • Klauschen F.
      • Muller K.R.
      • Samek W.
      On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation.
      Relevance of each neuron and the bias in 1 layer was calculated by distributing the relevance value of each neuron in the next layer according to their contribution to the linear combination for the neuron of the next layer (ie, multiplication of the output of each neuron and the corresponding weight, divided by the linearly combined value). To evaluate contribution on the predicted MAS score only, the value of the output neuron corresponding to the predicted score was distributed to the previous layers. Positive relevance means that the parameter strengthens the prediction whereas negative relevance weakens. The relevance value of each parameter was averaged over trials classified as the same score. One sample t test was used to test whether each parameter has nonzero average relevance for each score. We defined the overall relevance of each parameter as the sum of the average relevance values over the 5 MAS scores. Based on the overall relevance, the most and least relevant parameters were identified. The AI algorithm was developed and analyzed using custom MATLABc codes.

      Results

      Data collection

      A total of 848 trials were collected since 1 rater conducted 3 trials at his or her first participation. From MAS0 to 3, 117, 324, 281, 96, and 30 trials were collected respectively. For most MAS3 trials, a catch was not observed due to slow stretching. For those trials, angle of catch based on the maximum deceleration and angle of catch based on the local minimum of the mechanical power were set to negative extreme values of -10 since they decrease as the MAS score increases. Similarly, magnitude of catch, resistance after catch, and local minimum of mechanical power were set to positive extreme values, +10. For these parameters, the average relevance for MAS3 was excluded from the overall relevance calculation since their relevance might be exaggerated due to the artificial extreme values. For MAS0, trials where the catch was observed were included in the further analysis since weak spasticity can sometimes be rated as MAS0. Trials with data error were excluded. The error includes malfunction of the sensors, failure to hold the handle while stretching and unsure MAS rating. A total of 648 trials (MAS0: 71, 1: 256, 1+: 224, 2: 78, 3: 19) were used to train the AI. Note that the following results about MAS3 were driven from a small number of samples.

      Artificial neural network-based quantitative analysis of MAS

      The AI algorithm successfully learned MAS rating. Percentage agreement of 82.2% and Cohen’s kappa of 0.743 (95% CI: [0.735, 0.751]) were achieved on average. Percentage agreement values for MAS0-3 were 76.7%, 80.4%, 85.0%, 84.9%, and 88.4%, respectively. It was strongly correlated with the human raters (correlation coefficient=0.825). Each parameter and its relevance were averaged over trials classified as the same MAS score (fig 3, 4). Despite the parameters with clear positive or negative relevance, some parameters showed insignificant relevance which had positive or negative relevance depending on trials. Angle of catch based on the maximum deceleration, resistance after catch, maximum resistance during the passive stretching, and slope of linear regression of the resistance with respect to the joint angle were relevant to the decision of MAS0. In the case of MAS2, angle of catch based on the maximum deceleration, resistance after catch, maximum resistance during the passive stretching, and maximum stretching speed were relevant. Magnitude of catch, resistance after catch, PROM, maximum stretching speed, and slope of linear regression of the resistance with respect to the joint angle were relevant to MAS3. Most parameters had negative relevance to MAS1 and 1+ while the biases for the linear combinations at the 3 layers had positive relevance.
      Figure thumbnail gr3
      Fig 3Distribution of biomechanical parameters of trials classified as each MAS score. Normalized values are shown. Zero level implies the mean value of each parameter. Dots indicate average values of each MAS score. Vertical lines show 1 SD. (A) angle of catch based on the maximum deceleration; (B) magnitude of catch; (C) resistance after catch; (D) PROM; (E) maximum resistance during the passive stretching; (F) maximum stretching speed; (G) angle of catch based on the local minimum of the mechanical power; (H) local minimum of mechanical power; (I) slope of linear regression of the resistance with respect to the joint angle. For some MAS3 trials that catch was not clearly observed, angle of catch based on the maximum deceleration, angle of catch based on the local minimum of the mechanical power, magnitude of catch, resistance after catch, and local minimum of mechanical power were set to be extreme values (eg, -10 or 10).
      Figure thumbnail gr4
      Fig 4The average relevance of biomechanical parameters and biases to the decision of each MAS score. Red and blue colors indicate positive average relevance (relevant for the prediction) and negative average relevance (against the prediction), respectively. Overall relevance is defined as the sum of the average relevance values over the MAS scores. Relevance values of angle of catch based on the maximum deceleration, magnitude of catch, resistance after catch, angle of catch based on the local minimum of the mechanical power, and local minimum of mechanical power for MAS3 (colored in gray) were excluded from the calculation of overall relevance. The average relevance value is not significantly different from 0 according to 1 sample t test (statistical significance: 5%).
      Angle of catch based on the maximum deceleration, maximum resistance during the passive stretching, and maximum stretching speed were identified as the most relevant 3 parameters based on the overall relevance. With these parameters, the AI algorithm could achieve an agreement of 71.8%. On the other hand, the least relevant parameters (magnitude of catch, angle of catch based on the local minimum of the mechanical power, and local minimum of mechanical power) could achieve an agreement of 64.6%.

      Discussion

      Relevance of biomechanical parameters

      Angle of catch based on the maximum deceleration was the most relevant to MAS0 and 2 (large and small angles of catch respectively, the normalized values far from 0 as shown in fig 3A). As angle of catch has been considered one of the primary factors in the spasticity assessment,
      • Boyd R.N.
      • Graham H.K.
      Objective measurement of clinical findings in the use of botulinum toxin type A for the management of children with CP.
      our result supports that angle of catch based on the maximum deceleration is specifically effective for determining MAS0 and 2. Low slope of linear regression of the resistance with respect to the joint angle (small increase of resistance throughout range of motion) and low maximum resistance during the passive stretching (smaller maximum resistance) were relevant features of MAS0, while high maximum resistance during the passive stretching, low maximum stretching speed (slow stretching), and high resistance after catch (increase of resistance after catch) were relevant to MAS2.
      MAS1 and 1+ had only a few parameters which showed positive but not strong relevance. For example, in the case of angle of catch based on the maximum deceleration, a trial tended to be classified as MAS0 or 2 as the parameter deviated further from its mean. Thus, they can be considered as intermediate levels between MAS0 and 2. While most parameters worked similarly as angle of catch based on the maximum deceleration did, few parameters such as maximum resistance during the passive stretching for MAS1, and slope of linear regression of the resistance with respect to the joint angle for 1+ weakly contributed. Regarding the ambiguity between MAS1 and 1+,
      • Pandyan A.D.
      • Johnson G.R.
      • Price C.I.
      • Curless R.H.
      • Barnes M.P.
      • Rodgers H.
      A review of the properties and limitations of the Ashworth and modified Ashworth Scales as measures of spasticity.
      understanding them requires a careful analysis, which can be a topic for the next study.
      In the case of MAS3, PROM was prominently reduced. A catch was not observed clearly in most trials. Hence, we inevitably set the catch-related parameters (ie, angle of catch based on the maximum deceleration, magnitude of catch, resistance after catch, angle of catch based on the local minimum of the mechanical power, and local minimum of mechanical power) as the extreme values, which were clearly separable from the other trials. We found that magnitude of catch and resistance after catch, which had extreme values implying no catch, and slow maximum stretching speed were considered important.
      Interestingly, maximum stretching speed was relevant to severe spasticity of MAS2 and 3. Raters stretched the severely impaired joints slowly since they might be stiffer. Considering the velocity dependency of spasticity, it is important to cope with the varying stretching speed. In this study, the speed was considered in decision-making rather than regulating the speed based on visual or auditory feedback.
      When fewer parameters were used to train the AI, classification performance was degraded; however, better performance was achieved when the most relevant parameters were used.

      Discussions on the AI-based approach

      The AI-based approach enables a detailed quantitative analysis of human decision-making or even the decision-making by itself. With the conventional methods such as correlation or regression analyses, prediction of the clinical scores is still ambiguous. In this study, we found how much human raters rely on the specific parameter as well as ranges of the parameters corresponding to each MAS score (fig 3). In addition, if automated prediction like what IBM Watson does is required, different AI algorithm such as convolutional neural network specialized for multidimensional data can directly predict the MAS score from the response of spastic joint.
      Regarding the clinical application of the AI algorithm, there are several points to discuss. First, the size and quality of dataset determine the prediction performance. More data guarantee better performance in general. The quality is improved as the data are collected from diverse cases and as answers given to the dataset (eg, human-rated MAS score) are consistent. For the diversity, we collected data from multiple subjects and raters. For the consistency, if inevitable, screening abnormal data may be helpful as we excluded trials with data error.
      Second, using more neurons and layers can improve the learning capacity of the AI algorithm; however, the chance of learning errors in data may also increase. We had to address the dilemma of how to set the number of neurons and layers of the MLP. We collected MAS score for every trial to cope with intrasubject variation. However, if raters made inconsistent ratings, the more complex AI would learn the specific cases. We therefore used the simplest MLP in this study.
      Lastly, the training result varies since the training process is stochastic. The AI algorithm is randomly initialized and guided by samples presented in random order. Thus, it can be trained differently even with the same architecture and dataset. To handle this variability, averaging the results of separate trainings is recommended. In this study, we averaged 30 separate trainings.

      Study limitations

      First, the measurement device might have caused bias of the MAS ratings. The measurement device was designed to minimize slippage between the device and the subjects as well as inertia of the device. In addition, we surveyed on the questionnaire: “Would you have rated the same MAS scores with the bare hand contact?” With 5 being exactly the same and 0 being completely different, we obtained 4.67 on average, which implies that the raters did not think that the device affected their rating. Despite the potential bias, the device can detect intrasubject variation which is another source of poor reliability. The effect of intrasubject variation can be minimized by averaging multiple measurements.
      Second, analysis of MAS3 was based on relatively small samples since it was hard to find subjects with severe spasticity which is possibly due to the recent advances in the management of spasticity. Within the small number of samples, the artificial neural network learned the slow stretching speed as a clear feature of MAS3. This will still be the same with more MAS3 samples; however, the AI algorithm would find the more accurate boundary of parameters.
      Last, we did not focus on the robustness of the AI rule since this study aimed at understanding how AI learns the majority of human experts. Therefore, validation using separate dataset was not conducted.

      Conclusion

      In this study, an AI-based approach was used to infer the MAS scoring rule from biomechanical data collected from multiple raters as well as multiple subjects. The AI achieved satisfactory performance of agreement (82.2%, kappa=0.743) and correlation (correlation coefficient of 0.825) with the human raters. Among the 9 parameters, angle of catch, maximum resistance, and maximum stretching speed were the most relevant to MAS. MAS0, 2, and 3 have distinct apparent patterns of the biomechanical parameters, whereas MAS1 and 1+ do not. These findings can be used for designing more standardized training of MAS for improving MAS instrument itself or for providing a systematic transition to other quantitative instruments. Furthermore, the AI-based relevance test can be applied to analyze the subjective nature of other clinical instruments.

      Suppliers

      • a.
        LRF325; Futek Advanced Sensor Technology Inc.
      • b.
        HEDM-5500; Avago Technologies US Inc.
      • c.
        MATLAB 2018a; Mathworks Inc.

      References

        • Burridge J.H.
        • Wood D.E.
        • Hermens H.J.
        • et al.
        Theoretical and methodological considerations in the measurement of spasticity.
        Disabil Rehabil. 2005; 27: 69-80
        • Bohannon R.W.
        • Smith M.B.
        Interrater reliability of a modified Ashworth scale of muscle spasticity.
        Phys Ther. 1987; 67: 206-207
        • van Wijck F.M.
        • Pandyan A.D.
        • Johnson G.R.
        • Barnes M.P.
        Assessing motor deficits in neurological rehabilitation: patterns of instrument usage.
        Neurorehabil Neural Repair. 2001; 15: 23-30
        • Blackburn M.
        • van Vliet P.
        • Mockett S.P.
        Reliability of measurements obtained with the modified Ashworth scale in the lower extremities of people with stroke.
        Phys Ther. 2002; 82: 25-34
        • Mehrholz J.
        • Wagner K.
        • Meissner D.
        • et al.
        Reliability of the Modified Tardieu Scale and the Modified Ashworth Scale in adult patients with severe brain injury: a comparison study.
        Clin Rehabil. 2005; 19: 751-759
        • Fleuren J.F.
        • Voerman G.E.
        • Erren-Wolters C.V.
        • et al.
        Stop using the Ashworth Scale for the assessment of spasticity.
        J Neurol Neurosurg Psychiatry. 2010; 81: 46-52
        • Pisano F.
        • Miscio G.
        • Del Conte C.
        • Pianca D.
        • Candeloro E.
        • Colombo R.
        Quantitative measures of spasticity in post-stroke patients.
        Clin Neurophysiol. 2000; 111: 1015-1022
        • Calota A.
        • Feldman A.G.
        • Levin M.F.
        Spasticity measurement based on tonic stretch reflex threshold in stroke using a portable device.
        Clin Neurophysiol. 2008; 119: 2329-2337
        • Blanchette A.K.
        • Mullick A.A.
        • Moin-Darbari K.
        • Levin M.F.
        Tonic stretch reflex threshold as a measure of ankle plantar-flexor spasticity after stroke.
        Phys Ther. 2016; 96: 687-695
        • Powers R.K.
        • Mardermeyer J.
        • Rymer W.Z.
        Quantitative relations between hypertonia and stretch reflex threshold in spastic hemiparesis.
        Ann Neurol. 1988; 23: 115-124
        • Lynn B.O.
        • Erwin A.
        • Guy M.
        • et al.
        Comprehensive quantification of the spastic catch in children with cerebral palsy.
        Res Dev Disabil. 2013; 34: 386-396
        • Alibiglou L.
        • Rymer W.Z.
        • Harvey R.L.
        • Mirbagheri M.M.
        The relation between Ashworth scores and neuromechanical measurements of spasticity following stroke.
        J Neuroeng Rehabil. 2008; 5: 18
        • Chung S.G.
        • Van Rey E.
        • Bai Z.
        • Roth E.J.
        • Zhang L.Q.
        Biomechanic changes in passive properties of hemiplegic ankles with spastic hypertonia.
        Arch Phys Med Rehabil. 2004; 85: 1638-1646
        • Park H.S.
        • Kim J.
        • Damiano D.L.
        Development of a Haptic Elbow Spasticity Simulator (HESS) for improving accuracy and reliability of clinical assessment of spasticity.
        IEEE Trans Neural Syst Rehabil Eng. 2012; 20: 361-370
        • Pandyan A.D.
        • Price C.I.
        • Barnes M.P.
        • Johnson G.R.
        A biomechanical investigation into the validity of the modified Ashworth Scale as a measure of elbow spasticity.
        Clin Rehabil. 2003; 17: 290-293
        • Seth N.
        • Johnson D.
        • Taylor G.W.
        • Allen O.B.
        • Abdullah H.A.
        Robotic pilot study for analysing spasticity: clinical data versus healthy controls.
        J Neuroeng Rehabil. 2015; 12: 109
        • Zupan B.
        • Stokic D.S.
        • Bohanec M.
        • Priebe M.M.
        • Sherwood A.M.
        Relating clinical and neurophysiological assessment of spasticity by machine learning.
        Int J Med Inform. 1998; 49: 243-251
        • Gregson J.M.
        • Leathley M.J.
        • Moore A.P.
        • Smith T.L.
        • Sharma A.K.
        • Watkins C.L.
        Reliability of measurements of muscle tone and muscle power in stroke patients.
        Age Ageing. 2000; 29: 223-228
        • Kumar R.T.
        • Pandyan A.D.
        • Sharma A.K.
        Biomechanical measurement of post-stroke spasticity.
        Age Ageing. 2006; 35: 371-375
        • van den Noort J.C.
        • Scholtes V.A.
        • Becher J.G.
        • Harlaar J.
        Evaluation of the catch in spasticity assessment in children with cerebral palsy.
        Arch Phys Med Rehabil. 2010; 91: 615-623
        • Gardner M.W.
        • Dorling S.
        Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences.
        Atmospheric Environment. 1998; 32: 2627-2636
        • Bach S.
        • Binder A.
        • Montavon G.
        • Klauschen F.
        • Muller K.R.
        • Samek W.
        On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation.
        PLoS One. 2015; 10 (e0130140)
        • Boyd R.N.
        • Graham H.K.
        Objective measurement of clinical findings in the use of botulinum toxin type A for the management of children with CP.
        Eur J Neurol. 1999; 6: S23-S35
        • Pandyan A.D.
        • Johnson G.R.
        • Price C.I.
        • Curless R.H.
        • Barnes M.P.
        • Rodgers H.
        A review of the properties and limitations of the Ashworth and modified Ashworth Scales as measures of spasticity.
        Clin Rehabil. 1999; 13: 373-383