2003 IFA Congress: Montreal, Canada

A Sensorimotor Perspective on Stuttering: Insights from the Neuroscience of Motor Control

Ludo Max1,4, Vincent L. Gracco2,4, Frank H. Guenther3.5, Satrajit S. Ghosh3, and Marie E. Wallace1,4
1University of Connecticut, Department of Communication Sciences, 850 Bolton Road Unit 1085, Storrs, CT 06269-1085, USA
2McGill University, School of Communication Sciences and Disorders, 1266 Pine Avenue West, Montreal, Quebec, H3G 1A8, Canada
3Boston University, Department of Cognitive and Neural Systems, 677 Beacon Street, Boston, MA 02215, USA
4Haskins Laboratories, 270 Crown Street, New Haven, CT 06511-6695, USA
5Massachusetts Institute of Technology, Research Laboratory of Electronics 77 Massachusetts Avenue, Cambridge, MA 021139-4307, USA


We present a theoretical perspective on stuttering based on a wide range of empirical data regarding the neuroscience of motor control. This perspective relies heavily on recent insights into models of motor control incorporating (a) feedforward and feedback. control schemes, (b) the formation, consolidation, and updating of inverse and forward internal models of effector system dynamics, and (c) cortical and subcortical activation patterns during speech and nonspeech motor tasks. We suggest that stuttering may result either when producing speech with inaccurate internal models or with a motor strategy that is weighted too much toward feedback control. The overall perspective can account not only for the primary characteristics of the disorder but also for several of the associated phenomena (e. g., age of onset, fluency enhancing conditions, treatment effects). Furthermore, this sensorimotor perspective on stuttering is consistent with computer simulations implemented in the DIVA model -a neural network model of the central control of speech movements.

  1. Introduction
The contemporary literature on stuttering shows a remarkable lack of new theoretical models. Although some new perspectives on specific aspects of the disorder have been formulated within the last 15 years (e.g., linguistically based models such as the covert-repair and lexical access hypotheses; Postma & Kolk, 1993; Prins et al., 1997), there have been very few recent attempts at formulating a comprehensive framework that would account not only for the primary characteristics (i.e., sound and syllable repetitions, audible and inaudible sound prolongations) but also for the various phenomena known to be associated with stuttering (e. g., typical range for the age of onset, empirical results regarding sensory and motor performance in individuals who stutter, fluency improvements during adaptation paradigms and fluency-enhancing conditions, positive results of specific treatment programs, etc).

We propose here two hypotheses that are based on current insights into (and modeling of) the neuroscience of motor control, and that propose specific components or processes within a well- documented and widely-accepted sensorimotor control scheme as possible sources for the speech dysfluencies in stuttering (see also Max, in press, for a detailed version of one of the hypotheses and Max et al., in press, for an expanded version of both hypotheses presented here). Our theoretical position from which these hypotheses are formulated has a strong foundation in a wide variety of experimental data and theoretical constructs regarding (a) the neural processes involved in speech and nonspeech motor control, (b) mechanisms that allow the central nervous system (CNS) to plan movements in ways that take into account the multiple intricate and dynamic transformations from central commands to movement consequences, (0) the neural mechanisms and substrates involved in sensorimotor learning, and (d) cortical and subcortical activation patterns during speech and nonspeech motor tasks. This background of the overall theoretical perspective is presented here only briefly as literature reviews and detailed discussions are available in our related publications (Max, in press; Max et al., in press).

Prior to presenting the two hypotheses that are the focus of this paper, it is helpful to describe a global model of motor control that serves as the context for these hypotheses. First, describing this model will familiarize the reader with the terminology as used in our hypotheses. Second, it will make it possible to consider our hypotheses about stuttering from this general perspective and, in doing so, to identify the components and mechanisms that are suggested as potential sources of involuntary sound/ syllable repetitions and prolongations. Third, it will facilitate understanding of our initial computer simulations of one hypothesis as implemented in the DIVA model, a mathematical neural model of speech production (Guenther, 1994; Guenther & Ghosh, 2003).

The global model, shown schematically in Figure 1, proposes two interacting components: a feedforward control system and a feedback control system. The feedforward control system depends on the availability of continually updated and accurate inverse internal models of the system dynamics to inversely compute the necessary motor commands that would result in a planned movement goal given the dynamics of the system and its current state (Desmurget & Grafton, 2000; Shadmehr & Holcomb, 1997; Shadmehr & Mussa-Ivaldi, 1994; Wolpert et al., 2001). Thus, an inverse internal model is conceptualized as a neural map from the desired sensory consequences to the central commands necessary to achieve those consequences in the presence of time-Varying influences of effector-specific variables such as neural and muscular physiological factors and biomechanics. After determination of the movement goal and desired movement outcome, the feedforward control system accesses the inverse internal models to accomplish the inverse computation of motor commands that will be executed directly to the effector musculature.

The feedback control system monitors ongoing movements, and provides a mechanism for corrections if necessary. Of course, it is well known that purely afferent feedback control is problematic due to the significant time lag of the afferent signals relative to the controlled events. However, numerous sources in the recent literature on the neuroscience of motor control suggest that the problems associated with such time delays are circumvented by evaluating a copy of the prepared motor commands (efference copy) with forward internal models (Bhushan & Shadmehr, 1999; Blakemore et al., 2001; Desmurget & Grafton, 2000; Flanagan & Wing, 1997; Mehta & Schaal, 2002; Wolpert & Miall, 1996; Wolpert et al., 2001). A forward model is a neural map from motor commands to sensory consequences. In other words, a forward model allows a prediction of the sensory consequences of a given set of motor commands because it provides the controller with information about the dynamics of an effector system’s response to a given input. In contrast to a pure feedback control scheme in which the sensory consequences for a movement at time point tx are available at a later time tx+DA (DA is the delay associated with afferent input), control schemes based on forward modeling allow a prediction of the sensory consequences to be made availableat an earlier time point txD-F+DE (DF, the delay associated with the feedforward signal, is the time interval between motor command preparation and muscle contraction such that tx-DF is the time when the commands for movement at tx are prepared; DE is a delay associated with forward model-basedevaluating of an efference copy of those commands). This prediction is based on the efference copy and forward models in combination with afferent information from time point tx-DF+DA. If the predicted consequences differ from the desired and planned movement goal and/or trajectory, corrections to the efferent signals can be made from the beginning of the movement (as opposed to only at the end of the movement in pure feedback control) and possibly (depending on the extent of delay DE) even during command preparation/execution prior to the actual initiation of muscle contraction.

  1. Hypothesized sources of stuttering
Inaccurate/unstable internal models hypothesis

As one of two possibilities presented here, we hypothesize that individuals who stutter may have, or may have had during childhood, problems with the acquisition and/or updating of the inverse and/or forward internal models included in the model shown in Figure 1. In essence, this hypothesis1 suggests that an important aspect of the disorder may lie in an inability to acquire stable (e.g., not inappropriately changing in response to time-varying aspects of afferent signals) and correct mappings inverse, forward, or both) between motor commands and sensory consequences or to continually update these mappings during speech development.


Figure 1. Schematic representation of a global model of motor control. The model represents a

hybrid control scheme consisting of a feedforward controller and a feedback controller that make use of inverse and forward internal models, respectively2

Correct mappings may be particularly critical for speech production given that this task requires -besides the kinematic and dynamic transformations also associated with limb movements -additional transformations because the CNS also needs accurate forward and inverse representations of the conversions from articulatory movements to acoustic output. Moreover, rapid neural and craniofacial developmental changes during childhood require that these internal representations of command-to-output transformations be updated in parallel. It is well documented that dramatic anatomical changes take place in the vocal tract during development (e.g., Kent & Vorperian, 1995). Consequently, children’s motor systems face the challenging task of acquiring and updating multiple internal models for a continually changing neuromotor system. If, for currently unknown but possibly neuroanatomical or neurochemical reasons, the CNS would fail to accurately update the internal models in order to match the applicable transformations, it would become impossible to correctly derive the necessary commands for a desired sensory outcome or to predict with great precision the sensory consequences of planned motor commands.

Specifically, problems with the inverse models would result in inaccurate computations of the feedforward commands. When these incorrectly prepared motor commands are executed, their sensory consequences would not match the desired consequences. This could result in an increased need for feedback-based corrections, including interruptions or re-sets of the feedforward commands that give rise to sound/syllable repetitions and sound prolongations. In addition to this possibility, however, we speculate that it may be more likely that the types of speech dysfluencies that are characteristic of stuttering result from problems with forward internal models. If the consequences of prepared motor commands cannot be accurately predicted based on an efference copy and concurrent afferent inflow, a mismatch may arise between predicted and actual consequences of the executed movements regardless of whether or not the generated commands were accurate. A result of this mismatch could be that the CNS responds by re-attempting the movement and re- issuing the central commands until the sensory consequences are interpreted as matching the desired consequences, sustaining the already ongoing commands until the conflict is resolved or avoided by relying on online moment-to-moment feedback, or generating a different set of commands. Each of these types of attempted repairs could result in prolonged or repeated muscle contractions, and", thus, sound/syllable repetitions and sound prolongations.

From this perspective, the often replicated finding that individuals who stutter show longer movement durations in both speech and nonspeech motor tasks (for an overview, see Max, in press) is interpreted to reflect a preferred motor control strategy because longer durations allow more time for the processing and integration of afferent information. Compared with normally fluent speakers, individuals who stutter may rely more on strictly afferent information to compensate for the reduced efficiency of the typical feedback controller that relies on a combination of efference copies, forward models, and afferent information to make a prediction of the sensory consequences of a movement in an_ anticipatory manner. This interpretation is consistent with experimental demonstrations that speaking at a slower rate is a fluency-enhancing condition for most individuals who stutter (Adams et al., 1973). In addition, in light of the fact that PET studies have indicated that stuttering individuals fail to achieve normal levels of activation in auditory cortical areas during speech production (Braun et al., 1997; Fox et al., 1996), the fluency-enhancing effect of altered auditory feedback conditions (e.g., masking noise, frequency-altered feedback, delayed auditory feedback, chorus reading) may result from the fact that these conditions provide an external stimulation of auditory cortex during speech production. Based on existing evidence that auditory cortex activation in normal subjects partly reflects vocal-to-auditory priming (Paus et al., 1996), we suggest that activation by an external stimulus could accomplish its fluency-enhancing effect through an improved monitoring of the efference copies.

Weak feedforward control and over-reliance on feedback hypothesis A slightly different perspective on the possible sensorimotor sources of stuttering, but also framed within the global model shown in Figure l, is formulated in our second hypothesis. One important difference with the hypothesis discussed above is that this second hypothesis does not assume any problems with stuttering speakers’ internal models. Another difference is that the second hypothesis proposes that an over-reliance on strictly afferent feedback is not a strategy selected to avoid stuttering (by circumventing inaccurate or unstable internal models) but rather a strategy that actually results in stuttering due to instabilities inherent in this type of control.

Indeed, there is always a time lag between a motor command and its sensory consequences. When movements are primarily under afferent feedback control (i.e., weighted more toward afferent feedback control than, as is common for well-practiced tasks, toward feedforward control), the delay in arrival of the sensory signals may render the system unstable. Such instabilities, expected particularly for fast movements (such as speech movements), could lead to effector oscillations and system re-sets. Similar to the proposal in our alternative hypothesis discussed above, re-sets of the sensorimotor system would result in the observable speech dysfluencies that are characteristic of stuttering.

If a control strategy that is biased toward afferent feedback control results in system instabilities and stuttering moments, why then would stuttering individuals continue to use such a strategy that nonstuttering individuals replace with a feedforward strategy after learning the motor task during development? We hypothesize that individuals who stutter may have weakened feedforward control projections, and this, in turn, may lead to the need or preference for a speech motor strategy that depends primarily on feedback. For example, using diffusion tensor imaging to investigate brain structure, Sommer et al. (2002) found that stuttering adults show abnormalities in the white matter pathways underlying the orofacial area of the left-hemisphere primary sensorimotor cortex. Damage to these pathways may compromise the feedforward command from premotor to primary motor areas.

In the context of this hypothesis, stuttering individuals’ preference for longer movement durations during speech and nonspeech movements as well as improvements in speech fluency under conditions of decreased speech rate are attributed to the fact that slower movements, as compared with faster movements, are less affected by the delays associated with afferent information. A possible interpretation for the fluency-enhancing effect of altered auditory feedback conditions is that these conditions may force the system to depend less on strictly afferent feedback control, thus reducing the problems associated with delays in those afferent signals.

  1. Computer simulations with the DIVA model
Since the mid 1990s, Guenther’s research group has been developing, updating, and expanding a neural network model of the central control (and its acquisition) of speech movements (e.g., Guenther, 1994; Guenther & Ghosh, 2003). This model, known as the DIVA model (Directions Into Velocities of Articulators), combines mathematical descriptions of underlying commands, cerebral and cerebellar neural substrates corresponding to the model’s components, and computer simulations controlling an articulatory synthesizer.

During an initial babbling phase, the model (schematically represented in Figure 2) learns to control movements of the vocal tract by using the auditory feedback from self-generated speech sounds to learn the mappings between central commands and acoustic consequences. Once these neural mappings have been tuned, production of a particular sequence of speech sounds starts with the activation of speech sound map cells in premotor cortex (in the computer simulations, each speech sound map cell corresponds to one phoneme or syllable). Activation of a speech sound map cell results in the readout of a feedforward command from premotor cortex to primary motor cortex as well as a feedback command passing through theâ_˜ auditory and somatosensory areas before reaching motor cortex.

Early in development, the feedforward command is inaccurate, and the model depends on feedback control. The projections in the feedback control subsystem constitute forward models, thus encoding the expected sensory consequences of the sounds to be produced. The feedback system compares these expectations to the system’s current state as signaled by incoming afferent information. If the current auditory and somatosensory states are outside the target regions for the produced sound, error signals are generated in higher-order sensory areas. The error signals are then transformed into corrective motor commands by projections from the sensory areas to the primary motor cortex. Over time, however, the feedforward command becomes WCll tuned through monitoring of the movements controlled by the feedback subsystem. Once the feedforward subsystem is accurately tuned, the system can rely almost entirely on feedforward commands because few sensory errors are generated unless external perturbations are applied to the system. 

In the DIVA model, cells in the motor cortex generate the overall motor command M(t) :hat is a combination of feedforward and "eedback commands and that is defined as


with aff, and afb representing the amount of weighting toward feedforward and feedback control, respectively, and g(t) representing a speech rate signal that is 0 when not speaking and 1 when speaking at the maximum rate.

In computer simulations to date, our hypothesis that stuttering may result from weak feedforward control and over-reliance on feedback (i.e., the second of our two hypotheses described above) has been implemented in the DIVA model by using an inappropriately low value of otfi, and an inappropriately high value of otfb. Interestingly, introducing such a bias :oward (unstable) feedback control, in combination with a reset signal :riggered by large sensory errors, leads to stuttering behavior -in particular sound repetitions -in the vocal tract model. Work is currently underway to determine whether or not our hypothesis that stuttering may result from unstable or incorrect internal models (i.e., the first hypothesis described above) is also feasible within the DIVA model given its specific mathematical representations of internal models in the feedforward and feedback control subsystems.


Figure 2. Left panel: Schematic representation of the DIVA model, a mathematical nearal network model of speech movements (Gitenther, I994; Gaenther & Ghosh, 2003 ). Right panel: Vocal tract model controlled by the DIVA model for speech synthesis2

  1. Conclusion
Based on current insights into the neural control of movement, we have described here two specific hypotheses about the possible sensorimotor sources of stuttering. Both hypotheses are primarily explanations of the so-called “proximal” sources of stuttering. That is, they propose explanations for what causes a single moment of stuttering when an individual who stutters is speaking. In their current stage of development, the hypotheses do not fully address the “distal” sources of stuttering (i.e., Why does a certain individual have the disorder?) although some preliminary speculations in this regard are included (e.g., anatomical differences in white matter pathways) and understanding this aspect of the disorder is an integral part of the long-term goals of our collaborative work.

Specifically, the two hypotheses suggested here are that stuttering may result from (a) unstable or incorrect internal models in the feedforward and feedback control subsystems for speech movements or (b) an over-relying on afferent feedback that, due to the time lags inherent in afferent signals, leads to instabilities in the control of these speech movements. In two more detailed publications elsewhere, we have presented several arguments suggesting that the overall perspective taken for this work can account not only for the primary characteristics of stuttering but also for a wide range of phenomena associated with the disorder and its development (Max, in press; Max et al., in press).

Initial computer Simulations of the DIVA model have already demonstrated that speech dysfluencies in the form of sound repetitions can indeed occur when motor commands are biased toward afferent feedback control as suggested in the weak feedforward control and over-reliance on feedback hypothesis. Testing of simulations for the alternative hypothesis and initial empirical testing based on recent reports of sensorimotor adaptation in speech (Houde & Jordan, 1998; Max et al., 2003) are currently in progress.

This work was funded, in part, by NIH grants DC 03102 (P.I. Vincent L. Gracco) and DC02852 (P.I. Frank H. Guenther).

1Although there is overlap with work by Neilson and Neilson (1987, 1991), the current hypothesis suggests different explanations for (a) the nature of the speech dysfluencies, (b the neural processes leading to those dysfluencies, (c) the common observation of prolonged speech and nonspeech movement durations in individuals who stutter, (d) the role of anatomical/neural maturation, and (e) the effect of fluency-enhancing conditions.
2Copyright by NSSLHA and reprinted with permission.

Adams, M. R., Lewis, J. 1., & Besozzi, T. E. (1973). The effect of reduced reading rate on stuttering frequency. Journal of Speech and Hearing Research, 16, 671-675.

Braun, A. R., Varga, M., Stager, S., Schulz, G., Selbie, S., Maisog, J. M., Carson, R. E., & Ludlow, C. L. (1997). Altered patterns of cerebral activity during speech and language production in developmental stuttering: An H2( 15)O positron emission tomography study. Brain, 1 20, 761-784.

Bhushan, N., &_Shadmehr, R. (1999). Computational nature of human adaptive control during learning of reaching movements in force fields. Biological Cybernetics, 81, 39-60.

Blakemore, S. J ., Frith, C. D., & Wolpert, D. M. (2001). The cerebellum is involved in predicting the sensory consequences of action. Neuroreport, 12, 1879-1884.

Desmurget, M., & Grafton, S. (2000). Forward modeling allows feedback control for fast reaching movements. Trends in Cognitive Sciences, 4, 423-431.

Flanagan, J . R., & Wing, A. M. (1997). The role of internal models in motion planning and control: Evidence from grip force adjustments during movements of hand-held loads. Journal of Neuroscience, 17, 1519-1528.

Fox, P. T., Ingham, R. J ., Ingham, J. C., Hirsch, T. B., Downs, J . H., Martin, C., Jerabek, P., Glass, T., & Lancaster, J. L. (1996). A PET study of the neural systems of stuttering. Nature, 382, 158-161.

Guenther, F. H. (1994). A neural network model of speech acquisition and motor equivalent speech production. Biological Cybernetics, 72, 43-53. .

Guenther, F. H., & Ghosh, S. S. (2003). A model of cortical and cerebellar function in speech. In M.J. Sole, D. Recasens, & J . Romero (Eds.), Proceedings of the 15â__â  International Congress of Phonetic Sciences (pp. 1053-1056). Barcelona, Spain.

Houde, J . F., & Jordan, M. I. (1998). Sensorimotor adaptation in speech production. Science, 279, 1213-1216.

Kent, R. D., & Vorperian, H. K. (1995). Development of the craniofacia1-oral-laryngeal anatomy: A review. Journal of Medical Speech-Language Pathology, 3, 145-190. 360 Theory, research and therapy in fluency disorders

Max, L. (in press). Stuttering and internal models for sensorimotor control: A theoretical perspective to generate testable hypotheses. In B. Maassen, R. Kent, H. F. M. Peters, P. van Lieshout, & W. Hulstijn (Eds.), Speech motor control in normal and disordered speech. Oxford, UK: Oxford University Press.

Max, L., Wallace, M. E., & Vincent, 1. (2003).'Sensorimotor adaptation to auditory perturbations during speech: Acoustic and kinematic experiments. In M.J. Solé, D. Recasens, & J. Romero (Eds.), Proceedings of the 15"‘ International Congress of Phonetic Sciences (pp. 1053-1056). Barcelona, .Spain.

Max, L., Guenther, F. H., Gracco, V. L., Ghosh, S. S., & Wallace, M. E. (2004). Feedback-biased motor control and inaccurate internal models as sources of dysfluency: A theoretical model of stuttering. Contemporary Issues in Communication Sciences and Disorders, 3], 105-122.

Mehta, B., & Schaal, S. (2002). Forward models in visuomotor control. Journal of Neurophysiology, 88, 942-953.

Neilson, M. D., & Neilson, P. D. (1987). Speech motor control and stuttering: A computational model of adaptive sensory-motor processing. Speech Communication, 6, 325-333.

Neilson, M. D., & Neilson, P. D. (1991). Adaptive model theory of speech motor control and stuttering. In H. F. M. Peters, W. Hulstijn, & C. W. Starkweather (Eds.), Speech motor control and stuttering (pp. 149-156). Amsterdam, The Netherlands: Elsevier.

Paus, T., Perry, D. W., Zatorre, R. J ., Worsley, K. J ., 82; Evans, A. C. (1996). Modulation of cerebral blood flow in the human auditory cortex during speech: Role of motor-to-sensory discharges. European Journal of Neuroscience, 8, 2236-2246.

Postma, A., & Kolk, H. (1993). The covert repair hypothesis: Prearticulatory repair processes in normal and stuttered disfluencies. Journal of Speech and Hearing Research, 36, 472-487.

Prins, D., Main, V., & Wampler, S. (1997). Lexicalization in adults who stutter. Journal of Speech, Language, and Hearing Research, 40, 373-384. A

Shadmehr, R., & Holcomb, H. H. (1997). Neural correlates. of motor memory consolidation. Science, 277, 821-825.

Shadmehr, R., & Mussa-lvaldi, F. A. (1994). Adaptive representation of dynamics during learning of a motor task. Journal of Neuroscience, 14, 3208-3224.

Sommer, M., Koch, M. A., Paulus, W., Weiller, C., & Biichel, C. (2002). Disconnection of speech- relevant brain areas in persistent developmental stuttering. Lancet, 360, 380-383.

Wolpert, D. M., & Miall, R. C. (1996). Forward models for physiological motor control. Neural Networks, 9, 1265-1279.

Wolpert, D. M., Ghahramani, Z., & Flanagan, J. R. (2001). Perspectives and problems in motor learning. Trends in Cognitive Sciences, 5, 487-494.


In preparation for the 2018 World Congress the IFA is implementing Japanese translations of some pages on the site. Choosing Japanese below to see these translations.

Not all pages are translated, but you can use Google translate to see a machine translation using the switch below

Google Translate

Follow the Joint World Congress