SciELO - Scientific Electronic Library Online

 
vol.5 número1Invited review: Epidemics on social networksEnhancement of photoacoustic detection of inhomogeneities in polymers índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

  • No hay articulos citadosCitado por SciELO

Links relacionados

  • No hay articulos similaresSimilares en SciELO

Compartir


Papers in physics

versión On-line ISSN 1852-4249

Pap. Phys. vol.5 no.1 La Plata jun. 2013

http://dx.doi.org/10.4279/PIP.050004 

 

Revisiting the two-mass model of the vocal folds

 

M. F. Assaneo,1 M. A. Trevisan

DOI: http://dx.doi.org/10.4279/PIP.050004

*E-mail: florencia@df.uba.ar
•E-mail: marcos@df.uba.ar

Laboratorio de Sistemas Dinámicos, Depto. de Física, FCEN, Universidad de Buenos Aires. Pabellón I, Ciudad Universitaria, 1428EGA Buenos Aires, Argentina.

 

Realistic mathematical modeling of voice production has been recently boosted by ap-plications to different fields like bioprosthetics, quality speech synthesis and pathological diagnosis. In this work, we revisit a two-mass model of the vocal folds that includes accu-rate fluid mechanics for the air passage through the folds and nonlinear properties of the tissue. We present the bifurcation diagram for such a system, focusing on the dynamical properties of two regimes of interest: the onset of oscillations and the normal phonation regime. We also show theoretical support to the nonlinear nature of the elastic properties of the folds tissue by comparing theoretical isofrequency curves with reported experimental data.

 

I. Introduction

In the last decades, a lot of effort was devoted to de-velop a mathematical model for voice production. The first steps were made by Ishizaka and Flanagan 1, approximating each vocal fold by two cou-pled oscillators, which provide the basis of the well known two-mass model. This simple model reproduces many essential features of the voice produc-tion, like the onset of self sustained oscillation of the folds and the shape of the glottal pulses.

Early analytical treatments were restricted to small amplitude oscillations, allowing a dimensional reduction of the problem. In particular, a two dimensional approximation known as the flap-ping model was widely adopted by the scientific community, based on the assumption of a transversal wave propagating along the vocal folds 2, 3. Moreover, this model was also used to successfully explain most of the features present in birdsong 4, 5.

Faithful modeling of the vocal folds has recently found new challenges: realistic articulatory speech synthesis 6-8, diagnosis of pathological behavior of the folds 9, 10 and bioprosthetic applications 11. Within this framework, the 4-dimensional two-mass model was revisited and modified. Two main improvements are worth noting: a realistic description of the vocal fold collision 13,14 and an accurate fluid mechanical description of the glottal flow, allowing a proper treatment of the hydrody-namical force acting on the folds 8,15.

In this work, we revisit the two-mass model de-veloped by Lucero and Koenig 7. This choice rep-resents a good compromise between mathematical simplicity and diversity of physical phenomena act-ing on the vocal folds, including the main mechani-cal and fluid effects that are partially found in other models 13, 15. It was also successfully used to reproduce experimental temporal patterns of glot-tal airflow. Here, we extend the analytical study of this system: we present a bifurcation diagram, explore the dynamical aspects of the oscillations at the onset and normal phonation and study the

This work is organized as follows: in the second section, we describe the model. In the third section, we present the bifurcation diagram, compare our solutions with those of the flapping model approximation and analyze the isofrecuency curves. In the fourth and last section, we discuss our results. force, m is the mass and kc the coupling stiffness. The horizontal displacement from the rest position x0 is represented by x.

We use a cubic polynomial for the restitution term Eq. (2), adapted from 1, 7. The term with a derivable step-like function Θ Eq. (3) accounts for the increase in the stiffness introduced by the collision of the folds. The restitution force reads

II. The model

Each vocal fold is modeled as two coupled damped oscillators, as sketched in Fig. 1.


Figure 1: Sketch of the two-mass model of the vocal folds. Each fold is represented by masses m1 and m2 coupled to each other by a restitution force kc and to the laryngeal walls by K1 and K2 (and dampings B1 and B2), respectively. The displace-ment of each mass from the resting position x0 is represented by x1 and x2. The different aerody-namic pressures P acting on the folds are described in the text.

In order to describe the hydrodynamic forcé that the airflow exerts on the vocal folds, we have adopted the standard assumption of small inertia of the glottal air column and the model of the bound-ary layer developed in 7, 11, 15. This model as-sumes a one-dimensional, quasi-steady incompress-ible airflow from the trachea to a separation point. At this point, the flow separates from the tissue surface to form a free jet where the turbulence dis-sipates the airflow energy. It has been experimen-tally shown that the position of this point depends on the glottal profile. As described in 15, the separation point located at the glottal exit shifts down to the boundary between masses mi and m-i when the folds profile becomes more divergent than a threshold Eq. (7).

Viscous losses are modeled according to a bi-dimensional Poiseuille flow Eqs. (6) and (7). The equations for the pressure inside the glottis are

As sketched in Fig. 1, the pressures exerted by the airflow are: P¿„ at the entrance of the glottis, Pyi at the upper edge of mj, P21 at the lower edge of m2, -Pomí at the entrance of the vocal tract and Pa the subglottal pressure.

The width of the folds (in the plañe normal to Fig. 1) is lg; di and di are the lengths of the lower and upper masses, respectively. a¿ are the cross-sections of the glottis, a¿ = 2/s(x¿ + xo); /x and p are the viscosity and density coefficient of the air; ug is the airflow inside the glottis, and ks = 1.2 is an experimental coefficient. We also assume no losses at the glottal entrance Eq. (5), and zero pressure at the entrance of the vocal tract Eq. (8).

The hydrodynamic forcé acting on each mass reads:

Following 1, 7, 10, these functions represent opening, partial closure and total closure of the glottis. Throughout this work, piecewise functions P21, /i and /2 are modeled using the derivable step-like function © defined in Eq. (3).

III. Analysis of the model

i. Bifurcation diagram

The main anatomical parameters that can be ac-tively controlled during the vocalizations are the subglottal pressure Ps and the folds tensión controlled by the laryngeal muscles. In particular, the action of the thyroarytenoid and the cricothyroid muscles control the thickness and the stiffness of folds. Following 1, this effect is modeled by a pa-rameter Q that scales the mechanic properties of the folds by a cord-tension parameter: kc = Qkco, ks = Qkjí and ras = ^ír. We therefore performed a bifurcation diagram using these two standard control parameters Ps and Q.

Five main regions of different dynamic solutions are shown in Fig. 2. At low pressure valúes (región I), the system presents a stable fixed point. Reaching región II, the fixed point becomes un-stable and there appears an attracting limit cycle. At the interface between regions I and II, three bifurcations occur in a narrow range of subglottal pressure (Fig. 3, left panel), all along the Q axis. The right panel of Fig. 3 shows the oscilla-tion amplitude of X2. At point A, oscillations are born in a supercritical Hopf bifurcation. The amplitude grows continuously for increasing Ps until point B, where it jumps to the upper branch. If the pressure is then decreased, the oscillations persist even for lower pressure valúes than the onset in A. When point C is reached, the oscillations suddenly stop and the system returns to the rest position. This onset-offset oscillation hysteresis was already reported experimentally in 12.

The branch AB depends on the viscosity. De-creasing /x, points A and B approach to each other until they collide at ¡1 = 0, recovering the result reported in 3,10,14, where the oscillations occur as the combination of a subcritical Hopf bifurcation and a cyclic fold bifurcation.

On the other hand, the branch BC depends on the separation point of the jet formation. In particular, for increasing ks, the folds become stiffer and the separation point moves upwards toward the output of the glottis. From a dynamical point of view, points C and B approach to each other until they collapse. In this case, the oscillations are born at a supercritical Hopf bifurcation and the system presents no hysteresis, as in the standard flapping model 17.

Regions II and III of Fig. 2 are separated by a saddle-repulsor bifurcation. Although this bifurcation does not represent a qualitative dynamical change for the oscillating folds, its effects are rele-vant when the complete mechanism of voiced sound production is considered. Voiced sounds are gener-ated as the airflow disturbance produced by the oscillation of the vocal folds is injected into the series of cavities extending from the laryngeal exit to the mouth, a non-uniform tube known as the vocal tract. The disturbance travels back and forth along the vocal tract, that acts as a filter for the original signal, enhancing the frequencies of the source that fall near the vocal tract resonances. Voiced sounds are in fact perceived and classified accord-ing to these resonances, as in the case of vowels 18. Consequently, one central aspect in the generation (2013) / M. F. Assaneo et al.


Figure 2: Bifurcation diagram in the plañe of sub-glottal pressure and fold tensión (Q,PS). The in-sets are two-dimensional projections of the flow on the (wi,xi) plañe, the red crosses represent unsta-ble fixed points and the dotted lines unstable limit cycles. Normal voice occurs at (Q,PS) ~ (1,800). The color code represents the linear correlation be-tween {x\ — x^) and (í/i + y^)'. from dark red for R = 1 to dark blue for R = 0.6. This diagram was developed with the help of AUTO continuation software 20. The rest of the parameters were fixed at mi = 0.125 g, mi = 0.025 g, kio = 80 N/m, lt2o = 8 N/m, kc = 25 N/m, t\ = 0.1, e2 = 0.6, lg = 1.4 cm, d\ = 0.25 cm, ¿2 = 0.05 cm and xo = 0.02 cm.


Figure 3: Hysteresis at the oscillation onset-offset. Left panel: zoom of the interface between regions I and II. The blue and green lines represent folds of cycles (saddle-node bifurcations in the map). The red line is a supercritical Hopf bifurcation. Right panel: the oscillation amplitude of xi as a function of the subglottal pressure Ps, at Q = 1.71. The continuation of periodic solutions was realized with the AUTO software package 20.

 

Interestingly, normal phonation occurs in the región near the appearance of the saddle-repulsor bifurcation. Although this bifurcation does not al-ter the dynamical regime of the system or its time scales, we have observed that part of the limit cycle approaches the stable manifold of the new fixed point (as displayed in Fig. 4), therefore changing its shape. This deformation is not restricted to the appearance of the new fixed point but rather occurs in a coarse región around the boundary between II and III, as the flux changes smoothly in a vicinity of the bifurcation. In order to Alústrate this effect, we use the spectral contení Índex SCI 21, an indicator of the spectral richness of a signal: SCI = ~^2k Ai~fk/C^2k -Afc/o), where A¡~ is the Fourier amplitude of the frequency f¡. and /o is the fundamental frequency. As the pressure is in-creased, the SCI of xi(í) increases (upper right panel of Fig. 4), observing a boost in the vicinity of the saddle-repulsor bifurcation that stabilizes after the saddle point is generated.

Thus, the appearance of this bifurcation near the región of normal phonation could indicate a possi-ble mechanism to further enhance the spectral richness of the sound source, on which the production of voiced sounds ultimately relies.

 


Figure 4: A projection of the limit cycle for x\ and the stable manifold of the saddle point, for param-eters consistent with normal phonatory conditions, (Q,PS) = (1,850) (región III). Left inset: projection in the 3-dimensional space (yi, x\, xi). Right inset: Spectral contení Índex of x\{t) as a function of Pa for a fixed valué of Q = 0.95. In green, the valué at which the saddle-repulsor bifurcation takes place.

In the boundary between regions III and IV, one of the unstable points created in the saddle-repulsor bifurcation undergoes a subcritical Hopf bifurcation, changing stability as an unstable limit cycle is created 19. Finally, entering región V, the stable and the unstable cycles collide and disappear in a fold of cycles where no oscillatory regimes exist.

In Fig. 2, we also display a color map that quan-tifies the difference between the solutions of the model and the flapping approximation. The flap-ping model is a two dimensional model that, instead of two masses per fold, assumes a wave propagating along a linear profile of the folds, Le., the displace-ment of the upper edge of the folds is delayed 2t with respect to the lower. The cross sectional áreas at glottal entry and exit (ai and 0,2) are approxi-mated, in terms of the position of the midpoint of the folds, by

where x is the midpoint displacement from equilibrium xo, and t is the time that the surface wave takes to travel half the way from bottom to top. Equation (11) can be rewritten as {x\ — X2) = t(í/i +2/2)- We use this expression to quantify the difference between the oscillations obtained with the two-mass model solutions and the ones gener-ated with the flapping approximation, computing the linear correlation coeflicient between (x\ — X2) and (í/i + 2/2)- As expected, the correlation coeflicient R decreases for increasing Ps or decreasing Q. In the región near normal phonation, the approximation is still relatively good, with R ~ 0.8. As expected, the approximation is better for increasing xo, since the effect of colliding folds is not included in the flapping model.

ii. Isofrequency curves

One basic perceptual property of the voice is the pitch, identified with the fundamental frequency /o of the vocal folds oscillation. The production of different pitch contours is central to language, as they affect the semantic contení of speech, carry-ing accent and iníonaíion informaíion. Alíhough experimeníal daía on piích conírol is scarce, ií was reporíed íhaí ií is acíively conírolled by íhe laryn-geal muscles and íhe subgloííal pressure. In particular, when íhe vocalis or iníeraryíenoid muscle acíiviíy is inacíive, a raise of íhe subgloííal pressure produces an upraising of íhe piích 16.

Compaíible wiíh íhese experimeníal resulís, we performed a íheoreíical analysis using Ps as a single conírol parameíer for piích. In íhe upper pan-els of Fig. 5, we show isofrequency curves in íhe range of normal speech for our model of Eqs. (1) ío (10). Following íhe ideas developed in 22 for íhe avian case, we compare íhe behavior of íhe fúndameni al frequency wiíh respecí ío pressure Ps in íhe íwo mosí usual cases preseníed in íhe lií-eraíure: íhe cubic 1, 7 and íhe linear 10, 14 resíiíuíions. In íhe lower panels of Fig. 5, we show íhe isofrequency curves íhaí resulí from re-placing íhe cubic resíiíuíion by a linear resíiíuíion Ki(xi) = kiXi + xi + X° Y¿ki(xi + In).

Alíhough íhe curves fo(Ps) are noí affecíed by íhe íype of resíiíuíion ai íhe very beginning of oscillations, íhe changes become evidení for higher valúes of Ps, wiíh posiíive slopes for íhe cubic case and negaíive for íhe linear case. This resulí sug-gesís íhaí a nonlinear cubic resíiíuíion forcé is a respect to the oscillation onset, we showed how jets and viscous losses intervene in the hysteresis phenomenon.

Many different models for the restitution prop-erties of the tissue have been used across the literature, including linear and cubic functional forms. Yet, its specific role was not reported. Here we showed that the experimental relationship between subglottal pressure and pitch is fulfilled by a cubic term.

IV. Conclusions

In this paper, we have analyzed a complete two-mass model of the vocal folds integrating collisions, nonlinear restitution and dissipative forces for the tissue and jets and viscous losses of the air-stream. In a framework of growing interest for detailed modeling of voice production, the aspects studied here contribute to understanding the role of the different physical terms in different dynamical be-haviors.

We calculated the bifurcation diagram, focusing in two regimes: the oscillation onset and normal phonation. Near the parameters of normal phonation, a saddle repulsor bifurcation takes place that modifies the shape of the limit cycle, contributing to the spectral richness of the glottal flow, which is central to the production of voiced sounds. With good model for the elastic properties of the oscil-lating tissue.

Acknowledgements - This work was partially funded by UBA and CONICET.

1 K Ishizaka, J L Flanagan, Synthesis of voiced sounds from a two-mass model of the vocal cords, Bell Syst. Tech. J. 51, 1233 (1972).         [ Links ]

2 I R Titze, The physics of smallamplitude os-cillation of the vocal folds, J. Acoust. Soc. Am. 83, 1536 (1988).         [ Links ]

3 M A Trevisan, M C Eguia, G Mindlin, Nonlin-ear aspects of analysis and synthesis of speech time series data, Phys. Rev. E 63, 026216 (2001).         [ Links ]

4 Y S Perl, E M Arneodo, A Amador, F Goller, G B Mindlin, Reconstruction of physiological instructions from Zebra finch song, Phys. Rev. E 84, 051909 (2011).         [ Links ]

5 E M Arneodo, Y S Perl, F Goller, G B Mindlin, Prosthetic avian vocal organ controlled by a freely behaving bird based on a low dimensional model of the biomechanical periphery, PLoS Comput. Biol. 8, e1002546 (2012).         [ Links ]

6 B H Story, I R Titze Voice simulation with a bodycover model of the vocal folds, J. Acoust. Soc. Am. 97, 1249 (1995).         [ Links ]

7 J C Lucero, L Koening Simulations of temporal patterns of oral airflow in men and women using a two-mass model of the vocal folds un-der dynamic control, J. Acoust. Soc. Am. 117, 1362 (2005).         [ Links ]

8 X Pelorson, X Vescovi, C Castelli, E Hirschberg, A Wijnands, A P J Bailliet, H M A Hirschberg, Description of the flow through in-vitro models of the glottis during phonation. Application to voiced sounds synthesis, Acta Acust. 82, 358 (1996).         [ Links ]

9 M E Smith, G S Berke, B R Gerratt, Laryn-geal paralyses: Theoretical considerations and effects on laryngeal vibration, J. Speech Hear. Res. 35, 545 (1992).         [ Links ]

10 I Steinecke, H Herzel Bifurcations in an asym-metric vocalfold model, J. Acoust. Soc. Am. 97, 1874 (1995).         [ Links ]

11 N J C Lous, G C J Hofmans, R N J Veldhuis, A Hirschberg, A symmetrical two-mass vocal-fold model coupled to vocal tract and trachea, with application to prosthesis design, Acta Acust. United Ac. 84, 1135 (1998).

12 T Baer, Vocal fold physiology, University of Tokyo Press, Tokyo, (1981).

13 T Ikeda, Y Matsuzak, T Aomatsu, A numerical analysis of phonation using a two-dimensional flexible channel model of the vocal folds, J. Biomech. Eng. 123, 571 (2001).

14 J C Lucero, Dynamics of the two-mass model of the vocal folds: Equilibria, bifurcations, and oscillation region, J. Acoust. Soc. Am. 94, 3104 (1993).

15 X Pelorson, A Hirschberg, R R van Hassel, A P J Wijnands, Y Auregan, Theoretical and experimental study of quasisteadyflow separation within the glottis during phonation. Applica-tion to a modified twomass model, J. Acoust. Soc. Am. 96, 3416 (1994).

16 T Baer, Reflex activation of laryngeal muscles by sudden induced subglottal pressure changes, J. Acoust. Soc. Am. 65, 1271 (1979).

17 J C Lucero, A theoretical study of the hystere-sis phenomenon at vocal fold oscillation onset-offset, J. Acoust. Soc. Am. 105, 423 (1999).

18 I Titze, Principles of voice production, Prentice Hall, (1994).

19 J Guckenheimer, P Holmes, Nonlinear oscillations, dynamical systems and bifurcations of vector fields, Springer, (1983).

20 E Doedel, AUTO: Software for continuation and bifurcation problems in ordinary differen-tial equations, AUTO User Manual, (1986).

21 J Sitt, A Amador, F Goller, G B Mindin, Dy-namical origin of spectrally rich vocalizations in birdsong, Phys. Rev. E 78, 011905 (2008).

22 A Amador, F Goller, G B Mindlin, Frequency modulation during song in a suboscine does not require vocal muscles, J. Neurophysiol. 99, 2383 (2008).

Creative Commons License Todo el contenido de esta revista, excepto dónde está identificado, está bajo una Licencia Creative Commons