Trans Voice Literature Review Part 4
A cool crop of articles here. I set out to find pre-2020 articles, but recovered one 2023 article I missed in earlier searching. Additionally, an article was published just a few days ago, so I included it at the end, having already written most of this. We’re seeing a wide range of techniques being employed, with varying efficacy. While we have no breakout stars, Gallena et al. [16] is the first article I’ve seen that isolated F0, F1, F2, and F3 as test parameters, even if they were digitally altered.
-
Gallena, S. J. K., Stickels, B., & Stickels, E. (2018). Gender Perception After Raising Vowel Fundamental and Formant Frequencies: Considerations for Oral Resonance Research. Journal of Voice, 32(5), 592–601. https://doi.org/10.1016/j.jvoice.2017.06.023
TL;DR: F0 of recordings of one cis male speaker were digitally altered to 175Hz, and F1 through F3 were gradually incremented to match recordings of one cis female speaker, creating 111 samples (79 modified) to be rated by 27 listeners. Adjusting for a moderate increase of F1-F3 had the highest perception of femininity, and adjusting F1 alone to be highly increase also had a perception of femininity.
Background: Prior research showed that the lowest F0 perceptually rated as feminine is 156Hz to 165Hz, yet when trans women increase their F0 to the gender ambiguous zone they are perceptually rated as masculine, implying that FF1-3 are the missing cues.
Results: For /ɑ/, F1-F3 together needed to be raised to 20% to give 70% feminine rating, but F1, F2, or F3 alone needed to be raised 80% to give the same rating. For /æ/, F1-F3 together needed to be raised 60%, or F1 alone to 80% (F2, F3 no affect alone). For /i/, no sample was rated as not masculine. For /u/, almost all samples were rated as masculine.
Review: While this study has no techniques for voice training whatsoever, it does interject some interesting findings regarding formant frequencies. Some of the present day research has focused on F2 (such as Kawitzky & McCallister [1]), while legacy research focused on F0. This study finds that either F1 or the combination of F1 through F3 has the highest impact on getting voices into the feminine category. I went back to Hillenbrand et al. [1.a] (which was cited in the background portion of this paper as well) and pulled the F1 frequencies for reference. Leyns et al. [3] used PET and ART to increment FF0-4, mirroring this study’s finding that incrementing all of the formant frequencies has the best outcome. Bøyesen & Hide [4] used twang and medialization to increase FF0-2 (but not F3) — Their conclusion is that F1 is controlled by “oral cavity openness”, which I interpreted as coming from twang. Overall, a fascinating contribution to the body of research.
-
Rapoport, S. K., Varelas, E. A., Park, C., Brown, S. K., Goldberg, L., & Courey, M. S. (2023). Patient Satisfaction and Acoustic Changes in Trans Women after Gender Affirming Voice Training. Laryngoscope, 133(9), 2340–2345. https://doi.org/10.1002/lary.30543
TL;DR: 34 participants underwent voice training that targeted resonance, prosody, and timbre. Most participants experienced an increase in F0. Six sessions were required for the participant to reach a desired voice outcome. Authors assert that The focus of voice training should be on participant satisfaction (such as via the TWVQ [Trans Woman Voice Questionnaire], not on observer ratings of femininity, so no rating was performed.
-
Hirsch, S. (2017). Combining Voice, Speech Science and Art Approaches to Resonant Challenges in Transgender Voice and Communication Training. Perspectives of the ASHA Special Interest Groups, 2(10), 74–82. https://doi.org/10.1044/persp2.SIG10.74
TL;DR: The author presents a framework for a training program for both transfem and transmasc voice goals, providing outcomes for vowels, voiced consonants, nasal consonants, liquid consonants, and open or closed vocal instrument at the end of an utterance. The author has developed this program over 10 years of experience with clients but did not clinically study the efficacy, so the results are anecdotal.
Background: Reference to Carew et al. [2] claim that raising F3 leads to increased perceptual naturalness. Reference to Garon et al., (2015) which found that F2 increased after acoustic articulatory mapping. Leveraging the high resonant forward focus of /i/ to model femininity and low round vowels for masculinity.
Process: Discuss the anatomy and physiology of speech sounds and phonemes. Go through each phoneme category and explain the challenges/solutions. Starting with phrases used often, read through each phrase slowly - phoneme by phoneme. Identify the potential acoustic challenges in each phoneme. Articulate silently, practicing the new articulatory gestures, then speak the sentence and repeat with the edit. Do this many times with different strategies until fully understood.
Transfem framework (problem; solution): Voiced sounds are darker than voiceless; avoid an acoustic burst by articulating with a lighter touch. The nasal sounds are constricted, pressed, dark; lighten or loosen the contact as much as possible, eg. light touch of tongue on teeth for /n/, lips barely held together for /m/. Vowels are mostly dark except for /i/; produce all vowels with an unstretched /i/ lip shape to lighten tone, almost as if slightly grinning. Liquid sounds are all tense and dark; eg. loosen lips for /w/, pronounce /ɝ/ with a mid mouth feeling. A closed vocal instrument is dark; keeping the mouth open beyond the end of a sentence facilitates a graceful finish.
Transmasc framework (problem; solution): Voiced sounds are darker than voiceless; a little extra pressure and loudness on the contact in /b/, /g/. The nasal sounds are produced with moderate contact; without pinching add more pressure. The vowels are too light, especially /i/; produce all vowels with a rounded or oval shape as much as possible to darken the tone, a small bit of lip protrusion to enhance, relax the jaw. Liquid sounds; maintain tension to create a greater release after pressure buildup. An open vocal instrument at the end of a sentence is light; a slight abrupt close will capture the dark tones.
Review: Despite not being purely academic, this is really the first article to provide a procedure with specificity. As a linguist who enjoys phonetics and phonology, the whole premise appeals to me. Building from the conclusions of Merritt & Bent [9.a], articulatory cues were slightly more important for perceptual gender than intonation when formants were in an ambiguous zone. At least we now know what those articulations are, which might help us plan things out.
-
Garon, G., Nichols, K., Lucarelli, M., Ellis, L., & Menezes, C. (2015, November). Articulatory-Acoustic Mapping in a Single Case Study of Male to Female Transgender Reparation Therapy. Presented at the 2015 Annual Convention of the American Speech-Language-Hearing Association, Denver, CO. [Poster 199]
-
Menezes, C., Lucarelli, M., Koesters, T., Ruta, K., Rymers, A., & Turshon, K. (2019). Articulating a Femal Vowel: Male to Female Transgender Voice Therapy. In: Sasha Calhoun, Paola Escudero, Marija Tabain & Paul Warren (Eds.) Proceedings of the 19th International Congress of Phonetic Sciences (pp. 3011 - 3015). https://www.internationalphoneticassociation.org/icphs-proceedings/ICPhS2019/
TL;DR: I pulled this as a replacement for Garon et al. (2015). Two participants applying different articulatory strategies over the course of one year. Acoustic analysis of F1 and F2 showed movement towards a feminine vowel space for /i/, /ɑ/, /u/, and /ᴂ/.
Methodology: Participant 1 focused first on raising F0, then second on forward articulation. Participant 2 focused simultaneously on F0 and articulation (for resonance). An Electromagnetic Articulograph was used to determine the position of the tongue while recording samples.
Results: Participant one increased F1 and F2 to nearly the control female values, participant two increased F1 and F2 for all vowels but /i/, which had increased F2 but lower F1. Both participants ended with a lower tongue height than they started with, but had opposite trajectories for tongue forwardness.
Review: Small sample size, but further building up the literature focused on the efficacy of articulatory training. It does raise some questions about the physiological tongue movement, though, given that the participants used opposite forwardness.
-
-
Stewart, C. F., & Kling, I. F. (2017). University Practicum for Transgender Voice Modification: A Motor Learning Perspective. Perspectives of the ASHA Special Interest Groups, 2(10), 102–108. https://doi.org/10.1044/persp2.SIG10.102
TL;DR: Commentary on best practices for student clinicians treating trans voice patients. Auditory feedback is misleading because elevation of pitch can be achieved with counterproductive stiffening of the larynx and vocal tract. SOVT (straw phonation, voiced fricatives, /mhm/) enhances the sympathetic vibrations, heightening the tactile/kinsethetic feedback which is a more accurate measurement between ease and effort. Random practice promotes better retention than blocked practice.
-
Schwarz, K., Fontanari, A. M. V., Costa, A. B., Soll, B. M. B., da Silva, D. C., de Sá Villas-Bôas, A. P., Cielo, C. A., Bastilha, G. R., Ribeiro, V. V., Dorfman, M. E. K. Y., & Lobato, M. I. R. (2018). Perceptual-Auditory and Acoustical Analysis of the Voices of Transgender Women. Journal of Voice, 32(5), 602–608. https://doi.org/10.1016/j.jvoice.2017.07.003
TL;DR: 32 trans women and 28 cis women were recorded for comparison. No voice training was administered, and prospective participants who previously had voice training were omitted. Study focus was on evaluating untrained trans women voices. SLP raters evaluated voices for gender and GRBASI (dysphonia, roughness, breathiness, asthenia, strain, and instability). Trans women exhibited hypernasality and roughness, indicating improper vocal compensations. Trans women (even without voice training) were rated as ambiguous perceptual gender (by SLPs).
Review: Here at least part of the use of SLPs is kind of justified, given that they were also analysing GRBASI in additional to perceptual gender. The presence of hypernasality and roughness indicates that trans women without voice training are still attempting to alter their voices, though to a mild voice quality detriment. As established by Holmberg et al. [7], SLPs are bad at gendering voices, so we have to take the assertion that transfem voices are ambiguous without training with skepticism.
-
Irineu, R. A., Dassie-Leite, A. P., Pereira, E. C., Ferreira, T., & Martins, P. N. (2025). Vocal Markers in the Gender Perception of Trans Women and Trans Men. Journal of Voice. https://doi.org/10.1016/j.jvoice.2025.03.002
TL;DR: 30 trans women and 23 trans men participants were recorded. No voice training was administered, and those with prior voice training were omitted. HRT was not considered an exclusion factor, so trans men with testosterone related voice changes were included. SLP raters evaluated voices for gender and GRBASI (dysphonia, roughness, breathiness, asthenia, strain, and instability). Femininity was associated with high pitch, articulation, and ascending intonation, while masculinity was related to tense vocal quality, loudness, descending intonation, and low pitch. Ascending intonation was the most significant predictor of femininity, and F1 frequency was the most significant preductor of masculinity.
Review: This shares some similarity with Schwarz et al. [20], most notably in that it is looking at untrained voices. The finding that F1 was fundamental to masculinity kind of lines up with Gallena et al. [16] finding that for /æ/ a high F1 lead to perceptual femininity.