Michigan State University

Related Projects

The audio recordings in the Tone Perfect database are related to two larger studies oriented around human perception of tone in language, particularly in Mandarin, conducted by Catherine Ryu of Michigan State University’s Department of Linguistics and Languages.

The recordings in the Tone Perfect database have been used to create other works listed on this page.

  • Picky Birds: A game for learning tonal languages. More information about Picky Birds is available in news coverage from Michigan Radio and MSU Today
  • Lingua Incognita, a tone-audio piece by Ben Fuhrman
  • Heart Doubt, a piece of experimental electronic music by Ben Fuhrman and Catherine Ryu that uses sounds from Tone Perfect
  • KUAI, an app by Matthew Scholefield that teaches basic Mandarin pronunciation by providing random syllables
  • Mandarin Sound Table by Matthew Scholefield, for exploring sounds in Mandarin Chinese
  • Mandarin Tone Machine Learning Project by Alice Xue

Heart Doubt

Contact Information

Heart Doubt is a piece of experimental electronic music. It utilizes both the audio assets from the Tone Perfect database and dramatic utterances of “I love you” by two native speakers (male and female), expressing a wide range of emotions such as anger, irony, tenderness, irritation, solemness, playfulness, and others. As such, this piece infuses the richness of Mandarin lexical and emotional tones into a novel sonic experience. To fully appreciate the acoustic intricacy of this piece, a high quality headset is recommended

Voice actors: Zijin Liu and Haitian Yan

Program Notes: Benjamin R. Fuhrman

Commissioned by Catherine Ryu of Michigan State University’s Department of Linguistics and Languages, Heart Doubt was created out of the Mandarin syllables “wo ai ni” (trans. “I love you”) to show off the creative possibilities of the Tone Perfect: Multimodal Database for Mandarin Chinese (https://tone.lib.msu.edu). Using these syllables, this piece explores the initial stage of interest and doubt at the beginning of a relationship. The male and female speakers are essentially engaged in an examination of their feelings for each other in a classic “she loves me, she loves me not” type game, trying to make sense of their feelings as outside thoughts intrude on their inner monologues.

The sounds in Heart Doubt were created by first quantizing the speech segments to begin on a controlled pitch (still allowing for the pitch contour changes in Mandarin), and then running the sounds into multiple instances of the IRCAM Multi Granular and IRCAM Stretch sample oscillators in UVI’s Falcon instrument in order to create different sonic textures. Additional textural work was created using PaulStretch to create the “monk choir” chanting the syllables that control the action in each of the three sections of the piece: wo (0:00-2:30), ai (2:30-3:30), and ni (3:30-5:00). Enveloping of each section was roughly mapped to the pitch contour of the respective control syllable in terms of both volume and effects automation.

Additional sounds were created on a modular synthesizer by feeding the DAW output to an envelope follower, and using that to control triggering, additional envelopes, and oscillators, creating the bell and chime sounds heard at different points through the piece.

Mandarin Sound Table

Contact Information

This sound table contains all monosyllabic sounds in Mandarin Chinese in four tones. If you do not know what kinds of sounds exist in Mandarin Chinese, explore them through this sound table.

Explore the full sound table


Contact Information

KUAI app teaches basic Mandarin pronunciation by providing random syllables and asking the user to transcribe them using pinyin.

Mandarin Tone Machine Learning Project

Contact Information

Mandarin Chinese is a tonal language where detection of tone is crucial to understanding the word’s meaning. However, for people with cochlear implants or non-native Chinese speakers, tone identification is a challenging task, providing the need for an automated system for tone recognition. This study trained a convolutional neural network (CNN) to classify Mandarin tones from audio recordings. The training data used was a monosyllabic Mandarin Chinese dataset of 9,860 audio files. The neural network was trained on either male, female, or combined data, and for each dataset split, either mel-frequency cepstral coefficients (MFCC), mel-spectrograms, or pitch contours were extracted from the audio files and fed as input features into the CNN. The highest test accuracy achieved in this research is 99.8%, which outperforms monosyllabic Mandarin dataset-trained models reported in previous literature. Results indicate that separating audio files by gender when training the neural network yields highest testing accuracies, which has important implications for future research in tone recognition.