相关文章推荐
细心的佛珠  ·  QStackedWidget ...·  1 年前    · 
This article provides a step-by-step guide on implementing a simple text-to-speech (TTS) application using C# and the System.Speech.Synthesis namespace. It explains the code used, including creating a SpeechSynthesizer instance and synthesizing speech from text, and offers suggestions for exploring more advanced TTS features.

Introduction

This tip explains how to implement a text-to-speech (TTS) application in C# using the System.Speech.Synthesis namespace. TTS technology has many practical use cases, such as in accessibility tools and speech-enabled applications. By following the step-by-step guide provided, readers will be able to create a simple console application that synthesizes speech from text. The article also provides an explanation of the code used and offers suggestions for exploring more advanced features of the SpeechSynthesizer class.

Implementing Text-to-Speech in C# using System.Speech.Synthesis

Text-to-speech (TTS) technology has been around for a while and has found many use cases, such as in language learning, accessibility tools for visually impaired individuals, and in speech-enabled applications. In this article, we will explore how to implement a simple TTS application using C# and the System.Speech.Synthesis namespace.

Prerequisites

Before we begin, you need to have the following installed on your machine:

  • .NET Framework 4.6.1 or higher
  • Visual Studio 2017 or higher
  • Implementation

    We will be using the System.Speech.Synthesis namespace, which provides classes for synthesizing speech from text. Follow the steps below to create a console application in C# and implement TTS.

  • Open Visual Studio and create a new Console Application project.
  • Add a reference to the System.Speech assembly. Right-click on the project in Solution Explorer , select Add Reference , and then choose System.Speech from the list of assemblies.
  • In the Program.cs file, add the following code:
  • using System; using System.IO; using System.Speech.Synthesis; class Program static void Main(string[] args) // Basic TTS SpeechSynthesizer synth = new SpeechSynthesizer(); synth.SetOutputToDefaultAudioDevice(); synth.Speak( " Hello, world!" ); // Changing the Voice synth.SelectVoiceByHints(VoiceGender.Female, VoiceAge.Adult); synth.Speak( " Hello, I am a female voice!" ); // Changing the Pitch and Rate synth.Rate = -2; synth.Volume = 100 ; synth.Speak( " Hello, I am speaking slower and louder!" ); // Pausing and Resuming Speech synth.Speak( " Hello, I will pause for 3 seconds now." ); synth.Pause(); System.Threading.Thread.Sleep( 3000 ); // wait for 3 seconds synth.Resume(); synth.Speak( " I am back!" ); // Saving Speech to a WAV File synth.SetOutputToWaveFile( " output.wav" ); synth.Speak( " Hello, I am saving my speech to a WAV file!" ); // Setting the Speech Stream MemoryStream stream = new MemoryStream(); synth.SetOutputToWaveStream(stream); synth.Speak( " Hello, I am being streamed to a memory stream!" ); byte[] speechBytes = stream.GetBuffer(); // Changing the Voice and Pronunciation PromptBuilder builder = new PromptBuilder(); builder.StartVoice(VoiceGender.Female, VoiceAge.Adult, 1 ); builder.AppendText( " Hello, my name is Emily." ); builder.StartVoice(VoiceGender.Female, VoiceAge.Teen, 2 ); builder.AppendText( " I am from New York City." ); builder.StartStyle( new PromptStyle() { Emphasis = PromptEmphasis.Strong }); builder.AppendText( " I really love chocolate!" ); builder.EndStyle(); builder.StartStyle( new PromptStyle() { Emphasis = PromptEmphasis.Reduced }); builder.AppendText( " But I'm allergic to it..." ); builder.EndStyle(); synth.Speak(builder); Console.ReadLine();

    Code Outline

  • Basic TTS

    Creates a SpeechSynthesizer instance and synthesizes the text " Hello, world! " using the default audio device.

  • Changing the Voice

    Selects a female adult voice and synthesizes the text " Hello, I am a female voice! " using that voice.

  • Changing the Pitch and Rate

    Sets the speech rate to -2 (slower) and the volume to 100 (louder), and synthesizes the text " Hello, I am speaking slower and louder! ".

  • Pausing and Resuming Speech

    Synthesizes the text " Hello, I will pause for 3 seconds now. ", pauses the speech for 3 seconds, and then resumes the speech and synthesizes the text " I am back! ".

  • Saving Speech to a WAV File

    Sets the output of the SpeechSynthesizer to a WAV file named " output.wav ", and synthesizes the text " Hello, I am saving my speech to a WAV file! ".

  • Setting the Speech Stream

    Sets the output of the SpeechSynthesizer to a memory stream, synthesizes the text " Hello, I am being streamed to a memory stream! ", and gets the resulting speech bytes from the memory stream.

  • Changing the Voice and Pronunciation

    Uses the PromptBuilder class to create a more complex prompt, changing the voice for certain parts of the prompt, and adding emphasis and reduced emphasis to certain parts of the prompt. The resulting prompt is then synthesized using the SpeechSynthesizer .

    These code examples demonstrate some of the basic and advanced functionality of the SpeechSynthesizer class, including changing the voice and pitch, pausing and resuming speech, and saving synthesized speech to a file or memory stream.

    History

  • 16 th March, 2023: Initial version
  • As an Artificial Intelligence Engineer with over 15 years of experience, I excel in innovating, designing, and developing state-of-the-art technology. My expertise lies in coding complex algorithms, and engineering robots to automate tasks with precision and efficiency. I'm a passionate coder who is always exploring cutting-edge technologies and pushing the boundaries of what's possible. Hello,
    I tried to add english and french and german voices to the project, it was working for english pronouncation and for french but not for german, the german stayed french. How are the different voices for different languages to be installed on windows ? Anybody having experience with this problem ?
    Here the modified Code I tested:
    using System; using System.IO; using System.Speech.Synthesis; class Program static System.Globalization.CultureInfo MyCultureInfo = new System.Globalization.CultureInfo( " en-US" ); // english voice is working correct static System.Globalization.CultureInfo MyCultureInfoGerman = new System.Globalization.CultureInfo( " de-DE" ); // not working stays a french voice as standard voice ! static void Main(string[] args) SpeechSynthesizer synthD = new SpeechSynthesizer(); synthD.SelectVoiceByHints(VoiceGender.Male, VoiceAge.Adult, 1 , MyCultureInfoGerman); synthD.SetOutputToDefaultAudioDevice(); synthD.Speak( " Hallo, bitte starten Sie jetzt die Initialisierung, falls die Maschine bereit ist!" ); // stays standard french voice - why ?? // Basic TTS SpeechSynthesizer synth = new SpeechSynthesizer(); synth.SelectVoiceByHints(VoiceGender.Male, VoiceAge.Adult, 1 , MyCultureInfo); synth.SetOutputToDefaultAudioDevice(); synth.Speak( " Hello, world!" ); // Changing the Voice synth.SelectVoiceByHints(VoiceGender.Female, VoiceAge.Adult, 0 , MyCultureInfo); synth.Rate = 2 ; synth.Volume = 40 ; synth.Speak( " Hello, I am a female voice!" ); // Changing the Pitch and Rate synth.Rate = -2; synth.Volume = 100 ; synth.Speak( " Hello, I am speaking slower and louder!" ); // Pausing and Resuming Speech synth.Speak( " Hello, I will pause for 3 seconds now." ); synth.Pause(); System.Threading.Thread.Sleep( 3000 ); // wait for 3 seconds synth.Resume(); synth.Speak( " I am back!" ); // Saving Speech to a WAV File synth.SetOutputToWaveFile( " output.wav" ); synth.Speak( " Hello, I am saving my speech to a WAV file!" ); // Setting the Speech Stream MemoryStream stream = new MemoryStream(); synth.SetOutputToWaveStream(stream); synth.Speak( " Hello, I am being streamed to a memory stream!" ); byte[] speechBytes = stream.GetBuffer(); // Changing the Voice and Pronunciation PromptBuilder builder = new PromptBuilder(MyCultureInfo); builder.StartVoice(VoiceGender.Female, VoiceAge.Adult, 1 ); builder.AppendText( " Hello, my name is Emily." ); builder.EndVoice(); builder.StartVoice(VoiceGender.Female, VoiceAge.Teen, 2 ); builder.AppendText( " I am from New York City." ); builder.StartStyle( new PromptStyle() { Emphasis = PromptEmphasis.Strong }); builder.AppendText( " I really love chocolate!" ); builder.EndStyle(); builder.StartStyle( new PromptStyle() { Emphasis = PromptEmphasis.Reduced }); builder.AppendText( " But I'm allergic to it..." ); builder.EndStyle(); builder.EndVoice(); synth.SetOutputToDefaultAudioDevice(); synth.Speak(builder); Console.ReadLine(); Sign in · View Thread Clearly there are a few things missing here.
    1) Using VS2022, I had to download and install System.Speech > 8.0.0-preview.2.23128.3 using the NuGet Package Manager.
    2) The code 'as is' throws exceptions like 'Cannot generate SSML data: Voice element not closed.' MS should change this message to say something like 'Close Voice element using EndVoice()'
    3) The code following '// Setting the Speech Stream' really does nothing, because the speechBytes are not used anywhere! (Am I missing something??)
    4) To hear the text synthesized in the 'builder' you will need to add the following before the synth.Speak(builder) line
    synth.SetOutputToDefaultAudioDevice(); //otherwise this is going to WavStream??
    5) In System.Speech > 8.0.0-preview.2.23128.3 VoiceAge.Teen sounds exactly like VoiceAge.Adult .
    So, the code that appears to work on my system looks like this: SpeechSynthesizer synth = new SpeechSynthesizer(); synth.SetOutputToDefaultAudioDevice(); synth.Speak( " Hello, world!" ); // Changing the Voice synth.SelectVoiceByHints(VoiceGender.Female, VoiceAge.Adult); synth.Speak( " Hello, I am a female voice!" ); // Changing the Pitch and Rate synth.Rate = -2; synth.Volume = 100 ; synth.Speak( " Hello, I am speaking slower and louder!" ); // Pausing and Resuming Speech synth.Speak( " Hello, I will pause for 3 seconds now." ); synth.Pause(); System.Threading.Thread.Sleep( 3000 ); // wait for 3 seconds synth.Resume(); synth.Speak( " I am back!" ); // Saving Speech to a WAV File named 'OUTPUT.WAV' synth.SetOutputToWaveFile( " output.wav" ); synth.Speak( " Hello, I am saving my speech to a WAV file!" ); // Setting the Speech Stream MemoryStream stream = new MemoryStream(); synth.SetOutputToWaveStream(stream); synth.Speak( " Hello, I am being streamed to a memory stream!" ); byte[] speechBytes = stream.GetBuffer(); // Changing the Voice and Pronunciation PromptBuilder builder = new PromptBuilder(); builder.StartVoice(VoiceGender.Female, VoiceAge.Adult, 1 ); builder.AppendText( " Hello, my name is Emily." ); builder.EndVoice(); // Added to make it work builder.StartVoice(VoiceGender.Female, VoiceAge.Teen, 2 ); builder.AppendText( " I am from New York City." ); builder.StartStyle( new PromptStyle() { Emphasis = PromptEmphasis.Strong }); builder.AppendText( " I really love chocolate!" ); builder.EndStyle(); builder.StartStyle( new PromptStyle() { Emphasis = PromptEmphasis.Reduced }); builder.AppendText( " But I'm allergic to it..." ); builder.EndStyle(); builder.EndVoice(); // Added to make it work synth.SetOutputToDefaultAudioDevice(); // Added otherwise this is going to WavStream?? synth.Speak(builder); // error without the EndVoice above >> 'Cannot generate SSML data: Voice element not closed.' Sign in · View Thread Question Code had an error...you didn't "EndVoice()" after the "StartVoice()" calls Pin
    SBGTrading 18-Mar-23 12:33
    SBGTrading 18-Mar-23 12:33 I had to add "builder.EndVoice();" calls after the Emily append, and the New York City append statements.

    Sign in · View Thread Interesting even if the article is far from being exhaustive in particular only deals with Framework not .Net
    Sign in · View Thread You're missing an EndVoice in the builder section
    PartsBin an Electronics Part Organizer - An updated version available!
    JaxCoder.com
    Latest Article: ARM Tutorial Part 2 Timers

    Sign in · View Thread A few months ago I looked into the code available in this namespace, after spending some time with Edge's Read Aloud function. Edge apparently uses a completely different TTS engine, and comparing the results, it's obvious Edge's sounds a lot better. Like, it's in a completely different league.
    Have you looked into what Edge is using? Any idea if there's some APIs one can hook into to use Edge's TTS rather than what's available here?

    Sign in · View Thread Maybe this is what you are referring to ?
    How to get new edge's TTS voices into C# .NET winforms app?
    Text to Speech – Realistic AI Voice Generator | Microsoft Azure
    Bringing cloud powered voices to Microsoft Edge Insiders - Microsoft Edge Blog
    Just guessing and copy-pasting URLs.
    Sign in · View Thread