Tips for Using Free Text to Speech (TTS) with Switchvox Phone Systems

By Dan Ribar

It’s exciting to see how easy it is for our resellers and customers to customize their phone systems to meet their business needs – and then be willing to share their tips with others. In this guest blog post, Dan Ribar, CIO of 1st Guard Corporation, explains how to use a free version of Text to Speech in a Switchvox phone system. Thanks for sharing, Dan!

The natural progression for a new telephone system is to typically find it’s way to some form of dynamic IVR that looks to a database for the information.  My office has had the Digium Switchvox system for a couple years now and feel pretty good about writing basic IVRs. Now, it’s time for some cool text-to-speech (TTS).

Why do I need TTS?  In a basic IVR,  you can record messages that may never need to change, like “Press one to reach customer service.”  But, you may want to have a bit more dynamic text coming back to the end user, like “Your account is in great standing.  Your account balance is $123.45 and you have one pending claim….” We wanted that dynamic text and began looking for a way to handle it.

After a few weeks of research,  I found a couple of solutions.

1.  Buy a TTS engine that runs on your server.  (This will cost anywhere from $1,000 and up.)

2.  Use a free TTS web service.  (I couldn’t find any voices that could be easily understood so this wasn’t a good option.)

3.  Use a paid TTS web service.  (Well, cost is always an issue, but more importantly I didn’t want to rely on the performance of a web-based service to feed a dynamic IVR.)

4.  Use the TTS built into Microsoft dot net framework. (Sounds great, but it requires a physical server with a sound card. Not a good option since my office is 100% virtual server based.)

5.  Write your own.

Now option five seems a bit daunting.  I mean,  how do you write a TTS engine?  Simple answer is you don’t.  What you can do is package some free components to make it all work.  Here’s how it was implemented at my office:

Downloaded the free TTS engine called eSpeak and installed it on the IIS server.  If you want to see how it works,  just install it on your PC and try out the command line.

Then, used visual studio to write (or enhance in this case) the http listener for the TTS requests.  I wanted a REST like solution so it would integrate well into the IVR on the Switchvox.

Some code would be nice, too. This is what the listener looks like:

Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load

If Not IsPostBack Then

Dim myWords As String = Request.QueryString(“myWords”)

FileName &= Trim(Now().Year.ToString) & Trim(Now().DayOfYear.ToString) &              Trim(Now().Hour.ToString) & Trim(Now().Minute.ToString) & Trim(Now().Millisecond.ToString)

FileName &= “.wav”

Dim p As New Diagnostics.Process

‘ s = speed

‘ p = pitch

‘ a = amplitude or volume

Dim args As String = “-v en-us -s 150 -a 120  -w ” & FileName & ” “”” & myWords & “”” “

p.StartInfo.Arguments = args

p.StartInfo.FileName = “d:/data/espeak/command_line/espeak.exe”

p.StartInfo.UseShellExecute = False

p.StartInfo.CreateNoWindow = True

p.StartInfo.RedirectStandardError = True


Dim ttsErrors As String = p.StandardError.ReadToEnd




Response.ContentType = “audio/wav”

Response.AddHeader(“Content-Disposition”, “inline; filename=test.wav”)



That’s it! Pretty easy. Right?  If your listener is called tts.aspx,  you just call it with:

tts.aspx?myWords=’Hello World’

…and he returns a wav file.

It’s simple to then integrate it into Switchvox. In your IVR, add an action type of ‘Play Sound From URL’ and add this line:’Hello World’

That’s almost everything. All that’s left is to review the following:

Is this solution:

Pretty cool? YES

Free? YES

LAN based? YES

Works well with, Visual Studio and IIS? YES

Related Posts

There Are 4 Comments

  • roderickm says:

    For Switchvox fans that don’t dabble in Visual Studio programming and IIS administration, a web service may be a good fit. is a great TTS web service from Cepstral, the same folks that made the “Allison” voice and over 50 others.

  • Jon Daley says:

    I am surprised to see some notable items missing here. Why don’t you mention flite, and the others like it? Free, non-web, standalone libraries that easily integrate into Asterisk?

    They’re not great, and I don’t use them for customer-facing parts, but I use them for employee-facing parts of the system, and it seems like they are worth mentioning on your list, since someone might find them useful, and much more useful than a 3rd party service, and way less work than compiling your own TTS engine.

    As for Roderick’s statement about Cepstral and Allison – I thought Allison was a real person, or maybe Roderick means “the same folks who recorded the Allison voice”, though I didn’t think she was related to Cepstral at all.

  • Dan Ribar says:

    Good points Jon.

    We were looking for a specific integration style (SOAP vs command line vs REST vs API vs etc etc) and just landed on eSpeak. My point was only to show that’s it is very easy to get a free TTS brewing. Like everything technology — there are as many options as there are opinions and most of them work 🙂

    After using eSpeak for a while, we ended up adding in the AT&T Natural Voices for our production system. Not best for everyone, but a great solution for us.

    Thanks again.

  • Tarun Dham says:

    Great Article Dan,
    How do I make use of Anna’s voice from Microsoft in this .Net Application?

Add to the Discussion

Your email address will not be published. Required fields are marked *