[SOLVED] Linux-Compatible Natural-Sounding Text-to-Speech Synthesizer

julianvb · 08-22-2015, 05:16 AM

2015-08-22

I am looking for a more natural-sounding text-to-speech synthesizer than eSpeak, which actually is very reliable and easy to use in a Linux script. Thus far I haven't been able to find such a product. I've tried several Wine-based TTS and found them hard to use and disappointing even though I don't mind paying a reasonable sum. Any help will be greatly appreciated.

Julianvb

LinuxUser42 · 08-22-2015, 02:59 PM

You AND Ken Starks (blog of Helios, and FOSS Force writer).
You might consider sending him a note, it would be nice if more people bonded together to help show the market need.

ardvark71 · 08-22-2015, 03:47 PM

Quote:

Originally Posted by julianvb

I am looking for a more natural-sounding text-to-speech synthesizer than eSpeak, which actually is very reliable and easy to use in a Linux script.

Hi...

I guess so! I tried listening to a sample of eSpeak here but I barely could understand what was being said.

Have you taken a look at Festival? You can try their online demo here. I found the voice of "Tom (English American male)" pleasant and easy to understand. I'm not sure how you can download the software from their download page but I did find Festvox, which I guess is a program that incorporates Festival's software, here.

Let us know how it goes...

Regards...

frankbell · 08-22-2015, 08:31 PM

According to Jonathan Nadeau, a blind Linux user and maintainer or Sonar Linux, an improved screenreader is sorely needed for Linux. One of his hopes is to help provide one.

He uses Orca in Sonar.

julianvb · 08-24-2015, 11:10 PM

Hi, LinuxUser42, ardvark71 and frankbell,

Thank you all for your helpful inputs. I've just received an e-mail from Alan W. Black, a TTS expert at Carnegie-Mellon University recommending Festival 2.4 and CMU Flite, a more portable and faster C version of Festival. Today I installed Flite via synaptic and tried it out on two of my Linux Mint computers. I found the (U.S. English) voices quite natural and its syntax very similar to that of eSpeak. Thus there is no need for me to modify my existing Linux show-and-tell scripts.

In case anyone is interested in an excellent free Chinese-language TTS, I recommend Ekho highly. I've been using it for 3 years. It encompasses Mandarin, Cantonese and other major Chinese languages. Its sound quality is more than sufficient for all my current needs. By the way, Ekho was originally designed as a TTS for the blind in China but now it benefits the entire society.

I'll definitely get in touch with Ken Starks to remind the commercial software world that there's room for it to contribute and benefit even in Linux.

Julianvb

ardvark71 · 08-24-2015, 11:42 PM

Quote:

Originally Posted by julianvb

Thank you all for your helpful inputs. I've just received an e-mail from Alan W. Black, a TTS expert at Carnegie-Mellon University recommending Festival 2.4 and CMU Flite, a more portable and faster C version of Festival. Today I installed Flite via synaptic and tried it out on two of my Linux Mint computers. I found the (U.S. English) voices quite natural and its syntax very similar to that of eSpeak. Thus there is no need for me to modify my existing Linux show-and-tell scripts.

In case anyone is interested in an excellent free Chinese-language TTS, I recommend Ekho highly. I've been using it for 3 years. It encompasses Mandarin, Cantonese and other major Chinese languages. Its sound quality is more than sufficient for all my current needs. By the way, Ekho was originally designed as a TTS for the blind in China but now it benefits the entire society.

I'll definitely get in touch with Ken Starks to remind the commercial software world that there's room for it to contribute and benefit even in Linux.

You're welcome, glad you found a solution that works.

If you would, please mark this thread as "SOLVED" by clicking on "Thread Tools" directly above your initial post. Thanks!

Regards...

julianvb · 08-26-2015, 11:28 AM

Hi, ardvark71,

I thought I did mark this thread as [Solved] yesterday from the first post, namely Aug 25. Thanks.

Julianvb

ardvark71 · 08-26-2015, 02:23 PM

Quote:

Originally Posted by julianvb

I thought I did mark this thread as [Solved] yesterday from the first post, namely Aug 25. Thanks.

You did, yes. Thank you.

Regards...

newsgrabber · 11-05-2015, 07:07 PM

There's also a combination of espeak+mbrola voices. But this sounds even worse. Especially when run from gespeak because it makes unwanted gaps between words.

(BTW: if your gespeak does not see any mbrola voices installed symlink the folder espeak data to another location:
ln -s /usr/lib/i386-linux-gnu/espeak-data/ /usr/share/espeak-data
the original folder might be also in /usr/lib/x86_64-linux-gnu/espeak-data, see: bugs.launchpad.net for details).

julianvb · 11-06-2015, 12:35 AM

Hi, newsgrabber,

Thanks very much for your interesting input. Please let me know when you come across a reasonably priced natural-sounding TTS for Linux.

Julianvb

newsgrabber · 11-07-2015, 04:10 AM

Quote:

Originally Posted by julianvb

Hi, newsgrabber,

Thanks very much for your interesting input. Please let me know when you come across a reasonably priced natural-sounding TTS for Linux.

write to me @ poczta.onet.pl

ondoho · 11-07-2015, 06:23 AM

just for kicks i compiled flite from here and it sounds much better than the example from here.
http://iki.fi/dt/stuff/theraven.ogg
(you must also download & use the voices, but even without them it sounds better than the espeak example)

julianvb · 01-06-2017, 09:41 PM

2017-01-06

I am happy to report that I recently came across Cepstral's Swift TTS, a commercial TTS compatible with WIndows, OSX and Linux. I've tested its Linux David voice briefly, which sounds quite natural and I found the syntax very user-friendly. About a dozen voices are available and they are priced from $10 to $45. According to Cepstral, a licensed copy of swift may be used on only one computer and all its user-created swift output files may not be used on any computer not having its own Cepstral license. I like the firm's pre-purchase policy of allowing the public free testing of their TTS products.

I hope my information will be helpful to Linux users who are still searching for a natural-sounding commercial TTS.
Anyone interested in the Swift TTS will benefit from visiting http://www.cepstral.com and reading its very informative pages.

Julianvb

ardvark71 · 01-06-2017, 09:52 PM

Hi Julian...

Thank you for your update, I'm glad you found another product that fits your criteria. Perhaps your information will be helpful for others looking for this kind of software.

Regards...

ondoho · 01-07-2017, 07:05 AM

not sure why i had to compile it myself then, but flite is in the repos for at least archlinux, ubuntu and debian.
probably most distros.

so after installing with package management it comes with a basic male voice, and 'slt', a very soft (and more natural imo) female voice:

Code:

$ flite -h
flite: a small simple speech synthesizer
  Carnegie Mellon University, Copyright (c) 1999-2011, all rights reserved
  version: flite-2.0.0-release Dec 2014 (http://cmuflite.org)
usage: flite TEXT/FILE [WAVEFILE]
  Converts text in TEXTFILE to a waveform in WAVEFILE
  If text contains a space the it is treated as a literal
  textstring and spoken, and not as a file name
  if WAVEFILE is unspecified or "play" the result is
  played on the current systems audio device.  If WAVEFILE
  is "none" the waveform is discarded (good for benchmarking)
  Other options must appear before these options
  --version   Output flite version number
  --help      Output usage string
  -o WAVEFILE Explicitly set output filename
  -f TEXTFILE Explicitly set input filename
  -t TEXT     Explicitly set input textstring
  -p PHONES   Explicitly set input textstring and synthesize as phones
  --set F=V   Set feature (guesses type)
  -s F=V      Set feature (guesses type)
  --seti F=V  Set int feature
  --setf F=V  Set float feature
  --sets F=V  Set string feature
  -ssml       Read input text/file in ssml mode
  -b          Benchmark mode
  -l          Loop endlessly
  -voice NAME Use voice NAME (NAME can be filename or url too)
  -voicedir NAME Directory contain voice data
  -lv         List voices available
  -add_lex FILENAME add lex addenda from FILENAME
  -pw         Print words
  -ps         Print segments
  -psdur      Print segments and their durations (end-time)
  -pr RelName Print relation RelName
  -voicedump FILENAME Dump selected (cg) voice to FILENAME
  -v          Verbose mode
$ echo "Hello World!" | flite -voice slt

still i decided to download more voices:

Code:

$ cd
$ mkdir .config/flite && cd .config/flite
$ wget -r --no-parent --no-directories --accept flitevox http://www.festvox.org/flite/packed/flite-2.0/voices/
$ echo "Hello World!" | flite -voice ./cmu_us_axb.flitevox

that's a big download, ~550MB.
but it's also possible to use the voices without downloading them first:

Code:

echo "Hello World!" | flite -voice http://www.festvox.org/flite/packed/flite-2.0/voices/cmu_us_axb.flitevox