Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am looking for a more natural-sounding text-to-speech synthesizer than eSpeak, which actually is very reliable and easy to use in a Linux script. Thus far I haven't been able to find such a product. I've tried several Wine-based TTS and found them hard to use and disappointing even though I don't mind paying a reasonable sum. Any help will be greatly appreciated.
Distribution: Lubuntu, Raspbian, Openelec, messing with others.
Posts: 143
Rep:
You AND Ken Starks (blog of Helios, and FOSS Force writer).
You might consider sending him a note, it would be nice if more people bonded together to help show the market need.
I am looking for a more natural-sounding text-to-speech synthesizer than eSpeak, which actually is very reliable and easy to use in a Linux script.
Hi...
I guess so! I tried listening to a sample of eSpeak here but I barely could understand what was being said.
Have you taken a look at Festival? You can try their online demo here. I found the voice of "Tom (English American male)" pleasant and easy to understand. I'm not sure how you can download the software from their download page but I did find Festvox, which I guess is a program that incorporates Festival's software, here.
According to Jonathan Nadeau, a blind Linux user and maintainer or Sonar Linux, an improved screenreader is sorely needed for Linux. One of his hopes is to help provide one.
Thank you all for your helpful inputs. I've just received an e-mail from Alan W. Black, a TTS expert at Carnegie-Mellon University recommending Festival 2.4 and CMU Flite, a more portable and faster C version of Festival. Today I installed Flite via synaptic and tried it out on two of my Linux Mint computers. I found the (U.S. English) voices quite natural and its syntax very similar to that of eSpeak. Thus there is no need for me to modify my existing Linux show-and-tell scripts.
In case anyone is interested in an excellent free Chinese-language TTS, I recommend Ekho highly. I've been using it for 3 years. It encompasses Mandarin, Cantonese and other major Chinese languages. Its sound quality is more than sufficient for all my current needs. By the way, Ekho was originally designed as a TTS for the blind in China but now it benefits the entire society.
I'll definitely get in touch with Ken Starks to remind the commercial software world that there's room for it to contribute and benefit even in Linux.
Thank you all for your helpful inputs. I've just received an e-mail from Alan W. Black, a TTS expert at Carnegie-Mellon University recommending Festival 2.4 and CMU Flite, a more portable and faster C version of Festival. Today I installed Flite via synaptic and tried it out on two of my Linux Mint computers. I found the (U.S. English) voices quite natural and its syntax very similar to that of eSpeak. Thus there is no need for me to modify my existing Linux show-and-tell scripts.
In case anyone is interested in an excellent free Chinese-language TTS, I recommend Ekho highly. I've been using it for 3 years. It encompasses Mandarin, Cantonese and other major Chinese languages. Its sound quality is more than sufficient for all my current needs. By the way, Ekho was originally designed as a TTS for the blind in China but now it benefits the entire society.
I'll definitely get in touch with Ken Starks to remind the commercial software world that there's room for it to contribute and benefit even in Linux.
You're welcome, glad you found a solution that works.
If you would, please mark this thread as "SOLVED" by clicking on "Thread Tools" directly above your initial post. Thanks!
There's also a combination of espeak+mbrola voices. But this sounds even worse. Especially when run from gespeak because it makes unwanted gaps between words.
(BTW: if your gespeak does not see any mbrola voices installed symlink the folder espeak data to another location:
ln -s /usr/lib/i386-linux-gnu/espeak-data/ /usr/share/espeak-data
the original folder might be also in /usr/lib/x86_64-linux-gnu/espeak-data, see: bugs.launchpad.net for details).
just for kicks i compiled flite from here and it sounds much better than the example from here. http://iki.fi/dt/stuff/theraven.ogg
(you must also download & use the voices, but even without them it sounds better than the espeak example)
I am happy to report that I recently came across Cepstral's Swift TTS, a commercial TTS compatible with WIndows, OSX and Linux. I've tested its Linux David voice briefly, which sounds quite natural and I found the syntax very user-friendly. About a dozen voices are available and they are priced from $10 to $45. According to Cepstral, a licensed copy of swift may be used on only one computer and all its user-created swift output files may not be used on any computer not having its own Cepstral license. I like the firm's pre-purchase policy of allowing the public free testing of their TTS products.
I hope my information will be helpful to Linux users who are still searching for a natural-sounding commercial TTS.
Anyone interested in the Swift TTS will benefit from visiting http://www.cepstral.com and reading its very informative pages.
Thank you for your update, I'm glad you found another product that fits your criteria. Perhaps your information will be helpful for others looking for this kind of software.
not sure why i had to compile it myself then, but flite is in the repos for at least archlinux, ubuntu and debian.
probably most distros.
so after installing with package management it comes with a basic male voice, and 'slt', a very soft (and more natural imo) female voice:
Code:
$ flite -h
flite: a small simple speech synthesizer
Carnegie Mellon University, Copyright (c) 1999-2011, all rights reserved
version: flite-2.0.0-release Dec 2014 (http://cmuflite.org)
usage: flite TEXT/FILE [WAVEFILE]
Converts text in TEXTFILE to a waveform in WAVEFILE
If text contains a space the it is treated as a literal
textstring and spoken, and not as a file name
if WAVEFILE is unspecified or "play" the result is
played on the current systems audio device. If WAVEFILE
is "none" the waveform is discarded (good for benchmarking)
Other options must appear before these options
--version Output flite version number
--help Output usage string
-o WAVEFILE Explicitly set output filename
-f TEXTFILE Explicitly set input filename
-t TEXT Explicitly set input textstring
-p PHONES Explicitly set input textstring and synthesize as phones
--set F=V Set feature (guesses type)
-s F=V Set feature (guesses type)
--seti F=V Set int feature
--setf F=V Set float feature
--sets F=V Set string feature
-ssml Read input text/file in ssml mode
-b Benchmark mode
-l Loop endlessly
-voice NAME Use voice NAME (NAME can be filename or url too)
-voicedir NAME Directory contain voice data
-lv List voices available
-add_lex FILENAME add lex addenda from FILENAME
-pw Print words
-ps Print segments
-psdur Print segments and their durations (end-time)
-pr RelName Print relation RelName
-voicedump FILENAME Dump selected (cg) voice to FILENAME
-v Verbose mode
$ echo "Hello World!" | flite -voice slt
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.