Trying To Synchronize Sound Cards on Different Computers
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Trying To Synchronize Sound Cards on Different Computers
Hi Everybody,
I've written a "C" program which transmits audio to a number of computers over a TCP LAN connection. I'm using ALSA, the preemptive kernel, and pthread. After running for 30 minutes or so the slight variation in sampling rates (~+-.01%) among the computers accumulates and manifests as a noticeable differential delay in the sound from the speakers. I know how to detect the variation and would like to dynamically compensate for it by individually varying the sampling rate (ever so slightly) of each playback device to oppose the variation.
Does anybody out there in Linux Land know how to dynamically vary the playback sample rate? I've tried using snd_pcm_hw_params_set_rate() and snd_pcm_hw_params_set_rate() followed by snd_pcm_hw_params() to no avail. They don't seem to work when playback is running.
Like Redhatstand said, it's better to just pad the data.
However, there is no need to detect a zero crossing; you can just duplicate (or interpolate) any sample. Duplicating is the safest option.
If the sample rate difference is that small, one in a few thousand or less, the error caused by the added sample is insignificant; you won't be able to hear it. Especially if the samples are not added at fixed intervals, but with a bit of random fluctuation in the time domain. To simplify, periodic noise is much easier to perceive than random noise. Random dithering algorithms may help.
If you need absolute audio quality, use a high-quality resampling library; see for example https://ccrma.stanford.edu/~jos/resample/. This is much more difficult to get right, since even small changes in the pitch are easily perceived. Also, the precision synchronization required in this case is difficult in a networked environment -- see NTP for example.
Nominal Animal
Last edited by Nominal Animal; 03-21-2011 at 01:34 AM.
I've written a "C" program which transmits audio to a number of computers over a TCP LAN connection. I'm using ALSA, the preemptive kernel, and pthread. After running for 30 minutes or so the slight variation in sampling rates (~+-.01%) among the computers accumulates and manifests as a noticeable differential delay in the sound from the speakers. I know how to detect the variation and would like to dynamically compensate for it by individually varying the sampling rate (ever so slightly) of each playback device to oppose the variation.
Does anybody out there in Linux Land know how to dynamically vary the playback sample rate? I've tried using snd_pcm_hw_params_set_rate() and snd_pcm_hw_params_set_rate() followed by snd_pcm_hw_params() to no avail. They don't seem to work when playback is running.
multiple soundcards:
--------------------
multiple cards on the same machine can be used if they are synced
together. This can be achieved by either using a central word-clock
or by feeding a digital audio feed from one card into another and
putting it into slave mode.
In order for jack to access them, they must be abstracted to a single
'device' using an alsa config file.
If you dont mind some quality loss due to resampling, you have other
options:
Not bad Andy and please do keep your coat. Empirically I've determined that audio frames can be inserted (duplicated) or removed every pcm block without noticeable loss, at least to my tin ear, of fidelity. The block size is 360 frames. I'm pretty sure your idea would work nicely.
Kinda had my heart set on using sampling rate as my actuator though. I'm doing this work for myself, not the man. No deadline to meet. Only goal is elegance
A coder after my own heart - elegance is the ONLY criterion IMHO.
That said, the end user becomes the man anyway (boo)
I think I can help you to switch from a 'sample rate' focus to 'sample manipulation' focus by scrapping use of the word 'cheat' and replacing it with 'finer control of audio quality'...
If you are talking about a fixed block size I guess you are sending to a compression library, FLAC or MP3? 360 bytes rings no bells, so you may be doing something totally new (me too :-)
Nominal Animal:
Your response assisted me in a section of my coding too, thanks muchly :-)
It is really remarkable how easily the ear is fooled by tricky coder tactics: but the point about avoiding regularity in sample interventions is utterly vital.
if quality is _not_ of the utmost but you have no real silences, you could set a threshold (say 40DB below the average) and introduce a:
fade out > sample pad/delete > fade in
lasting a random (but short) period of time.
It should work for most listeners, but there'll always be someone whose aural palette is attuned enough to notice.
Wow! Some much feedback in such a short span of time. Many thanks everybody. You all are right. The sample rates are so close (within .01%) that adding/subtracting a frame every 10,000 frames will be inaudible. At a 44100 rate it amounts to, at most, 4.4 erred frames a second. But it'd sure be neat to close a PLL around the playback rate using the capture rate as the reference. Just to watch that old sample rate lock in to 44103 or 44099 or to interpolate between two integral rates as required. Cool.
Maybe I rejected JACK too quickly. Isn't it just a layer on top of ALSA? Another decision I made early on was to remove Pulseaudio. It's just a layer between the hardware and ALSA. Right? I wonder if I should be using the asound library directly and forget ALSA? Comments please.
Nice tip on random dithering and thanks for the link to the boys in Palo Alto. A little advanced for me right now but I'll keep in mind for the future. Right now I'm focusing on wireless transport. I'm thinking about NTP too to synchronize everything but that's a last resort.
360 audio frames works out to 1440 bytes which fit perfectly into an Ethernet frame with a few bytes to spare for message header. I'm not using any compression yet. There's a 2 second de-jitter buffer (FIFO) between the receive socket thread and the ALSA snd_pcm_writei()thread. The FIFO is where frames can be replicated or removed (to/from the tail pointer) to adjust for the slight difference between capture and play back rates.
... Maybe I rejected JACK too quickly. Isn't it just a layer on top of ALSA? ...
Not at all. In fact, the JACK site says also about porting JACK to MacOS which has no ALSA. If I understand correctly, JACK works on UNIXish systems - don't remember about Windows.
...
When I dealt with JACK several years ago, it in the end appeared as the most natural thing to use for many things.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.