Need computationally fast approximation to standard deviation.

jiml8 · 11-21-2008, 11:51 AM

Brief description of the problem:

Due to power and lifecycle constraints, satellite downlinks generally occur at the minimum acceptable power level. As a consequence of this, downlink signals appear at levels only slightly above the background noise level (typically less than 10dB above the noise).

I have a situation where I am monitoring a satellite transponder's spectrum and reporting on the appearance and disappearance of carriers. The bandwidth of the carrier can be anything; I have to handle all cases.

I am particularly interested in the case where a broadband carrier is present, and a narrowband carrier appears in part of the spectrum occupied by the broadband carrier. This would represent a collision and needs to be reported as fast as possible, so I must detect the appearance of the narrowband carrier.

Now, the broadband carrier commonly does not have a uniform spectral density (the exact spectral distribution depends on the type of modulation and the type of data it carries), so its amplitude is not constant across the envelope. I have observed variances of as much as 8 dB from peak to valley in one carrier that I have been observing.

The appearance of a narrowband carrier within the broadband envelope should be represented as a +3dB shift in the power amplitude within that band. Thus, should a narrowband carrier appear in what is a valley in the broadband carrier, the +3dB shift might not bring the resultant power up to the level of higher power density components of the broadband carrier, and I therefore will have to do some work to detect it.

The system:

Briefly, an L-band tuner feeds I and Q signals to a digital signal processor which digitizes the data then performs an FFT. The output of the FFT is fed across the PCI bus to the host computer, which is running Linux. I do not have control of the algorithms on this DSP, though I have made a proposal to my client which would have him using a product of ours (mine and an associate) where we COULD move more processing into the DSP. But for now I am stuck.

I take this raw FFT data and find the RMS power spectrum by computing it directly using floating point math (I do wish I could do this on the DSP). This is computationally expensive. I then perform some filtering on the resulting spectrum in order to smooth out the stochastic random noise that almost buries the signal otherwise.

I then do a variety of other processing things to obtain the information I need, then report it to the next module up the line (which I have not yet written).

At the present time, I am getting one complete FFT every 5 ms, but my computation overhead means that I am spending 45 ms on each pass through my processing loop. I would like to speed this up, and a faster processor isn't an option at this point because this is a drop-in second generation code for a deployed system.

To solve this collision issue and reliably identify it when it happens, I have decided that I have to do something like compute the standard deviation of the power level in a given frequency bin in my spectrum over time, after I have filtered it. This way, I can determine the normal variability within a single cell when a signal is present, and I can therefore perform some statistics to determine if the change in power I now see is statistically significant - hence due to another emitter coming up.

The problem is that a standard deviation is computationally VERY expensive. I can ease the pain by storing intermediate results in RAM (even though this will cost a LOT of RAM), but I still have to do squares and square roots using floating point arithmetic.

I am looking for a better way. I can live with an approximation of the standard deviation if the approximation saves me a lot of compute time. Presently, I am wading through textbooks and scholarly papers, but all of them contain lots of information that I have to study, when all I really want is an algorithm.

Does anyone here have something that would help me?

pixellany · 11-21-2008, 12:51 PM

Wow!! Quite a screenful.

I have essentially no experience with finding optimum algorithms for computer programming. All I can say is that the formula for standard deviation is pretty simple. It's not obvious why it would represent a huge processing load.

If speed is an issue, my instinct is that you would want to keep a running tally of Summation(X^2) and then take the root when you are done.

Sorry--I don't have a "sigma" key.

H_TeXMeX_H · 11-21-2008, 01:41 PM

There may be some hints here:
http://en.wikipedia.org/wiki/Algorit...ating_variance
note: variance = stdev * stdev

Also, if you know assembly, that would help.

jiml8 · 11-21-2008, 02:02 PM

Quote:

Originally Posted by pixellany

Wow!! Quite a screenful.

I have essentially no experience with finding optimum algorithms for computer programming. All I can say is that the formula for standard deviation is pretty simple. It's not obvious why it would represent a huge processing load.

If speed is an issue, my instinct is that you would want to keep a running tally of Summation(X^2) and then take the root when you are done.

Sorry--I don't have a "sigma" key.

Yeah, I have been thinking about exactly that. In fact, when I compute the RMS power from the I and Q components, I wind up with those squares - and I take the square root to return the power.

But I then turn around and do a lot of filtering before I do my detection work. I have been looking at the implications of filtering power squared rather than power and I have to do some math to see what it all means.

It does appear to make sense to not take the square root, filter the hell out of it, then square it again, sum it, and square root it again...if I can make it work and show how it is OK. My client is good at this; I'll have to be able to justify it.

I am just wondering if there is a quick algorithm - something that gets me away from those square root computations.

jiml8 · 11-21-2008, 02:31 PM

Quote:

Originally Posted by H_TeXMeX_H

There may be some hints here:
http://en.wikipedia.org/wiki/Algorit...ating_variance
note: variance = stdev * stdev

Also, if you know assembly, that would help.

That page does have some interesting ideas...

And I do know assembly, but will avoid it in this application if at all possible. We do have to upgrade these systems in the future, and assembly is very non-portable.

H_TeXMeX_H · 11-21-2008, 04:06 PM

You don't have to use assembly, but it might be faster if you know what you're doing. But if it must be portable then don't use it.

jlinkels · 11-21-2008, 04:45 PM

Splendid idea to use the statistical value of the amplitude and see if it changes over time.

If your problem is squaring and square-rooting, why not use look-up tables for these functions. If you need more accuracy, you could even linear-interpolate between functions. Much more accuracy you would probably not need.

jlinkels

salasi · 11-21-2008, 05:54 PM

some random-ish thoughts...

Quote:

Originally Posted by jiml8

I am particularly interested in the case where a broadband carrier is present, and a narrowband carrier appears in part of the spectrum occupied by the broadband carrier.

I don't understand what you mean by a broadband carrier; a carrier is inherently a narrowband signal*, except for spread spectrum, for which you have either a simultaneous array of narrow band signals or a succesion of narrow band signals, depending.

(* This comment is incorrect, if you include flaws in the transmission equipment; probably 'should be' should replace 'is')

Quote:

Now, the broadband carrier commonly does not have a uniform spectral density (the exact spectral distribution depends on the type of modulation and the type of data it carries), so its amplitude is not constant across the envelope. I have observed variances of as much as 8 dB from peak to valley in one carrier that I have been observing.

OK, you seem to be using the word 'carrier' in a non-standard way to mean the whole signal, and not just the non-encoding part of the signal; is that the cause of my confusion? Could you also say what the modulation scheme is, please?

Quote:

The appearance of a narrowband carrier within the broadband envelope should be represented as a +3dB shift in the power amplitude within that band. Thus, should a narrowband carrier appear in what is a valley in the broadband carrier, the +3dB shift might not bring the resultant power up to the level of higher power density components of the broadband carrier, and I therefore will have to do some work to detect it.

Could you take moving averages (integer numbers, for preference) for each bin, and when the ratios (to previous values? to the rest of the values?) of certain bins suddenly pop up, would that be sufficiently reliable, or would it be too noise prone?

Quote:

Briefly, an L-band tuner feeds I and Q signals to a digital signal processor which digitizes the data then performs an FFT. The output of the FFT is fed across the PCI bus to the host computer, which is running Linux. I do not have control of the algorithms on this DSP, though I have made a proposal to my client which would have him using a product of ours (mine and an associate) where we COULD move more processing into the DSP. But for now I am stuck.

Not immediately relevant, but does the FFT include any windowing (eg, hamming)?

Quote:

At the present time, I am getting one complete FFT every 5 ms, but my computation overhead means that I am spending 45 ms on each pass through my processing loop. I would like to speed this up, and a faster processor isn't an option at this point because this is a drop-in second generation code for a deployed system.

The obvious suggestion to speed up the floating point math part on a PC, is not to do the floating point math part on a PC!

It will be faster to do the math in some fixed point format; floating point is a whole lot more forgiving when it comes to dynamic range issues and avoiding dynamic range issues with fixed point takes real care.

This may not be a big issue, if you are dealing with a ratio; if you scale the numbers so that the main signal is somewhere in the middle of the dynamic range (approximately), if you do get something that overloads, that is a bad sign...but is that bad sign necessarily the bad sign that you are looking for, or could it be some other effect (eg, broadband interference)?

Quote:

The problem is that a standard deviation is computationally VERY expensive.

Naturally. But, remember, it has been your choice to attack this problems via standard deviations. You could consider a simple approximation to the square roots, but I'd guess that this won't help as much these days as it once did (fp being a lot faster than it once was). But you'd only really know definitively the answer to that, if you tried it.

Quote:

I can ease the pain by storing intermediate results in RAM (even though this will cost a LOT of RAM),

I would have expected that this was the only workable possibility (storing anything to disk at every sample is going to be a disaster), and it needs that amount of ram that it needs... Of course, variables that are integers would take up less space than floating point numbers.

ta0kira · 11-21-2008, 07:19 PM

If you actually use σ then you must have some sort of finite interval, i.e. you can't expand your interval to infinity. That implies that older data should somehow be less relevant; therefore, maybe you can keep two running pseudo-statistics based on an exponential decay function. The first thing that comes to mind is something I use to detect departures from consistent activity (though not for signals or hardware,) which essentially comes down to this:

Code:

     -Ct
I = e

I ≡ importance
C ≡ a constant
t ≡ elapsed time

Using something like that (probably with the Δt since the last update,) you can find something comparable to a time-weighted average with:

Code:

<xnew> = <xold>·I + xnew·(1 - I)

As far as a standard deviation, you might perform that in the same manner with something like this:

Code:

σnew^2 = σold^2·I + (x - <xold>)^2·(1 - I)

ta0kira

jiml8 · 11-23-2008, 05:41 PM

Quote:

Originally Posted by jlinkels

Splendid idea to use the statistical value of the amplitude and see if it changes over time.

If your problem is squaring and square-rooting, why not use look-up tables for these functions. If you need more accuracy, you could even linear-interpolate between functions. Much more accuracy you would probably not need.

jlinkels

I had given no thought at all to lookup tables. Idea never occurred to me.

How delightfully primitive!

However, I could generate it at initialization time and if I wanted square roots of -30.0 dB squared to -90 dB squared, at 0.1 dB intervals, that would only take about 300K of 32 bit numbers. It would save a lot of processing time...

I am going to play with this idea a bit...

jiml8 · 11-23-2008, 06:48 PM

A lookup table turns out to be very simple to generate. The attached program is a standalone that works.

I will integrate the two subroutines into my code and do some timing experiments a bit later on. This looks like it will be pretty fast though. Actually, I can't see any approach being faster, and the accuracy appears to be more than I really need.

To play with this, compile it using the -lm option to gcc then specify a value between 900 and 8100 and it will give the square root. Compare with a calculator and see that this is pretty good.

Code:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <math.h>
#include <stdarg.h>
#include <sys/types.h>
#include <sys/time.h>
#include <unistd.h>

float *create_sq_rt(void);
float get_sqrt(float *,double);
int main(int argc, char *argv[])
{
	float *sqrttable,ans;
	double useval;
	if(argv[1] != 0x0)
	{
		useval = atof(argv[1]);
		if(useval >= 900 && useval <= 8100)
		{
			sqrttable = create_sq_rt();
			ans = get_sqrt(sqrttable,useval);
			printf("answer is %f\n",ans);
			if(sqrttable)free(sqrttable);
		}
	}
}
float *create_sq_rt()
{
    float *sqrttable;
    double ii;

    //range is -30^2 to -90^2 by tenths.
    //since 30^2 is 900 and we want integer indexes, we multiply by 10
    // thus we start with index 9000, then 900.1 is 9001, etc.  To start
    // array index at 0 we subtract 9000
    sqrttable = malloc(72000*sizeof(float));
    for(ii=900.0;ii<8100.0;ii+=.1)
    {
	sqrttable[(int)((ii*10)-9000)] = (float)sqrt(ii);
    }
    return sqrttable;
}
float get_sqrt(float *sqrttable,double square)
{
    int sqint;
    sqint = (int)((square * 10)-9000);
    return sqrttable[sqint];
}

Quakeboy02 · 11-23-2008, 07:01 PM

Jim, would it be faster to convert everything up front to integers to allow you to use solely integer math after the conversion process?

jiml8 · 11-23-2008, 07:04 PM

Quote:

Originally Posted by salasi

some random-ish thoughts...

I don't understand what you mean by a broadband carrier; a carrier is inherently a narrowband signal*, except for spread spectrum, for which you have either a simultaneous array of narrow band signals or a succesion of narrow band signals, depending.

(* This comment is incorrect, if you include flaws in the transmission equipment; probably 'should be' should replace 'is')

OK, you seem to be using the word 'carrier' in a non-standard way to mean the whole signal, and not just the non-encoding part of the signal; is that the cause of my confusion? Could you also say what the modulation scheme is, please?

Yes, I am following my client's somewhat sloppy use of the language here.

Quote:

Could you take moving averages (integer numbers, for preference) for each bin, and when the ratios (to previous values? to the rest of the values?) of certain bins suddenly pop up, would that be sufficiently reliable, or would it be too noise prone?

I don't see any way ratios help here, when I have to look at absolute power.

Quote:

Not immediately relevant, but does the FFT include any windowing (eg, hamming)?

Of course.

Quote:

The obvious suggestion to speed up the floating point math part on a PC, is not to do the floating point math part on a PC!

It will be faster to do the math in some fixed point format; floating point is a whole lot more forgiving when it comes to dynamic range issues and avoiding dynamic range issues with fixed point takes real care.

I cannot avoid floating point. I have taken a hard look at that issue.

Quote:

I would have expected that this was the only workable possibility (storing anything to disk at every sample is going to be a disaster), and it needs that amount of ram that it needs... Of course, variables that are integers would take up less space than floating point numbers.

This DSP board brings with it a hard memory limit of 896 Megs. This could become a serious problem, and I have to keep it in mind at all points. I am working on relieving that problem, but so far the vendor has been distinctly unhelpful. This is lending some strength to my attempt to get the client to go with our (my associate and me) solution, which would give me a lot more flexibility AND a bigger piece of the pie

.

jiml8 · 11-23-2008, 07:11 PM

Quote:

Originally Posted by ta0kira

If you actually use σ then you must have some sort of finite interval, i.e. you can't expand your interval to infinity. That implies that older data should somehow be less relevant; therefore, maybe you can keep two running pseudo-statistics based on an exponential decay function. The first thing that comes to mind is something I use to detect departures from consistent activity (though not for signals or hardware,) which essentially comes down to this:

Code:

     -Ct
I = e

I ≡ importance
C ≡ a constant
t ≡ elapsed time

Using something like that (probably with the Δt since the last update,) you can find something comparable to a time-weighted average with:

Code:

<xnew> = <xold>·I + xnew·(1 - I)

As far as a standard deviation, you might perform that in the same manner with something like this:

Code:

σnew^2 = σold^2·I + (x - <xold>)^2·(1 - I)

ta0kira

I am not interested in the intelligence in the signal, just the appearance and disappearance of the envelope. Thus, though I am indeed doing time-weighted averaging, I am not permitting a decay function as I would if I were interested in the signal content; a sample 1 second ago is as relevant as the most recent sample, In fact, handling the statistics when the signal routinely displays major changes in spectral distribution is giving me a bit of heartburn, particularly where I have to identify this collision case. Such a change will show as a larger standard deviation in that signal bin, which decreases the detectability of the new signal.

Right now, I am just trying to determine whether I can reliably detect the appearance of a new signal in an occupied band (which is a problem), and I am trying to determine what kind of false alarm rate will be acceptable, and how I can filter it out upstream.

Also, I am trying to come up with a very fast way to do this...

jiml8 · 11-23-2008, 07:19 PM

Quote:

Originally Posted by Quakeboy02

Jim, would it be faster to convert everything up front to integers to allow you to use solely integer math after the conversion process?

I keep looking at that, and I would love to do it.

But at this point it is not at all obvious to me that I can do it so I am continuing with floating point.

First you make it work, then you optimize it - and it happens that what I am working on here is groundbreaking; no one else is doing what we are doing. The approach was recently patented by my client and I see no showstoppers in it though I do see a number of issues - such as the one being discussed here.

So for now I'll minimize extraneous issues (such as doing integer math where it is naturally floating point) and I'll make it work. Then I'll see if I can speed it up that way.