Bandwidth And Noise

Back to DSP v2.1 page.

The DSP Speech Processor was designed to support a 4 kHz speech bandwidth to suit the transceiver it was designed with (although it also supports narrower bandwidths). Wider speech bandwidths have been found to improve both speech quality and intelligibility in some noisy conditions (particularly QRN). This goes against the old thinking that a narrow bandwidth is always more readable through noise.

Narrow transmit bandwidths concentrate power in a narrower range, causing the signal to stand out above the noise more. But wider speech bandwidths preserve more speech information, which can enhance intelligibility. This is even true of bass frequencies (< 600 Hz), that are clearly audible through noise and give the listener clues to speech timing and pronunciation.

For this to work, good EQ is needed to keep the power spectrum reasonably flat. Without good EQ, the strong bass frequencies will steal most of the available transmitter power, producing a high average power output with very poor readability. Dynamic pre-emphasis, multiband compression and multiband clipping all help achieve a flatter power spectrum. This can result in a signal that sounds like it has too much high frequency energy, but receiver de-emphasis or a high-cut filter corrects this easily, and improves the perceived S/N ratio.

Increasing the bandwidth from 300 to 3000 Hz to 100 to 3000 Hz is an increase of 200 Hz, or less than 8%. If the power spectrum is kept flat, this reduces the power of the frequencies from 300 to 3000 Hz by only 0.7 dB (worst case). On a receiver with de-emphasis (and adequate bass response), the signal can appear to get louder, and more natural sounding.

SSB and AM systems both have a flat noise spectrum. This is because the transmission bandwidth is a small fraction of the radio frequency used, and the modes themselves respond equally to noise across the passband. SSB systems perform a linear translation of frequencies from audio to RF and from RF back to audio. This means as the receiver IF filter bandwidth is increased, the new noise introduced falls on new frequencies. The total noise power in the passband increases, but there is more passband - so the signals stay above the noise floor by the same amount. This is quite easy to observe using an SDR receiver spectrum display. This means the IF bandwidth of an SSB receiver can be increased, without degrading the readability of signals! The total noise may sound louder, but the signals will be just as audible as they were before, as they are the same height above the noise floor. Increasing receiver IF bandwidth also reduces ringing on noise impulses, which can reduce their effect.

Simple diode AM detectors respond to signals and noise across the passband and will degrade the S/N ratio at all frequencies as the bandwidth is increased. A solution to this is to use synchronous AM detection. This helps improve S/N ratio and reduce distortion and is frequently used for these reasons in the video IF detectors of analog TV receivers (remember those?).

Most speech power falls on specific frequencies, rather than being a perfectly flat spectrum like white noise. This causes the transmit power to concentrate in peaks across the passband. This increases the S/N ratio of the speech by making it rise higher above the noise floor. This allows SSB transmitter bandwidth to be increased with a smaller noise penalty than expected.

Future development in speech processing using FFT methods may allow improvements by modifying the spectrum of the voice to optimize the spectral peaks being transmitted for higher intelligibility, while simultaneously modifying their relative phases to produce a lower peak-to-average output waveform. This may also allow very high loudness with less distortion than current methods.
 

Tests and Observations

Many narrowband signals heard on-air do not have a flat power spectrum. Instead, there is often more energy at the lower frequencies around 300 - 600 Hz. This sounds louder and more natural, but it makes the audio less readable through noise. If more high frequency boost above 1 kHz is used to flatten the power spectrum, the audio can sound overly bright on a normal receiver (too much treble). This can be improved to some extent by receiver EQ or de-emphasis. A transmit passband that includes bass down to below 100 Hz while maintaining a flat power spectrum sounds more natural, and in some cases appears to be slightly more readable or louder through noise. The higher average power of low frequencies plus the receiver de-emphasis cause the low frequencies to sound louder - a well known effect in FM broadcasting. Listen to the tests below and judge for yourself!

Audio + Noise Intelligibility Tests

Method: The processor mode was set to mode 5 in these tests and the dynamic pre-emphasis EQ was greatly increased, to boost the higher voice frequencies. The audio was processed using the same amount of clipping all the way through these recorded files. The only variable was the Utility mode bandwidth settings. The processed audio was recorded via S/PDIF output, and compressed again by about 1 dB more to precisely fix the peak level at -6 dB. This simulates ALC action, and ensures the peak level stays constant - simulating a limited transmit power level. Then, white noise was mixed over the audio, with a -5 dB speech ratio for normal clipping, and a -6 dB speech ratio for maximum clipping (measured in a 5.5 kHz bandwidth). This produces a negative S/N ratio for the speech (i.e.: the noise is louder). After the noise was added, the audio was given some filtering to simulate receiver de-emphasis, and reduce low frequency rumble.

To hear the results of these tests accurately, the playback system must have adequate bass response. Good quality headphones are recommended.

UPDATE: A narrowband receiver simulation has been added, with sharp FFT filtering removing all frequencies outside of the range 250 Hz to 3 kHz.

Some observations after much testing:

October 14, 2014: The results using a 250 Hz lower limit might be improved by increasing bass energy in the range 300 Hz to 600 Hz. However this will increase mid-bass clipping distortion and may cause a drop in quality.

Tested 10/08/2014:
The Mode 5 tests were done using a contoured audio EQ out of the multiband compressor, reducing bass energy between 125 and 600Hz, and boosting treble above 1kHz.

Normal clipping, Mode 5 + EQ boost @ -5 dB SNR - Utility modes stepped through to change bandwidth

Maximum clipping, Mode 5 + EQ boost @ -6 dB SNR - Utility modes stepped through to change bandwidth

Maximum clipping, Mode 5 + EQ boost @ -6 dB SNR - Utility modes stepped through to change bandwidth NARROW RECEIVE 250 Hz - 3 kHz

Bass/Nobass test @ -6 dB SNR This test compares a highpass cutoff of 60 Hz and 250 Hz at a poor S/N ratio of -6dB.

Very weak Bass/Nobass test @ -10 dB SNR This test compares a highpass cutoff of 60 Hz and 250 Hz at a very poor S/N ratio of -10dB.

Bass/NoBass test - Clean signal, maximum clipping and EQ

Flat spectrum (Mode 3 + input EQ), maximum clipping, diminishing S/N ratio. 60 Hz - 3 kHz and 250 Hz - 3 kHz, in a loop. S/N ratios: Max, -6 dB, -8 dB, -10 dB, -12 dB, -14 dB.
The results are close. The 250 Hz highpass may have slightly better readability, particularly on words "3, 4, 5".

Note: Since these tests were done, the mode numbering has been rearranged. Mode 4 now has a contoured output: -10dB@300Hz, +3dB@1.5kHz and Mode 5 has a flat output.

The tests below have the new mode numbering:
New: Mode 4 + Maximum Clipping at -6dB SNR with post EQ and lowpass filtering at 3kHz Wide:60Hz-3kHz and Narrow:250Hz-3kHz
Which is louder? Which is more readable? There isn't much difference, except in tonal quality.

New: Mode 5 + Maximum Clipping at -6dB SNR with post EQ and lowpass filtering at 3kHz Wide:60Hz-3kHz and Narrow:250Hz-3kHz
Which is louder? Which is more readable? There isn't much difference, except in tonal quality.

References:

"Spectral redundancy: Intelligibility of sentences heard through narrow spectral slits" 1995

"The Effect of Bandwidth on Speech Intelligibility" 2006
 

Spectrum power plot on voice. Mode 5, Utility mode 3, 60 Hz to 3 kHz, maximum clipping, higher compression on Band 3. Updated 26/08/2014.

Spectrum power plot on voice. Mode 5, Utility mode 5, 250 Hz to 3 kHz, maximum clipping, higher compression on Band 3. Updated 26/08/2014.

Sound quality of clipped speech Added 25/09/2014

Hilbert clipping sound quality variations due to changes in EQ.

A flat power spectrum is desirable, for keeping readability high and distortion low. The average power spectrum won't be ruler-flat, due to higher voice frequencies having lower average power, and pronunciation causing power spectrum changes.

With moderate clipping levels, small variations from flat can improve quality. Less clipping in the range 300 to 600 Hz sounds better, as distortion here sounds particularly "muddy". However, this also reduces loudness. Slightly more clipping in the range 1 to 2 kHz produces a stronger clarity on pronunciation, and distortion here seems far less obvious.

This contoured response is produced in Mode 4.

The modified Hilbert clipping routine seems to tolerate higher level at the voice fundamental frequency (100 Hz here), possibly because the distortion products are masked by the existing voice harmonics. Using a good amount of phase rotation can reduce clipper distortion and raise average level. This is due to the phase rotation causing an improvement in speech waveform symmetry, and temporal skewing of the spectrum causing the different voice frequencies to overlap less, and hence clip at different times. If heavy clipping is used for DX purposes, a totally flat power spectrum (adjusted using white noise, with all multiband compressors in gain reduction) produces louder and better sounding audio.

This flat output is produced in Mode 5.

Since Mode 5 is louder and has good quality over typical communications channels, it is now the processor's default mode.

Back to DSP v2.1 page.
 

free web
stats