Field of invention
[0001] This invention relates to an apparatus and method for the detection of simultaneous call transmissions, in particular to doublesideband amplitude modulation (DSBAM) transmissions.
Background
[0002] Simultaneous Call Transmission (SCT) is a situation when two or more transmissions occur simultaneously on the same frequency band. The end listener is then (usually) only able to understand the higher powered of the pair of transmissions. An example is illustrated in Figure 1.
[0003] This is a potentially hazardous situation as the sender of the weaker transmission may assume that they were actually heard by the end listener and take action accordingly. A situation where the consequent action would be incredibly dangerous would be where two planes transmit a signal to an airtraffic controller simultaneously, who then replies, and both planes believe the response is directed to them. Such a scenario may be noticed by an alert, trained human operator listening out for the characteristic phenomena of interfering voice and heterodyne tones arising from the frequency difference between the two transmitters. However this only reliably works when the weaker transmitter is (1) within the power range of the stronger transmitter e.g. 0 to 20dB and (2) the heterodyne lies in the filtered audio range (e.g. 300Hz to 3.5 kHz). The first condition may not be met if one transmission originates much further away than the other (i.e. one plane is overhead and the other is several kilometres away); the second condition may not be met if the frequency difference between the two transmitters (defined by the precision of the quartz in the transmitting equipment) is minor.
[0004] Indeed, SCT can occur far outside of these values due to real world effects such as propagation loss, multipath error and frequency error.
[0005] Hence automatic SCT detection is a desirable feature for radios.
[0006] WO2015/008165A2 describes a method for the detection of more than one signals contained in a received signal. The method comprises: downconverting the received signal, thereby providing a downconverted signal in a complex IQ base band; and at least partially cancelling the strongest user in the downconverted signal, thereby allowing for the detection of a possible secondary user.
[0007] US2010/0067570A1 and
DE102007037105A1 describe an apparatus adapted to automatically detect when two transmissions occur simultaneously. This system phase demodulates the sumsignal of the dual transmission by converting the baseband InPhase and Quadrature signal into unwrapped phase and amplitude. A periodic 'wobble' is present on the unwrapped phase if there are close frequencyseparated transmissions, as the difference in frequency causes the phasors of the transmissions to rotate around one another. This phase timeseries is then Fourier transformed using a bank of transformers with varying window lengths to determine whether any secondary transmission (i.e. a peak due to the unwrapped phase 'wobble') is present. A warning tone is added to the audio output if a secondary transmission is detected so as to alert the operator to the situation.
[0008] This proposed solution has several significant drawbacks. The step of producing a phase time series is an intrinsically nonlinear process (as it involves performing an arctan) and so errors propagate as the process progresses manifesting as intermodulation products in the output spectrum which are not physically present in the input. Furthermore, in realworld conditions this solution can potentially generate large occurrences of 'false positives' where an alert is sounded when only one transmission is present. This is because this system has no way of suppressing common signal impairments such as sinusoidal mains hum, incidental FM (frequency modulation), frequencyselective multipath effects and 1/f
^{2} (reciprocal frequencysquared) phase noise. All of these effects can potentially introduce a certain amount of unwrapped phase 'wobble' and then be identified as secondary transmissions.
[0009] False positives are very damaging because they cause an operator to lose faith in the equipment's reliability if it is 'crying wolf' too often. This may result in the operator turning off the automatic SCT feature completely or taking unnecessary mitigating action such as repetition of instructions. Because Air Traffic Control is a safety critical activity, an SCT detection system should be highly tolerant to real signal imperfections so that it is of the highest achievable reliability.
[0010] On the other hand, false negatives are inevitable when the secondary transmission is very weak in power, when it becomes indiscernible from the noise floor, and also in the situation where the secondary transmission is superposed on the stronger transmission with negligible frequency difference.
[0011] An improved solution is therefore needed.
[0012] The invention is defined by the features of independent claims 1 and 14. Further aspects of the invention are the subject of the dependent claims.
[0013] In one embodiment there is provided a method of determining the presence of a secondary transmission in a timedomain sumsignal including a primary transmission and the secondary transmission, the method comprising: transforming the timedomain sumsignal into a frequency domain sumsignal; wherein the frequencydomain sumsignal is a linear combination of the primary transmission and the secondary transmission; and wherein the transforming is based on a plurality of frequency bins; estimating a primary carrier frequency based on the frequencydomain sumsignal; shifting the frequency bins of the frequencydomain sumsignal based on the estimated primary carrier frequency; determining a conjugate of a sideband of the frequencybinshifted frequencydomain sumsignal; attenuating the frequencybinshifted frequencydomain sumsignal by using said conjugate of the sideband of the frequencybinshifted frequencydomain sumsignal; extracting at least one peak from the attenuated frequencydomain sumsignal; and determining the presence of the secondary transmission in the frequencydomain sumsignal based on said at least one peak.
[0014] In another embodiment there is provided an apparatus for determining the presence of a secondary transmission in a timedomain sumsignal including a primary transmission and the secondary transmission, the apparatus comprising: means for transforming the timedomain sumsignal into a frequency domain sumsignal, wherein the frequencydomain sumsignal is a linear combination of the primary transmission and the secondary transmission; and wherein the transforming is based on a plurality of frequency bins; means for estimating a primary carrier frequency based on the frequencydomain sumsignal; means for shifting the frequency bins of the frequencydomain sumsignal based on the estimated primary carrier frequency; means for determining a conjugate of a sideband of the frequencybinshifted frequencydomain sumsignal; means for attenuating the frequencybinshifted frequencydomain sumsignal by using said conjugate of the sideband of the frequencybinshifted frequencydomain sumsignal; means for extracting at least one peak from the attenuated frequencydomain sumsignal; and means for determining the presence of the secondary transmission in the frequencydomain sumsignal based on said at least one peak.
[0015] The invention extends to any novel aspects or features described and/or illustrated herein.
[0016] Any feature in one aspect of the invention may be applied to other aspects of the invention, in any appropriate combination. In particular, method aspects may be applied to apparatus aspects, and vice versa.
[0017] Furthermore, features implemented in hardware may be implemented in software, and vice versa. Any reference to software and hardware features herein should be construed accordingly.
[0018] The invention also provides a computer program and a computer program product comprising software code adapted, when executed on a data processing apparatus, to perform any of the methods described herein, including any or all of their component steps.
[0019] The invention also provides a computer program and a computer program product comprising software code which, when executed on a data processing apparatus, comprises any of the apparatus features described herein.
[0020] The invention also provides a computer program and a computer program product having an operating system which supports a computer program for carrying out any of the methods described herein and/or for embodying any of the apparatus features described herein.
[0021] The invention also provides a computer readable medium having stored thereon the computer program as aforesaid.
[0022] The invention also provides a signal carrying the computer program as aforesaid, and a method of transmitting such a signal.
[0023] Any apparatus feature as described herein may also be provided as a method feature, and vice versa. As used herein, means plus function features may be expressed alternatively in terms of their corresponding structure, such as a suitably programmed processor and associated memory.
[0024] It should also be appreciated that particular combinations of the various features described and defined in any aspects of the invention can be implemented and/or supplied and/or used independently.
[0025] In this specification the word 'or' can be interpreted in the exclusive or inclusive sense unless stated otherwise.
[0026] The invention extends to methods and/or apparatus substantially as herein described with reference to the accompanying drawings.
[0027] Purely by way of example, the present invention is illustrated by the accompanying drawings in which:
Figure 1 shows a Simultaneous Call Transmission (SCT) scenario;
Figure 2 shows an example Double SideBand Amplitude Modulated (DSB AM) signal;
Figure 3 is an example flow diagram of an SCT detection method;
Figure 4 is a schematic diagram of an apparatus operable to perform the method shown in Figure 3;
Figure 5 shows overlapping windows of the sliding window buffer of Figure 3;
Figure 6 is an illustration of the 'zero padded FFT' of Figure 3;
Figure 7 shows example filters for use in an SCT detection method;
Figure 8 shows the effect of the filters of Figure 7 on an example timedomain equivalent (FFT input) envelope;
Figure 9(a) shows an example frequency plot of a secondary signal, superposed on a primary signal carrying voice and 400Hz mains interference present;
Figure 9(b) shows the signal of Figure 9(a) following downconversion;
Figure 10(a) shows the left hand side of the signal of Figure 9(b) reflected onto the right hand side;
Figure 10(b) shows the signal of Figure 10(a) following DSBAM cancellation and a noise floor estimate;
Figure 11(a) shows the signal of Figure 10(b) following noise floor estimation;
Figure 11(b) shows the asymmetry metric of the peaks detected from the signal of Figure 10(a);
Figure 12 shows a 'feature space' plot of the peak and asymmetry metrics of a large number of simulations;
Figure 13(a) shows a scenario where an SCT and 400Hz mains noise are absent;
Figure 13(b) shows a scenario where an SCT is present and 400Hz mains noise is absent;
Figure 13(c) shows a scenario where an SCT and 400Hz mains noise are present;
Figure 14 shows a flow diagram of an alternative method for SCT detection; and
Figure 15 is an illustration of the inphase and quadrature components of a signal where an SCT is present.
Detailed description
[0028] In this specification, the term 'primary transmission' or 'primary carrier' refers to the transmission with greatest power. The term 'secondary transmission' or 'secondary carrier' refers to any other (lowerpowered) transmission occurring at the same time from another aircraft.
[0029] Air traffic controller (ATC) to aircraft communication is generally very concise, each transmission is typically less than 10 seconds in duration, and can be as short as 2 or 3 seconds. For this reason, the latency in detecting a secondary call transmission (SCT) is preferably less than 23 seconds for this field of use. The term 'simultaneous' in this specification refers to the situation where two transmissions overlap in time, as in this scenario an ATC would either not hear the SCT or the audio would be filtered out by the radio.
[0030] A typical speech DSBAM signal can be fully described by the complex timedomain signal:
where
 t is the time in seconds
 A is a gain constant (proportional to the transmitter Root Mean Square (RMS) power)
 v(t) is the realvalued audio signal, normalised to (1,+1) peak to peak
 k is the modulation depth in the range (0,1) expressed as a percentage.
 ω_{c} is the carrier frequency in radians/sec, typically approximately (2π)118MHz.
 θ is some notional phase offset (in radians) with respect to t=0
 j is √1
[0031] The spectrum (i.e. Fourier transform) of such a signal is shown in Figure 2  X(
ω). The signal is centred on a theoretically infinitesimal carrier frequency
ω_{c}. In reality, this carrier band is broadened by system and transmission imperfections. There are 'conjugate symmetric' sidebands either side of
ω_{c}, such that, for within the bandwidth of
v(
t)
, X(
ω_{c} +
ω) is the same as X(
ω_{c} 
ω); equal magnitude and conjugate phase.
[0032] For convenient manipulation later, we express DSBAM conjugate symmetry by first computing the conjugate of the carrier phasor as
where * indicates conjugation.
[0033] The conjugate symmetry property means that the following equality holds (approximate in practice, due to system imperfections and external effects).
[0034] The imperfections in the system are due to noise, cancellation imperfections and superimposed signals (such as a weak secondary transmitter due to SCT). In order to find these, the following calculation is performed:
[0035] In order to perform such a calculation, properties of the primary carrier signal (i.e.
ω_{c} and
c) must be first known. Once the primary carrier has been identified, it can be isolated and removed, allowing subsequent analysis of Y(
ω) to determine whether an SCT has occurred.
[0036] Figure 3 illustrates a highlevel dataflow of an exemplary method (referred to as 'Frequency Domain (FD) SCT detection') for detecting the presence of a secondary call transmission in an input signal. The input is a high sample rate IQ (inphase / quadrature) baseband timeseries (real or complex) and the outputs are SCT detection results (for example, as a tone inserted into the audio signal or a flag placed into a data stream). Each step is described in more detail below the following brief overview.
[0037] As used herein, the term "sumsignal" (and similar) preferably connotes a signal received by a receiver (for example an air traffic control system). Such a sum signal preferably comprises a primary carrier signal, noise, and potentially a secondary carrier signal (where present); the term "sum" merely implies, preferably, that such elements (primary carrier signal, noise, etc) are present in one and the same signal.
[0038] A sum signal is received and converted to an IQ baseband sample. This signal is decimated 300 (downsampled) removing unnecessary frequency components so as to reduce the load on later processing steps.
[0039] The decimated signal is sampled using an overlapping, sliding window buffer 302 which stores a given length of signal to be processed. Sampling the signal means that the entire signal does not need to be processed in one go, improving the latency of detection and reducing processor load.
[0040] The length of the buffer is determined by the tradeoff between processor load in analysing lengthy samples and the increased detection accuracy (i.e. high signaltonoise ratio) longer samples afford. The sampling rate (i.e. number of windows per second) is determined by a tradeoff between processor load in processing large numbers of windows within a short timeperiod and latency of SCT detection.
[0041] Each window sample is then input into a Fast Fourier Transform (FFT) 304  outputting X(ω)  for further processing in the frequency domain. An FFT is preferable to a continuous Fourier Transform (FT) as it is much less processorintensive. Other discrete transforms such as wavelet transforms or spectral line filters may be used.
[0042] The decimation step may introduce 'decimator ripple' in the frequency domain output, this is corrected for at step 306 before any further processing.
[0043] The frequency of the primary carrier transmission is estimated 308, and the FFT output is downconverted 310 based on this frequency. The inphase elements to the primary carrier transmission are then cancelled by subtracting the conjugate of the negative sideband frequencies of the primary signal from their counterpart positive sideband frequencies in the (decimated, downsampled, frequency domain) sum signal 312.
[0044] The remaining signal  Y(ω)  is due to phase noise or other signals at different frequencies (which may be SCTs) to the main carrier transmission in the signal. This signal is analysed for peaks 316 (defined by a threshold) above an estimated noise floor 314. These peaks are indicative of an SCT being present as they represent significant magnitude parts of the signal which are not on the same carrier frequency as the primary transmission (e.g. peaks corresponding to a heterodyne tone). However, other effects, such as mains hum and phase noise may show up as peaks above a nominal noise floor. Noise effects such as these affecting the sidebands of the primary carrier are typically symmetric about the primary carrier frequency, thus an asymmetry analysis 318 is performed to determine whether a particular peak has a corresponding 'mirror image' peak. This analysis is performed using an 'asymmetry threshold'. The peak is also analysed for its magnitude (above the noise floor) as higher power peaks are more likely to be secondary transmissions rather than variations in noise. These two parameters (and/or others) are combined in a 'Feature Space Classification' 320 and an SCT can be signalled if a sample contains a peak exceeding the predetermined threshold(s).
[0045] This process is performed continuously for every sampled window of the incoming signal. It should be noted that the method has been shown as split into numerous discrete steps whereas in practice many of these steps may occur simultaneously or as part of a single step.
[0046] The process described above (after sampling) is undertaken entirely on the spectrum of time windows of signal rather than any timeseries. This is significant as the main (distinguishable) physical difference between primary and secondary transmissions is the slight difference in central frequency. Analysing the spectrum is thus addressing the fundamental problem. Furthermore, all of the above processes are mathematically linear, thus greatly reducing the potential of spurious artefacts being introduced, or for errors to propagate and become amplified as the process continues.
[0047] Figure 4 shows a schematic diagram of a radio receiver 104 adapted to perform the processes involved in the detection of a secondary call transmission as described above.
[0048] A signal is received by an aerial and is input to an AnaloguetoDigital Converter (ADC) 402. The dotted section 400 represents a simplified digital radio receiver without any SCT detection capability. The digital signal is demodulated by demodulating unit 403, with assistance from a central processor 422 and memory 424. This is then passed to an audio output unit 420 and audio is outputted. An actual digital radio may include many additional components (such as tuning, filtering and amplifying circuitry), but such components are omitted for clarity in this figure.
[0049] This audio extraction process occurs independently of the SCT detection, as this represents the primary purpose of the radio 104, to convert received signals into audio (or other useful information). The components used for SCT detection are shown outside of the dotted section 400.
[0050] The digital signal is decimated (downsampled) by decimator 404 before being sampled in a sliding window buffer 406. Each window is then passed through a Fast Fourier Transform (FFT) 408.
[0051] The spectrum outputted from the FFT has filters / windows 410 applied to it so as to produce the signal defined by Y(ω) in equation 4. This output is passed to a comparator 412, which, with logic circuitry 412 and thresholds stored in memory 424, determines whether an SCT has occurred. If so, the operator is notified, for example by a tone being inserted into the audio output and/or a flag (such as an indicator on a user interface) is raised via tone / flag generator 418. Other information regarding the SCT, such as indication of a confidence level, or timestamp of the event, may also be outputted.
[0052] Figure 4 shows components separated for clarity whereas in reality many of these components may be combined as a single component (such as the comparator, logic combined with the processor) or further split into separate components.
[0053] The following description further details the various steps briefly described above.
Decimation 300
[0054] In a DSBAM radio receiver, the intermediate digital signal after analogue to digital conversion is often at a higher sample rate than is required to support the DSBAM sidebands of a primary signal.
[0055] The decimation stage 300 represents the contextdependent lowpass filtering and downsampling that may be required to reduce the signal bandwidth to a value of e.g. ±7kHz at a sample rate of
f_{s}=14kHz. For instance, such a bandwidth will support a primary DSBAM signal with 4kHz audio bandwidth (A) and up to ±3kHz of frequency error (B). The relationship between A, B and
f_{s} are illustrated below, with additional example values:
Audio bandwidth on Primary DSB signal (kHz)  Maximum Primary Frequency Error (kHz)  Complex Sample Rate after decimation (f_{s}) (kHz) 
A 
±B 
f_{s}=2*(A+B) 
5 
4 
18 
3 
3 
12 
4 
3 
14 
[0056] The purpose of decimation is twofold: (1) to reduce the computational load and (2) reject signals outside the band of interest for SCT detection. The decimation step (and the subsequent ripple equalisation) would not be necessary if these issues are not of any relevance (e.g. if the analoguetodigital converter has a low sample rate).
[0057] The decimator design preferably has a narrow transition region (for example 10% of passband) with low passband ripple (03dB) and high stopband attenuation (for example more than 40dB) i.e. a typical specification for a highquality decimator for audio applications. A typical lowpass mask specification given the third set of parameters in the above table would be ±1dB of passband ripple up to 5kHz, a transition region from 5kHz to 7kHz, and 60dB of gain in the decimation stopband.
[0058] Note that if the timeseries is realonly, then a complex oscillator and mixer are required to downconvert the signal before the decimator. The decimation lowpass filter then requires sufficient stopband attenuation to adequately remove the frequencyshifted conjugate image.
Sliding window buffer 302
[0059] This stage presents the most recent
T seconds block of sample data
M times a second to the subsequent processing stages as illustrated in Figure 5. Blocks preferably have a high degree of overlap to maximise the chance of detecting a secondary transmission as soon as it starts. Typical values for this stage (for the ATC example discussed above) are
T=2 seconds,
M=4Hz, in this case the longest time possible between an SCT event beginning and the end of one window  t
_{max} is 1/M = 0.25 seconds. However, not all SCT events may be detected in such a short time period as the signal may be too weak; in such circumstances, due to the overlap of windows, the next window would have 0.5 seconds of SCT to detect and so on. Thus a detectable constant SCT would be detected within T + t
_{max} (= T + 1/M) seconds. In order to ensure overlap the following identity must hold: T*M>1, but ideally at around 8 windows overlap, so T*M≥8.
[0060] The purpose is to allow strong signals to be detected quickly by the system at a coarse granularity in time, but also allow an adequate timehistory to allow the coherent integration and detection of weak secondary signals.
[0061] In use, M times a second, a buffer comprising the last T seconds of data is processed. This results in each data block of length 1/M seconds being processed T*M times in total. For illustration T can be in the range 1 to 4 seconds and M can be in the range 2 to 16. T governs the coherent integration period for detecting weak SCT signals and it is of advantage to be long and about the same length as typical primary transmission utterances. The value of (1/M) governs the maximum latency for detecting strong SCT signals and it is advantageous for M to be high for low latency. The processing load is proportional to the product T*M, thus there is a tradeoff between performance and processor load when selecting values of T and M.
[0062] Although in use a large number of window buffers will be processed, for clarity, the following description will solely focus on the processing of a single window.
Oversampled, zeropadded FFT 304
[0063] Figure 6 illustrates how the current analysis window from the buffer is mapped into the FFT input with 'zero padding'. The mapping is unconventional in that the buffer is split into two halves and the first half is mapped into the final part of the FFT input and the second half is mapped into the start of the FFT input, with zerovalue inputs occupying the intervening samples. This improves the operation of frequency domain downconversion as described below.
[0064] The FFT size
N_{FFT} is chosen to be around twice the size of the buffer window to provide sufficient oversampling for subsequent processing. The choice of oversampling ratio at approximately x 2 is a compromise between two conflicting factors: (1) critical sampling at around x 1 oversampling is not viable because downconversion requires an unfeasibly long resampling filter for the required DSBAM cancellation fidelity, (2) the system performance at say, > x 3 oversampling yields negligible performance benefit at the expense of increased computational complexity in the FFT. Of course, oversampling ratios of greater than x 3 may be used if computational complexity is not an issue  for example if the fidelity of the cancelation is paramount.
[0065] For example with a signal sampling rate,
f_{s}, of 14kHz and
T=2 seconds, the buffer is 28,000 samples long. This indicates that an FFT size of
N_{FFT}=65,536 (the oversampling ratio being 2.34) is appropriate; presuming that a standard Digital Signal Processing (DSP) library function is used requiring a poweroftwo size (i.e.
N_{FFT} = 2
^{n}, where
in the example above, n=16). The relationship between these variables and example combinations are indicated in the table below:
Sample Rate (kHz)  T (seconds)  Buffer Blocksize (kSamples)  Radix2 FFT Size (kSamples)  Oversampling Ratio 
f_{s}  T  C = f_{s}*T  N_{FFT}  N_{FFT} / C 
10 
2 
20 
64 (≈2^{16}) 
3.2 
12 
4 
48 
64 (≈2^{16}) 
1.3 
14 
6 
84 
128 (≈2^{17}) 
1.5 
16 
8 
96 
128 (≈2^{17}) 
1.3 
[0066] Depending on operational requirements / constraints, a larger or smaller oversampling ratio may be used. The larger the oversampling, the more processorintensive the resulting analysis would be (due to the greater number of discrete frequency 'bins' in the frequency domain) but the system would be more accurate due to (at least) the spectrum having greater resolution.
[0067] For later convenience, the FFT output vector is denoted as the vector
x with elements
x_{i} where
i={0,1...,
N_{FFT}1} counting up from the zero frequency bin.
Decimator Ripple Equalisation 306
[0068] The lowpass filter discussed above with reference to decimation may have significant passband ripple in order to be implementable with realistic cost. Passband ripple is an artefact manifesting in the spectrum of a transformed signal having had imperfect (i.e. nonsquare) bandpass filters applied to it.
[0069] The gain fluctuation across the band of interest can degrade the ability to perform primary carrier double sideband cancellation as it affects the conjugate symmetry property exploited in equation 3. A lowcost and simple way to compensate this effect is to calculate the ripple across the decimated band
H(ω) from the FFT of the impulse response caused by the decimation, and apply gain and phase compensation to the output of the FFT of 1/
H(ω).
[0070] The inverse transform 1/
H(ω) is stored as a vector of
N_{FFT} complex weights which is applied to the output directly after the FFT has been computed.
[0071] Although
H(ω) is symmetric about zero hertz, it is not symmetric about the primary carrier, so would not be cancelled out when calculating Y(
ω)  which is described in more detail below.
Primary Carrier Frequency Estimation 308
[0072] The highest magnitude FFT output bin (denoted as bin
j) is detected and its power and frequency are measured. This is asserted to be the primary carrier (i.e. strongest sinusoidal tone) and these measurements are passed on to the classification stage discussed in order to detect if any primary signal is present. Identifying the primary carrier frequency leads to identification of nonprimary carrier signals (such as an SCT).
[0073] Taking the magnitude samples of the three FFT output bins {
j1
, j, j+1} a parabolic (quadratic) curve may be fitted to the points, for example using closedform linear algebra. The fractional bin frequency f in the range of 0.5 to 0.5 of the maximum value of the fitted parabola is taken to be the best estimate of the true primary carrier frequency ω
_{c}. The oversampling of the FFT (e.g. twice oversampling) provides an interpolated mainlobe of the primary carrier and thus facilitates an accurate peak position estimate. Accurate primary carrier peak estimation allows for a more accurate downconversion, leading to improved subsequent DSBAM cancellation as the centre point of the reflection is more accurate.
[0074] At this stage the width of the primary mainlobe may be assessed by searching out from the peak in both negative and positive frequency until bins that are <3dB (approximately <0.5 in power) of the peak are identified (i.e. fullwidth, half maximum (FWHM) of the primary mainlobe). Leading and trailing edges of the primary transmitter in the analysis window cause wide mainlobes, and this measurement may be useful in the later 'feature classification' stage for assessing the timedomain activity of a primary transmitter.
FrequencyDomain Downconversion 310
[0075] Frequencydomain downconversion 310 is performed by generating a finiteimpulse filter which shifts the frequency bins by (
j+
f) bins (i.e. by ω
_{c}) so that the underlying maximum of the primary carrier mainlobe is shifted exactly on to the zero frequency bin. This step effectively makes the primary carrier signal symmetric about zero hertz, making later computation and determination of SCT events simpler.
[0076] The formula for the filter is given in Equation 5 where
N_{coeffs}= 4,
w=[0.3635819, 0.4891775, 0.1365995, 0.0106411]
w generates a BlackmanNuttall window which has good sidelobe performance. Other windows may be used such as 'Kaiser' or 'Equiripple' windows, but cosinefamily windows such as Hamming, Hann, Blackman family have the implementation benefit of combining good sidelobe performance with the precise and simple computation using cosines.
[0077] The value
x_{LIM} sets the limits for the window (i.e. it is zerovalued for 
x >
x_{LIM})
, and hence defines the quality of the resampling (a typical value would be
x_{LIM}=5)
. A small value is desirable in order to minimise the processing complexity of downconversion. The choice of x
_{LIM} is discussed in more detail below with reference to Figures 7 and 8.
[0078] The components of this frequency domain downconversion filter are shown in Figure 7. This figure also shows the 'half bin samples' which may be present due to the FFT oversampling described above with reference to Figure 6. The effect of these fractional bins are described in more detail below with reference to Figure 8.
[0079] Downconversion is performed using Equation 6 as a circular convolution on the FFT output
x but only including the nonzero terms from Equation 5 to minimise computational cost. For example, with
x_{LIM}=5, only 2
x_{LIM}+1 =11 (5 to +5) multiply / accumulates are needed per bin. This is analogous to implementing a short Finite Impulse Response (FIR) filter.
[0080] Frequency domain convolution of two signals is analogous to multiplication of their timedomain equivalents. In this case, the inverse Fourier transform of the
g(
x) term in Equation 6 is an arbitrary frequency sinusoid with a samplingdependent envelope function: this is unity when samples are taken on a grid at integer values of
x (i.e.
f=0) and (worstcase) has ramping and a zero point when samples are taken on a grid at halfway between FFT bins (i.e.
f=±0.5) as illustrated in Figure 8. Other values of
f create envelopes intermediate between these extremes. Although this process would not be necessary if only integer values of
f were used, doing so would introduce errors into the central frequency and thus mean that the later asymmetry analysis would carry through these errors.
[0081] The mathematical explanation for the envelope phenomenon shown in Figure 8 is as follows. Equation 5 comprises the product of two terms; (1) a sin(
x)/
x function with infinite support on
x (which has too many terms to compute practically) and (2) a compactly supported window function (which makes
g(
x) economic to compute). In time domain, by analogy, this is the circular convolution of (1) an arbitrary frequency sinusoid and (2) a bandpass filter corresponding to the frequency shifted IFFT of the window function. The output of this filtering process is unit amplitude sinusoid
except where a phase discontinuity passes through the filter where the two ends of the sinusoid are circularly "spliced" together. This creates the characteristic "dip" in the sinusoid envelope illustrated in Figure 8 which is worstcase when a 180 degree continuity passes through (as occurs with the halfbin case).
[0082] Figure 8 also explains the utility of the unconventional zeropadding described above of mapping the "1
^{st} half" and "2
^{nd} half' of the input buffer to timedomain intervals where the envelope function is almost exactly unity. The mapping of the second half of the time window to the first part of the FFT input and vice versa means that the FFT input maintains its timeorder as the end of the first half is effectively contiguous with the start of the second half (as the FFT can be visualised as wrapped around the surface of a cylinder). Hence if the equivalent timedomain product is taken (by taking notional IFFTs of the frequency domain convolution), we have the desired effect of the signal multiplied by a unitamplitude complex sinusoid in order effect highquality, precise downconversion of the primary signal. Slight deviation from unity over the nonzeropadded part of the envelope function is permitted as a perfect standard rectangular window is not necessary. A tolerance of deviation from the maximum of the envelope function of approximately 1% is preferable.
[0083] The choice of x
_{LIM} is a function of the oversampling ratio
so as to be the smallest value to minimise the computational complexity of the window filter whilst not impinging on the 'flatness' of the envelope function. If x
_{LIM} is too small, the envelope function would begin to curve over the sections of the IFFT which contain the signal data, resulting in the signal being modified prior to DSBAM cancellation. The value of x
_{LIM} which satisfies this tradeoff has been empirically found to be approximately (12 / oversampling ratio).
[0084] The final stage in downconversion is to rotate the FFT output such that the primary carrier is zerophase (phaserotation). This is performed by Equation 7 where
x is the downconverted FFT output derived from Equation 6.
DSBAM Cancellation (of Primary Signal) 312
[0085] DSBAM cancellation 312 as discussed above is effected by applying Equation 8 in order to generate an output vector
y comprising
N_{FFT}/2+1 bins (the zero frequency bin and the right hand side of the spectrum). As
Y(ω) is by mathematical definition conjugate symmetric about zero for an ideal primary carrier, only computation of the right hand side (i.e. positive frequency) is necessary. Only magnitude information is taken into
y for the purpose of peak detection, hence the modulus is taken.
[0086] The quality of DSBAM cancellation 312 is dependent on the temporal coherence of the primary signal. Phase noise on the primary carrier can lead to some feed through of tonal components in the sidebands which may appear as distinct tones in
Y(ω). A simple technique for identifying such tones using the concept of power 'asymmetry' is described below.
[0087] Subtracting the conjugate of the negative frequencies from the positive frequencies of the sum signal (after downconversion) effectively cancels out the part of the signal inphase with the primary carrier (attenuating the frequency domain primary carrier within the sumsignal), leaving just signals which have introduced phase noise into the sumsignal. These signals include phase noise (which would generally be at a lowlevel across a wide range of frequencies) and specific tones, which would manifest as peaks in the frequency plot.
Noise Floor Estimation 314
[0088] SCT tones in
y are characterised by isolated narrowband peaks against a noise floor after DSBAM cancellation of the primary carrier transmission. Hence, in order to detect peaks, a noise floor estimate which is not biased by tonal peaks should be estimated. Noise levels may not be constant over the whole frequency range in question, so the noise level at every frequency bin is estimated in order to 1) capture secondary transmissions above the local noise level, but potentially below the noise level elsewhere, and 2) discount frequency bins with higher levels of noise than elsewhere. A single estimate of the noise level across the entire frequency spectrum would not be able to account for such circumstances, resulting in, in the case of 1) false negatives, and in the case of 2) false positives. Either of these scenarios is undesirable, false negatives particularly so in an ATC implementation as such events could result in a dangerous situation.
[0089] An effective way to determine a frequencydependent noise floor estimation is to calculate a movingaverage of the magnitude across a range of bins centred around a particular frequency bin. If a large enough bin range is used and peaks are not frequent, this would be an accurate representation of the noise floor at that frequency bin. In one example a short slidingwindow rankorder statistic filter is applied which extracts the e.g. the median, power bin as the noise floor estimate. Analogous filters are used for removing impulsive noise from otherwise smooth functions in applications like image processing.
[0090] This concept is expressed simply in Equation 9 where the median window estimate is over ±
N_{NFE} bins (a typical value is
N_{NFE}=256 when
N_{FFT}=65536). If the window is too long, frequencydependent changes in the noise floor are smoothedout and the noise floor does not respond to local effects such as colouration from filters. On the other hand if N
_{NFE} is too short, legitimate SCT peaks may adversely bias the noise floor estimate leading to them being smoothed and subsequently discounted. A value for N
_{NFE} of approximately
has been found to satisfy this tradeoff, but other information (such as known noise sources) may be taken into account in the choice of N
_{NFE}.
[0091] An issue occurs when applying the window to the very start and end of
y where nonexistent bins are addressed outside the boundary. A solution is to reflectin the missing bins from the respective boundary such that e.g. bin
i=1 comes from bin
i=+1, and similarly so, for the end of
y.
[0092] Median filtering is costly to compute in terms of procession time and power. A practical optimisation is to decimate
y by summing contiguous blocks of
D_{1} samples and then using a much shorter median filter over ±
D_{2} on the resulting decimated signal. The aggregate window size is
N_{NFE}=
D_{1}D_{2}. For instance,
D_{1}=16 and
D_{2}=16 when
N_{NFE}=256
. This has little performance loss when the time series
y is dominated by the noise floor and has sparsely located peaks. In one embodiment
this splits the processing load evenly between the linear moving average process and the nonlinear median filtering process. In other embodiments fewer, larger windows may be taken or alternatively, more, smaller windows. The choice of length of windows D
_{1}, D
_{2} is also dependent on the tradeoff between too short being dominated by peaks and too long missing the trend of the noise, for example D
_{1} and D
_{2} could each vary between 4 and 64 as a general illustration in these circumstances.
[0093] The median is the default rankorder statistic to draw out, but other measures of central tendency are possible, for example the 40
^{th} centile, which will be less biased by peaks, but more susceptible to low power noise samples.
Peak Detection of Secondary Carriers 316
[0094] Peaks are identified in
y by identifying local maxima, where y
_{i}>y
_{i1} and y
_{i}>y
_{i+1}. Performing just this analysis may pick up a lot of spurious fluctuations in the noise floor, for this reason only peaks (i.e. values of y
_{i}) that satisfy a certain predefined threshold (peak_metric_thresh) are identified as SCT candidates. Example values for peak_metric_thresh are provided below with reference to Figures 11 to 14, but may vary from around 0.85 to 3 (or greater than 3) depending on the situation.
[0095] In one embodiment this is where the distinct peaks in
y are 10
^{peak_metric_thresh} times higher than the (local) noise floor
n. This is denoted the subset
P of the set of all possible
i values (bin indices) which satisfies Equation 10.
[0096] This gives rise to the value of peak metric (principally for diagnostic purposes) in Equation 11.
[0097] The threshold peak_metric_thresh is preferably a systemset parameter which may be calculated once upon calibration of the system; alternatively it may be dynamically calculated so as to result in a system with a specific falsepositive rate. This may be useful if the variance in the noise floor (i.e. the accuracy of the noise floor estimate) changes over time so that the system becomes more prone to false negatives (if the variance decreases) or it becomes more prone to falsepositives (the variance increases). In an average situation, a value for the peak metric threshold
p(
i) would be between 1 and 4, more preferably between 2 and 3 as a general illustration in these circumstances.
[0098] Another metric that may be used to reduce the number of candidate peaks is to specify that two peaks must be separated by a minimum frequency otherwise they are treated as a single peak (i.e. the smaller peak is disregarded). The threshold min_freq_sep is defined. In one example this is between 5Hz and 50Hz, preferably between 7Hz and 15Hz, and preferably approximately 10Hz. Disregarding the smaller peak of a closely separated pair of peaks has negligible impact on the capability to detect genuine secondary tones when peak detections are sparsely separated. Such a feature allows strong peaks from e.g. 400Hz mains hum (which are highly conjugatesymmetric) to absorb their own sidelobe features which are much weaker in power but more asymmetric and thus can cause false positives. The method identifies the weaker peaks from the set
P which are within +/ min_freq_sep of the current secondary tone candidate being analysed, and marks them for deletion from set
P by placing them in the set
Q as follows (with commentary accompanying each step):
[0099] Given secondary tone candidate indices in
P, a nonnegative realvalued asymmetry metric is computed using Equation 12. This is a measure of how asymmetric the power is between positive and negative frequencies (with respect to the downconverted primary carrier at zero frequency).
[0100] An asymmetry analysis favours 'legitimate' SCT events over other phase noise as SCT events have (by definition) a central frequency offset from that of the primary carrier and are thus asymmetric about primary carrier (and, after downconversion, are asymmetric about zero hertz). There is a low probability of another tone precisely at the opposite frequency sign as this would correspond to a third SCT at a very specific frequency.
[0101] In contrast, "worstcase" primary signals with the deleterious properties of (1) high phase noise and (2) voice sidebands contaminated with interference tones (from e.g. mains electricity) generate secondary carrier candidates which are very symmetric in power (from the core definition of DSBAM conjugate symmetry).
[0102] Hence the asymmetry metric
a(
i) provides a useful way to exploit values that are precomputed elsewhere in the process (i.e. bins from the downconverted
X(
ω) in vector
x) to reject false positives from poor quality primary transmitters.
[0103] A threshold, asym_metric_thresh, for the value of
a(i) is defined where peaks not meeting this threshold are discarded as being too symmetrical, and thus unlikely to be SCTs. The asymmetry threshold provides a means to discount peaks which have a high residual power following subtraction due to the fact that the symmetric peaks had a high power prior to subtraction  for example if the signal has a high level of noise (which is not perfectly symmetrical), or due to external effects such as mains hum. Figures 12, 13(b) and 13(c) below show scenarios where the asymmetry threshold is utilised to reduce the falsepositive rate by limiting the number of events above the power threshold which would otherwise be deemed to be SCT events.
Feature Space Classification 320
[0104] Before a candidate peak from set
P can be determined as an SCT event, a number of checks may be performed.
[0105] In order for an SCT to be present, there must be first the presence of a primary peak. This eliminates false positives when there is no transmission being received. A threshold primary_pk_thresh is defined where SCT analysis is only undertaken if the primary peak is above this threshold. This threshold is corrected by the amount of gain applied to the signal (AGC_gain) so as to measure the absolute power of the primary signal.
[0106] A threshold is also set for the maximum allowed width of the primary peak, primary_bw_thresh, where SCT analysis is only undertaken if the width of the primary carrier peak is greater than this threshold. This ensures that a certain lower bound is met on the markspace ratio of the primary transmitter in the analysis window, for example it may be desirable for the primary transmission to occupy at least 50% of the time window. This can prevent some anomalies due to rising edges entering or trailing edges leaving the analysis window. The width of the primary carrier peak is an output which is simple to generate and which provides some clear information about the temporal activity of the primary transmitter.
[0107] The following section describes logic which may implement the classification part of the method.
Inputs from Primary Carrier Frequency Estimation
[0108] The following additional inputs are used for detecting the presence of a primary signal (and have associated thresholds):
 primary_pk
Magnitude value of primary peak  primary_bw
Primary peak 3dB width in bins (FWHM)  AGC_gain
[0109] The Automatic Gain Control magnitude gain applied elsewhere in the receiver.
Impact of AGC in the RX chain
[0110] Automatic Gain Control (AGC) will modulate the dynamic range of signals; therefore the primary_pk value is scaled by the amount of applied AGC and thus needs to be rescaled by the reciprocal of the AGC gain in order to have an absolute power in terms of dBm.
Example Decision Logic
[0111] The following decision logic is given as an example of how to generate a Boolean detection output.
if (primary_pk>(primary_pk_thresh/AGC_gain)) AND
(primary_pk>primary_bw_thresh) AND
there exists any a(i)>asym_metric_thresh; i∈P
then
SCT_detect=TRUE
else
SCT_detect=FALSE
end (if)
[0112] This analysis would give a Boolean 'yes' or 'no' to any peak that has passed the previous filtering stages so that it remains in candidate set
P (e.g. that it is above the peak threshold and is not close in frequency to another peak).
[0113] The exact values of the parameters used in the analysis (e.g. a(i) and p(i)) can be used in a 'quadrant' analysis, wherein the combination of them in a feature space leads to a positive SCT determination.
[0114] A more generalised analysis is to fit a suitable likelihood density function of the form prob(peak_metric, asymmetry_metric) given SCT present ("H1") or SCT absent ("H0") and then computing a likelihood ratio to make the decision. The exact form of the likelihood function would depend on the application, as well as other factors such as desired false positive rate.
[0115] A more sophisticated algorithm than the decision logic described above, with some statistical modelling of the parameter density functions under different H1/H0 hypotheses (e.g. Gaussian Mixtures Model, Fuzzy Clustering, Neural Network, or Support Vector Machines) would be capable of generating a 'soft' output with a confidence score, for example between zero and one.
[0116] Such a confidence level could be fedback to the end user for information and/or calibration purposes.
Simulation results
[0117] To illustrate the operation of the proposed method the following 'difficult' signal scenario comprising the presence of SCT is demonstrated, the scenario featuring:
 Primary DSBAM signal carrying voice audio and additive loud 400Hz mains hum
 Primary carrier frequency error
 Significant phase noise on the primary carrier
 Secondary DSBAM signal carrying voice
 Additive White Gaussian Noise (AWGN)
[0118] Figure 9 shows the spectrum
X(ω) of the input (a) and output (b) signals of Frequency Domain DownConversion. The primary carrier, voice sidebands, 400Hz mains tone sidebands and the secondary signal (creating an SCTpresent scenario) are marked. After downconversion, the primary carrier is shifted to zerofrequency making the two voice and 400Hz mains sidebands and the carrier of the single secondary signal respectively symmetric and asymmetric about zerofrequency, as discussed above.
[0119] Figure 10 illustrates the results of the DSBAM cancellation spectrum
Y(ω) (a) and the noise floor estimated spectrum
N(ω) (b) in comparison to the superposed positive and negative frequency halves of
X(ω). DSBAM cancellation has achieved around 25dB attenuation of the 400Hz tone with negligible attenuation of the secondary carrier. This is because the 400Hz mains hum modulates the primary carrier and is thus conjugatesymmetric with respect to the primary carrier. This means this feature is largely attenuated by the proposed frequency domain DSBAM cancellation stage. However the secondary carrier is not conjugate symmetric with respect to the primary carrier and is not significantly attenuated.
[0120] The noise floor estimate,
N(ω)  shown in Figure 10(b), follows the underlying spectral envelope of
Y(ω) without much bias from isolated peaks in
Y(ω). Note that imperfect DSBAM cancellation of the (semicoherent, poor quality) primary signal has led to some feedthrough of primary voice spectrum which is followed by the noise floor estimate
N(ω).
[0121] Figure 11 illustrates detected peaks (the peak_metric_thresh is set low to a value of 0.85 to allow false detections through for characterisation). Two peaks are correctly detected for respectively the 400Hz mains tone and secondary carrier. Though the peak metrics are of comparable magnitude (shown in Figure 11(a)), the asymmetry metrics are different (Figure 11(b)).
[0122] By extension, if a Monte Carlo run of 1000 simulations is performed with the same parameters, but randomised noise and frequency offsets, we obtain the informative scatterplot in Figure 12 of peak metric versus asymmetry metric. There are two distinct clusters caused by (1) highly powersymmetric detections due to poorlycancelled 400Hz tones and (2) highly asymmetric tones due to genuine secondary carrier. Feature space design as described above may be used to distinguish between these two different sets of candidate peaks even with 'difficult' signal parameters.
[0123] Various thresholds may be used to determine legitimate SCT events. Figure 14 illustrates the utility of such thresholds. For illustration, empirically setting the asym_metric_thresh≈0.4 and the peak_metric_thresh≈3.5 excludes most of the 400Hz false positives and still includes the majority of the genuine secondary signal cluster true positives as shown by Figure 12.
[0124] Three further scenarios are illustrated in Figure 13, as described in the table below:
Primary 400Hz Mains  SCT absent  SCT present 
Absent 
Figure 13(a) 
Figure 13(b) 

There are a few isolated points, well below the proposed thresholds. No false positives. 
Most of the SCT true positives lie in the top right quadrant described by the two proposed thresholds. 
Present 
Figure 13(c) 
Figure 12 

There is a cluster of points, exceeding the peak threshold, but not the asymmetry threshold. This shows the value of the proposed dual threshold idea. No false positives would be generated. 
As discussed above. 
[0125] Such 'feature space classifications' may be provided to a user for system analysis, or the SCT determination may be performed directly on the data with no graphical output.
'Mixed domain' SCT detection
[0126] An alternative embodiment where both the time series and the spectrum of the received signal are processed is described below. This embodiment may be preferable if processing power is limited, as processing large amounts of FFTs and their outputs can be processor intensive, especially if the FFT is significantly oversampled.
[0127] Figure 14 shows a highlevel flow diagram for the 'mixed domain' method; many of the steps having corresponding steps in the frequency domain SCT detection method. The detail relating to the corresponding steps described above applies to this alternative embodiment unless explicitly indicated otherwise.
[0128] The first steps are as described previously, wherein the incoming signal is decimated 300 and 'chopped up' into overlapping windows 302.
[0129] The method then branches, with one branch performing an FFT 500, estimating the frequency 502 and phase 504 of the primary transmission in the signal. The phase may be estimated by determining the phase of the samples used in determining the peak (e.g. the highest magnitude sample and the two either side). The highest magnitude samples would most likely be from the primary carrier so are most likely to have the primary phase. The primary carrier frequency and phase are used to downconvert 505 the timedomain windows by mixing each window with a complex sinusoid with the same frequency and phaseoffset as the primary carrier transmission.
[0130] The signal can be illustrated by Figure 15 where the inphase (I) and quadrature (Q) components of a frequency downconverted signal (x'(t)) are plotted. If there were only perfect, phase noise free, primary carrier transmission, this vector would lie at constant θ with its magnitude (i.e. length) changing with time. If there are any additive signals (such as SCTs or phase noise), the vector's angle would also change.
[0131] In order to measure this part of the signal, the signal is phaserotated by θ and the part of the vector moving along the Q axis is measured. This step corresponds to the 'Quadrature split' 506 step in Figure 14. This process is mathematically linear and so information is preserved and no artificial intermodulation effects are propagated through to the following processing steps.
[0132] A realonly input FFT 508 is performed on the Q component of the phaserotated signal. This provides a spectrum from which peaks are detected 510 corresponding to the outofphase components of the original signal.
[0133] The analysis of these peaks so as to determine the presence of an SCT event then follows in the same way as described above.
Alternatives and modifications
[0134] The above specification refers primarily to the situation where two simultaneous transmissions are present, but the same system would be able to alert the user to any number of simultaneous transmissions. The specification has been limited to the former scenario as this is statistically far more likely.
[0135] Furthermore, the specification above is primarily concerned with simultaneous voice transmissions received by an Air Traffic Controller, but it will be appreciated that the signal does not necessarily have to be voice transmissions. For example, it may be digital information encoded into an AM radio transmission.
[0136] In the above description, the conjugate of the negative frequency sideband is subtracted from the related positive frequency sideband of the sumsignal so as to cancel out the primary carrier. The opposite operation is equally possible whereby the conjugate of the positive frequency sideband is subtracted from the related negative frequency sideband of the sumsignal.
[0137] Various ranges and/or values are provided in this description, often with reference to specific embodiments, notably being derived from values such as buffer window size
T, sampling rate
f_{s} and audio/signal bandwidth. Those skilled in the art would understand that for different applications or operating conditions, the system and method may operate more effectively with these values modified.
[0138] It will be understood that the present invention has been described above purely by way of example, and modifications of detail can be made within the scope of the invention.
[0139] Reference numerals appearing in the claims are by way of illustration only and shall have no limiting effect on the scope of the claims.