In Part 2 of this series on Radar Basics, the use of Doppler processing was discussed as a key method to discriminate both in distance and velocity. Discrimination in direction (or angle of arrival to the antennas) is provided by aiming the radar signal, using either traditional parabolic or more advanced electronic steering of array antennas.
Under certain conditions, other methods are required. For example, jammers are sometimes used to prevent detection by radar. Jammers often emit a powerful signal over the entire frequency range of the radar. In other cases, a moving target has such a slow motion that Doppler processing is unable to detect against stationary background clutter – such as a person or vehicle moving at walking speed. A technique called space time adaptive processing (STAP) can be used to find targets that could otherwise not be detected.
Because the jammer is transmitted continuously, its energy is present in
all the range bins. And, as shown in Figure 1, the jammer cuts across
the all Doppler frequency bins due to its wideband, noise-like nature.
It does appear at a distinct angle of arrival however. Figure 1 also
depicts the ground degree of clutter in a side-looking airborne radar
due to the Doppler of the ground relative to the aircraft motion. A slow
moving target return can easily blend into the background clutter.
STAP radar processing combines temporal and spatial filtering that can
be used to both null jammers and detect slow moving targets. It requires
very high numerical processing rates as well as low latency processing,
with dynamic range requirements that generally require floating-point
Figure 1. Clutter and jammer effects in Doppler space
STAP processing requires use of an array antenna. However, in contrast to the active electronically scanned array (AESA), for STAP the antenna receive pattern is not electronically steered as with traditional beamforming
arrays. In this case, the array antenna provides the raw data to the STAP radar processor, while the antenna processor does not perform the beam steering, phase rotation or combining steps, as indicated in Figure 2. Also, while the AESA is depicted in one dimension, this array can be – and often is – two dimensional in both elevation (up and down) and azimuth (side to side). In this way, the antenna receive pattern can be steered or aimed in both elevation and azimuth.
Figure 2. Array antenna necessary for STAP radar
Radar processing can occur over N
consecutive pulses, as long as they lie within the coherent processing interval, considered “slow” time. The L
radar samples collected during the pulse repetition frequency (PRF) interval are binned, which corresponds to the range. The PRF interval is referred to as “fast” time. Doppler processing occurs across the N
samples in the L
range bins as shown in Figure 3.
Figure 3. Doppler radar processing diagram
Doppler processing operates on an array of data for detection processing. STAP radar, on the other hand, operates on a cube of data as illustrated in Figure 4. The extra dimension, M
, comes from the M
distinct inputs from the array antenna (for both elevation and azimuth). This will produce a radar data cube of dimensions M
(number of array antenna inputs) by L
(number of range bins in fast time) by N
(number of pulse in CPI in slow time). Doppler processing occurred over the data slice across the L
dimensions. In STAP, we will see that slices of data across the M
dimensions are processed.
Figure 4. Radar Datacube used in STAP Radar
Before discussing the STAP algorithm, it may help to provide some context. STAP is basically an adaptive filter, which can filter over the spatial and temporal (or time) domain. The goal of STAP is to take a hypothesis that there is a target at a given location and velocity, and create a filter that has high gain for that specific location and velocity, and apply proportional antenuation of all signals (clutter, jammers and any other unwanted returns). There can be many suspected targets to generate location and velocity hypotheses for, and these are all normally processed together in real time. This produces very high processing and throughput requirements on the STAP processor.
The volume of data coming from the receive antenna is very high, thus this data must be processed in real time, rather than be stored for later off-line processing. Further, as this is an adaptive filter system, the data is processed immediately as part of a feedback loop to generate the optimal filters used to detect the suspected targets.
This brings up the issue of how the suspected targets are identified for subequent STAP processing. This can come from weak detections found in Doppler processing, from other IR or visual sensors, from intelligence data, or from many other sources. This issue is beyond the scope of these discussions on how STAP processing works. But as will be shown, STAP has the capability to pull targets that are below the clutter into a range that can be reliably detected. A good analogy is a magnifying glass. Conventional methods are used to view the big picture, but if something of interest is noted, STAP can be used to act as a magnifying glass to zoom into a specific area and see things that would be otherwise undetectable.
For each suspected target, a target steering vector must be computed. This target steering vector is formed by the cross product of the vector representing the Doppler frequency and the vector representing the antenna angle of elevation and azimuth. For simplicity, we will assume only azimuth angles are used.
The Doppler frequency offset vector is a complex phase rotation:
= e -2π·n·Fdopp
for n = 1..N-1
The spatial angle vector is also a phase rotation vector:
= e -2πd·m·sin(θ/λ)
for m = 1..M-1, for given angle of arrival θ and wavelength λ
The target steering vector t
is the cross product vector Fd
and A θ
as shown in Figure 5, and t
is vector of length N
. This must be computed for every target of interest.
Figure 5. Target steering vector t = f (Angle, Doppler)
Next, the the interfence covariance matrix SI
must be estimated. One method is to compute and average this for many range bins surrounding the range of interest. To compute this, a colum vector y
is built from a slice of the radar data cube at a given range bin k
. The covariance matrix by definition will be the vector cross product:
* · yT
Here, the vector y
is conjugated and then multiplied by its transpose. As y
is of length N
, the covariance matrix SI
is of size [(N
· M) x (N · M
)]. Remember that all the data and computations are being performed with complex numbers, representing both magnitude and phase. An important characteristic of S
I is that it is hermitian, which means that S
I = SI*T
or equal to its conjugate transpose. This symmetry is a proporty of covarioance matrices.
Figure 6. Computing the interference covariance matrix
The covariance matrix represents the degree of correlation across both antenna array inputs and over the pulses comprising the CPI (coherent processing interval). The intention here is to characterize undesired signals and create an optimal filter to remove them, thereby facilitating detection of the target. The undesired signals can include noise, clutter and jammers.
noise + Sjammer
The covariance matrix is very difficult to calculate or model, therefore it is estimated. Since the covariance matrix will be used to compute the optimal filter, it should not contain the target data. Therefore, it is not computed using the range data right where the target is expected to be located. Rather, it uses an average of the covariance matrices at many range bins surrounding, but not at the target location range. This average is an element by element average for each entry in the covariance matrix, across these ranges. This also means that many covariance matrices need to be computed from the radar data cube. The assumption is that the clutter and other unwanted signals are highly correlated to that at the target range, if the differnce in range is reasonably small.
Figure 7. Estimating the covariance matrix using neighboring range bin data
The estimated covariance matrix can used to build the optimal filter. As those of you with experience with adaptive filters already guess, this is going to involve inversion of the covariance matrix, which is very computationally expensive, and generally requires the dynamic range of floating point numerical representation. Recall that this matrix is of size [(N
) x (N
)] and can be quite large.
Fortunately, the matrix inversion result can be used with multiple targets at the same range. The steps are as follows:
*, or u
One method for solving for SI
is known as QR Decomposition, which we will use here. Another popular method is the Choleski Decomposition.
Perform the substitution SI
, or product of two matrices.
can be computed from SI
using one of several methods, such as Gram-Schmidt, Householder transformation, or Givens rotation. The nature of the decomposition in to two matrices is that R
will turn out to be an upper triangular matrix and Q
will be an orthonormal matrix, or a matrix composed of orthogonal vectors of unity length. Orthnonornal matrices have the key property of:
So it is trival to invert Q
. Please refer to a text on linear algebra for more detail on QR Decomposition.
* now multiply both sides by QH
is an upper triangular matrix, u
can be solved by a process known as “back substitution”. This is started with the bottom row that has one non-zero element, and solving for the bottom element in u
. This result can be back-substituted for the second to bottom row with two non-zero elements in the R
matrix, and the second to bottom element of u
solved for. This continues until the vector u
is completely solved. Notice that since the steering vector t
is unique for each target, the back substitution computation must be performed for each steering vector.
Then solve for the actual weighting vector h
*), where dot product (tH
*) is a weighting factor (this is a complex scaler, not vector)
Finally solve for the final detection result z
by the dot product of h
and the vector y from the range bin of interest.
is a complex scaler, which is then fed into the detection threshold process.
After this math, it is worthwhile to try to get an intuitive understanding of what is going on. Shown in Figure 8 is a plot of SI-1
, the inverted covariance matrix. In this case, there is a jammer at 60 degrees azimuth angle, and a target at 45 degrees, 1723 meters range and normalized Doppler of 0.11.
Figure 8. Logarithmic plot of inverted covariance matrix
Notice the very small values, on the order of –80 dB, present at 60 degrees. The STAP filtering process is detecting the correlation associated with the jammer location at 60 degrees. But inverting the covariance matrix, this jammer will be severly attenuated. Notice also the diagonal yellow clutter line. This is a side looking airborne radar, so the ground clutter has positive Doppler looking in the forward direction, or angle and negative Doppler in the backward direction or angle. This ground clutter is being attenuated at about –30 dB, proportionally less severely than the more powerful jammer signal.
The target is not present in this plot. Recall that the estimated covariance matrix is determined in range bins surrounding, but not at the expected range of the target. However in any case, it would not likely be visible anyway. However, using STAP processing with the target steerig vector can make a dramatic difference, as shown in Figure 9. The top plot shows the high return of the peak ground clutter at range of 1000m with magnitude of ~ 0.01, and noise floor of about ~0.0005.
With STAP processing, the noise floor is pushed down to ~0.1 x 10-6 and the target signal at about 1.5 x 10-6
is now easily detected. It is also clear that floating point numerical representation and processing will be needed for adequate performance of the STAP algorithm.
Figure 9. STAP processing gain
Next, the processing requirements should be considered, using the following assumptions:
PRF = 1000 Hz
12 antenna array inputs (Aθ
vectors are of length 12 or M
16 pulse processing (Doppler vectors are of length 16 or N
Minimum required size of SI
is [192x192], in complex single precision format
Assume 32 likely targets to process (32 target steering vectors)
Use of 200 range bins to estimate SI
Table 1. STAP GFLOPs estimate
In fact, this is a very conservative scenario. The PRF is rather low and the number of antenna array inputs is very small. Should the number of antenna array inputs increase by 12 to 48, the processing load of the matrix processing, in particular QR Decomposition, goes up by the third power or 64 times. This would require over 3 TeraFLOPs of realtime floating point processing power. Because of this, the limitations on STAP are clearly the processing capabilities of the radar system.
The theory of STAP has been known for a long time, but the processing requirements have made it impractical until fairly recently. Many radar applications benefiting from STAP are airborne and often have stringent size, weight, and power (SWaP) constraints. Very few processing architectures can meet the throughput requirements of STAP, while even fewer can simultaneously meet the SWaP constraints.
One alternative is to use FPGAs. Several FPGA vendors have long offered floating-point operator libraries such as multiply and add/subtract that have similar areas, performance levels, and latencies. The combination of multiple arithmetic operators into higher level functions such a vector dot product operator are inefficient and suffer from significantly reduced Fmax
. Typical latencies for both multipliers and adders are in the range of 10; a dot product operator with a few tens of inputs may therefore exceed a latency of 100. Routing congestion and datapath latencies are have been critical restrictions on floating point implementations on FPGA architectures. Parallelism is a key advantage of a hardware solution like FPGAs, but it is often not applied to floating point signal processing because the long latencies make the data dependencies in algorithms such as matrix decomposition difficult to manage. Therefore, the resultant systems offered poor performance levels, uncompetitive to other platforms such as GPU or multi-core CPU architectures.
Altera has developed a floating-point design flow that can overcome these issues. Rather than building a datapath from individual operators, the entire datapath is considered as a single function with inter-operator redundancy factored out. Mantissa representation can be converted to hardware-friendly twos complement and mantissa widths extended to reduce the frequency of normalizations. Elementary functions can be implemented as much as possible using hard multipliers, which offer guaranteed internal routing and timing, as well as low power and latency. New techniques can be applied for matrix decompositions, with the algorithms restructured to remove most of the data dependencies, so that parallel – and therefore high latency – datapaths can be used for these computations.
This approach is known as “Fused Datapath”, and when combined with Altera’s new 28nm Variable Precision DSP block architecture, offers extremely high data processing capabilities, in excess of one Teraflop on a single FPGA die.
In addition, Fused Datapath methodology is actually more accurate than computing on a microprocessor, which uses the standard IEEE754 floating point conventions. This has been measured by analyzing single precision Fused Datapath and single precision computations on a desktop PC and comparing both to a double precision result standard.
Moreover, this toolflow has been optimized specifically for radar applications, with support for vector operators and common linear algebra constructs, as shown in Figure 10. A useful set of floating point trigonometric and other math library functions is also integrated. Typical Fmax
using Fused Datapath on dense floating point designs in large Stratix FPGAs is between 200 to 250 MHz.
Figure 10. Fused datapath floating-point library support
STAP radar designs have been built with this new floating point toolflow. The performance of a key function, the matrix inversion, is shown in Table 2. In this case, the Choleski Decomposition performance is shown rather than QR Decomposition, as it is turns out to be more efficient for hardware implementations (either algorithm can be used in most STAP applications).
Table 2. FPGA matrix inversion throughput metrics
The designer has two methods to trade throughput against FPGA hardware resources. As part of the toolflow support for vector operations, the tool allows the user to parameterize the vector size for various processing steps. A large vector size can process more data simultaneously, at the expense of more hardware. Smaller vector sizes require more looping to complete the calculations, but use less device resources, and has reduced throughput.
performance on 40nm Stratix IV and 28nm Stratix V FPGAs are similar; however the much higher density and architectural improvements of Stratix V enable more Choleski cores to be built within the same chip, thus allowing for a proportional increase in aggregate throughput through parallelism (see Table 2). All of these metrics were built with Fused Datapath technology using single precision floating point precision.
This new capability provides superior computational capability for radar system designers. One attractive architecture for STAP or other radar backend processing is to partition the high GFLOPs and predictable processing in an FPGA (for example: covariance matrix computation and inversion) while maintaining the more dynamic and lower GFLOPs on a processor (steering vector generation, detection processing). This can also help preserve the code base investment in legacy processor architectures.
Also see Part 1
, Part 2
, Part 3
, and Part 5
of this five-part mini-series on “Radar Basics”.
Fundamentals of Radar Processing
by Mark Richards. Readers are encouraged to refer to this text for further information on STAP.
About the author
As senior DSP technical marketing manager, Michael Parker is responsible for Altera’s DSP-related IP, and is also involved in optimizing FPGA architecture planning for DSP applications.
Mr. Parker joined Altera in January 2007, and has over 20 years of DSP wireless engineering design experience with Alvarion, Soma Networks, TCSI, Stanford Telecom and several startup companies.