Professional Documents
Culture Documents
Hybrid Electric Vehicles
Hybrid Electric Vehicles
Formation flying synthetic aperture radar (FF-SAR) systems, as an important development direction of
multichannel SAR, can achieve high-resolution wide-swath imaging. Coherently combining data from
satellite receivers puts a strain on the traditional real-time processing systems based on individual satellites.
Characteristics, such as the power of real-time on-orbit processing platform, must be properly balanced with
constrained memory and parallel computational resources. This article proposes a distributed SAR real-time
imaging method based on the embedded graphics processing units (GPUs). The parallel computing method
of the chirp scaling algorithm is designed based on the parallel programming model of compute unified
device architecture, and the optimization methods of memory and performance are proposed for the
hardware architecture of embedded GPUs. In particular, the unified memory management method is used
to avoid data copying and communication delays between the CPU and GPU. A hardware verification
system for distributed SAR real-time imaging processing based on multiple embedded GPUs is constructed.
The proposed algorithm takes 5.86 s to process single-precision floating-point complex imaging with a data
size of 8192 × 8192 on a single Jetson Nano platform. The actual power consumption is less than 5 W, and
the performance-to-power ratio is greater than 1.7%. The experimental results show that the real-time
processing method based on the embedded GPUs proposed in this article has high performance and low-
power consumption.
Keywords: Chirp scaling (CS) algorithm, distributed architecture, embedded graphics processing unit
(GPU), on-orbit real-time processing, synthetic aperture radar (SAR).
LIST OF CONTENTS
1 Introduction 1
2 Literature Survey 3
5 Conclusion 21
6 Future Scope 22
7 References 23
LIST
LIST OF
OF FIGURES
TABLES
Chapter 1
Introduction
1.1 Overview
Spaceborne synthetic aperture radar (SAR) systems provide high-resolution, all-time, and all-
weather ground observation capabilities. Therefore, they are widely used in important fields, such as disaster
monitoring, resource exploration, and environmental protection.
Formation flying synthetic aperture radar (FF-SAR) is a new operational mode used to achieve high-
resolution wide-swipe SAR images. FF-SAR is usually combined with a set of very compact lightweight
satellite platforms, which has a lower overall cost, is easier to replace faulty satellites, and is more adaptable
to future fast and flexible launch missions. The TanDEM-X mission and the CanX-4&5 formation mission
have successfully demonstrated important capabilities in this area.
In a CubeSat train was proposed for high-resolution radar detection and imaging missions in
Antarctica. Its formation consists of 50 CubeSats, and the coherent combination of radar echoes collected
through all platforms is expected to guarantee high cross-orbital resolution, demonstrating the significant
potential of FF-SAR for future applications. It is foreseeable that, in future FF-SAR missions, the data
volume to be processed will greatly increase. The processing power of each satellite is limited on account
of the limitations of satellite size and power consumption. On-orbit real-time imaging processing systems
based on single satellites face considerable pressure. Therefore, it is necessary to explore an on-orbit real-
time imaging processing system suitable for FF-SAR mode.
In a multi-satellite distributed data processing system was proposed, which can effectively reduce the
processing pressure of a single satellite by reasonably assigning the computational tasks of the SAR imaging
algorithm to multiple satellites for processing. The system takes a field programmable gate array (FPGA)
as the core processing unit. FPGAs are attractive for on-orbit real-time processing systems because they can
meet the requirements for high performance and low-power consumption. Flexibility is the main advantage
of FPGAs. However, in the pursuit of computational accuracy, floating-point operations need to be used,
which results in a large consumption of required computational resources.
As a high-performance platform, graphics processing units (GPUs) are often used to accelerate the
processing of SAR imaging.
The advantage of GPUs is parallel computing capability, but the challenge is that GPUs are limited
by the ability to interact with data. The coordination between data throughput and computation needs to be
optimized. The traditional GPUs are not feasible as on-orbit real-time processing systems because their size
and power consumption cannot meet the requirements. However, the emergence of embedded GPUs has
provided a new opportunity for many real-time data processing tasks in recent years.
Embedded GPUs have the advantages of high integration, low-power consumption, and high
performance. Benefiting from the compute unified device architecture (CUDA) programming method, the
development cycle is short. Some pieces of literature have studied to implement of SAR imaging using
embedded GPUs. In two SAR processing algorithms were implemented and tested based on the Jetson TX1
platform. It shows that running these two algorithms on Jetson TX1 is faster than using CPU.
However, the overall optimization efficiency is limited because of using open-source library ArrayFire
for parallel computation. In the details of performing SAR imaging with Jetson TK1 were provided, but the
results suggest that the transfer of redundant data consumes considerable processing time between the CPU
and GPU. In fact, the data transfer between the CPU and GPU could have been avoided on the embedded
GPU. Notably, the feasibility of embedded GPU on-orbit operation has been verified.
Related studies have shown that embedded GPUs can provide considerable advantages for
computationally intensive data processing in low earth orbit applications. Therefore, embedded GPUs have
excellent application prospects in short-term tasks. This article proposes a distributed SAR imaging method
based on the embedded GPU for FF-SAR system. The proposed method is scalable to different embedded
GPU platforms, and the quantity configuration is also flexible. The processing of chirp scaling (CS)
algorithm has been rescheduled to suit the distributed SAR imaging systems. In order to maximize the
processing performance of embedded GPU, the corresponding optimization methods are proposed for
CUDA parallel computing.
Finally, a distributed simulation system based on embedded GPUs is constructed, and its processing
performance is verified by using the raw data of Gaofen-3 (GF-3). The rest of this article is organized as
follows. Section II introduces the CS imaging algorithm. In Section III, the method of design and
optimization of a distributed SAR real-time imaging system is introduced. Section IV gives the experimental
results and discussion.
CHAPTER 2
Literature Survey
2.2.1 Real-time processing of spaceborne SAR data with nonlinear trajectory based on variable PRF
Base Paper Methodology: This paper proposes a real-time processing approach for spaceborne synthetic
aperture radar (SAR) data with nonlinear trajectories and variable pulse repetition frequencies (PRFs). The
methodology involves:
1) Utilizing a modified range-Doppler algorithm that accounts for the varying PRF along the azimuth
dimension.
2) Incorporating a trajectory estimator to estimate the nonlinear sensor trajectory based on the range-
compressed data.
3) Performing azimuth compression with the estimated nonlinear trajectory, enabling accurate focusing of
the SAR data.
4) Implementing the processing on a real-time computing platform, demonstrating the feasibility of on-
board processing for spaceborne SAR systems with nonlinear trajectories and variable PRFs.
The key aspects are handling variable PRFs, estimating nonlinear trajectories from the data itself, and
integrating these into a real-time processing chain for on-board SAR data focusing.
2.2.2 Detecting ships in the New Zealand exclusive economic zone: Requirements for a dedicated
small-sat SAR mission
1) Analysing the maritime traffic patterns and ship density in the New Zealand EEZ to determine the
required SAR imaging capabilities.
2) Evaluating the performance of different SAR modes (Stripmap, ScanSAR, and TOPSAR) in terms of
resolution, swath width, and coverage rate for ship detection.
3) Assessing the feasibility of using a small satellite platform with a compact SAR payload for this
mission.
4) Determining the optimal orbit parameters, such as altitude and inclination, to achieve the desired
coverage and revisit times.
5) Investigating the use of advanced signal processing techniques, like ship detection algorithms and
constant false alarm rate (CFAR) detectors, to improve ship detection accuracy.
The primary focus is on defining the technical requirements, including SAR modes, satellite platform, and
orbit design, to enable a dedicated small-sat SAR mission for maritime surveillance and ship detection
within the New Zealand EEZ.
2.2.3 Assessments of ocean wind retrieval schemes used for Chinese Gaofen-3 synthetic aperture
radar co-polarized data
Author Names: Yang Zhang, Xiao-Ming Li, Ke-Xin Zhang, Qi Yang, Wei Yang
1) Collecting and preprocessing Gaofen-3 SAR co-polarized data and corresponding wind measurements
from buoys or numerical models.
2) Implementing and assessing the performance of several ocean wind retrieval algorithms, including
empirical models (CMOD5.N, CMOD-IFR2), semi-empirical models (DWAV-GS, XWAVE), and
physical models (RFSCAT).
3) Evaluating the accuracy of the retrieved wind speeds and directions from these models by comparing
them with the ground truth data from buoys or models.
4) Analyzing the effects of various factors, such as wind speed range, incidence angle, and polarization,
on the retrieval performance of different models.
5) Identifying the most suitable wind retrieval scheme(s) for Gaofen-3 SAR co-polarized data based on
the accuracy assessments and specific application requirements.
The main objective is to comprehensively evaluate and compare the capabilities of different wind retrieval
algorithms in estimating ocean wind fields accurately from the Gaofen-3 SAR co-polarized data,
accounting for various environmental and sensor-related factors.
2.2.4 Spaceborne demonstration of distributed SAR imaging with TerraSAR-X and TanDEMX
Authors' Names:Gerhard Krieger, Nico Adam, Mohsen Younis, Marc Rodriguez-Cassola, Pau Prats,
Marco Antweiler
Base Paper Methodology:
This paper describes a spaceborne demonstration of distributed synthetic aperture radar (SAR) imaging
using the TerraSAR-X and TanDEM-X satellites. The methodology involves:
1) Developing a distributed SAR imaging concept, where the two satellites act as a large single-pass
interferometric SAR system with a adjustable baseline.
2) Implementing a bi-static synchronization link between TerraSAR-X (transmitter) and TanDEM-X
(receiver) to ensure precise timing and phase synchronization.
3) Conducting experiments with various baseline configurations, ranging from a conventional along-track
interferometric mode to a pendulum mode with large cross-track baselines.
4) Processing the bi-static SAR data collected by the two satellites using specialized distributed SAR
imaging algorithms.
5) Analyzing the focused bi-static SAR images and interferometric products to assess the performance of
the distributed SAR imaging concept.
6) Demonstrating the potential for enhanced capabilities, such as improved spatial resolution, suppressed
ambiguities, and extended imaging opportunities, compared to conventional monostatic SAR systems.
The key aspects are the synchronization between the two satellites, the implementation of distributed SAR
imaging algorithms, and the evaluation of the obtained bi-static SAR images and interferometric products
to validate the concept and its advantages.
Chapter 3
Distributed Real-Time Image Processing of Formation Flying SAR Based
on Embedded GPUs
The CS algorithm uses phase multiplication instead of interpolation to complete range migration
correction. In order to make the range migration trajectories of all targets uniform, the CS operation is
used to eliminate the space-varying characteristics of range migration and uniformly correct the
remaining range migrations for all scatter points. The CS algorithm does not require interpolation
operations and can perform accurate image processing only through complex multiplication and
FFT/IFFT.
In this section, a real-time imaging processing system adapted to FF-SAR mode is proposed. The
system is a distributed architecture based on multiple embedded GPUs. The specific content includes
hardware architecture of the system, rescheduled CS algorithm, and parallel computing optimization
method. First, distributed hardware system is introduced. The hardware architecture of this system is
scalable. For the convenience of description, this section takes the FF-SAR system consisting of four
embedded GPUs as an example. Second, processing of CS algorithm is rescheduled, which can be applied
to distributed hardware systems. Finally, CUDA program optimization method for parallel computing of
CS algorithm based on embedded GPU is introduced.
In the FF-SAR mission, multiple nodes can be employed to jointly process radar data. Therefore, it
is different from the processing flow where all CS algorithms are performed on a single embedded GPU.
The processing tasks of different stages of CS algorithm need to be rescheduled so that the four embedded
GPUs can cooperate to complete the processing. Coprocessing of multiple embedded GPUs reduces the
data volume processed by each embedded GPU and improves processing efficiency.
Fig. 4 shows the flowchart of GPU implementation of CS algorithm. The CS algorithm is
decomposed into three stages for the convenience of describing the data flow and processing flow of each
stage.
In first stage, as shown in Fig. 5(a), master embedded GPU performs transposition operation to
obtain the data arranged in azimuth direction. First, the data are evenly divided into four parts in azimuth
direction. The divided data are stored continuously in azimuth direction. One portion of the data is
reserved by master node, and the remaining three parts of the data are sent to slave nodes through optical
fibers. The master and slave nodes perform 1-D azimuth FFT to carry the data into range-Doppler domain.
CS phase factor used to change the frequency scale of line modulation is calculated, and the
corresponding point target data are multiplied by this factor to obtain values after range bending. The data
processing steps executed on the master node and slave nodes are independent and parallel. After first data
processing stage is completed by slave nodes, the data are sent back to the master node. Finally, master
node performs sequential splicing of the received data.
CUDA programming is used for the development of embedded GPUs, which is the same style as
the traditional GPU. Different from the traditional GPUs in hardware architecture, the embedded GPU
memory space is generally small. It is necessary to optimize the memory of the embedded GPU. Since
CUDA programming model requires CPU and GPU to work together, CS algorithm needs to be
decomposed into two parts suitable for GPU parallel computing and CPU serial execution, respectively.
The main steps in the CS algorithm include matrix transposition operations, FFT operations, IFFT
operations, and phase multiplications.
The FFT and IFFT operations are highly parallel. Matrix multiplication also has the feature of
implementing parallel computing. The following optimization methods are adopted in this article.
1) Unified memory management: As shown in Fig. 5, the traditional GPU and CPU
heterogeneous computing architectures are generally discrete. The GPU and CPU have separate memory,
and data need to be transmitted through PCIe bus. However, the heterogeneous computing architecture of
embedded GPU is an integrated architecture. As shown in Fig. 7(b), the CPU and GPU share same
physical memory, and there is no need for data transmission through PCIe bus.
Therefore, the use of unified memory management can avoid duplicate memory allocation and
data transmission and effectively improve the performance of embedded GPUs.
2) Memory reuse: Due to the limited memory resources of embedded GPUs, in addition to using
unified memory to reduce the use of memory space, memory reuse is adopted to avoid the waste of
memory space further. Address space needs to be allocated and freed when calling cuFFT library. The
time for address space allocation and free can even exceed the FFT operation. Address space is allocated
only on the first call to cuFFT library and freed after all cuFFT calls are complete, which is an efficient
means of memory multiplexing. The in-place transposition of the matrix is also a method of memory
reuse. The transposed matrix covers the address space of matrix before the transposition, so the memory
space is saved.
3) Align and merge access: Global memory is the largest and most frequently used memory in
GPUs, and most applications are susceptible to memory bandwidth limitations. Therefore, maximizing the
use of global memory bandwidth is the key to optimizing the performance of kernel function.
Unaligned and unmerged memory access wastes bandwidth and affects the GPU memory access
speed. Matrix transpose can be used to implement aligned and merged memory. During azimuth direction
processing, the data are stored in the azimuth direction. When range processing is performed, the data are
stored in the range direction to improve the efficiency of the processor in reading and writing data in the
memory.
4) Shared memory: Latency and bandwidth are the major factors when optimizing memory
performance. Shared memory can be used to avoid the effects of global memory latency and bandwidth on
the performance. Bank conflicts need to be avoided when using shared memory; otherwise, the memory
access efficiency will be reduced. If two addresses of a memory request fall in the same memory bank,
there is a bank conflict and the access has to be serialized. Memory padding methods can avoid bank conflicts.
When declaring shared memory, pad the extra space so that the memory addresses to be accessed fall in
different banks to avoid bank conflicts.
Fig 7. Schematic diagram of the data processing of CS imaging algorithm in the distributed processing system. (a)
Schematic diagram of the first stage of data processing. (b) Schematic diagram of the second stage of data processing.
(c) Schematic diagram of the third stage of data processing.
Chapter 4
Results and Discussion
To evaluate the processing performance of embedded GPU in FF-SAR task, the 3-m-resolution
single-precision floating-point complex raw data of the GF-3 satellite was used. The experimental
platform used is Jetson Nano. The same experiment on NVIDIA AGX Orin platform and NVIDIA
GeForce RTX 2060 Max-Q platform for comparison was conducted. Table I presents the hardware
parameters of all experimental platforms.
Fig. 6 shows the final imaging results of the experimental platforms and the imaging results in
MATLAB. In the experiments of CS algorithm on a single embedded GPU, the entire data processing
time, excluding data reading, is calculated. With a data volume of 0.5 GB, it only takes about 5.86 s to
complete image processing on Jetson Nano platform, and the power consumption is not higher than 5 W.
It takes 0.395 s to implement the imaging algorithm on the Jetson AGX Orin platform with the power
consumption of 60 W. In addition, the same experiment was conducted for the RTX 2060 Max-Q
platform using the same optimized CUDA program and data volume. It took 0.956 s to complete the entire
imaging process. For GPU platforms on computers, such as RTX 2060 Max-Q, although they provide
powerful performance, the power consumption is generally very high. Therefore, they are not suitable as
real-time processing platforms on satellites. Embedded GPUs balance performance and power
consumption, making them suitable as the on-orbit real-time processing platform.
The time consuming of Jetson Nano to execute the kernel function and memory copy of CS
algorithm is analyzed and compared with the results of Jetson AGX Orin platform and RTX 2060 Max-Q
platform. The results are shown in Table II. By comparing the execution time of different tasks on
different experimental platforms, it could be found that the time consuming of running different kernel
functions on the Jetson AGX Orin platform and the RTX 2060 Max-Q platform is less than the Jetson
Nano. This is related to the number of CUDA cores and GPU frequency.
Notably, Jetson AGX Orin platform and RTX 2060 Max-Q platform have comparable CUDA core
\counts. However, the time consumption of Jetson AGX Orin platform is less than the RTX 2060 Max-Q
platform, which is largely due to CUDA memory copy time. Since the embedded GPUs are integrated
heterogeneous architecture, the CPU and GPU share the same physical storage space
Notably, Jetson AGX Orin platform and RTX 2060 Max-Q platform have comparable CUDA core
\counts. However, the time consumption of Jetson AGX Orin platform is less than the RTX 2060 Max-Q
platform, which is largely due to CUDA memory copy time. Since the embedded GPUs are integrated
heterogeneous architecture, the CPU and GPU share the same physical storage space.
There is no need to transfer data between the host and the device before and after the execution of
the kernel function. The CPU and GPU in the RTX 2060 Max-Q platform are discrete architecture, and
the data transfer between the CPU and GPU must use the PCIe bus.
Therefore, CUDA memory copy occupies a lot of run-times on the RTX 2060 Max-Q platform
and reduces processing performance. The Jetson Nano and Jetson AGX Orin platforms benefit from the
integrated architecture, saving the time of CUDA memory copy.
Fig. 8. Imaging results of GF-3 raw data of 8192 × 8192 points by implementing CS algorithm on different
platforms. (a) Jetson Nano imaging results. (b) Jetson AGX Orin imaging results. (c) RTX 2060 Max-Q imaging results.
(d) MATLAB imaging results
A distributed embedded GPU simulation system is built using four Jetson Nanos. In this
experiment, optical fiber communication between different data processing units was used to simulate
laser communication between satellites. The raw data used in the experiment are 16 384 × 16 384 points
of complex single-precision floating-point numerical data. Fig. 8 shows the architecture of the distributed
embedded GPU simulation system. The system includes raw data delivery module, embedded GPUs, and
OTP modules. The raw data delivery module is responsible for simulating the sending process of the raw
data of the spaceborne SAR. Each data processing unit includes an embedded GPU and an OTP module.
The embedded GPU and the OTP module are connected through PCIe bus.
The data are transmitted between OTP modules via optical fibers. The imaging result of
implementing the CS algorithm based on the distributed system is shown in Fig. 4. In the three stages- of
the CS algorithm, after each data division, the data volume allocated to each node is the same. After the
data are divided from master node, they are transmitted to slave nodes for processing, and finally, the data
are returned to master node. In each stage, the data transfer pipeline is shown in Fig. 8. Since data transfer
is pipelined, transfer times can be overlapped. First, the three pieces of data on the master node are
transferred to the OTP module via the PCIe bus. Then, the data are transferred from the OTP module to
the slave node. Each slave node starts processing the data after receiving the complete data. The processed
data are transmitted from the slave node to the OTP module. Finally, the data processed by each slave
node are transmitted to the master node by the OTP module.
At this time, the OTP module needs to wait for the data block of the slave node 1 to be completely
transmitted to the interior before the next transmission of the data. The transmission rate of fiber is
5 Gbps, but the PCIe transfer rate is about 2 GB/s. Therefore, in order to avoid rate mismatches,
pipeline transmission is not used here. The processing time associated with each processing stage in the
distributed system was determined on the Jetson Nano, and the results are shown in Table III. The time
consumed by the distributed system to implement stages 1, 2, and 3 is about 4.5, 5.2, and 4.8 s,
respectively. It includes the CPU scheduling time, data transfer time, and GPU parallel computing time of
each stage.
The total time to implement CS algorithm imaging with four Jetson Nanos is about 14.5 s. The
time to implement the CS algorithm on a single Jetson Nano is about 5.86 s. Compared with the
implementation of the CS algorithm on a single Jetson Nano, the time-consuming increase is due to CPU
scheduling and data transmission in the process of data division and splicing.
the reliability of Jetson Nano is verified. For comparison with other real-time processing
platforms, the performance-to-power ratio to measure the processing performance of different
platforms is used. It considers the quantity of data processed, processing time, and processing power.
The results of different processing platforms are shown in Table III. The 0.5 GB of data were
processed using the CS strip imaging algorithm taking 5.86 s on the Jetson Nano. By using the Jtop
system monitoring utility in the Jetson system, it could be found that the peak power consumption of the
Jetson Nano during operation did not exceed 5 W, which is consistent with measurements using power
meter. The performance-to-power ratio is as high as 1.706%. The Jetson AGX Orin platform exhibits very
high-processing performance. It takes 0.395 s to process 8192 × 8192 points of data in 60 W power
consumption mode, and the performance-to-power ratio is 2.110%.
The processing performance of the RTX 2060 Max-Q platform is also very powerful. For the
same data, the processing time is shorter. However, its size and power consumption cannot meet the
requirements of on-orbit processing platform. The results show that Jetson Nano and Jetson AGX Orin
have higher performance-to-power ratio compared with other platforms. In addition, embedded GPU
platform and FPGA+ASIC platform show high performance-to-power ratio. The optimization method
proposed in this article has a significantly higher performance-to-power ratio. For the low
performance-to-power ratio, as shown in, weak platform performance and poor CUDA program
optimization are the main reasons.
However, under the constraints of power consumption, the FPGA and embedded GPU can
have a higher performance-to-power ratio.
Compared with FPGA platforms, embedded GPU platforms have short development cycles
and are easier to implement. Due to the short development cycle, embedded GPUs will have great
application potential in future satellite launch missions with large numbers and short cycles. Through
the performance analysis of the distributed architecture simulation system based on four Jetson Nanos,
although the use of memory space is optimized through unified memory management, memory reuse,
and in-place storage, Jetson Nano’s memory space of only 4 GB is not enough to process the data.
Thus, there is still a bottleneck in distributed processing in the system. And 1 × 4 PCIe Gen2
makes the data transfer in the system more time consuming, which affects the processing performance
of the distributed system.
Fig. 10. Schematic diagram of the data transmission pipeline between the master node and slave node at each stage
However, with the rapid development of embedded GPUs, NVIDIAs newly released 64 GB Jetson
AGX Orin could run in 15 W power mode, provide extraordinary improved memory capacity, and support
2 × 8 PCIe Gen4. Moreover, the data transmission rate of this platform has been greatly accelerated. If this
platform can pass the on-orbit environmental reliability tests, it will provide significant advantages in FF-
SAR on-orbit real-time imaging.
CONCLUSION
In order to explore a more suitable imaging processing method for FF-SAR system, this article
proposed a distributed real-time imaging processing method for spaceborne SAR based on embedded
GPUs. The original CS algorithm processing was rescheduled to accommodate the distributed
systems. According to the hardware and software architecture of embedded GPU, optimization
methods for memory and parallel computing are proposed to maximize its processing performance.
The simulation system was implemented using the Jetson Nano platform and the proposed method
was verified using GF-3 raw data. The results show that the proposed method has better real-time
performance under low-power consumption. Compared with the previous pieces of literature, it has a
higher performance-to-power ratio. The development cycle of embedded GPU platforms is shorter and
the scalability is more advantageous. It can be seen that embedded GPUs have good application
prospects in the real-time processing of spaceborne SAR.
FUTURE SCOPE
1. Scaling to larger formations: Extending the distributed processing approach to handle data from larger
formations with more satellites/platforms for increased coverage and resolution.
2. Advanced processing techniques: Incorporating more advanced SAR processing algorithms and
techniques, such as interferometry, polarimetry, and moving target indication, into the distributed real-time
processing pipeline.
4. On-board machine learning: Integrating on-board machine learning capabilities for tasks like automatic
target recognition, change detection, or data compression, leveraging the parallel processing power of
embedded GPUs.
5. Inter-satellite communication: Improving inter-satellite communication and data exchange protocols for
efficient distribution of processing tasks and data sharing within the formation.
6. Fault tolerance and redundancy: Developing fault-tolerant and redundant processing strategies to
ensure reliable operations in case of hardware failures or data losses.
7. Power and thermal management: Optimizing power consumption and thermal management strategies
for the embedded GPU-based processing systems to enable sustainable long-term operations.
8. Hybrid architectures: Investigating hybrid architectures that combine on-board processing with ground-
based processing facilities for more complex or computationally intensive tasks.
9. Application to other domains: Adapting the distributed real-time processing approach to other domains
that involve formation flying platforms, such as astronomical interferometry or multi-robot systems.
REFERENCE
[1] J. Chen, J. Zhang, Y. Jin, H. Yu, B. Liang, and D.-G. Yang, “Real-time processing of spaceborne SAR
data with nonlinear trajectory based on variable PRF,” IEEE Trans. Geosci. Remote Sens., vol. 60, 2022,
Art. no. 5205212.
[2] J. Krecke, M. Villano, N. Ustalli, A. C. M. Austin, J. E. Cater, and G. Krieger, “Detecting ships in the
New Zealand exclusive economic zone: Requirements for a dedicated smallsat SAR mission,” IEEE J.
Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 3162–3169, Mar. 2021.
[3] L. Ren et al., “Assessments of ocean wind retrieval schemes used for Chinese Gaofen-3 synthetic
aperture radar co-polarized data,” IEEE Trans. Geosci. Remote Sens., vol. 57, no. 9, pp. 7075–7085, Sep.
2019.
[4] J. Chen, M. Xing, H. Yu, B. Liang, J. Peng, and G.-C. Sun, “Motion compensation/autofocus in
airborne synthetic aperture radar: A review,” IEEE Geosci. Remote Sens. Mag., vol. 10, no. 1, pp. 185–
206, Mar. 2022.
[5] G. Krieger et al., “TanDEM-X: A satellite formation for high-resolution SAR interferometry,” IEEE
Trans. Geosci. Remote Sens., vol. 45, no. 11, pp. 3317–3341, Nov. 2007.
[6] T. Kraus, G. Krieger, M. Bachmann, and A. Moreira, “Spaceborne demonstration of distributed SAR
imaging with TerraSAR-X and TanDEMX,” IEEE Geosci. Remote Sens. Lett., vol. 16, no. 11, pp. 1731–
1735, Nov. 2019.
[7] D. Giudici, P. Guccione, M. Manzoni, A. M. Guarnieri, and F. Rocca, “Compact and free-floating
satellite MIMO SAR formations,” IEEE Trans. Geosci. Remote Sens., vol. 60, 2022, Art. no. 1000212.
[8] A. Renga, M. D. Graziano, and A. Moccia, “Formation flying SAR: Analysis of imaging performance
by array theory,” IEEE Trans. Aerosp. Electron. Syst., vol. 57, no. 3, pp. 1480–1497, Jun. 2021.
[9] G. Krieger et al., “TanDEM-X,” in Distributed Space Missions for Earth System Monitoring, vol. 31.
New York, NY, USA: Springer, 2013, pp. 387–436.
[10] N. Roth et al., “Flight results from the CanX-4 and CanX-5 formation flying mission,” in Proc.
Small Satellites Syst. Serv. Symp., Valletta, Malta, 2016, p. 30.