Professional Documents
Culture Documents
Fast-Fourier-transform Based Numerical Integration Method For The RayleighSommerfeld Diffraction Formula, PDF
Fast-Fourier-transform Based Numerical Integration Method For The RayleighSommerfeld Diffraction Formula, PDF
Fabin Shen and Anbo Wang, "Fast-Fourier-transform based numerical integration method for the
Rayleigh-Sommerfeld diffraction formula," Appl. Opt. 45, 1102-1110 (2006). doi: 10.1364/ao.45.001102
The numerical calculation of the Rayleigh–Sommerfeld diffraction integral is investigated. The imple-
mentation of a fast-Fourier-transform (FFT) based direct integration (FFT-DI) method is presented, and
Simpson’s rule is used to improve the calculation accuracy. The sampling interval, the size of the
computation window, and their influence on numerical accuracy and on computational complexity are
discussed for the FFT-DI and the FFT-based angular spectrum (FFT-AS) methods. The performance of
the FFT-DI method is verified by numerical simulation and compared with that of the FFT-AS
method. © 2006 Optical Society of America
OCIS codes: 050.1960, 000.4430, 000.3860, 350.5500.
冕冕
to calculate the Rayleigh–Sommerfeld diffraction
integral and use Simpson’s rule to improve the cal- 1
culation accuracy. The selections of the sampling ⫽ A共␣, , 0兲
42
interval and the computation window and their in-
fluence on the calculation accuracy and computa- ⫻ exp共j␣x ⫹ jy ⫹ j冑k2 ⫺ ␣2 ⫺ 2z兲d␣d.
tional load are discussed. The calculation accuracy, (5)
the computational speed, and the applicability re-
gions of the FFT-DI and the FFT-AS methods are
The FFT-AS method for calculating U共x, y, z兲 nu-
compared.
merically has been reported and can be given as
This paper is arranged as follows: In Section 2 a
brief review of the AS method and the DI method is
given. In Section 3 a method of implementing the Q ⫽ IFFT2兵FFT2兵U共xm, yn, 0兲其 ·⫻ G共␣m, n, z兲其,
FFT-DI method is described with improved accu- (6)
racy by use of Simpson’s rule. In Section 4 the sam-
pling intervals, the computation window size, the where U共xm, yn, 0兲 and G共␣m, n, z兲 are samples of
calculation accuracy, and the computational load of U共x, y, 0兲 and G共␣, , z兲, FFT2 and IFFT2 denote a
the two methods are discussed. In Section 5 the 2D FFT and a 2D IFFT, and ·⫻ means element-by-
simulation results for diffraction behind a circular element multiplication.5,10 –12
aperture and a single slit are demonstrated; the
calculation accuracy and computational speeds of B. Direct Integration Method
the two methods are compared. In Section 6 our G共␣, , z兲 in Eq. (4) is the optical transfer function of
conclusions are given. the medium, and its inverse Fourier transformation
g共x, y, z兲 ⫽
1
42
冕冕 G共␣, , z兲 N N
U共xm, yn, z兲 ⫽ 兺 兺 U共i, j, 0兲g
i⫽1 j⫽1
⫻ exp共j␣x ⫹ jy兲d␣d ⫻ 共xm ⫺ i, yn ⫺ j, z兲⌬⌬, (9)
⫽
1 exp共jkr兲 z 1
2 r r r 冉 冊
⫺ jk , (7)
where ⌬ and ⌬ are sampling intervals on the ap-
erture plane.
3. Fast-Fourier-Transform Based Direct
where r ⫽ 冑x2 ⫹ y2 ⫹ z2.10 Thus U共x, y, z兲 can be Integration Method
solved as the convolution of U共x, y, 0兲 and g共x, y, z兲: The Riemann sum in Eq. (9) can be regarded as a
discrete linear convolution of U共i, j, 0兲 and
U共x, y, z兲 ⫽ 冕冕A
U共, , 0兲g共x ⫺ , y ⫺ , z兲dd
g共xm, yn, z兲, which can be calculated effectively by
means of a FFT. We present a FFT-DI implementa-
tion for N2 points located at sampling grids on the
observation plane. The discrete convolution in Eq. (9)
⫽ 冕冕 U共, , 0兲
exp共jkr兲 z
2r r
can be calculated as
冉 冊
A
1
⫻ ⫺ jk dd, (8) where
r
冤 冥
··· |
冋 册
É Ì É | 0N⫻(N⫺1)
U0 0
U⫽ ⫽ U共N, 1, 0兲 ··· U共N, N, 0兲 | , (11)
0 0 (2N⫺1)⫻(2N⫺1)
⫺ ⫺ ⫺ | ⫺
0(N⫺1)⫻N | 0(N⫺1)⫻(N⫺1)
Xj ⫽ 再
x1 ⫺ N⫹1⫺j
xj⫺N⫹1 ⫺ 1
j ⫽ 1, . . . , N ⫺ 1
j ⫽ N, . . . , 2N ⫺ 1
, (13)
Yj ⫽ 再 y1 ⫺ N⫹1⫺j
yj⫺N⫹1 ⫺ 1
j ⫽ 1, . . . , N ⫺ 1
j ⫽ N, . . . , 2N ⫺ 1
. (14)
where r ⫽ 冑共x ⫺ 兲2⫹共y ⫺ 兲2⫹z2. It is exactly the The sampling grid of U共i, j, 0兲 is zero padded to
Rayleigh–Sommerfeld diffraction integral formula, 共2N ⫺ 1兲 ⫻ 共2N ⫺ 1兲 as shown in Eq. (11) because
which can be used for both near and far fields without Eq. (10) gives the circular convolution of U and H.
any approximation. The result S is a 共2N ⫺ 1兲 ⫻ 共2N ⫺ 1兲 complex matrix.
In most cases, the diffraction integral in Eq. (8) has The desired light fields in the observation plane can
to be calculated by direct numerical integration. On be given by the N ⫻ N lower right submatrix of S:
the aperture plane, U共, , 0兲 is sampled to N ⫻ N
equidistant grids. For a point on the observation U共xm, yn, z兲 ⫽ Sm⫹N, n⫹N. (15)
U⫽ 冋 W ·⫻ U0 O
O O 册 ,
(2N⫺1)⫻(2N⫺1)
(17)
sampling of g共x, y, z兲 is always aliased. However,
g共x, y, z兲 is of low frequencies for paraxial points and
relatively long propagation distances. Thus, for a
given diffraction problem, one can estimate the max-
where imum frequency and select the appropriate sampling
interval. The magnitude and the oscillation period of
W ⫽ BTB, (18) g共x, y, z兲 at different observations planes are plotted
in Figs. 2 and 3, respectively, with a simulation wave-
B ⫽ 1⁄3关1 4 2 4 2 . . . 2 4 1兴 (19) length of 0.5 m.
The attenuation of light fields with respect to the
for an odd N.17 The error of Simpson’s rule for 2D offset between the point and the optical axis at planes
numerical integration can be estimated as z ⫽ 0.5, 1, 2.5, 5, 10, 25, 50, 100 m are plotted in
Fig. 2. It is evident that more energy is diffracted to
areas far from the optical axis as z increases, but the
E ⫽ O共⌬4兲 ⫹ O共⌬4兲, (20)
attenuation rate gets smaller. This implies that more
sampling points are required for small z because of
which is smaller than that of Eq. (16). the rapid variations in the light field.
The FFT-DI method discussed above can also be We can estimate the oscillation period of g共x, y, z兲
extended to calculation of the Helmholtz–Kirchhoff by calculating the interval ⌬ between points on the
integrals by use of different values of g共x, y, z兲 in observation plane with 2 phase difference, given as
Eq. (9).
where ε is a small number. Inequality (24) can be When N2 points on the observation plane are arbi-
rewritten as trarily located, the computational complexity of the
traditional DI method is O共N4兲 (Ref. 12) because each
冏 冉 冊冏 冏 冉 冊冏
1 1
r2 r
1 1
⫺ jk ⬍ 2
z z
⫺ jk , (25)
point has a computational complexity of O共N2兲, as
shown in Eq. (9). The FFT-DI method calculates the
light fields on grid points and uses a FFT and an
where r ⫽ 冑2 ⫹ z2. In practice, z is usually much
IFFT to improve the calculation speed. The compu-
tational load of the FFT-DI method in Eq. (10) comes
larger than several wavelengths; thus inequality (23) from (a) FFT(U), (b) calculation of H, (c) FFT(H), (d)
can be simplified to the element-by-element product of FFT(U) and
FFT(H), and (e) the IFFT of the product. For maxi-
1 1
⬍ , (26) mum efficiency of FFT and IFFT, U and H should be
2
r z2 zero padded to have the size 2m ⫻ 2m, where m is an
integer. Assuming that the length of the FFT is NF,
and the minimum can thus be solved as the computational complexity of the FFT-DI method
is
⬎ z关共1兾兲 ⫺ 1兴1兾2. (27)
C2 ⫽ Ca ⫹ Cb ⫹ Cc ⫹ Cd ⫹ Ce
The minimum computation window can then be se-
lected to be 共2 ⫹ a兲 ⫻ 共2 ⫹ b兲, where a ⫻ b is the size
of the aperture. ⫽O共NF 2 log2 NF兲 ⫹ O共NF 2兲 ⫹ O共NF 2 log2 NF兲
The FFT-DI method, however, requires that the
⫹ O共NF 2兲 ⫹ O共NF 2 log2 NF兲, (29)
size of the computation window be the same as that of
the aperture window. When the observation window
is larger than the aperture, one can either enlarge the which is much lower than O共N4兲.
aperture window or divide the observation window If the N ⫻ N sampling array in the FFT-DI method
into subwindows with smaller sizes. In Subsection is zero padded to 2N ⫻ 2N, then the lengths of the
4.D below, we show that the latter method is more FFTs in the FFT-DI method are the same as in the
computationally effective. FFT-AS method when a computation window with
2 ⫻ 2 times the aperture size is used. From Eqs. (28)
D. Computational Complexity and (29), one can find that the FFT-DI method needs
The computational load of the FFT-AS method in only one more 2D FFT than the FFT-AS method does,
Eq. (6) comes from (a) a FFT of U, (b) calculation of G, which means that the computational loads of the
(c) element-by-element multiplication A ⫽ FFT共U兲 FFT-DI and FFT-AS methods are comparable.
· ⫻ G, and (d) IFFT (A). Assuming that the array size In the FFT-DI method the size of the computation
is N ⫻ N, the computational complexity is window is the same as that of the aperture window.
One can divide a large observation window into
C1 ⫽ Ca ⫹ Cb ⫹ Cc ⫹ Cd smaller subwindows that have the same size as the
aperture, such that the maximum computational ef-
⫽ O共N2 log2 N兲 ⫹ O共N2兲 ⫹ O共N2兲 ⫹ O共N2 log2 N兲. ficiency of the FFT algorithm can be obtained. For
(28) example, if the desired observation window is 2 ⫻ 2
H⫽
2 冋
jkz H1共1兲共kr1兲
r1
···
H1共1兲共kr2N⫺1兲
r2N⫺1 册 ,
2N⫺1
(34)
rj ⫽ 再
冑共x1 ⫺ N⫹1⫺j兲2 ⫹ z2,
冑共xj⫺N⫹1 ⫺ 1兲2 ⫹ z2,
j ⫽ 1, . . . , N ⫺ 1
j ⫽ N, . . . , 2N ⫺ 1
, (35)
A. Circular Aperture
It has been shown that the light fields on the optical
Fig. 8. Simulation results of diffraction pattern of a single slit. axis behind a circular aperture under normal uni-
form plane-wave illumination have the following ex-
act solution:
times the aperture, one can divide it into four sub-
冋 册
windows with N ⫻ N ⫽ 共N1兾2兲 ⫻ 共N1兾2兲 arrays, where
N1 is the sampling number for the large computation exp共jkz兲 exp共jk冑z2 ⫹ a2兲
U共z兲 ⫽ U0z ⫺ , (36)
window. The computational complexity for the FFT
and the IFFT for the small windows is
z 冑z2 ⫹ a2
where U0 is the magnitude of the incident light and a
4 ⫻ O共N2 log2 N兲 ⫽ O关N12共log2 N1 ⫺ 1兲兴, is the radius of the circular aperture.19,20
We used both the FFT-AS and the FFT-DI methods
which is less than that for the large computation to calculate the light fields of axial points and com-
windows, whose computational complexity is pared the results with the theoretical solution in
O共N1 2 log2 N1兲. Eq. (36) to evaluate the accuracy of the two methods.
The parameters used in the simulation are as follows:
E. Two-Dimensional Diffraction The wavelength of the chromatic light was
The impulse response of a 2D linear isotropic homog- ⫽ 0.5 m, the radius of the circular aperture was a
enous medium is7,11 ⫽ 10 ⫽ 5 m, and the sampling interval was 0.1.
For the FFT-AS method the computation window was
set to 2 ⫻ 2 times the aperture window.
jkz
h共x, z兲 ⫽ H 共1兲 kr , The simulation results of the axial intensity distri-
2r 1 共 兲
(30)
bution are shown in Fig. 4. The FFT-DI method gives
consistent simulation results for both small and large
where H1共1兲共kr兲 is the first-order, first-kind Hankel z. In contrast, the FFT-AS method gives consistent
function and r ⫽ 冑x2 ⫹ z2. Thus the propagation of results only for small z. When z is increased, because
light can be represented as the convolution of U共x, 0兲 more light power is diffracted outside the computa-
and h共x, z兲: tion window, a larger error occurs. The square errors
of the calculations, |U ⫺ UAS|2 and |U ⫺ UDI|2, are
冕
plotted in Fig. 5, where U is the theoretical value and
UAS and UDI are the simulation results. The FFT-DI
U共x, z兲 ⫽ U共, 0兲h共x ⫺ , z兲d method has a much smaller calculation error than
A the FFT-AS method for a relatively large z. The error
⫽ 冕A
U共, 0兲
jkz
H 共1兲 kr d,
2r 1 共 兲
(31)
of the FFT-AS method tends to increase with z,
whereas the error of the FFT-DI method shows a
maximum at a certain z and is asymptotic to zero
when z is large.
where r ⫽ 冑共x ⫺ 兲2 ⫹ z2. The errors of the FFT-AS method for several sam-
The FFT-DI method for 2D diffraction can be given pling intervals and computation window sizes are
as shown in Fig. 6. Figure 6(a) shows the errors of the
FFT-AS method when the aperture is sampled from
128 ⫻ 128 to 512 ⫻ 512 with the computation window
S ⫽ IFFT关FFT共U兲 ·⫻ FFT共H兲兴⌬, (32) kept at the same size, which is 2 ⫻ 2 of the aperture
window. The results show that oversampling on the
where aperture will not reduce the calculation error. Figure
a
FFT-DI FFT-ASb
6(b) shows the errors of the FFT-AS method when the was set to be 10 m ⫻ 10 m, the same size as the
computation window size is selected from 2 ⫻ 2 to aperture window. The average execution time of 25
6 ⫻ 6 of the aperture size while the sampling num- running cycles and the FFT array sizes of the FFT-DI
bers of the aperture are kept at 128 ⫻ 128. The and the FFT-AS methods are compared in Table 1.
results show that a large computation window will The simulation code was programmed and executed
decrease the calculation error. in Matlab v6.5 on a Windows XP operating system.
The errors of the FFT-DI method for several sam- It is evident that the FFT-DI method can greatly
pling numbers are shown in Fig. 7. The calculation reduce the computational load compared with the
error |U ⫺ UDI|2 at z ⫽ 8 m for sampling numbers traditional DI method. The execution time of the
from 128 ⫻ 128 to 1024 ⫻ 1024 are plotted together FFT-DI method for an array size of 512 ⫻ 512 is
with the calculation errors of Simpson’s rule. It can be 3.36 s , which is much less than that of the traditional
seen that the calculation error decreases when the DI method reported in Ref. 16. The computational
sampling number increases. However, when the sam- loads of the FFT-DI and the FFT-AS methods are
pling interval is small enough, the rate of decrease in comparable when the array sizes in these two meth-
error becomes saturated. Simpson’s rule can greatly ods are the same. The execution times of the FFT-DI
reduce the calculation error and thus improve the method with a 1024 ⫻ 1024 sampling array and of
calculation accuracy. the FFT-AS method with a computation window of
16 ⫻ 16 times the aperture window are 13.34 and
B. Single Slit 10.80 s, respectively. The lengths of the FFT and the
We verified the 2D diffraction formulas in Eqs. (30)– IFFT are 2048 ⫻ 2048 for both cases. The execution
(35) by calculating the diffraction patterns of a single time of the FFT-DI method is only 1.24 times that of
infinite slit under uniform plane-wave illumination. the FFT-AS method.
The wavelength of the light was 0.5 m. The width of The FFT-DI method uses one more FFT than the
the slit was 10 m. The aperture plane was sampled FFT-AS method and thus needs one more array to
to 1024 points along the x axis. The observation win- save the FFT result. For example, for an array size of
dow was set to 80 m width with the center on the 1024 ⫻ 1024 with double precision, the FFT-AS
optical axis. In the FFT-AS method the computation needs 8 Mbytes of memory to save the result of one
window was the same size as the observation win- FFT, while the FFT-DI method needs 16 Mbytes of
dow, which was eight times the size of the aperture. memory to save the results of the two FFTs.
The diffraction patterns at observation planes of
z ⫽ 5, 50, 250 m, calculated by both the FFT-AS 6. Conclusions
and the FFT-DI methods, are plotted in Fig. 8. Based on the investigation of the numerical calcula-
When z is small 共5 m兲, the differences between the tion methods for the Rayleigh–Sommerfeld diffrac-
calculation results from the FFT-AS and the FFT-DI tion integral, a fast-Fourier-transform based DI
methods are so small that they cannot be identified, method was implemented to lower the computational
as Fig. 8(a) shows. However, the error of the FFT-AS load of the numerical integration. Simpson’s rule was
method increases when z becomes larger 共50 m兲, as introduced to improve the calculation accuracy. The
Fig. 8(b) shows. When z is large 共250 m兲, the parameter selections and the performance of the
FFT-AS method fails to give an accurate result, as FFT-DI and the FFT-AS methods were discussed.
shown in Fig. 8(c). The sampling of the light fields on the aperture plane
needs to meet the requirements of the Nyquist sam-
C. Calculation Speed pling theorem in both methods. The accuracy of the
In our simulation we used a Compaq personal FFT-AS method depends on the error caused by the
computer with a Pentium 4, 1.4 GHz CPU and finite size of the computation window. An oversam-
512 Mbytes of RAM to investigate the speed of the pling in the aperture window will not increase the
FFT-AS and the FFT-DI methods. The simulation accuracy of the FFT-AS method. The accuracy of the
was based on the circular aperture diffraction as de- FFT-DI method depends on the sampling intervals in
scribed in Subsection 5.A. The observation window the aperture plane. The size of the computation win-