Quantifying the 2 . 5 D imaging performance of digital holographic systems

D. P. Kelly damien-peter.kelly@tu-ilmenau.de Institut für Mikround Nanotechnologien, Macro-Nano, Fachgebiet Optik Design, Technische Universität Ilmenau, Postfach 100565, 98684 Ilmenau, Germany J. J. Healy Instituto de Ciencias Fı́sicas, Universidad Nacional Autónoma de Mexico, Avenida Universidad S/N, Colonia Chamilpa Cuernavaca Morelos, C.P. 62210 Mexico B. M. Hennelly Department of Computer Science, National University of Ireland, Maynooth, Co. Kildare, Ireland The Callan Institute, National University of Ireland, Maynooth, Co. Kildare, Ireland J. T. Sheridan UCD Communications and Optoelectronic Research Centre, University College Dublin, Belfield, Dublin 4, Ireland SFI Strategic Research Cluster in Solar Energy Conversion, University College Dublin, Belfield, Dublin 4, Ireland School of Electrical Electronic and Mechanical Engineering, College of Engineering, Mathematical and Physical Sciences, University College Dublin, Belfield, Dublin 4, Ireland


INTRODUCTION
The invention of holography was reported by Gabor in a short article in 1948 [1] and then expanded upon by him in several other manuscripts, see for example [2]- [4].Holography is a two-step imaging process, where a hologram is first recorded and later reconstructed.This approach retains information about the complex amplitude of the wavefield, however a static scene and fairly coherent light source are generally required for the duration of the recording process and a known reference wave is necessary to encode the phase information.This latter requirement unfortunately produces three additional and unwanted terms, namely the two DC terms and a conjugate image term, [2]- [5].However, one advantage is that since the complex amplitude of the object field is recovered, it is possible to record the hologram in a non-imaging optical plane.Increasingly, holograms are recorded by digital devices, such as CCD's, and can then be reconstructed numerically, in close to real-time, by desktop computers [6]- [8].A CCD device returns an array, W n×m , of numbers to the user and can be related to the instantaneous intensity, see [9,10] and Chapter 9 of Ref. [11].Proceeding with a 1-D analysis for the sake of simplicity we write this relationship as: where β is a constant of proportionality between the wave field intensity and the values returned by the camera which we henceforth neglect.The functions p γ (x) and p L (x) describe the effect of the finite pixel size and the finite camera extent respectively, and u ref (x) is the reference wavefield.The ' * ' indicates complex conjugation.The function δ T is a comb function [12] which can be defined as and can conveniently be represented as [13] δ where T is the spacing between the centers of adjacent pixels and where ⊗ respresents the convolution operation [12].From Eq. ( 1) we can see that |u(x) + u ref (x)| 2 is first convolved with our pixel function p γ (x) before being apertured (limited in space) by the function p L (x) and then sampled.
The order of operations is very important and has been discussed in the following references [9,10][14]- [16].In Eq. ( 1), I z (x) and I ref (x) refer to the object and reference intensities, while u * (x) is the conjugate image term.In order to produce a quality reconstruction it is necessary to remove or supress these terms which can be achieved in numerous ways, perhaps most famously by Leith and Upatnieks [17], however many other techniques exist [18,19].Here we assume that these terms; the conjugate twin image and the DC intensity terms (see for example Chapter 9 of Ref. [5]), may be removed using phase shifting interferometric techniques [20]- [23], see also Chapter 2 of [24].We concentrate exclusively on the u * ref u term for the remainder of this paper.
In Ref. [25], Collins relates diffraction theory to ray matrices, providing a convenient method of describing the propagation of a field through several optical elements.The field at the output of such a system will in general be in a mixed space-spatial frequency domain, and can be interpreted mathematically as a Linear Canonical Transform (LCT) [26,27].Referring to Figure 1, a field U(X), (with input domain variable X), is input to the system, the effects of which are described with the parameters ABCD A , where the subscript A is used to indicate that the field represents an analytical or continuous mathematical function.The complex field, u * ref u, at the output of this transform is recorded by a digital camera and in the process is subject to three separate operations: (1) the spatial frequency content of the signal is changed by the finite size of the camera pixels, 2γ, used to record the hologram; (2) the finite extent, 2L, of the camera face acts to limit the spatial extent of the field u * ref u; and finally (3) the digital camera samples the field u * ref u, which after PSI processing [10,22], would return a 2-D array of complex numbers to the user.This 2-D array of captured data is then subject to a second numerical LCT operation, defined with the variables ABCD N , to yield our reconstructed field R N (X).Here we use the superscript, N, to indicate that a numerical operation on a finite discrete data set is performed by a computer to generate the continuous field R N (X).We define the LCT as follows: where j = √ −1.We note that since |AD − BC| = 1, we have 3 independent variables and can thus typically eliminate C.This transform reduces to other more well known transforms by appropriately choosing the values for ABCD.For example the following matrices describe the scaled (by a factor q) Fourier Transform (FT), Fresnel (FST) propagation (a distance z), the effect of a thin converging lens (focal length, f ) or Chirp Modulation Transform (CMT), an imaging operation (magnified by M), and a Fractional Fourier Transform (FRT) operation of fractional angle θ [28] and where q is a scaling factor, FRT = cos(θ) q sin(θ) − sin(θ)/q cos(θ) .( 8) We now make some general comments about the results thus far.
We note that the hologram recorded by the camera is sampled in an LCT domain, which in general will be a mixed spacespatial frequency domain.In order to understand the implications of this we need to look at some recent theoretical developments in sampling theory [13,27][29]- [31].
If we choose our ABCD A = [1, 0, 0, 1], i.e. an in-focus unit magnification imaging system, see Eq. ( 7), then our sampling rate T must be such that 1/T > 2 f max to ensure that the maximum spatial frequency, f max is recovered.We note that here we are sampling in the space domain and so our sampling rules reduce to a statement of the Nyquist sampling conditions, i.e. f max = f NQ .The pixel function p γ (x) will in general act to attenuate the higher spatial frequencies present in our signal u(x).
If however we set ABCD A = [1, z, 0, 1], our initial LCT transform then becomes a Fresnel transform, see Eq. ( 5).Once the discrete set of values, Eq. ( 1), are reconstructed numerically (this operation is typically an inverse Fresnel transform that is performed numerically [10]), we find that the resulting image is different in several regards.Similar to the imaging LCT operation the p γ (x) will again act to attenuate the higher spatial frequencies in the image, however the sampling rate, T, now defines the region in space that can be reconstructed without being corrupted by overlapping replicas (assuming an input function of finite spatial extent), see Ref. [10] for more detail.Importantly this does not effect the ability of the system to resolve spatial frequencies [10][32]- [37].
Other different imaging properties occur if we assume that our initial LCT operation performs a scaled FT.In this instance the image is reconstructed by numerically implementing an inverse FT.From the analysis presented in Ref. [38] we see that in this case the function p γ (x) acts to attenuate the complex amplitude of the reconstructed image.We expect other properties for different types of LCT transforms.Indeed depending on the application it may be possible to design optimal optonumeric imaging systems by applying this generalized phase space holographic recording [39]- [42].
Since holographic capture provides access to the complex amplitude of an optical field, it is possible to numerically re-construct our field at various depths providing a pseudo-3D imaging capability over a volume.Kou and Sheppard [43] point out that this can only be considered to be 2.5D imaging due to the finite spatial frequency support of the hologram, unlike optical sectioning where multiple images are acquired.By examining the 3D PSF of the system we find that the detail size or resolution we can expect to recover from our hologram varies as a function of depth and system parameter values, with the result that the image detail is maximized for particular reconstruction planes only.
In the following sections and conclusion we develop a theoretical framework to describe the imaging properties of generalized phase space holographic imaging systems.This framework will not only allow us to analyze both the imaging properties of the FT and FST systems, but also those of more general optical systems.The layout of the paper is as follows: In Section 2.1 we highlight some important performance properties for general imaging systems that ideally should be addressed by an analysis: (i) what spatial frequencies will be passed through the system?(ii) what is the size of the input field that can be imaged?The analysis presented applies to a broad range of DH systems and so to ensure we still maintain some insight into the imaging process, and are not overwhelmed by the complexity of the solution, we introduce a special input function that allows us to specifically examine how the extent and spatial frequency content of an input field change as the field propagates through a quadratic phase space system described by an LCT.In Section 2.2 we examine the role of the finite camera aperture on the imaging performance of the system.Simple but general rules of thumb are derived, which relate the quality of the output image to the input field and the system parameters.In Section 2.3 the filtering effect of the finite pixel size is discussed.Again a simple guideline is presented that allows an imaging system designer to determine the quality of the output image.Numerical examples are presented that illustrate different features of this performance limiting factor.In Section 2.4 the role of sampling is examined.In Section 3 we derive an expression to describe the PSF function for a general DH system and show that it varies as a function of depth and hence does not perform true 3D imaging.Some representative numerical examples are presented.Finally we make some closing remarks in a brief conclusion and discussion section.

THEORETICAL ANALYSIS
Here we present our approach for analysing generalized DH systems.

Representation of the input field
We begin by considering a plane wave normally incident on a thin sinusoidal amplitude diffraction grating of infinite extent.Neglecting the DC term, the resulting diffracted field is composed of two infinite plane waves that travel at an angle to the optical axis, see Section 4.5.2 of Ref. [5].From this analysis [5], it is shown that under special conditions, i.e. at the Talbot distances, the diffracting field will reproduce a perfect image of the original diffracting grating.In practical systems the diffracting grating has a finite extent which places a limit on the power entering the system and also acts to localize the spatial distribution of power in the diffracted field, thus confining the Talbot bands to a particular spatial volume.In Figure 1, we depict such a situation.Here, an input transparency containing different spatial frequencies diffracts the power contained in the illuminating beam into several different 'channels' that propagate until they are incident on the camera plane.Different optical elements, such as DL in Figure 1, can be used to deflect the path of these channels.We now wish to provide a mathematical description for what has been discussed thus far by choosing a specific function as our input wavefield, For values of f x > 1, and α i > 1, Eq. ( 9) limits the power entering our system to ≈ √ π/8 Watts.Furthermore the signal's power is localized and primarily lies within the range −2α i ≤ X ≤ 2α i m, and thus by varying α i we can investigate the region of space that can be viewed with the opto-numeric system or as we shall see later, by choosing a small value for α i and setting f x = 0, we can determine the PSF of the optical system.We note when f x = 0 the power entering the system doubles.The cosine function allows the determination of how individual spatial frequencies are effected by the imaging operation.
To examine the expression for the field u(x) at the camera plane, see Figure 1, we calculate the following expression: where and From Eq. ( 10) and Eq.(11) we see that under an LCT operation, the input modulated Gaussian-type function is transformed into another Gaussian-type function, that is centered at x = Bλ f x and has a width of approximately −2 α c ≤ (x − B λ f x ) ≤ 2 α c .Thus the power associated with a specific spatial frequency is confined to a definite spatial location, see Figure 1.From Eqs. ( 10)-( 14) we also note that this Gaussian-type function is multiplied by system dependent linear and quadratic phase terms, as well as some constant complex term, K c .Noting that cos (2π we write the field at the camera as The transform we have performed on our input signal is a lossless operation and so we expect that the power of the field in the camera plane will be conserved, even though the signal may now be distributed differently in phase space.Loss is introduced into our system for the first time at the camera and our signal is modified irreversibly, due to the finite extent of the camera, the discrete nature of the device (sampling), and the finite active areas of the pixels.Note that so far we have assumed that the extent of various lenses that are used to process the input field prior to capture are effectively infinite.However, as we shall see it is possible to extend this analysis further by limiting the extent of our optical elements using complex ABCD parameters [44,45].In the following sections we now examine conditions under which we may expect to get a reasonable reconstruction of our original object field.Ideally we should try to ensure that as much of the signal field u(x) lies within the finite extent of the camera.We would also like to minimize the impact of the pixel filtering operation, [due to the finite pixel size, see Eq. ( 1)], on our reconstructed signal.Thus for a given signal type and fixed camera parameters, we wish to determine the optimal transform (ABCD A ), that maximizes the information throughput of the opto-numeric system or that attempts to match the Space-Bandwidth Product (SBP) of the signal and the optical system [39]- [42].Using results from Eqs. (10) to (15) we now consider the effect of the finite size camera aperture, the pixel filtering operation, the role of sampling, and finally the ability of our system to reconstruct an input field over a 3D volume.

Camera Aperture
In order to proceed further it is useful to look at a specific numerical result and so in Figure 2 we present a plot of |u(x)| with the following parameters, λ = 633 nm, ABCD A = [1, 0.5, 0, 1] (Standard SI units are used unless specifically stated otherwise), α i = 2 mm, and f x = 10 lines/mm.Referring to Eq. ( 5), we note that this operation is a Fresnel transform.We assume an on-axis plane wave, thus u ref (x) can be arbitrarily set to unity.
From Figure 2(a), we can see the presence of two Gaussiantype functions located at ±λB f x .These two functions [u f x (x) and u − f x (x)] interfere with each other producing the structure in the field at −1 ≤ x ≤ 1 mm.For different parameter choices this structure can vary quite significantly.The finite extent of a digital cameras can be modeled using a rect function which we define as We would now like to determine how large the width of the camera aperture, 2L, should be in order to ensure that the power from our input signal is incident on the light sensitive area of the camera.We first observe that the width of the u f x (x) functions is 4α c and they are centered at ±λB f x ; therefore most of the power of u(x) lies within the range, −SE ≤ x ≤ SE where Substituting the appropriate values for B, α c and f x into Eq.( 17) gives a value of SE ≈ 7.2 mm, which is indicated by a dashed vertical line in Figure 2(a).We arbitrarily choose a value of L = 6 mm for our camera aperture, indicated by a solid vertical line in Figure 2(a).All information outside these lines, i.e. x ≥ |L|, is not recorded by the camera and therefore does not contribute to the final reconstructed field.In Figure 2(b) we present the result of the numerical reconstruction.To generate this reconstruction we sample the field u(x) over the range −L ≤ x ≤ +L in steps of T = 10 µm.Using these values R N (X) was calculated by implementing an inverse Fresnel transform using a trapezoidal integration algorithm.A more in-depth discussion of the role of sampling is presented in Section 2.4.From Figure 2(a) we can see that while most of the power contained in the input signal is incident on the camera, a small portion falls outside the camera  17) we expect that u f x (x) will be shifted further along the x axis with the result that less of the signal power is now intercepted by the camera, since SE ≈ 10.3 mm, see Figure 3(a).This results in a poor reconstruction of the input signal, as can be seen by inspecting Figure 3(b).
From these initial results we hypothesis that the quality of the reconstruction is closely related to the ratio, PR, between the amount of power that is transmitted through the optical system and that captured by the camera.We can relate the power at the input plane (i.e. the total power ≈ √ π/8 W) to that intercepted by the camera aperture using Eq. ( 10) and Eq. ( 15), to give where we assume that the cross-terms, CT, average to zero and hence do not contribute to the total power incident on the camera.In Figure 4 we present a contour plot of Eq. ( 18), [see caption for details].We now further suppose that PR is directly related to the quality of the reconstructed signal.This hypothesis was tested by comparing the input and recovered signals using a Root Mean Square Error (RMSE) metric calculated for a discrete set of points in the reconstruction plane, where M = 600, and δX ≈ 12 µm (−3.6 ≤ X ≤ 3.6 mm), which yields the following results, see Fig. 5.
From Figure 5 and 6 we see that as the signal power incident on the camera reduces, the quality of our reconstructed signal decreases (as indicated by the increasing RMSE values) in keeping with our hypothesis.By comparing the results In Figure 4 and 6, it can be seen that a high quality reconstruction is achieved once PR ≥ 0.8.
We therefore conclude that for a given optical system, described using the parameters ABCD A , we may recover a given spatial frequency when Examining Fig. 5 and 6, it should be noted that this requirement is relatively stringent and that it is possible to reasonably reconstruct a given spatial frequency provided that PR ≥ 0.8, where PR is calculated using Eq. ( 18).
To summarize this section we conclude that as expected the camera aperture size plays a major role in determining the ability of holographic imaging systems to recover spatial frequencies.Whether a particular spatial frequency f x can be recovered depends on the camera extent 2L, the wavelength λ, and lastly the optical system ABCD A parameters, in particular B, which can be varied by combining different lenses and sections of free space, see for example DL in Figure 1.Finally, we note from Figure 5 and 6 that the RMSE values start again to steadily increase for values of B < 100 mm.This however is not related to the relative values of L, λB f x and SE, rather it is a result of the use of an insufficient sampling rate for the given input function extent which leads to aliasing.We will return to discuss this observation later in Section 2.4.

Active Pixel Area
In this section we examine the filtering effect introduced by the size of the camera's pixels in more detail.This operation is described mathematically as a convolution of the pixel function of the CCD and the incident wavefield, In keeping with the standard approach in the literature, [9,10][14]- [16], we assume a uniform rectangular structure for the CCD pixels, thus with a corresponding FT given by pγ (k) = S γ sinc(2πγk), (22) where sinc(x) = sin(x)/x.From Eq. ( 21), we note that the size of a pixel is 2γ.We can relate this size to the sampling rate; T = 2γ/FF, where FF is the 'fill-factor', 0 ≤ FF ≤ 1.
We now examine the impact of the convolution filtering operation on the spatial frequency content of u(x).From the form of Eq. ( 22), we expect that our signal will be subject to a low pass spatial frequency filtering operation.Since this operation is most easily understood in the spatial frequency domain, we look at the FT of our signal u(x).Using Eq. ( 4) we note we can describe the effect of a general LCT followed by a FT as To generate an unscaled FT, we include a λ scaling factor in the above equation to take into account the way we have defined our LCT transform.Using these new ABCD A values we determine the bandwidth, BW, just as we previously defined SE, where We therefore propose that the following 'rule of thumb' be used as a guideline to determine the role of the finite size pixel in the imaging operation: From Eq. ( 23) we note that BW may be controlled by varying the parameter D. This can be achieved either by placing different optical elements between the input and camera planes or alternatively by varying the curvature of the reference field.We now present some numerical examples to illustrate how an optical system may be designed by varying the reference field.
For imaging (or Fresnel) systems the effect of the sinc function in Eq. ( 21) will be to attenuate higher spatial frequencies more than lower ones.Therefore in order to examine how different spatial frequencies are transmitted through the complete 'opto-numeric' system we now present an example involving a signal with two spatial frequency components: f x1 = 12.5 lines/mm and f x2 = 100 lines/mm.
We set α i = 0.1 mm and choose an ABCD A = [1, 2/25, 0, 1], which again corresponds to a Fresnel transform.Our camera extent 2L = 12 mm and sampling rate, T = 10 µm are identical to those used in the previous example and we choose a 100% filter factor, setting FF = 1.In Figure 7 we present the results.
Eq. ( 23) predicts that when D = 1 we expect to see two peaks at k = f x1 and k = f x2 , each having a finite spread (extent) since αc ≈ 3.2 lines/mm.The sinc function will attenuate all of the contribution that exists at k = f x2 , however since in general αc = 0, some of the power associated with the f x2 component of the signal will still be present, leading to some minor distortion of the f x1 component in the reconstructed input signal in Figure 7(b).We have compared the reconstruction in Figure 7(b) with the actual input signal using an RMSE metric, yielding a high value of RMSE ≈ 3.16.
We now modify our optical system by changing our reference field from a plane wave to a diverging spherical wave, defined: The field recorded by our camera, u * ref (x)u(x), can be accounted for, using Eq. ( 6).The original system parameters are thus changed in the following manner (A similar approach is adopted in [46]) Setting z S = 150 mm, the new system parameters, ABCD A , become [1, 2/25, -20/3, 7/15].
We first note that since A and B are invariant under this transformation, SE will not change when we replace our plane wave reference source with a spherical one.In one sense this is not unexpected, since the same part of the scattered object field is captured by the camera face.Nevertheless the new spherical reference will effect the captured spatial frequency distribution, since the effective values of both C and D have been altered.Referring to back Eq. ( 23), and noting that D = 7/15, we expect that our spatial frequency peaks, previously located at k = f x1 and k = f x2 , will be shifted to lower frequencies as can be seen by comparing Figure 7 and 8. Also from Eq. ( 24) we expect that the width of the distributions will be reduced; i.e. αc ≈ 1.8 lines/mm.The fact that these peaks are no longer located at their original spatial frequencies reflects the fact that we are filtering in a mixed spatial-spatial frequency domain.Thus the action of the finite size pixel is now less severe.This is reflected in the reduced RMSE value, RMSE = 1.27, for the resulting reconstruction.
We now choose z S = 2/25, i.e. we locate the source of our spherical reference beam the same distance from the camera as our object field.For this special distance we find that ABCD A becomes [1, 2/25, -12.5, 0].Since D = 0, Eq. ( 23) implies that all spatial frequency components become bunched together at k = 0, while αc ≈ 1.97 lines/mm.This system is referred to as a lens-less Fourier optical setup [8] and we note the similarity (apart from scaling factors) of the distributions in Figure 9.With this system there is a FT relationship between the image field and the captured field.Since the spatial frequency content has been shifted into the center (towards zero) there is no appreciable filtering effect (due to the finite size pixels) and we obtain a good reconstruction of our object field with RMSE ≈ 0.00019.In this sub-section we have chosen a relatively small value for our input field extent, namely 4α i ≈ 0.4 mm.From Figure 9(a) we can see that the sinc function is approximately flat over the range |k| < 4 lines/mm and does not attenuate ũ(k) appreciably.As our input extent increases, so too will the extent of ũ(k).The two domains are related by k = X/(λz S ).Thus we see that the effect of the finite pixel size for this optical system acts to attenuate the complex amplitude of the reconstructed field.
We conclude this section by making several remarks.The effect of the pixel extent, and the associated convolution operation, is to attenuate the spatial frequency content in the FT plane of our capture LCT domain.Indeed signal power located at k = n/(2γ) is entirely removed, where n is a positive integer [10,47].If it were possible to design a camera by combining pixels of different widths, it may be possible to overcome this limitation [10], although other fundamental limits do exist [47].Assuming FF = 1, the first spatial frequency suppressed would be k = 1/(2γ) = 1/T = 2 f NQ .
For a Fresnel system, D = 1.This would seem to imply, Eq. ( 23), that we can recover spatial frequencies up to twice the Nyquist limit.For values in the range 2/(2γ) ≤ k ≤ 3/(2γ), the sinc function in Eq. ( 21) is negative returning a signal that is phase shifted by π.As shown by designing the system using lenses and free space we can vary D, and recover spatial frequencies much higher than this limit.One example of such systems are lens-less Fourier systems.
Finally we note that in a unit magnification imaging system, D = 1; implying that we should be able to recover frequencies above the Nyquist limit without any additional processing along the lines outlined in [48]- [51].As we shall see in the next section, sampling comes into play in this case limiting the recoverable spatial frequencies in a unit magnification imaging system to f NQ .

Sampling
It is necessary now to address the role of sampling in DH.
We have already noted that for the generalized holographic systems our input function will be sampled in a mixed space-spatial frequency domain.Let us refer to the field after filtering by both the camera pixels and aperture as v(x), such that with the help of Eq. ( 2), we may rewrite Hence our filtered field is sampled at intervals of T. We process this data by applying an inverse LCT operation numerically to generate our continuous reconstructed signal, We note that where R(X) is the signal that would be returned if no sampling took place.Appropriate general parameters for performing our inverse numerical LCT are given by We note from this result that the effect of sampling is to produce an infinite number of modulated replicas of our continuous function R(X), which are spatially separated from each other by a distance λB N /T.Provided that the spatial extent of our input field is less than the distance between neighboring replicas, i.e.
then it is possible to avoid aliasing and to recover our signal R(X) from its replicas [13,27].We also note that each replica is multiplied by linear and constant phase terms.
We now return to Figure 5 and Figure 6.Once again we apply the hypothesis that the quality of the reconstructed signal is closely related to the power incident on the CCD.We note that as the incident power decreases, it corresponds to an increase in the calculated RMSE values.One region differs significantly in the two figures namely when B < 100 mm.We expect from Eqs. ( 18), (20) and Eq. ( 25) that a high quality reconstruction is possible, however the RMSE values indicate that the reconstruction in this region is poor.The results presented in these figures assumed λ = 633 nm, T = 10 µm and α i = 2 mm.From Eq. ( 29) we expect the onset of aliasing to occur for B < 126 mm.Therefore the increase in RMSE values shown in Fig. 6 is related to replicas encroaching on our reconstructed signal and distorting it.There is however a slow onset of this aliasing effect in the results presented.Based on Eq. ( 32), aliasing should start when B < 126 mm, however a noticeable increase does not occur until B < 100 mm, this is due to the particular nature of the chosen signal.To clarify we refer back to and note that while aliasing may be occurring, initially it only effects the regions of the reconstruction that have low amplitude and hence does not contribute significantly to increasing the RMSE values until B ≈ 100 mm.
Suppose now that our initial LCT transform performs as an ideal unit magnification imaging operation, in this case ABCD A = [1, 0, 0, 1], and which on substitution into Eq.( 30) yields the following result [note we have dropped the constant phase term exp −jπλB N D N n 2 /T 2 for simplicity], There are several points to note about this result: (i) R N (X) is a continuous function and therefore returns a value for all X, (ii) Since B N = 0 we do not see the appearance of an infinite number of replicas in the reconstruction, indeed this is a special case of Eq. ( 32), (iii) In general for a given value of X, R (X) is multiplied by an infinite number of different cosine values, which are dependent on both T and X.We can ensure that the cosine terms do not effect the recovery of our signal R(X) by carefully choosing our values for T and X, i.e., if where m is an integer.Hence we see that we can only recover the signal at particular values of X that are integer multiples of T, which appears equivalent to the Nyquist sampling conditions.
To close this section we note that while sampling is an important performance limiting factor in DH imaging systems it does not necessarily limit the the spatial frequencies that can be reconstructed.Since the sampling operation occurs in a non-imaging plane more general sampling conditions apply, and therefore in many instances the field at the camera plane can be sampled at rates much lower than the traditional Nyquist rate and yet still be fully recovered.

3D Imaging Performance -Point spread function analysis
Thus far we have examined the ability of our system to recover the spatial frequency content of our input signal.We now wish to examine the imaging performance of the holographic system as a function of depth.A practical example of interest might be imaging small particles that lie within a volume as depicted in Fig. 1 of Ref. [52] or in Fig. 10 of [53].It would seem to be useful to introduce a PSF to aid our analysis.Ideally the model we have been developing should provide some insight into the expected form of the PSF.We motivate this idea by considering some standard descriptions of diffraction-limited optical imaging systems; see for example Chapter 24 of Ref. [54] and Ref. [55].One description of the imaging process [54,55], first calculates the image formed using geometric optics, and then accounts for the effects of diffraction, by convolving this geometrical image with a PSF.
Alternatively one can analyze the behavior of a 4-f imaging system (where the limiting aperture is situated in the Fourier plane) by decomposing the input field into its Fourier components [54].These spatial frequencies are then mapped to Dirac delta functions in the Fourier plane.Only those spatial frequencies that lie within the aperture window can contribute to the final image and thus the degrading effects of diffraction can be interpreted as a spatial frequency filtering operation.
As the width of the aperture increases, more spatial frequencies are passed and the width of the associated convolving PSF narrows, improving the resolving capability of the system, i.e., the quality of the output image.Thus the width of the PSF is directly related to the spatial frequencies that pass through the Fourier plane aperture, see Section 6.2 in Ref. [5].We note that this latter viewpoint (frequency decomposition) is similar to the approach taken thus far.We see from our analysis that in a Fresnel system the power associated with different spatial frequencies is localized and confined to particular optical "channels" that tend to walk off from the optical axis, see also Section 2 of Ref. [56].If a sufficient amount of power associated with a particular spatial frequency in the signal is not intercepted by the camera aperture we cannot recover that spatial frequency, see Section 2.2 above.
Setting f x in Eq. ( 9) to zero and allowing the variable α i to limit to zero, Eq. ( 9) reduces to a Point Source (PS) located on the optical axis.The resulting field, u ps (x), incident on the camera can be expressed as This field is first convolved with the camera pixel function where √ π/λz and z = B/D.We note that Eq. ( 36) is in fact a scaled Fresnel transform integral over a distance z , with a Fresnel number F = γ 2 /λz .This number provides some insight into the form of the distribution described by Eq. (36).For example when F > 1, we are in the "Fresnel" regime and the distribution in Eq. ( 36) tends to be more structured, see for example Section 4.5.1 in Ref [5] and in particular Fig. 4.15 (note that the axes are scaled).Often the Fresnel numbers that are encountered in DH systems are F 1, due to the small size of the camera pixel compared with z .Under these conditions we can, to a good approximation, replace Eq. (37) with the Fraunhofer approximation to give An important case when the Fraunhofer approximation cannot be made includes imaging systems where B → 0 and F → ∞.In the following subsections we examine the effect these approximations have on the system's PSF.

Fraunhofer approximation, F 1
Here we examine the implications for our PSF analysis when the Fraunhofer approximation is valid, i.e. when F 1. Following the convolution operation the field ũps (x) is sampled, apertured and subject to an inverse LCT operation with the parameters ABCD N .We may write the resulting PSF as Choosing we can "numerically refocus" our digital hologram by varying z n f , i.e., by using the Fresnel transform to propagate to parallel nearby planes.Making use of Eqs. ( 2), (37), and (39), and simplifying the resulting expression we can write Eq. ( 38) as where and We note from Eq. ( 40), that the sampling operation again produces an infinite number of replicas in the reconstruction plane separated from each other by λB N /T as well as a linear and constant phase terms.
If we set z n f = 0, the expression in Eq. ( 41) reduces to a scaled FT operation.The form of the resulting PSF distribution depends on the interaction of the width of the sinc function and the camera aperture size.The first null of the sinc function occurs when X = λz /γ.Hence provided that we can assume that the limits of integration in Eq. ( 41) are imposed by the sinc function, rather than the camera aperture and therefore that the imaging system performance is limited by the camera pixels.Setting L → ∞ in Eq. ( 41), and performing the scaled FT we see that h(x) is given by a scaled form of the original camera pixel function.
When z n f = 0, we must take into account the quadratic phase term in Eq. ( 41).This is equivalent to performing an LCT operation on the sinc function.For small values of z n f we expect that h(x) will have a form similar to that of diffraction from a rectangular aperture (1-D slit), i.e. similar to the mathematical form of Eq. (36).
We now examine the case when the camera aperture acts to limit the region of integration of Eq. ( 41).This implies that the sinc function should not vary appreciably over the range −L ≤ X ≤ L. This will approximately hold if the following inequality is satisfied Again for simplicity we initially assume that z n f = 0. Eq. ( 41) can then be expressed as which has the more familiar PSF form, a sinc function.When z n f = 0, Eq. ( 45) will tend to broaden and its peak amplitude will decrease.

Numerical examples
In the previous section we discussed how the mathematical form of the PSF is dependent on the particular DH system.
Here we explore this further by examining how the PSF varies as a function of position for different DH systems.We do this by considering the reconstruction of a PS located at two different locations for three separate system configurations (i) Fresnel, (ii) lens-less Fourier and an (iii) optical FT system.The properties of these different systems are summarized in Table 1.Examining the parameters for ABCD N we note that the location of the replicas vary as a function z nf for systems (i) and (ii), while for system (iii) the locations of the replicas in the reconstruction domain is a function of the lens focal length only.The optical system parameters ABCD A , are more important for determining the performance of the different systems as these variables define both SE and BW.
Keeping the same CCD parameters as before, i.e.L = 6 mm, T = 10 µm, FF = 1, we first examine the reconstruction for system (i).

Fresnel PSF
Here we examine the form of the PSF when the PS is located at z ps = 7/500 and then at z ps = 8/125.Subbing the values into (i) Fresnel (ii) Lens-less Fourier (iii) Optical FT Eq. ( 42) we find that in the first instance L ≈ 0.9 mm, while when the PS is located further away from the CCD, L ≈ 4 mm.We present the plots for h(X, z n f ) in Figure 10(a) and (b) respectively.Since L = 6 mm, Eq. ( 42) holds when z ps = 7/500 and so we should expect a rectangular shaped PSF.For the second PS, Eq. ( 42) only approximately holds and we note that the PSF is shaped similar to a flat-topped sinc function.Nevertheless for both PS locations the limiting effect of the pixels dominate the imaging performance.In Figure 10(b) we plot the longitudinal PSF distribution which is symmetric about z = 0.For both PS the distribution is similar.Closer inspection reveals that the peak amplitude for |h(X, z n f )| does not occur when z n f = 0.This implies that the most sharply focused plane does not lie at the correct focus but rather two planes either side of the contributing PS.To confirm this we also plot h(X, z n f ≈ 58 µm) as a dotted distribution in Figure 10(a).While the peak amplitude is higher at this value for z n f , the width of the distribution is also wider.

Lensless Fourier PSF
From Section 2.3 we recall that by appropriately designing our optical system we can vary the value for D. Turning our attention to system (ii) we choose a value of z r = 7/500, see Table 1.Again we examine the distribution that results when the PS is located at z ps = 7/500 and then at z ps = 8/125.With these parameters D is given by zero and -25/7 respectively.This indicates that for the first PS location, spatial frequencies will not be effected by the convolution action of the pixels resulting in a dramatically improved form for our PSF, see the solid gray distributions in Figure 11(a) and (b).From Eq. ( 44) we find that for the first PS location the camera aperture acts as the limiting performance factor.For the second PS, Eq. ( 42) applies indicating that the finite pixel size limits the imaging performance of the system which can be verified by examining the dashed black distributions in Figure 11(a) and (b).We further note that the PSF for the second PS location is approximately rectangular and 3 times wider than in Figure 10(a), in keeping with Eq. (43).Once again the peak amplitude for |h(X, z n f )| does not occur when z n f = 0.

Optical FT PSF
Finally we turn our attention to our optical FT system.We choose a converging lens of focal length z f = 100 mm and refer the reader to Table 1 for the system parameters.We first note that the value of B is not a function of the location of the PS rather it is determined solely by the focal length of the lens.Neither of the inequalities, Eq.'s ( 42) and ( 44) are satisfied in this instance, hence in this region both the finite size of the pixels and the camera aperture effect the PSF.From Table 1, we recall that B does not vary as function of z ps and so for this particular system, the differences arise due to the filtering action of the camera pixels.Interestingly when the PS is located further from the CCD it gives a smaller value for D [see Eq. ( 23) and Eq. ( 25)], resulting in a narrower distribution, see the dashed line in Figure 12(a) and (b).In this latter instance we note that more power is contained in the side-lobes of the distribution.We attribute these differences to the effect of the pixel filtering operation.

Special Case: Complex ABCD A parameters for imaging architectures
In this section we wish to demonstrate how the analysis can be extended to include Gaussian type apertures in our optical systems, by examining a particular case namely the PSF for an imaging system.Previously we have assumed that the ABCD A are real and implicitly that the lens and other optical elements in the system are of infinite extent.Therefore a PS in the object plane would be mapped to a corresponding PS in the image plane and registered as a single bright spot on the center camera pixel, [we note for an imaging system, B = 0 and hence the role of Eq. ( 35) needs to be considered more carefully, see the definition in Eq. ( 1)].Previous analyses of these types of systems [57], assume that the limiting aperture arises due to the finite extent of other optical elements and not the camera aperture.We now show that an analogous result can be derived using the approach discussed in [44,45].Consider a 4-f unit magnification imaging system with a limiting Gaussian aperture located in the Fourier plane, i.e.
If the effective width of the PSF, 4α ds , can be resolved (in a Nyquist sense) by the sampling rate of the camera then the limits on resolution are imposed by the optical components used to image the input field.Should, 4α ds , be smaller than the pixel spacing then the sampling rate and finite pixel size will both act to limit the maximum spatial frequency that can be recovered [58].

Conclusion
In this manuscript we have developed a model which simplifies the analysis of generalized holographic imaging systems that include as special cases the familiar Fresnel, lensless Fourier and imaging architectures.We began by choosing a relatively straight-forward mathematical expression as the input function that could be used to conveniently model the propagation of different spatial frequency components through the holographic imaging system, see Figure 1.Using this function both the finite extent and spatial frequency content can be varied and their effects on the final reconstruction of the signal estimated using simple but quite general rules of thumb.It was shown that the power associated with a specific spatial frequency component can be visualized as traveling along definite "channels" centered at x = ±λ f x B of spatial extent 4α c .For a reasonable reconstruction we require that the power contained in these "channels" lie within the finite extent of the camera, ±L.This condition is ensured when Eq. ( 20) is satisfied, although quite reasonable reconstructions can be obtained for the less restrictive condition given in Eq. (18).A feature of a generalized DH system is that the path of these "channels" can be controlled using a combination of lenses and sections of free space by fixing the optical system parameters ABCD A , using Eqs.( 4)- (8).
The averaging effect of the finite size pixels were shown to limit the system performance by an amount related to the system parameter D and αc .As a guideline one should ensure that the bandwidth of the signal, BW, is less than 1/T, see Eq. ( 23) and (25).It was shown that the value of D could be varied by changing the curvature of the reference wavefield and this was explored in Section 2.3.
The discrete nature of CCD cameras has significant implications for the resulting reconstructed DH field.The effect of the sampling operation is to produce an infinite series of replicas in the reconstruction plane that are separated from each other by a distance λB/T.Provided that the extent of our input field is finite and does not exceed the separation between replicas we can perfectly reconstruct our signal.This result is a generalization of the commonly quoted Nyquist criterion, which implies that it is possible to recover spatial frequencies far greater than the Nyquist sampling limit provided that Eq. ( 32) holds.We note that in the model presented here, it has been assumed that the all optical elements are of infinite extent, this limitation may be overcome by modeling apertures using a Gaussian function [44,45] and as discussed in Section 3.3.
If a priori knowledge is available regarding the extent and spatial frequency content of an input signal, the main results from Section 2, i.e., Eq. ( 18), Eq. ( 25) and Eq. ( 32), can be used to optimally design an optical system ABCD A in order to image that signal.
In order to examine the 3D imaging performance of generalized DH systems in Section 3 we introduced a Point Spread Function (PSF) analysis.It was shown that the system's PSF was determined by the parameters ABCD A .In general the finite extent of the camera and pixels both contribute to the shape of the system's PSF, although two specific regimes were identified where one of these effects is almost completely dominant, see Eq. ( 42) and Eq.(44).It was shown that the longitudinal and lateral widths of the PSF were dependent on the distance of the point source with respect to the camera, see Table 1 and Eq. ( 42) and Eq.(44).Some implications of this result were discussed with the aid of some numerical examples in Section 3.2.Since the form of PSF is dependent on the parameters ABCD A , one cannot image over a 3D volume with the same detail, as when using a scanning imaging system.A similar conclusion was previously reached by Kou and Sheppard [43].In Section 3.3 the analysis was extended to include soft Gaussian apertures.This approach can be applied more generally along with the results in both Section 2 and 3 to deal with lossy systems, see for example [44].
These algorithms map N x × N y samples in the capture domain to M x × M y samples in the reconstruction domain.The effect of discretizing R N (X) in Eq.( 29) produces a periodicity of the sampled camera distribution.Although no information is destroyed by using these fast algorithms care needs to be taken when implementing them so that other sampling rules are not violated, [27,31,59,60].
In closing we would like to mention several publications that came to our attention after this manuscript was accepted.
First, an early publication by Yaroslavskii and Merzlyakov, [61] and a more recent contribution (see Chap. 3), again from Yaroslavskii, [62].In Ref. [63], Hao et.al. propose a slightly different model of interaction with the camera, swapping the order of the convolution and the aperture, but their model outputs identical sample values to that used in this paper.

D
FIG. 2 (a) u(x) with the following parameters, λ = 633 nm, ABCD A = [1, 0.5, 0 ,1], α i = 2 mm, and f x = 10 lines/mm.SE indicates the spatial extent of the field, u(x), while ±λB f x indicates the location of u fx (x).(b) |R N (X)| is calculated from a series of values of u(x) sampled over the range −L ≤ x ≤ +L at intervals of T = 10 µm.The values of the |u(x)| axis has been multiplied by a factor √ α i for presentation purposes while both the x and X axes are plotted in mm.
FIG. 6 Contour plot of RMSE with the following parameters, λ = 633 nm, L = 6 mm and α i = 2 mm, as a function of 50 ≤ B ≤ 300 mm and 1 ≤ f x ≤ 85 lines/mm.In this plot a reduced range is presented: 0 ≤ RMSE ≤ 4.

DFIG. 7
FIG. 7 Plot of (a) | ũ(k)|, over the range 0 ≤ k ≤ 50 lines/mm with λ = 633 nm, α i = 0.1 mm, FF = 1, f x1 ≈ 5.8 lines/mm and f x2 ≈ 46.7 lines/mm and (b) the resulting reconstruction in the image plane.The values of the | ũ(k)| and |R N (X)| axes have been multiplied by factors of 10 and 0.5 √ α i respectively for presentation purposes, while the units for the k and X axes are lines/mm and mm respectively.

DFIG. 9
FIG. 9 Plot of (a) | ũ(k)|, over the range 0 ≤ k ≤ 4 lines/mm with λ = 633 nm, α i = 0.1 mm, FF = 1, and (b) the resulting reconstruction in the image plane.The values of the | ũ(k)| and |R N (X)| axes have been multiplied by factors of 10 and 0.5 √ α i respectively for presentation purposes, while the units for the k and X axes are lines/mm and mm respectively.