Quantitative characterization of super-resolution infrared imaging based on time-varying focal plane coding

High resolution infrared image has been the goal of an infrared imaging system. In this paper, a super-resolution infrared imaging method using time-varying coded mask is proposed based on focal plane coding and compressed sensing theory. The basic idea of this method is to set a coded mask on the focal plane of the optical system, and the same scene could be sampled many times repeatedly by using time-varying control coding strategy, the super-resolution image is further reconstructed by sparse optimization algorithm. The results of simulation are quantitatively evaluated by introducing the Peak Signal-to-Noise Ratio (PSNR) and Modulation Transfer Function (MTF), which illustrate that the effect of compressed measurement coefﬁcient r and coded mask resolution m on the reconstructed image quality. Research results show that the proposed method will promote infrared imaging quality effectively, which will be helpful for the practical design of new type of high resolution infrared imaging systems.


INTRODUCTION
For the restriction of infrared sensors processing technology, compared to visible detector, infrared detector generally have focal plane array of limited number and larger pixel size, which cannot satisfy military and civilian imaging requirement of high resolution [1,2].Therefore, it is necessary to introduce new concept, new theory or new imaging mechanism to design new type of infrared imaging system.
Compressive sensing (CS) involves exploiting the sparsity and compressibility of an image in some transform domain so that one can utilize fewer measurements than the ones required for conventional imaging, yet the imaging can be reconstructed with minimal loss information [3,4].Focal plane coding (FPC) is a means for intelligent sensing and mapping of optical pixel to enable efficient and faithful digital image reconstruction.Making a combination of CS and FPC in infrared camera, we may obtain a high resolution image from a few measured samples.Many researchers have studied how to improve optical imaging resolution based on the above two points.Andrew D. Portnoy et al. [5,6] use focal plane coding to produce non-degenerate data between multiple apertures, and then sub-aperture data is integrated to form a single high resolution image.A compressive imaging system using multiplex and multi-channel measurements with static focal plane coding is described in Ref. [7] for compressed sensing, which demonstrates that this system can achieve up to 50% compression with conventional benchmarking images.The most noteworthy application of compressed sensing is the single-pixel camera designed by Baraniuks team in Ref. [8], the singlepixel camera acquires a recognizable image with a resolution comparable to N pixels.Inspired by the idea of the singlepixel camera, CS has been used for passive millimeter-wave imaging to significantly reduce the imaging time and produce high-fidelity images in Ref. [9].Refs.[10,11] present a novel approach for improving infrared imaging resolution by the use of CS.The image sensor measures the compressed samples of the observed image through a coded aperture mask placed on the focal plane of the optical system, the same scene could be sampled repeatedly by using multiplexing technology, and then the image reconstruction can be performed from these samples using an optimal algorithm.Inspired by the above research results, we propose a super-resolution infrared imaging method based on time-varying focal plane coding, and introduce the MTF for quantitative evaluation of image quality.This paper is organized as follows.Section 2 describes the basic theory of CS briefly.In Section 3 we present a super-resolution infrared imaging method based on timevarying focal plane coding, mainly including the coding sampling strategy and its corresponding reconstruction method.Experimental results and quantitative evaluation are given in Section 4. Section 5 concludes this paper.

COMPRESSED SENSING
e basic idea of CS theory is that when the image of interest is sparse or highly compressible in some basis (i.e., most basis coefficients are small or zero-valued), relatively few wellchosen observations suffice to reconstruct the most significant nonzero components.In particular, judicious selection of the type of image transformation introduced by measurement systems may dramatically improve our ability to extract high quality images from a limited number of measurements.In this section we review the intuition and theory underlying these ideas.By designing optical sensors to collect measurements of a scene according to CS theory, we can use computational methods to infer critical scene structure and content.
The CS sampling model is described as where f ∈ R N is an unknown signal, which could be sparsely represented as f = Ψx in an orthonormal basis Ψ.If there are only K (K N) non-zero components of x, f is defined as being K-sparse.Phi denotes a M × N (M N) matrix called measurement matrix.Θ = ΦΨ is a sensing matrix and y ∈ R M is an observation vector.The CS theory indicates that, subject to a Restricted Isometry Property (RIP) condition [12,13] on the observation matrix Theta, high dimensional vectors f can be recovered from a much smaller dimensional observation y with the probability close to 1.
The observation matrix Θ should satisfy the RIP of order 3k if, for T ⊂ {1, 2, . . ., n} and Θ T , a submatrix obtained by retaining the columns of Theta corresponding to the indices in T, there exists a constant δ 3 k ∈ (0, 1/3) such that for all z ∈ |T| , holds for all subsets T with |T| ≤ 3k.Intriguingly, many kinds of random matrices meet RIP with high probability.The equivalent condition of RIP, referred to as incoherence, requires that the rows {φ j } of measurement matrix Φ cannot sparsely represent the columns {φ i } of orthonormal basis Ψ (and vice versa).Refs.[14]- [17] proved that if Ψ is a Gaussian random matrix, random binary Bernoulli distribution matrix, partial Fourier matrix, local Hadamard matrix, Toeplitz matrix and so on, Θ can meet RIP in all probability.Finally, to solve the l 1 optimization problem described in Eq. (3) will get a high precision estimated value f of an original signal f .
where x ∈ R N is the sparse form of the original signal f , and x only has K(K N) non-zero components.Sensing matrix Θ denotes an M × N matrix and y ∈ R M is an incomplete observation vector.The dimension M of y is smaller than the dimension N of x ( M N ).There are many optimal methods such as Basis Pursuit (BP), Matching Pursuit (MP), Total Variation (TV), and Gradient Projection for Sparse Reconstruction (GPSR) [18] for signal reconstruction.

SUPER-RESOLUTION INFRARED IMAGING MODEL
For typical infrared imaging system, the pixel pitch of infrared sensors ranges from 15 m to 40 m [19], which is larger than that of their visible-light counterparts and cannot satisfy the resolution requirements of many science experiments.To improve infrared imaging quality, we design a time-varying coded mask based on CS and set it in front of the focal plane of the optical system closely.Finally, a super-resolution image can be obtained by introducing reconstruction algorithm from a small amount of samples.
As denoted in Figure 1, time-varying focal plane coding imaging model is given for one moment.Conventional infrared imaging resolution is determined by the detector pixel size (d × d) of the focal plane array.Assuming that the angular resolution is iFov and the scene sampling distance is defined as H.After m × m coded mask sub-array for each pixel and reconstruction algorithm are adopted, final infrared imaging resolution is dependent on the pixel size of the coded mask rather than that of the focal plane array.Theoretically, the angular resolution of the reconstructed image become α = iFov/m and the scene sampling distance of the reconstructed image is ∆h = H/m, thus infrared imaging resolution can be promoted m times.
According to CS theory, the practical design of coded mask choose Gauss random matrix with normal distribution N(0, 1) as the measurement matrix.As illustrated in Figure 2(a), a 12 × 12 coded mask array corresponds to 3 × 3 pixels on the sensor.Every pixel is masked with the same 4 × 4 sub-array pattern.Ref. [20] proved that multi-value mask is superior to conventional binary mask in imaging system.Thus, a multivalue time-varying coded mask is adopted in our infrared imaging system, in which "0" or "1" elements denote the light is fully blocked or passed, and the value between 0 and 1 denotes the light is partly blocked.The white area enables the light to pass through the mask while the black area disables it, A measurement matrix corresponds to a compressed measurement, time varying coding pattern can obtain multiple compressed measurement of original scene.Figure 2(b) shows that the variation of single pixels focal plane coding mask with time t.At the same moment, sub-arrays of every pixels coded mask are the same Gauss random matrix, while different random matrices are used for different moments, which obey the same Gaussian distribution.Compared to spatially multiplexing technology used to obtain multiple samples, the time-varying technique in this paper will reduce the volume and mass of an infrared imaging system significantly.
Suppose that ∆t is the time step of different moments of coded mask, M is the measurement times, t is the total detection time of all compressed measurements.The variable r is defined as the compressed measurement coefficient which equals to the ratio of M to coded mask resolution (m × m), r ∈ [0, 1].For m and ∆t being fixed, the total detection time t become larger with an increment of r.Its mathematical form is given as fol- From Eq. ( 4) and Eq. ( 5), M measurements are obtained after time t, thus each detector pixel can achieve a compressed measurement vector y which has M elements.This process can be written as its corresponding matrix form is expressed as where φ i (i = 1, 2, . . ., M) denotes the transformed 1 × (m × m) vector resulting from measurement matrix corresponding to the ith compressed measurement.f is the column vector of light field intensity in a pixel area.The same reconstruction processes of all the pixels are performed independently.For the trade-off between computational time and the recovery accuracy, the Gradient Projection for Sparse Reconstruction (GPSR) algorithm is used for image reconstruction, Eq. ( 3) can be formulated as an optimization problem where the objective function is expressed as a combination l 1 and l 2 minimization program, as described in Eq. ( 8).
where τ is a nonnegative parameter and τ = 0.02 Θ T y ∞ in this paper.Formally, by introducing vectors u and v, make the substitution, These relationships are satisfied by u i = (x i ) + and v i = (−x i ) + (i = 1, 2, . . ., n), where (•) + denotes the positive-part operator defined as (x) + = max{0, x}.So Eq. ( 8) can be rewritten as the following standard bound-constrained quadratic program(BCQP): The two-step gradient projection method defines its iteration z k+1 from the previous iteration z k as, where w (k) is a temporary variable, α (k) > 0 and β (k) ∈ [0, 1].Detailed derivation and discussion can see Ref. [18].

SIMULATION RESULTS
To verify the feasibility of the proposed super-resolution imaging method based on time varying focal plane coding, a tank image of 512 × 512 pixels is taken as an original scene.Discrete Cosine Transform(DCT) is used to form a sparse representation, and a 32 × 32 detector array is adopted to sample the original scene, and each detector element cover 16 × 16 pixels of an original scene.

Reconstruction of tank image
First, the effect of compressed measurement coefficient r on the quality of the reconstructed image is discussed.It is assumed that a 8 × 8 coded mask sub-array for each detector pixel is adopted, the reconstructed images for different r = 0.8, 0.4 and 0.2 are obtained and shown in Figures 3(b), (c) and (d), respectively.The resolution of the reconstructed image is 256 × 256 pixels.The PSNR is selected to evaluate the variation of image quality with the coefficient r, the relationship of the PSNR with r is illustrated in Figure 4.It can be seen  that the reconstructed image quality can be promoted significantly as r become larger.The largest PSNR can be achieved at 30db.However, for a fixed coded mask, an increment of compressive measurement ratio r will lead to longer sampling time.
It is assumed that the coefficient r = 0.It can be observed that the reconstructed image quality has been improved greatly, and more details of original scene are recovered with m being larger.To quantitatively assess the quality of the reconstructed image, the MTF is introduced.

Quantitative evaluation of image quality based on MTF
MTF is one of the key indicators to characterize the signal transfer characteristics of an imaging system as a function of spatial frequency in terms of linear response theory [21].Various methods have been proposed to determine the MTF of an imaging system, the slanted-edge measurement method to calculate the MTF is used in this paper [22].The schematic of MTF by the slanted-edge measurement method is shown in Figure 6.
MTF gives an idea of a deeper and more objective conception about resolution and falls from one to zero by increasing spatial frequency.In this paper, we will measure the spatial frequency by cycles per pixel, taking one dark and one light line as a cycle.In order to connect the MTF with the usual limiting resolution, the spatial frequency corresponding to minimum contrast ratio of human eye 0.05 is taken as the limiting resolution.
A slanted-edge image of 1024 × 1024 pixel is taken as an original scene.Assume that m = 8 remain unchanged, then the original scene is sampled for r = 0.25, r = 0.50 and r = 0.75, respectively.Their corresponding reconstructed images are In the case of conventional imaging without coded mask, the sampled image is degraded seriously, and the image contrast quickly falls off with an increment of the spatial frequency.For r = 0.25, the recovered image contrast is well reproduced compared with the downsampling image.Further, the quality of the reconstructed image can be improved for r = 0.5 and r = 0.75.When r = 0.25, r = 0.50 and r = 0.75, the limiting resolution are 0.775c/p, 0.797c/p, 0.825c/p, and the MTF at Nyquist frequency are 0.1121, 0.2008, 0.2175, specifically listed in Table 1.For a fixed m, the quality of reconstructed image improves with the increase of r.Both the limiting resolution and the MTF at Nyquist frequency improves significantly compared with conventional imaging mode without coded mask.

CONCLUSION
We have presented a super-resolution infrared imaging method based on time-varying focal plane coding for increasing image resolution.This method can obtain high quality images by using low resolution detector array.Time-varying focal plane coding scheme makes it possible to sample enough compressed measurement by one detector and thereby reduce the volume and mass of an infrared image system.Moreover, because the coded mask is not fixed and would be usable in many applications by changing the coding strategy.Both the PSNR and the MTF have been introduced to quantitatively evaluate the reconstruction image quality, and simulation results indicate that the proposed method can obtain more information of an original scene and improve image quality significantly compared with conventional imaging mode without coded mask.The optical experiments based on the proposed method will be an important part of our further research plan.The practical exposure time will be obtained in next step.Exposure time will be shorter and the Signal to Noise Ratio (SNR) will be reduced in the case of non-steady objects.

FIG. 4
FIG.4PSNR value as a function of r for different reconstruction images.
75 remain unchanged, we set m = 4, m = 8 and m = 16, namely 4 × 4, 8 × 8 and 16 × 16 coded masks for each pixel are used to sample the original scene, respectively.Their corresponding reconstructed images are shown in Figures 5(b), (c) and (d), respectively.Figure 5(a) is a downsampling image obtained by using a 32 × 32 detector array without coded mask.

Table 2 .
For a fixed r, the quality of reconstructed image improves with the increase of m.Both the limiting resolution and the MTF at Nyquist frequency improves significantly compared with conventional imaging mode without coded mask.

TABLE 2
Image quality characterization for different m