Truncation strategy for the series expressions in the advanced ENZ-theory of diffraction integrals

The advanced ENZ-theory of diffraction integrals, as published recently in J. Europ. Opt. Soc. Rap. Public. 8, 13044 (2013), presents the diffraction integrals per Zernike term in the form of doubly infinite series. These double series involve, aside from an overall azimuthal factor, the products of Jinc functions for the radial dependence and structural quantities $c_t$ that depend on the optical parameters of the optical system (such as NA and refractive indices) and the defocus value. The products in the double series have coefficients that are related to Clebsch-Gordan coefficients and that depend on the order of the Jinc function and the index $t$ of the structural quantity, as well as on the azimuthal order and degree of the involved Zernike term. In addition, the structural quantities themselves are also given in the form of doubly infinite series. In this paper, we give truncation rules for the various infinite series depending on specified required accuracy.


Abstract.
The advanced ENZ-theory of diffraction integrals, as published recently in J. Europ. Opt. Soc. Rap. Public. 8, 13044 (2013), presents the diffraction integrals per Zernike term in the form of doubly infinite series. These double series involve, aside from an overall azimuthal factor, the products of Jinc functions Jinc h for the radial dependence and structural quantities c t that depend on the optical parameters of the optical system (such as NA and refractive indices) and the defocus value. The products in the double series have coefficients that are related to Clebsch-Gordan coefficients and that depend on the order h of the Jinc function and the index t of the structural quantity, as well as on the azimuthal order m and degree n of the involved Zernike term Z m n . The structural quantities themselves are also given in the

Introduction and overview
The advanced ENZ-theory of diffraction integrals, as presented in [1], aims at the computation of the Debye approximation of the Rayleigh integral for the optical point-spread functions of radially symmetric optical systems that range from as basic as having low NA and small defocus value to advanced high-NA systems, with vector fields and polarization, that are meant for imaging of extended objects into a multilayer structure. As in the classical Nijboer-Zernike theory, the generalized pupil function is developed into a series of Zernike terms. This gives rise to diffraction integrals per Zernike term that are expressed in [1] as doubly infinite series In Eq. (1), m and n are the azimuthal order and degree of the involved Zernike term Z m n , the c t = c t (OS, f ) are the Zernike coefficients of the radially symmetric front factor composed of an algebraic factor comprising the parameters of the optical system and a factor comprising the defocus parameter f , the J h+1 (2πr)/2πr are Jinc functions whose order h has the same parity as m with argument 2πr where r is the value of the radial parameter, and the A are to Clebsch-Gordan coefficients related numbers. In [1], Eq. (59), there occurs a slightly more general expression, in which the vectorial nature and polarization conditions are accounted for, leading to 5 series expressions involving an integer j, |j| = 0, 1, 2 , of which Eq. (1) is the case j = 0. We shall not consider this generalization, since for truncation matters all these 5 cases behave the same. Furthermore, in the low-NA, small-defocus case, where a scalar treatment is allowed, the only required diffraction integral is the one with j = 0.
The A-coefficients in the double series in Eq. (1) have attractive properties with respect to their size and the set of h, t for which they are non-vanishing. The main effort in getting truncation rules goes therefore into bounding Jinc functions Jinc h and structural quantities c t . The Jinc functions are directly given in terms of Bessel functions while the structural quantities involve products of spherical Bessel and Hankel functions evaluated at f /2 and f /2v 0 , respectively, where v 0 , 0 < v 0 < 1, is a quantity determined by the optical system. Now it is a fact that (spherical) Bessel functions, considered as a function of the order, are of constant magnitude as long as the order is less than the value of the argument. Beyond this point a super exponential decay as a function of order takes place. The situation for the structural quantities is somewhat complicated by the occurrence of the Hankel functions (causing decay to slow down to exponential for t beyond |f |/2v 0 ). These observations are basic to the approach taken in this paper and lead to the general ruleof-thumb that it suffices to include in Eq. (1) all terms h, t with 0 ≤ h ≤ H, 0 ≤ t ≤ T in which H is slightly larger than 2πr and T is slightly larger than |f |/2. It is the aim of this paper to give a more precise meaning to this ruleof-thumb, in which the required absolute accuracy is included. Furthermore, by taking advantage of the (m, n)-dependent support properties of the Acoefficients, it is possible to formulate a truncation rule per Zernike term Z m n that achieves a particular accuracy with substantially less terms than when the general rule were used.
We shall do this in all detail for the diffraction integral I = I V M of [1], Sec. 8, which is meant for systems with high NA, vector fields and magnification. Explicitly, I assumes the form where are the algebraic, focal, polynomial and Bessel function factor, respectively. Here s 0 is the NA in image space, s 0,M is built from the refractive indices in image and object space and the magnification factor in object space according to [1], Eq. (31), and u 0 = 1 − 1 − s 2 0 . The I V M -case is with respect to truncation issues quite representative for all diffraction integrals considered in [1], except for the case of I V M M L in [1], Sec. 9, with backward propagating waves in a layer of the multilayer structure in image space. The I V M -case is also general enough to illustrate the various intricacies that come with the computation of the Zernike coefficients c t , the structural quantities, of the front factor a(ρ) f (ρ), see [1], Sec. 4, requiring truncation rules as well.
In Sec. 2 we consider rules for the truncation of the double series in Eq. (1) for the I V M -case for which we use bounds on the Jinc functions and on the structural quantities that follow from Debye's asymptotics for Bessel functions. In Sec. 3 we consider the truncation issues associated with the computation of the structural quantities. In Sec. 4 the whole computation scheme and the truncation rules are summarized. In Sec. 5 we illustrate the performance of the truncation rules by plotting actually achieved accuracy and computation times against required accuracy. In Sec. 6 we present our conclusions. In Appendix A we present basic properties of ϕ-functions that arise in bounding the (spherical) Bessel and Hankel functions using Debye's asymptotics. The results of Appendix A are used in Appendix B and C where we develop bounds on Jinc functions and structural quantities. In Appendix D we present some proofs concerning the validity of the truncation rules. In Appendix E we present a number of results containing the computation and asymptotics for the Zernike coefficients of the algebraic factors that occur in the I V M -case.
2 Truncation rules for the double series for I V M

Double series for I V M and truncation strategy
We have as in Eq. (1), where c t are the Zernike coefficients of the front factor a(ρ) f (ρ), with a(ρ) and f (ρ) as in Eqs. (3)(4) so that Our approach to get truncation rules for the double series uses the following observations. The coefficients A are all non-negative and bounded by 1 and satisfy other boundedness properties such as In Subsec. 2.2 we give bounds on the Jinc functions J h+1 (2πr)/2πr and the coefficients c t that show rapid decay after h = 2πr and t = 1 2 |f |, respectively. For values of absolute accuracy ε that are relevant in the optical practice, the double series in Eq. (6) is truncated at values h = H and t = T where both the Jinc functions and the coefficients have reached their plunge ranges. Accordingly, the absolute truncation error in approximating I V M in Eq. (6) by is safely bounded by where S m n is the set of all h, t such that A 0mm 2t,n,h = 0. In the general truncation rule, the dependence on n and m of the supporting set S m n is totally ignored and the functions bounding Jinc h+1 and c t are replaced by simple functions allowing convenient determination of set points H and T for which is below a specified ε > 0.
In the dedicated rule, we use a more careful approximation of the bounding functions, and we include explicitly the supporting set S m n . It thus appears that an inspection of the product of the approximated bounding functions along the boundary ∂ S m n of the supporting set in the (h, 2t)-plane produces numbers H = H m n and T = T m n such that the quantity in Eq. (10) is below a specified ε > 0.
The bound in Eq. (14) is valid for all h ≥ 0, except for a small range of h's near 2πr with r → ∞. In fact, Eq. (14) is valid for all r ≥ 0 and h ≤ 2, it is valid within a factor of 2 for all r ≥ 0 and all h ≤ 175, it is valid within a factor of 4 for all r ≥ 0 and all h ≤ 11194, and so on. Of course, we also have the general bound |J h+1 (2πr)/2πr| ≤ 1 2 . In Figure 1a, we show log 10 |J h+1 (2πr)/2πr| as a function of h, 0 ≤ h ≤ 150, for r = 0.1, 1 and 10, respectively. It can be seen that there is rapid decay from h + 1 = 2πr = 0.63, 6.28 and 62.83, respectively onwards. For the case that r = R = 10, we have plotted in Figure 1b both log 10 |J h+1 (2πr)/2πr| and the bound log 10 [ exp {−ϕ(h + 1; 2πR)}/2π 2 R √ R ], see Eq. (14). The (asymptotic) maximum of log 10 |J h+1 (2πr)/2πr| can be found from Appendix B and equals −2.5609, assumed at h = 58.67 when r = 10. At this point h, the upper bound log 10 [1/2π 2 R √ R] = −2.7953 is slightly lower than the asymptotic maximum. We have also shown in Fig. 1b  For the structural quantities c t a similar result holds. In Appendix C the following is shown. let f be a real number, and set Then where Here it has been assumed that s 0 ≥ s 0,M . In the case that s 0,M > s 0 , we should replace s 0 in Eqs. (17-18) by s 0,M and change the right-hand side of Eq. (16) accordingly. The value of a 0 is in almost all cases well approximated by (midpoint rule or Simpson rule for integration over x = ρ 2 ). The bound in Eq. (16) is shown in Appendix C using a somewhat heuristic approach so as to arrive at manageable expressions. As with the bound in Eq. (14) there are small exceptional ranges of t near 1 2 g and g → ∞, where Eq. (16) holds safe for a factor that grows to infinity very slowly as g → ∞.
In Figure 3, we show the graph of v 0 , as given in Eq. (18), against s 0 , 0 ≤ s 0 ≤ 1. The asymptotic decay of c t is Cv t 0 , and so there is rapid decay of c t for all s 0 until s 0 = 0.95 (with v 0 = 0.5241), and even cases like s 0 = 0.99 are still practicable.

General truncation rule
In Appendix A the functions ϕ(h + 1 ; 2πR) and ϕ(t ; g/2) − ϕ(t ; g/2v 0 ) are bounded from below by piecewise linear functions according to and where respectively. This leads to the following general truncation rule: Let 0 < ε < 1, and let Then the quantity in Eq. (11) is less than ε when See Appendix D for a proof. By observing that we can write T and H in Eq. (24) as where for 0 < γ ≤ 1 we have given precision to the rule-of-thumb that the truncation points should be chosen somewhat larger than 1 2 |f | and 2πr, respectively.
In Figure 4 we depict, for given n and m such that n−|m| is even an nonnegative, the set S m n in the (h, 2t)-plane that contains all non-zero coefficients A 0mm 2t,n,h (S m n is the convex hull of those points (h, 2t)). The boundary ∂ S m n of S m n consists of 4 line segments I, II, III, IV in accordance with the conditions, see [1], Sec. 5, We consider the function F (h, t) of Eq. (27) along ∂ S m n with continuous t ≥ 0, h ≥ 0. We have that F (h, t) is non-negative and increasing and convex in both h and t, and We let B as in Subsec. 2.3, and we let with H gen and T gen from Subsec 2.3. From the monotonicity and convexity properties of F , we then get, see Appendix D, -when M ≤ B, there are two points (h 1 , 2t 1 ) and (h 2 , 2t 2 ) ∈ ∂ S m n such that for any ( The dedicated truncation rule becomes then as follows. Determine . With H and T defined this way, we have that the quantity in Eq. (10) is less than ε.
By the monotonicity and convexity properties of F , the minimum M of F along ∂ S m n is assumed on edge II. Hence, it is sufficient to inspect F along this edge to find M.
The actual variables h, t are non-negative integer, and this should be accounted for. We intersect ∂ S m n with the box (h, 2t), h ≤Ĥ − 1 or t ≤T , whereĤ − 1 is the smallest integer of same parity as n withĤ ≥ H gen and T is the smallest integer withT ≥ T gen . In case that we find 0 or 1 point (h, 2t) in the intersection, the inspection is a trivial matter. In the case that we find two intersection points, we let the inspection start at the point with largest value of h and lowest values of 2t, and we end the inspection at or before the point with lowest value of h and largest value of 2t, following the boundary curve counterclockwise with points (h, 2t), integer h and t and h same parity as n.
3 Computation of structural quantities and truncation issues

Series expressions for structural quantities
We consider in this section computation of the Zernike coefficients of the front factor a(ρ) f (ρ), with a(ρ) and f (ρ) given in Eqs. (3)(4). We make a slight variation of the approach in [1], Sec. 4 and 8, in that we write and we use linearization coefficients A to write where A 000 2l,2k,2t a l b k .
The reason for moving a factor 1 − s 2 0 ρ 2 from the focal factor f (ρ) to the algebraic factor a(ρ) is the fact that this yields the most convenient expression for the expansion coefficients b k , viz.
Here j k and h (2) k are the spherical Bessel and Hankel functions of order k, given as with J ν , Y ν and H (2) ν the Bessel function of first, second and third kind (Hankel function) and of order ν, see [2], Ch. 10. The quantities b k can be computed, via Eqs. (39-40) using MatLab routines, efficiently at any desired accuracy.
As to the coefficients a l , we first write, see Eq. (3), Next, either term on the right-hand side of Eq. (41) is developed into a power series where the coefficients r N are computed recursively according to [1], Eqs. (37-39) and [1], Eq. (106). Finally, the Zernike coefficients a l,αβ are computed from r N according to with b N (l) given explicitly and computed recursively in [1], Eqs. (41-44).

Truncation and accuracy issues
The accuracy by which the c t must be computed is dictated by the absolute accuracy ε in the truncation analysis of Sec. 2 that involves the products of c t 's and Jinc functions J h+1 (2πr)/2πr as in Eqs. (10-11). Now |J h+1 (z)/z| ≤ 1/2 for z ≥ 0. Hence, when c t is computed with absolute accuracy ε, and the truncation rules of Subsecs. 2.3-2.4 are used with ε/2 instead of ε, a final absolute accuracy better than ε results.
Next, given integers L, K > 0, the absolute error due to approximating c t of Eq. (37) by is, as in Eqs. (9-10), safely bounded by max l>L or k>K Now there are the bounds The second bound in Eq. (46) follows from Appendix C, Eq. (C18), while the first bound is obtained by considering in Appendix E, Eq. (E1) the worst case l = 0 with s 0 = 0 and s 0,M close to 1. Hence, when ε ∈ (0, 1), we have that the quantity in Eq. (45) is less than ε when L and K are such that According to Appendix C we have and this is less than 3 16 ε when with γ as in Eq. (22). The quantities b k are computed using Eq. (38), involving the spherical Bessel and Hankel functions j k and h (2) k that can be computed using Matlab routines. From Appendix C we have that where the first inequality holds for all f and the second inequality only holds when |f /v 0 | ≥ 1. In the case that |f /v 0 | < 1, the b k of Eq. (38) is best evaluated using the power series representations of j k and h 92) k that follow from [2], 10.53. Thus it follows that b k is computed with absolute accuracy 3ε/16 for k = 0, 1, · · · , K when j k (f /2) and h respectively.
As to the first condition in Eq. (47), we consider the decomposition of a(ρ) 1 − s 2 0 ρ 2 in terms a αβ (ρ) as in Eq. (42) with α + β = 0 and Zernike coefficients a l,αβ as in Eq. (43). In Appendix E the following is shown. Let δ = |α| = |β|, and let S = max(s α , s β ). Denoting "the R 0 2l -coefficient of where Furthermore, the right-hand side of Eq. (52) is less than η := ε/8 when Therefore, the first condition in Eq. (47) is satisfied when L is the maximum of the two numbers that occur at the right-hand side of Eq. (54) for the choices δ = 3/4, 1/4 (where evidently δ = 3/4 yields the largest value of the two). We finally address the issue of truncating the series in Eq. (43). It is shown in Appendix E that for a given ε > 0 and an integer L > 0 such that all |a l,αβ | < 1 8 ε when l > L, we have that all numbers a l,αβ , l = 0, 1, ..., L , are computed with absolute accuracy ε/16 when the infinite series in Eq. (43) is Figure 5, we show log 10 |a 0,αβ − To summarize, for ε ∈ (0, 1) we replace c t by c t,LK given in Eq. (44) in which -L and K are given by the right-hand sides of Eq. (54) and Eq. (47), respectively, -b k is as in Eq. (38) for which j k (f /2) and h This results into an absolute error in c t bounded by ε + 1 2 ε + 1 4 ε = 7 4 ε, due to respectively, truncating the double series over l and k, approximating b k by computing j k and h (2) k using the Matlab-code, and approximating a l by truncating the series for the two a α,β,l .

Summary of the computation scheme and truncation rules
For integer n and m such that n − |m| is even and non-negative, consider where with given real f , r > 0 and s 0 , s 0,M ∈ [0, 1), and where u 0 = 1 − 1 − s 2 0 . There is the double series representation with summation over h, t = 0, 1, ... and h same parity as n and m. In Eq. (59), we have A 0mm 2t,n,h = (h + 1) in terms of the Clebsch-Gordan coefficients in | | 2 of [2], Chap. 34; the A's are considered in detail in [1], Sec. 5 and Appendix C. Furthermore, the c t are the Zernike coefficients of the front factor a(ρ) f (ρ), so that The c t have a double series representation where the a l are the Zernike coefficients of A(ρ) = a(ρ) 1 − s 2 0 ρ 2 , so that and the A 000 2l,2k,2t are related to Clebsch-Gordan coefficients as in Eq. (60). The b k are given as with j k and h The a l are computed by first writing and then expanding both terms a αβ (ρ) = (1 − s 2 α ρ 2 ) α (1 − s 2 β ρ 2 ) β at the righthand side of Eq. (67) into a power series and subsequently into a Zernike series according to The r N,αβ in Eq. (68) are computed recursively according to for N = 0, 1, ... . The a l,αβ are computed from the r N,αβ according to where the b N (l) are given by and can be computed recursively according to [1], Eqs. (42-44).

Truncating the double series for I
We consider replacing the double series for I in Eq. (59) by where H and T are to be chosen such that the absolute approximation error is less than ε ∈ (0, 1). Let R = max(1/2π, r), and let g = max(1, |f |). Furthermore, let In Eq. (73) and in the definitions of v 0 in Eq. (66) and of w 0 above, we need to replace s 0 by s 0,M when s 0,M > s 0 .

General truncation rule
The absolute approximation error is less than ε, simultaneously for all n and m, when where γ = min(1, ln(1/v 0 )).

Dedicated truncation rule
For c > 0 and x ≥ 0, define and let for h ≥ 0 and t ≥ 0 where for t ≥ 0 with γ 0 = ln(1/v 0 ) and v 0 given in Eq. (66). Let n and m be integers such that n − |m| is even and non-negative. The set S m n in the (h, 2t)-plane containing all non-zero coefficients A 0mm 2t,n,h in the double series in Eq. (59) is given by the constraints The convex hull of this set S m n has a boundary ∂ S m n which is a curve consisting of 4, possibly degenerate, line segments, listed in counterclockwise order as with H gen and T gen as in Eq. (75). The absolute approximation error is less than ε when H = H m n and T = T m n in Eq. (72) are chosen as follows.
Case M > B. Set Case M ≤ B. Follow the boundary curve counterclockwise through points (h, 2t) with integer t and integer h such that h − n is even, starting at the point (h, 2t) on edge I or II with lowest value of h such that h + 1 ≥ H gen and ending at the point (h, 2t) on edge II, III or IV with lowest value of t such that t ≥ T gen . Let (h 1 , t 1 ) be the first point found in this process for which

Truncation issues in computing c t
For t = 0, 1, · · · and 0 < ε < 1, the quantity approximates c t with absolute error less than ε when L and K are such that With γ = min(1, ln(1/v 0 )), the second item in Eq. (84) holds when Subsequently, let S = max(s 0 , s 0,M ), and set Then the first item in Eq. (84) is valid when Furthermore, when the a l and b k required in Eq. (83) are available with absolute accuracy 1 4 ε and 3 16 ε, respectively, while the K and L of Eqs. (85, 87) are used in Eq. (83), all c t are approximated with absolute accuracy 2ε.
As to the availability of a l and b k for l = 0, ..., L and k = 0, ..., K with a required accuracy we give the following comments. The a l have the form a l = a l,3/4,−3/4 + a l,1/4,−1/4 , and either term at the right-hand side of Eq. (88) is computed using the infinite series expression in Eq. (70). When this infinite series is trun- , the absolute error is for all l = 0, ..., L and either term at the right-hand side of Eq. (88) less than ε/16, and then all a l , l = 0, ..., L , are computed with absolute error less than ε/8. Finally, the b k are given by Eq. (65) in terms of spherical Bessel and Hankel functions, and can therefore be computed to any desired accuracy using MatLab routines (employing the expressions for spherical Bessel and Hankel functions in terms of ordinary Bessel and Hankel functions, see [2], Sec. 10.47). When this is done with absolute accuracy 3 32 2 −7/4 εu 0 exp (−ϕ(K; g/2v 0 ))/(2K + 1)v 0 and 3εu 0 /64(2K + 1) for j k and h (2) k , respectively, the b k are computed for k = 0, 1, · · · , K with absolute accuracy 3ε/16. Using these approximations of a l and b k in Eq. (83) with K and L as in Eqs. (85, 87) yields an approximation of c t with absolute error less than 7 4 ε.

Accuracy of assembled scheme
Let ε > 0, and use either one of the truncation rules in Subsec. 4.1. Furthermore, compute c t as in Subsec. 4.2 with absolute accuracy 7 4 ε. Finally, compute the Bessel function J h+1 (2πr) with absolute accuracy 2πrε/4w 0 a 0 , with w 0 and a 0 given in Subsec. 4.1, using Matlab-codes. Then the quantity I in Eq. (59) is approximated with an absolute error that can be bounded by ε + 1 2 7 4 ε + ε = 23 8 ε, due to, respectively, truncation of the double series in Eq. (59), approximating c t as in Subsec. 4.2, and approximating the Jinc function J h+1 (2πr)/2πr by computing J h+1 using the Matlab-code.

Illustration of the truncation rules
In this section, we show the absolute truncation error and the computation time, using the general truncation rule of Subsec. 2.3 and the dedicated truncation rule of Subsec. 2.4 for approximation of the diffraction integral I in Eqs. (1-2) as a function of ε ∈ (0, 1) for a variety of radial values r, maximum defocus values f , numerical aperture values s 0 and s 0,M , and Zernike circle polynomial degrees and orders n and m. The truncation rules are used with ε/2 instead of ε. The structural quantities c t and Jinc functions J h+1 (2πr)/2πr are computed with absolute accuracies ε/2 and ε/16w 0 a 0 , respectively, so that the absolute error due to using these computed quantities is bounded by ε/2 for all n and m simultaneously. The total absolute error using the truncated series with the computed quantities is then expected to be less than 1 2 ε + 1 2 ε = ε. In all figures, we show achieved accuracy (a) and computation time (b) against requested accuracy ε in the range 10 −15 − 10 0 , using the general truncation rule (dashed lines) and the dedicated truncation rule (solid lines). The graphs result from specification of  is achieved amply: the graphs in (a) stay well below and parallel to the graph (ε, ε) (dotted lines). The performance of the dedicated rule in terms of accuracy is most of the time slightly worse but comparable to that of the general rule, while the performance in terms of computation time can be significantly better. The latter situation occurs especially when the degree and order of the radial polynomial are large compared to f /2 and 2πr.

Conclusions
We have formulated and verified truncation rules for the double series expressions that emerge from the advanced ENZ-theory for the computation of the optical diffraction integrals pertaining to optical systems with high NA, vector fields, polarization, and meant for imaging of extended objects. These rules have been devised for the central case j = 0 in the vectorial framework, which can be considered to be representative for all occurring diffraction integrals. Two versions of the truncation rule have been developed. The general rule gives precision to the rule-of-thumb that the required summation range is of the order 2πr times 1 2 |f | with r and f the values of the (normalized) radial and the focal parameters in image space, irrespective of the degree and order of the radial polynomial involved in the diffraction integral. In the dedicated rule, we have also accounted for the specific way the radial polynomial influences the actual summation range, leading to performances comparable in terms of accuracy and better in terms of computation time than what is offered by the general truncation rule. A salient feature of the double series that manifest itself through the truncation rules is that the computation times stay well within what can be considered practicable, more or less independently of the values of the aperture parameters and the magnitudes of the focal and radial variable. In the case that circle polynomials of very high degree and/or order are involved in the diffraction integrals, the general truncation rule becomes impracticable, and one has to resort to using the dedicated rule. With this full understanding of the double series with regard to truncation matters, it can be said that the advanced ENZ-theory is more or less completed.

A Results on ϕ-functions
In this appendix, we present results on the functions and where d > c > 0. In Eq. (A1), we have arccosh(y) = ln(y + y 2 − 1) = and this is a non-negative, non-decreasing function of y. Furthermore, with In the formulation of the general truncation rule, it has been used that one can find piecewise linear functions bounding ϕ(x ; c) and ψ(x ; c, d) from below. Furthermore, in the design of the dedicated truncation rule, it is convenient to have convex functions bounding ψ(x ; c, d) from below (since ϕ(x ; c) is itself convex, such an effort does not have to be made for ϕ).
By convexity of ϕ(x ; c), the graph of ϕ lies above any tangent line, and so for any x 0 > 0, we have For a linear lower bound on ψ(x ; c, d), one must choose x 0 ∈ (c, d) such that ψ ′ (x 0 ; c, d) ≤ ln(d/c), see Eq. (A8), and then where we have set γ = arccosh(x 0 /c). Choosing the largest possible x 0 ∈ (c, d), so that ψ ′ (x 0 ; c, d) = ln(d/c) =: γ 0 , we have Hence, for any γ ∈ (0, γ 0 ], we have Evidently, since ϕ(x ; c) ≥ ψ(x ; c, d), the latter bound is also valid for ϕ(x ; c), without a restriction on γ. The choice γ = 1 leads to The largest convex functon bounding ψ(x ; c, d) from below is given by We conclude this appendix by showing 3 inequalities. The first one of these reads when x ≥ 1 2 √ 13, and is required in Appendix B. We have by Eq. (A5) for and this is negative for all c ∈ (0, 1] when √ x 2 − 1 ≥ 3/2, i.e., when x ≥ √ 13. Since there is equality in Eq. (A18) when c = 1, we get the result.
In Appendix C, the inequality in Eq. (A20) is required for all x ≥ 0. We shall comment on this below.
We next show that for This is required in Appendix C with α = 1/2 and c ≥ 1/2. For x ≥ d, we have by Eqs. (A3-A4) and this is the required inequality.
The inequalities in Eqs. (A20, A28) are required in Appendix C for all x ≥ 0. Since ϕ(x + α ; c) and ϕ(x ; c) vanish when x + α ≤ c, we have that Eq. (A20) holds for all x ≥ 0, except perhaps when c − α ≤ x ≤ c. In this latter case, we have that ϕ(x ; c) = 0, and therefore the left-hand side of Eq. (A20) can be written as , and this minimum increases in c ′ . For the case that c ′ = c/α = 1, we find numerically the minimum value −0.109709667. Hence, for the case that α = 1/2, as considered in Appendix C, we are dealing with a minimum value of the whole left-hand side of Eq. (A20) of the order −0.05. This can safely be ignored, and so we declare Eq. (A20) to be valid for all x ≥ 0. A similar situation arises for the inequality in Eq. (A28) whose validity is ensured for x ≥ d, x ≤ c − α and c ≤ x ≤ d − α (in the latter case, the second term in [ ] in Eq. (A29) is non-positive, while the first term in [ ] is non-negative by Eq. (A20)). So we only need to consider c − α ≤ x ≤ c and d − α ≤ x ≤ d (these two x-intervals overlap when d − α ≤ c). The minimum value of the first term in [ ] in Eq. (A29) has been bounded from below by −0.109709667α. The second term can be written on d − α ≤ x ≤ d as . The function is convex, and so its maximum over [1, 1 + 1/d ′ ] occurs at v = 1, with value f (1) = 0, or at v = 1 + 1/d ′ , with value

C Bounding structural quantities
In this appendix, we bound and estimate the structural quantities c t required in Eq. (1). At this point, we are interested in a manageable bound that can be used to formulate transparent truncation rules. To achieve this, we argue somewhat heuristically. We make the observation that the algebraic factor a(ρ) is composed from functions (1 − s 2 ρ 2 ) δ with |δ| ≤ 3/4. Any such function can be written as where the latter function has the appearance of a focal factor with imaginary value of the normalized focal parameter f /u 0 of order unity. Moving a factor 1 − s 2 0 ρ 2 from the focal factor to the algebraic factor, see Eqs. (34-35), we are led to estimate the Zernike coefficients c t of a(ρ) f (ρ) by those of where g = max(1, |f |) and a 0 is the R 0 0 -coefficient of a(ρ) 1 − s 2 0 ρ 2 as in Eq. (17). Using the explicit form of the Zernike coefficients b t (g) of the modified focal factor, see Eqs. (4, 35, 38), we thus postulate for c t the bound a 0 |b t (g)| = a 0 2t + 1 u 0 g |j t (g/2)| |h (2) t (g/2v 0 )| .
For the sake of computation of the quantities b k in Eq. (38), involving the products of spherical Bessel and Hankel functions, with a specified accuracy, we note the bounds for k ≥ 0 The first bound follows from Eqs. (C4), (C9) and (A20) with α = 1/2 and c = g/2, and the second bound follows from Eqs. (C5), (C13) and (A27) with α = 1/2 and c = g/2v 0 . Since |j k (f /2)| ≤ 1, we may replace the argument g/2 in the first inequality in Eq. (C21) by f /2. In the second inequality, we can replace the argument g/2v 0 by f /2v 0 only when |f /v 0 | ≥ 1.

D Proof of validity of truncation rules
In this appendix, we give the proofs for the results in Subsecs. 2.3-2.4 on truncation rules. We first show that the quantity in Eq. (11) is less than ε ∈ (0, 1) when H and T are chosen according to Eq. (24) with B given in Eq. (23).
From Appendix A, we have for d ≥ c > 0 that where γ ≤ ln(d/c). Taking x = h + 1, c = 2πR, d = ec (so that γ ≤ 1), we get by taking γ = 1 The right-hand side of Eq. (D2) The right-hand side of Eq. (D4) exceeds B of Eq. (23) when t ≥ T , and then from Eq. (16) Since for all h ≥ 0, t ≥ 0 by Eqs. (14, 16) we find that when h + 1 ≥ H and/or t ≥ T . This means that the quantity in Eq. (11) is less than ε.
As to the dedicated truncation rule, we use continuity, monotonicity and convexity of F (h, t) as a function of both h and t, see Eqs. (27-28). It thus follows easily that the right-hand side of Eq. (30) is less than ε when h + 1 > H or t > T when H and T are chosen as H = H m n = max(h 1 , h 2 ) + 1, T = T m n = max(t 1 , t 2 ) (for the case that M in Eq. (31) ≤ B; otherwise we simply have H = 1, T = 0). Here the points (h 1 , t 1 ) and (h 2 , t 2 ) are found as the first and the last point (h, 2t) on ∂S m n with F (h, t) > B when inspecting the 4 line segments of the boundary ∂S m n in counterclockwise manner through integer h and t with h same parity as n. This means that with this choice of H and T the quantity in Eq. (10) is less than ε.
It also follows that F (h, t) increases along both edge I and edge IV in Fig. 1 when (h, 2t) → ∞. Also F (h, t) increases along edge III when t increases and h is kept fixed at |m|. Therefore, the minimum M in Eq. (31) is to be found on edge II. On this edge II, it follows from convexity of F that the minimum is attained on a set of points (n − 2t, 2t) with t in a closed interval contained in [0, 1 2 (n − |m|)] (which reduces to a single point t when F is strictly convex on edge II).

E Asymptotics, bounds and truncation issues for coefficients of algebraic functions
We consider in this appendix the (computation of the) Zernike coefficients of the modified algebraic function see Eqs. (3,41). This A is the sum of two functions We let for such an a αβ S = max(s α , s β ) , s = min(s α , s β ) , so that a αβ (ρ) = (1 − s 2 ρ 2 ) δ (1 − S 2 ρ 2 ) ∆ .
We still owe the reader a proof of the results in Eq. (E7, 11). As to Eq. (E7), we consider the general case in Eq. (E5). Setting x = ρ 2 , we have by Cauchy's formula with integration contour a circle of radius < 1/S 2 in positive sense. We choose principal values of the roots (1 − s 2 z) δ,∆ , and we deform the contour so that the positive real axis from the first branch point z = 1/S 2 onwards, passing along the second branch point z = 1/s 2 , to z = ∞ is enclosed. When N = 1, 2, ... and δ, ∆ > −1, δ + ∆ < 1, this can be done without problems.
as required.