Numerical analysis of a slit-groove diffraction problem

We present a comparison among several fully-vectorial methods applied to a basic scattering problem governed by the physics of the electromagnetic interaction between subwavelength apertures in a metal film. The modelled structure represents a slit-groove scattering problem in a silver film deposited on a glass substrate. The benchmarked methods, all of which use in-house developed software, include a broad range of fully-vectorial approaches from finite-element methods, volume-integral methods, and finite-difference time domain methods, to various types of modal methods based on different expansion techniques. [DOI: 10.2971/jeos.2007.07022]


INTRODUCTION
The electromagnetic properties of metal-dielectric interfaces have attracted a vast amount of research since the early works of Mie and Ritchie on small particles and flat interfaces.The ability of such structures to sustain electron oscillations, known as surface plasmon polaritons (SPPs), has been investigated both to understand the fundamental physics involved and for potential applications.More recently, the development of nanofabrication techniques has lead to a resurgence of interest in this field [1,2].This is partly because metaldielectric interfaces allow miniaturization of optical components to subwavelength dimensions, well below those offered by all-dielectric micro or nano systems.
Numerical computation plays an important role in the analysis of light scattered by subwavelength features in metallodielectric structures.In this work, we benchmark several fully-vectorial methods on a basic scattering problem governed by the physics of the electromagnetic interaction between subwavelength apertures in a metal film.The two-dimensional construct studied is depicted in Figure 1.It is composed of a slit and a groove etched into a silver film on a glass substrate.The separation distance between the centres of the slit and the groove is denoted by d in the following.The other parameters, such as the permittivity of silver and the metal film thickness, are given in the caption.The slit-groove construct is illuminated by a normally incident plane wave (TM polarization) with wavelength in vacuum λ = 852 nm.The choice of the specific geometry depicted in Figure 1 is motivated by the availability of experimental data [3], the involvement of several physical mechanisms [4]- [7] leading to an oscillatory behaviour of the transmitted power as a function of d, the difference between numerical solutions and experimental data [8] providing different interpretations for the mechanisms responsible for the interaction at the metallodielectric interface, and the discrepancies between numerical solutions obtained with finite-difference time domain (FDTD) and modal methods (MM) [9,10].Our objective is to under- 1.1700i, a value that is close to the permittivity of silver used in the experiment [3].
The construct is illuminated by a TM-polarized plane-wave, incident along the normal to the film from air (n = 1) at λ = 852 nm.The width of both groove and slit and the slit depth is 100 nm The metallic film is assumed to be deposited on a semi-infinite glass substrate (n = 1.45).In (a) the silver film thickness is 400 nm.
stand these discrepancies and to provide an overview of the state-of-art of the field of computational plasmonics.
Several groups in Europe and in the United States have been contacted to participate in the benchmark using their in-house developed software.In the following sections, three different numerical exercises are benchmarked in relation with the geometry of Figure 1: • Calculation of S/S 0 as a function of d, for 100 linearly spaced values starting from d = 100 nm to d = 10 µm with an increment of 100 nm.S is defined as the vertical flux of the Poynting vector along a 200 nm horizontal cross section (horizontal thick line in Figure 1) located 400 nm below the substrate-metal interface and centred with respect to the vertical slit axis.S 0 is the same quantity in the absence of the groove.The numerical S/S 0 values can be directly compared with the measured data in [3].
• Study of the convergence of S/S 0 versus a parameter related to the discretisation accuracy of the methods for d = 500 nm.This calculation is intended to compare the convergence performance of the different methods and to estimate the absolute accuracy of the computational results.
• Calculation of the total field on the air-silver interface (y = 0 + ) diffracted by a single groove, as shown in Figure 1b.This particular field, known as the scattered surface wave, gives rise to the oscillatory behaviour observed in the field reflected by the geometry depicted in Figure 1a [8].Special attention has been given to the singular fields at the groove edges, since singularities are known to be difficult to handle numerically [11].
In Table 1 we present the twelve different implementations of fully-vectorial methods that have been benchmarked.We have classified these implementations into five general categories: modal methods (MM), finite difference time domain (FDTD) methods, finite-element methods (FEM), and volume integral method (VIM).Also included is a newly proposed hybrid method (HYB) that relies on a combination of FEM and a-FMM.The methods represent a selection of the most popular numerical methods used nowadays in computational electrodynamics.Clearly, they do not cover all existing methods, but nevertheless the results of the benchmark may be used with confidence by other researchers in the field to test their in-house or commercial software.The remainder of this paper is organized as follows.A general description of all methods is presented in Section 2. The purpose of this section is to provide the reader with sufficient background on the methods in order to obtain a basic understanding of the numerical approaches.In Section 3 we present and compare the numerical results obtained by the different implementations of the methods.The implementations of these methods are described in Section 4. Detailed discussions of the methods are provided in the cited literature.We include, however, an extensive description of the present implementations and explain all specific modifications made in order to improve the performance for the considered geometry.We additionally discuss the discrepancies between the results of the different methods.CPU times and memory requirements are likewise provided in this section.We conclude this paper in Section 5 with a summary of the derived insights.

Aperiodic Fourier Modal Method (MM1, MM2, MM3)
The Fourier modal method (FMM), also called the Rigorous Coupled Wave Analysis, has been originally developed for the rigorous analysis of diffraction by gratings [12].The method, which relies on modal expansions for all the electromagnetic field components in a Fourier basis, is nowadays one of the most popular methods in grating analysis.For 2D problems like the present one, the FMM calculates the electric field parallel to the slits for the transverse electric case (TE), and the magnetic field for the present transverse magnetic case (TM).
The remaining magnetic and electric field components for the TE-and TM-case, respectively, are then obtained as derived from Maxwell's equations.The method solves the eigenproblem in every layer of the structure and then, through the use of S-matrix algorithms, reconstructs the field by matching the boundary conditions imposed by Maxwell's equation.The eigenmodes are obtained using standard linear algebra routines as the eigenvectors of the system matrices for the TE-and TM-case in the various layers of the configuration.To avoid slow convergence, the present implementations use the TMformulation of the eigenvalue-problem as discussed in [13]- [15].
The a-FMM is a generalisation of the FMM approach that allows the method to handle non-periodic situations by artificial periodisation via the introduction of a Perfectly Matched Layer (PML) [16] or of complex nonlinear coordinate transformations [17].The nonlinear transformation is applied to satisfy the outgoing wave condition in the transverse direction.As the total number N of Fourier harmonics retained for the computation increases (NxN also represents the size of the matrix to be diagonalised for calculating N modes), limitations may arise from memory requirements, or from the finite precision of the numerical calculations, which essentially rely on modes calculation and on a few matrix multiplications for matching the tangential field components at the interfaces.

Method of Lines (MM4)
The results labelled MM4 are obtained with the method of Lines (MoL) [18]- [21].The MoL is a bidirectional eigenmode method, where the eigenmodes are determined by means of discretisation in real space.For the particular structure the fields and the derivatives with respect to the x-coordinate were discretised with finite differences.In this way, the wave equation, which is a partial differential equation, is transformed into a system of coupled ordinary differential equations.By computing the eigenvalues and eigenvectors of the operator matrix, which corresponds to a determination of the eigenmodes, this system can be decoupled and solved.As the number N of lines used for the calculation increases (N also represents the number of modes), the accuracy of the computational results increase.Details can be found in [19].

Local Eigenmode-Modal method (MM5)
Conceptually, the MM5 [22] is very similar to the Fourier modal method.The main difference is that the fields are not expanded in a Fourier basis, but in the local eigenmode basis of each section, whose propagation constants are obtained by solving for the roots of a transcendental equation.An advantage compared to e.g. the a-FMM is that the field profiles of the basis functions incorporate more information about the refractive index of the structure to be modelled, and are moreover calculated with high accuracy for two-dimensional problems.However, drawback of this method is that sometimes modes can be missed when searching the complex plane for zeroes of the transcendental function.This is in contrast to the FMM, where all modes are found at the same time by solving for the eigenproblem.To find the roots of the transcendental equation for lossy materials, we first consider the solution of the lossless case.These roots are then easily found as lying on the coordinate axes of the complex plane.Then, the loss in the system is gradually increased to the desired value and the location of the modes is tracked in the complex plane.For the current benchmark, the modes trace out a complex path with many crossings, making it possible that sometimes one or several modes are missed.The PML used in MM5 is implemented by means of complex coordinate stretching [23].
2.4 Finite Difference Time Domain method (FDTD1, FDTD2, FDTD3) The FDTD method was first introduced in [24] and is used extensively in computational electrodynamics.The method is based on the discretisation of Maxwell's equations in space and time [25].The structure is defined by the spatial distribution of the relative dielectric permittivity and of the relative magnetic permeability, and partial derivatives are replaced by centred finite differences.Within the commonly used Yee "leapfrog" scheme, every field component is located at a different node of the grid: electric and magnetic fields being interleaved in time and space on staggered grids.The present implementations use the PML boundary conditions of Bérenger [26].
The computing time and memory requirement for a problem solved with an FDTD method increases gradually with the number of elements.This means that large 2D-and many 3D-geometries can be computed on ordinary desktop computers.Time-domain methods are especially useful for obtaining spectral information of a problem in a single simulation run.
As the domain, and thereby the distance, at which fields need to interact in space and time increases, numerical dispersion becomes an increasing problem.This may result in larger errors for problems with a large slit-to-groove distance d.Corners and angles different from 90 o as well as oblique interfaces lead to staircasing.Since the FDTD operates in the time domain, a dispersion relation is needed to correctly simulate the response of media with a negative real permittivity such as metals in the optical regime.The FDTD implementations considered here employ a Drude dispersion model for the relative permittivity r : where ∞ = 3.36174, ω p = 1.338810 16 rad/s and γ = 7.0759210 13 rad/s.This model provides the correct permittivity at the frequency ω used in the benchmark.
Due to the implementation of the metal dispersion, second order accuracy is not guaranteed at metal-dielectric interfaces even for straight interfaces.This alters the convergence performance as shown in Figures 4 and 5.Because in the studied geometry, the plasmon wave vector is almost parallel to the interface, the PML placed above the structure must be relatively large to sufficiently damp the plasmon.This also holds for FEM methods, but not for MM that handle the vertical ydirection analytically.

Finite Element Method (FEM1, FEM2)
The analysis of the slit-groove problem has, also, been performed using finite element methods.These methods operate under the assumption that any function over a global domain can be approximated by a series of functions operating over a finite number of small sub-domains, called finite elements.The discretisation of Maxwell's equations leads to an algebraic system with unknowns expressed in the nodes or in the edges of the mesh elements.For the studied problem, the vector Helmholtz equation is solved numerically for the complex magnetic field H on a given mesh.The electric field E and Poynting vector S are derived afterwards.The FEM has several advantages.It has been successfully used to model complex geometries, anisotropic dielectric and metallic materials and allows for transient, harmonic or modal analysis.
It is mainly limited in the case of large system.For a large sparse, indefinite and complex symmetric matrix, such as the FEM matrix in the case of the time-harmonic Maxwell's equations, the system of equations has to be preconditioned and reordered appropriately.A large amount of memory is required due to the preconditioning.
Two dual formulations have been used.The first one is based on the curl-curl equation, with edge elements (FEM1) and the second one (FEM2) uses an in-house implementation of a nodal discretisation scheme based on the work described in [27]- [29].The structure is discretized by a mesh of unstructured triangular elements for the first method and with regular rectangular elements for the second one.Both types of elements are of second-order type.To enclose the computational domain and prevent reflections, an anisotropic PML is placed before the outer boundary.In the studied configuration, the PML above the layer must be relatively thick because the SPP is hardly damped in the PML.
PML parameters (thickness, number of elements in the PML, etc) are very important and directly affect the result reliability.Besides the PML, many other parameters, such as the mesh density, number of nodes or edges, and the type of elements, affect the accuracy and require specific attention.In the present study, attention is focused on the effect of mesh density to study the computation convergence, as shown in Figures 3 and 4. To reduce the discretisation error, the mesh has to be refined along the metal-dielectric interfaces.

Volume Integral Method (VIM)
The volume integral method (VIM) is based on Green's theorem, where the vectorial Maxwell's equations are solved by a transformation into the inhomogeneous vector Helmholtz equation for either the electric or magnetic field, of which the homogeneous solution is known, where κ 2 i contains material parameters of the background medium and κ 2 f contains material parameters of the nontrivial scatterers.We shall refer to the right hand side as the virtual source strength.It is non-zero only in the region of nontrivial scatterers.The Green's tensor is the solution of the same equation where the virtual source is given by a mathematical dipole Using Green's theorem, the inhomogeneous equation of Eq. ( 2) is solved by integration over the volume of the virtual source using the Green's tensor solution of Eq. ( 3), This is an integral equation for the unknown final electric field E f , which, for comparison with other methods, is the sum of the incident field E i and the scattered field E s .It can be solved by approximating E f by a linear combination of piecewise basis-functions and applying the method of collocation.This yields a system matrix which is full and often so large that an iterative method has to be used to solve the problem.In our implementation we have approximated the electric field E f and the Green's tensor G by piecewise constant functions, and have computed the volume integral numerically for all collocation points.To solve the discretized system we applied the iterative procedure as described in [30,31] and demonstrated the used implementation for optical data storage in [32].
The effects of a stratified medium are incorporated in κ 2 i and treated analytically, such that only the fine-structure (i.e.groove and/or slit) is considered as the virtual source.The effect of the stratified layers on the Green's tensor solution results in a two dimensional integration in spatial frequency space [33,34].The integration along the azimuthal-direction can be done analytically, while the radial-direction is done numerically using an adaptive Gauss-quadrature algorithm.
Before benchmarking and for debugging purposes, the VIM has been compared to three analytically known problems for one-, two-and three-dimensions, using an infinitely extending slab, a cylindrical rod and a Mie-sphere [35], showing a relative error of the order of 10 −5 , depending on the refractive index of the slab.Using the dependence found in Ref. [30], where the error in the intensity scales with the cell size to the power 3, a very crude estimate of the error for the calculated field would be of the order of 10 −4 for the calculation of the Poynting vector as a function of the distance between slit and groove, and 10 −3 for the calculation of the field from the groove only.Note that this estimate is based on far-field considerations and cannot be used for the field very close to the virtual source.
Since the volume integral method only depends on the volume of the virtual source, the calculation domain can be discretized very efficiently.However, the amount of scattering cells is limited by computing capabilities (usually the memory-limit).The radiation condition is automatically satisfied due to the constraints on the Green's tensor.When the electromagnetic field within the virtual source has been obtained, it is straightforward to obtain the field anywhere else by integration over the source-domain using the Green's tensor corresponding to the homogenous problem.The implemented iterative method avoids the problem of the memory requirements due to the full system matrix and the task of preconditioning the matrix, but involves a lot of operations, making the implementation relatively inefficient.

Hybrid finite element-modal method (HYB)
Institut d'Optique has developed a hybrid method (HYB) that combines FEM and a-FMM techniques.The basic underlying idea is to sample zones with rapidly-varying fields (at singularities for instance) or curved surfaces in real space with finite-element techniques, while using FMM expansion technique in all lamellar areas of the structure where modal expansions are numerically easier and physically more intuitive than methods relying on a full discretisation.zone surrounding the groove for h 2 < y < h 1 is sampled in real space.To improve the numerical performance, the mesh grid is refined at all metal-air interfaces and edges with a non-uniform triangular mesh with first-order elements (H z nodes).For y < h 2 and for y > h 1 , the integration in the y-direction is performed analytically with Fourier modal expansion techniques.The HYB implementation uses complex nonlinear coordinate transformations to satisfy the outgoing wave conditions in the x-direction, like the MM3.
With the a-FMM, and more generally speaking, with modal methods, arbitrary shaped structures with round corners for instance are handled with a staircase approximation.Numerical results show that the staircase approximation introduces sharp maxima in the local field map close to the stair edges of the profile.As a consequence, a large number of layers is generally required.Not only the computation load increases, but also the convergence performance degrades and a greater number of Fourier components is required to correctly represent the electromagnetic field.This is especially pronounced in TM polarization for metallic structures [36].This limitation does not hold for finite-element approaches.
For the benchmark geometry that is fully lamellar, the HYB method does not benefit from any advantage resulting from a staircase approximation.Thus as will be shown in Section 3, the HYB method provides slightly worse results in terms of convergence performance and memory requirements than the MM3.However, we believe that different conclusions would be reached if one has to take into account inevitable curvatures at the groove and slit edges.A draft containing numerical results for non-lamellar structures is in preparation.shows the values of D. Several conclusions may be derived:

NUMERICAL RESULTS AND COMPARISON
• The MM5 results significantly differ from those obtained with all other methods, the minimum value for D in line 12 being 0.111.
• If we consider the subpart formed by the FDTD methods only, see the part of the table outlined by a red box, we observe that the results are consistent with one another, with D values varying between 0.03 and 0.035, but differs twice that value from those obtained by others methods.
• We note that the group formed by the seven first methods, namely MM2, MM3, FEM1, FEM2, HYB, VIM and MM1 belonging to three different categories, provide very similar results, with D values below 0.01.A more careful inspection of the data obtained by these methods (in bold in Table 1) has also revealed that the peak-topeak deviations between this group of methods do not exceed 0.043 over the full range of separation distances.
On average over the one hundred separation distances, the peak-to-peak deviation is 0.0126 (relative value 1.4%).
• Considering the subgroup formed by the first five methods MM2, MM3, FEM1, FEM2 and HYB, even higher consistency is achieved, since the maximum D value does not exceed 0.005.The peak-to-peak deviations between this subgroup is 0.032, and on average over the d p 's, the peak-to-peak deviation is as low as 0.008.This corresponds to an impressive averaged 3-digit agreement.As will be shown in Figures 4 and 5 these implementations also provide the best convergence rates.
Figure 3 shows a comparison between the average transmission S/S 0 obtained from the seven methods mentioned above, and the experimental data (crosses) taken from [3].The depicted error bars delimit the maximum peak-to-peak deviation between the seven methods and the blue dots represent their average values.From the statistical error analysis given, we believe that the quantitative agreement obtained for seven methods belonging to three totally different categories is not fortuitous.Therefore, the average values are likely to represent accurate estimates for the normalized transmission.Thus the significant differences between the numerical and the experimental data may have been caused by unobserved, yet different physical parameters occurring in the measurement condition.As shown in Figure 3, the amplitude of the oscillations is not well reproduced.As confirmed by other calculations not reported here, it is intuitively clear that the amplitude is related to the actual shape of the groove and to its actual depth that might differ from the 100 nm depth used for the benchmark.The difference between the frequency oscillation of the published experiment and the benchmarked computation results is large and increasing as a function of separation distance.Since the material and configuration are carefully chosen to represent the experimental conditions, this difference is quite unexpected.In agreement with [8], all the benchmarked methods have provided an oscillation frequency given by the normalized SPP propagation-constant of the air-silver interface (n eff = 1.01), which differs from the experimental one (n eff = 1.04).The impact of the SPP mode on the oscillation frequency is further confirmed by the very good agreement obtained in Figure 6b between the fields diffracted by the groove on the interface and analytical SSPmodal expressions.

Convergence performance
In this sub-section, we present the convergence performance of the numerical implementations for d = 500 nm.The convergence performance is shown as a function of some parameters labelled by N, which is related to the sampling resolution.N differs from one method to the other and starts from an arbitrary N min value up to a N max value obtained before a memory limitation is reached.For FDTD, FEM and HYB, N is the total number of discretisation elements enclosing the non-trivial scatterers, i.e. the groove and/or slit.For VIM, N is the number of cells in the scattering volume (in this case only inside the slit and/or the groove) and not the total computational space.For Fourier modal methods, N represents the total number of Fourier harmonics retained for sampling the reciprocal space in the x-direction.For MM4, N represents the number of sampling points in the x-direction and for MM5 N represents the total number of local eigenmodes retained in the modal expansion.Note that for methods relying on a full discretisation of the computational space (FEM and FDTD), N grows as the computational area, while N grows as the transverse computational-window width (along the xdirection) for modal methods.More details are given in Section 4.Although the computational effort strongly differs from one method to another, in general, the CPU time also increases with N, since N is related to the size of the matrices, which are involved in the numerical implementation.
Figure 4 shows the absolute convergence performance for S/S 0 as a function of N. We have tried to use the same scale, but because of the variation in convergence rates, this has not been possible for all methods.In addition, since some meth- ods (MM2, MM3, FEM1, FEM2 and HYB) rapidly reach a plateau for small N values, we have used two different vertical scales as N increases.Note that these methods correspond to the group of methods that provide the smallest deviations in Table 2.
In addition, note that the same scale is not used in all plots for the sake of clarity.In Figure 5, we show the convergence rate on a logarithmic scale as a function of N. The relative error for a given N is defined as |[S/S 0 (N)]/[S/S 0 (N max )] − 1|, where S/S 0 (N max ) corresponds to the normalized transmission calculated with the highest resolution.Note that these values provided in parenthesis differ from one method to the other.Thus the figure is not intended to discuss the accuracy of the results.The convergence rates are varying by almost four orders of magnitude from one method to another.Since the benchmarked value S/S 0 is the ratio of two independent simulations, the errors for the two simulations cumulate in the benchmarked value.This may be disadvantageous for some methods, like the FDTD method, which are relatively sensitive to small changes in the discretisation.This is especially sensitive near metallic interfaces, for which FDTD methods are locally first-order accurate, leading to relatively large residual oscillations even for small discretisations.

Field scattered by a single groove
The governing physical phenomenon in the experiment in [3] is the surface wave diffracted by the groove on the metal interface, see Figure 1b.This wave propagates on the surface, interacts with the slit, and because of the delayed coherent interaction with the light directly coupled, results in the oscillating pattern of Figure 2. The analysis of this surface wave is the subject of the third numerical investigation.
Before providing a detailed comparison between the different field predictions obtained by the benchmarked methods, let us first examine some general properties of the field scattered by the isolated groove.Figure 6a displays the normalised zcomponent of the scattered magnetic field Re(H z,scat )/|H z,inc |, where H z,inc is defined as the magnetic field of the normally incident plane wave on the interface at y = 0. To obtain the scattered field, we have subtracted the incident and back-reflected (specular) plane waves in the displayed images for y > 0. Clearly, the scattered field looks like that of a cylindrical wave.This behaviour is expected because of the subwavelength dimensions of the groove, which therefore acts similarly as a line current [8]. Figure 6b is intended to roughly discuss additional properties of the scattered wave on the interface y = 0.The black curves represent the three components of the scattered field on the interface, calculated with MM3 for a normalized incident plane wave H z,inc (y = 0) = 1.The superimposed dotted curves for H z , E y and E x are respectively the analytical functions |S exp(ik where S is determined with the rigorous treatment in [37], k SP being the SPP propagation constant of the air-silver interface, k SP /k 0 = [ Ag /( Ag + 1)] 1/2 .The scattered field on the interface is shown to be essentially composed of a SPP for x > 1.5λ.This will be confirmed by the numerical data provided by the other methods in Figures 7 and 8. Further investigation on the numerical data shows that the transmission curve in Figure 3 and the magnetic-field curve in Figure 6b can almost be superimposed after renormalization.This indicates the key role played by the SPP on the silver-air interface in the oscillatory behaviour reported in the experiment.A related discussion can be found in [8,38,39].
After these preliminary considerations, we now compare the total fields calculated by the various numerical methods at the interface y = 0 + .For the sake of brevity, we simply compare the x-component of the electric field on the interface and do not report on H z and E y .This tangential component is continuous, but exhibits a singularity near the groove edges.In Figure 7, we show the normalised total electric field |E x,tot /E x,inc | as a function of the x-coordinate.The two plots on the top are related to short x-distances, 2 µm < x < 3 µm, while the plots in the bottom are related to larger x's, 9 µm < x < 10 µm.We first note that the oscillation amplitudes of the FDTD data are different from each other and are also different from those provided by the other methods.The x-locations of their minima and maxima are also different and are shifted towards small x-values.The systematic offset explains why the group of FDTD implementations provides similar distances in Table 2.We believe that the difference between the FDTD results and the other ones is due to the metal dispersion that prevents second-order accuracy at metal-dielectric interfaces even for lamellar interfaces as those considered in this work.Except for the FDTD data, a quantitative agreement is achieved over the full range of x's for all other methods (the MM5 data are not considered in this comparison).However a more careful inspection shows that the remaining results can be divided into two sets.These two sets are formed by the data obtained from MM2, MM3, MM4, and FEM2 as one set, and from MM1, FEM1, VIM as another set.The slight difference in oscillation amplitudes (not frequency) is due to the fact that for the second set of methods, the field has been calculated at y = 1 nm above the interface, instead at y = 0 as requested for the benchmark (see the next section for more details).The 1 nm difference may appear negligible at first sight, but as shown by additional computations obtained with the MM1 and MM3 at y = 1 nm, it is almost fully responsible for the slight difference between the two sets of results.
To assess the robustness of a particular technique, it is interesting to consider the computation of the electromagnetic fields near the groove edges.When the radius or curvature of the edge is negligible with respect to other relevant lengths as is the case for the ideal geometry considered in this work, singularities appear.The problem of the scattering of a monochromatic plane wave by a homogeneous wedge has received much attention in the literature [11], since knowledge of singular behaviour may be incorporated into numerical algorithms to increase their speeds of convergence.Reversely, when the singularity is not incorporated into the method (this is the case for all considered methods), the method accuracy can be tested by looking at the divergence properties of the computed fields in the vicinity of singular points and by comparing it to analytical expression for the critical exponent, see Section 4.
In Figure 8, we compare the results obtained by all methods in the immediate vicinity of the upper groove edge located at x = 50 nm.Note the use of vertical logarithmic scales.A good overall agreement is achieved for x < 50 nm, where the medium is uniform (n = 1).For x > 50 nm, strong deviation occurs and the overall relative accuracy is rather low.Note the quantitative agreement between MM2, MM3 and HYB over the full range of x values, even at the immediate vicinity of the singularities, where the field theoretically diverges.As discussed hereafter, this can be explained by the specific attention paid by these methods to accurately sample the field in the vicinity of edges, in Fourier space for MM2 and MM3 or in real space for HYB.

METHOD IMPLEMENTATION AND DISCUSSION
The main objective of this section is to underscore the different computing specifications and to give additional explanations of the obtained results relative to each method.Note that all co-authors have been able to provide numerical data in a 60-day term.This short duration implies that no optimisation of the in house-developed software have been performed for the benchmark.For each method, the CPU time and the mem-ory load are also given in order to estimate typical computational efforts that one may be faced with in reproducing the computed data for a given accuracy.However, let us keep in mind that these numbers are only estimates that do not allow a quantitative comparison between the method implementations that have been run on different hardware with different software.
MM1.The MM1 used in this work is an electromagnetic field solver that is based on the a-FMM.While the main procedure of implementation is similar to that in the general description of the method, special care is taken to prevent numerical instability of the evanescent contributions.We have combined the S-and R-matrix algorithm [40,16], such that the outward travelling waves are obtained using the S-matrix and the inward travelling waves using the R-matrix.This implementation is inherently stable and especially relevant for systems with many layers, as has been confirmed with the VIM [35].
The calculation for the benchmarked non-periodic structures has been performed by using a complex nonlinear coordinate transformation combined with a perfectly matched layer (PML) as described in [17].This modification, however, increases the computational time, since it requires that even for the non-corrugated layers (the incident air medium and the glass substrate) the modes are found numerically.Also, the number of modes that have to be taken for convergence depends on the PML and the nonlinear complex transformation.
In Figures 7 and 8, the MM1 requires more modes for an accurate field calculation, especially to include the evanescent modes.Finite numerical accuracy results in singularities in the system matrix.This in turn results in inaccuracies of the largest eigenvalues and corresponding eigenvectors used for calculating the near-field, which becomes apparent when the distance to the interface gets very close.
The convergence tests in Figures 4 and 5 were carried out using an artificial period width of 21.85λ and show a convergence from 350 modes onwards.The complex parameters for the nonlinear coordinate transformation f ct = (1 − i) −1 and for the PML f PML = (1 + i) −1 [17] are applied over a 2λ range at both sides of the artificial period.For the calculation as a function of d, an artificial period width in the x-direction of 18.85λ is used to obtain the data discussed in Table 2.We use 302 modes, with 3501 points for the numerical integration of the permittivity function to obtain its Fourier components.The same parameters but with 300 modes were used for computing the electromagnetic field in Figure 7, by using 1011 sampling points at a distance of 1 nm above the metal interface.For the calculation presented in Figure 8, we have used 10001 sampling points in the integration for obtaining the electric permittivity, 201 sampling points in the interval of −50 nm to 150 nm for the reconstruction of the field, a computational width of 8.29λ and 1000 modes.
A typical calculation takes 12 minutes and requires 0.7 GB of memory on an AMD Opteron 265 running at 1.8 GHz.The field calculation far from the edge takes 20 minutes and 0.7 GB of memory, and the one near to the edge takes 6.5 hours and 5.7 GB of memory on the same computer.In comparison with MM2 or MM3, MM1 offers a slower convergence and a larger computational effort.This is due to the fact that the large distance between the slit and the groove is not squeezed by any coordinate transformation.In MM1, the coordinate transformation is only used to create an aperiodic structure by incorporating PMLs.

MM2.
Since the problem under consideration is translation invariant in one direction, any modal method is an efficient tool that not only calculates but also gives physical insight in the simulations.Here we have used a Fourier Modal Method.This method is very popular although it presents intrinsic weak points.For instance, the Fourier basis is certainly not the best choice to represent permittivity functions that exhibit simultaneously high contrast and narrow geometrical details.The use of a parametric representation of space allows an adaptive spatial resolution [41].The coordinate transformation maps non-uniformly spaced points along x-direction in the physical domain to uniformly spaced points in the transformed domain.In the present context, one demands that the coordinate lines in the physical space are stretched in the neighbourhood of the transitions of the permittivity function.Furthermore, the transformation has to allow different spacing between the transitions in the physical and in the transformed domains.Such transformations have been used by Granet in [42] and improved by Vallius [43].In the present benchmark, the struc-ture under consideration is aperiodic.In order to take into account the radiation boundary for a periodic basis, we have used Perfect Matched Layers.Chew and Weedon have shown in [23] that PMLs are equivalent to a complex stretching on the coordinate space of Maxwell's equations.Hence, from the practical point of view, adaptive spatial resolution and PML are very similar.
The a-FMM introduces at least two arbitrary parameters: the artificial periodicity and the PML thickness.In the x-direction, we call a transition point MM3.The MM3 benchmarked in this work uses nonlinear complex coordinate transforms to map the semi-infinite open spaces (x → ±∞) onto finite segments in the computational space.The width of the coordinate transform, q in the notations of [17], is equal to one wavelength.When applied to the analysis of metallo-dielectric structures, the a-FMM, or the FMM in general, suffers from a lack of accuracy.The difficulty arises from the fact that the electromagnetic fields rapidly vary at the metal boundaries (skin depth effect) and that these variations have to be correctly sampled in Fourier space.Conceptually, fine sampling in Fourier space can be achieved by real coordinate-transforms that locally map segments of the real space onto enlarged or reduced segments of the computational space.
Figure 9 sketches the two coordinate transformations that have been applied to remove this difficulty.In addition to the complex coordinate transforms that are applied on the two boundaries of the computational space, two different real PML-like coordinate transforms have been implemented.The real PML-like transform in the centre (dashed bold lines) compresses the real space, and avoids the difficulty related to the sampling of large computational domains for large slit-groove separation distances.For large d's, this enhances the numerical accuracy.The real dilatation PML-like transform (dashed lines) applied at the slit and groove boundaries (in the figure, a single transform is shown) implement a zoom that results in a better sampling in the Fourier space.The concept of using real PML-like transforms for space dilatation or compression is similar to the adaptive spatial resolution [41,43], although the former is easier to implement since the real PMLlike transform simply results in a renormalization of the permittivity and permeability coefficients by a factor f PML , letting unchanged the eigenproblem formulation in the Fourier space [17].Figure 10  PML-like transforms, respectively.Basically, the net effect is an additional two-digit-accuracy improvement.The compression PML-like transform is not used for this computation.
All the calculations have been performed on a PC computer equipped with a 3 Ghz processor and with Matlab 7 software.In Figures 4 and 5, N represents the total number of Fourier harmonics retained for the computation.A typical CPU time for a single computation is 3 minutes for N = 401.
The fields calculation in Figures 7 and 8 have been performed for N = 1001.As shown by comparisons with the MM4 and HYB methods, the field singularities at the edge are accurately described in the Fourier domain.This may appear surprising since one could expect that a method relying on Fourier expansions is not capable to calculate field discontinuities because of the inevitable Gibbs phenomenon.However, this possible source of artefact can be largely removed by considering Fourier expansions only for the fields which are continuous and then from their computation to derive the other discontinuous field components in real space directly from the constitutive relations [44].Additionally, we have verified that the field divergence at the edge singularities scales with x −p up to edge distances of 10 −4 nm, where p = 0.3563 is the critical exponent calculated with the method described in [11].Although such small distances rule out the domain of validity of the macroscopic Maxwell's equations in photonics, the agreement with the asymptotic closed-form expression provides evidence for the accuracy of the calculation.A related discussion without using dilatation PMLs at the edges can be found in [44].
MM4.The actual structure was modelled with a discretisation distance of 25 nm.The total size of the computational window was 11.1 µm with a 0.4 µm spacer-width right from the slit leading to 444 discretisation lines.The required CPU-time per point was 70 seconds (Matlab 6.5 on a Pentium IV processor).Zero reflection for the special angles of 0, 40 and 85 degrees were chosen [19].All the parameters were varied, and gave virtually identical results.Closer examinations for d = 0.5 µ m have shown that the results weakly depend on the size of the computational window.The variations occur in the second digit, i.e. for an uncertainty of less than 5%.
To examine the reason for this, the electric field in the output plane (400 nm below the silver layer) was studied (no groove).Figure 11 shows that we do not see an exponential decrease to zero of the field within the computational window, even though this was already quite large compared to size of the slit.Instead we have an oscillatory behaviour.This is in contrast to results obtained for "classical" waveguide problems, with the concatenation of two (or more) waveguides for instance.In the latter case only a few eigenmodes are responsible for the main features of the device.Here, we have a high number of modes that are important for the overall behaviour.Therefore optimising the absorbing boundary conditions (ABC) and the computational window is a much more difficult task in the present case.
In view of the above remarks, it might be expected that the position of the boundaries and the actual ABCs play a crucial role for the results.For this reason, both the size of the computational window as well as the angles for zero reflection were modified to examine their influence.We see in (Figure 11), that we have relatively small variations of the maximum field in the "slit-area".Therefore, the determined transmission is stable in spite of the variation of the field.However, we can also conclude that it might be a difficult task to improve the accuracy (i.e.above the 5% mentioned before).

MM5.
As illustrated by Figures 4 and 5, the convergence of our implementation at the moment is worse for this particular benchmark as compared to other modelling problems, see for instance the recently published mode-solver benchmark [45], where several methods presented here are also compared.The main limitation currently is that, especially for small separations between the slit and the groove, the coupling between the identical slit and groove waveguides results in a complicated modal structure, with e.g.near-degeneracies.This means that the search for zeroes in the complex plane sometimes misses a mode, which results in poorer convergence as compared to larger slit-groove separations.As for calculation times, a calculation of S/S 0 for a single geometry with 100 modes takes about 20 sec on a 2 GHz Pentium IV.
FDTD1.In the FDTD1 implementation, the scattered field is defined as the total field minus the analytic solution of the airsilver-silica multilayer.During the last time interval before termination of the simulation, the amplitudes and the phases of the fields are determined from the calculated real fields to obtain the complex field amplitudes.The total electromagnetic field and the Poynting vector are then calculated as a postprocessing step from the scattered electromagnetic field and the analytic solution.
A computational domain is defined by an area that encloses all non-trivial scatterers, which are in this case the slit and groove.The mesh in this domain consists of rectangular elements, but it is non-uniform.By aligning the non-uniform mesh with the material interfaces, staircasing errors of the in general non-conforming mesh are minimized.In addition, better accuracy is obtained by using a relatively denser mesh in the slit, groove and metal skin depth area.The absorbing boundary surrounding the computational domain is a Complex Frequency-Shifted PML implemented as a convolutional PML (CPML), see [46].
The Drude model for silver is implemented using an Auxiliary Differential Equations (ADE) method [47].This gives an update rule for the polarization density in the metal.For every element, field values at a certain discrete time are obtained by subsequently updating the displacement field D, the electric field E, the polarization densities P, and the magnetic field H using values from a previous discrete time.Maxwell's equations for the scattered field include a virtual source term inside the groove and slit, with a source density which depends on the incident field.The time-domain simulations are stopped when the fields have become time harmonic that is when the relative RMS error between two subsequent time periods of the absolute magnetic field component averaged over the entire grid drops below a predefined small value.
Due to the implementation of the dispersion in the metal, second order accuracy is not guaranteed at metal-dielectric interfaces even when they are straight.This explains the ripples in the convergence results of Figures 4 and 5.The S/S 0 values at the requested line in space are converged to the final value long before the entire domain becomes steady-state and timeharmonic.The simulated time is therefore probably too long.Because in the studied geometry, the plasmon wave vector is almost parallel to the interface, the PML above the structure must be quite large to sufficiently damp the plasmon.This also holds for FEM methods.
In Figure 8, the FDTD1 results do not show a distinct singularity.In the used discretisation the E x field component is always located half a cell away from the edge.The field decays then more smoothly and causes the general higher values on the surface elsewhere.The E x field could be relocated to the edge, but since the field values are located in a staggered grid, the E z would then not be exactly on the edge.The grid could be made denser in the vicinity of the edges at the cost of computational time.Similar considerations hold for the other FDTD implementations.
In Figures 4 and 5, N is defined as the number of grid elements in the entire domain including the PML.For all simulations except for the convergence curve, N = 44810 3 .In the convergence curve N max = 61910 3 .We used a discrete time step ∆t = 7.3810 −3 fs, which corresponds to a Courant stabilityfactor of 0.9.The simulation is stopped when the relative RMS error described above drops below 0.5%.The total simulated time ranges then between 56 fs and 85 fs.
The 2D computational domain stretches from x = −10800 nm to x = 800 nm and from y = −900 nm to y = 500 nm, with the origin at the centre of the slit on the metal-air interface.The mesh is three times more dense in the x direction for x = −150 nm to 150 nm, as well as in the y direction for y = −120 nm to 20 nm.
The PML is 80 elements wide with the characteristic parameters described in [25] set as σ max = σ opt , a max = 2σ max and κ max = 15.The parameters are smoothly varied using third order polynomial grading.(At the moment of publication, a correction in the PML implementation now gives the same accuracy using only 20 elements and a max = 0.006σ max .) A typical problem used 215 MB of memory and took 50 minutes on an AMD Opteron running at 2.6 GHz.For N = 21010 3 , this reduces to 98 MB of memory and 10 minutes of computing time.
FDTD2.In FDTD2, the electric and magnetic conductivities of the PML media are chosen in order to completely annihilate the back-reflected wave on the external side of the PML region.The number of the PML layers is fixed to 8 throughout all our simulations (i.e. a PML thickness of 320 nm).
For dispersive materials, the dielectric constant is described by a Drude model that implies the computation of an additional field, the electric displacement vector D. The constitutive equation linking the E and D fields is expressed as . This equation can be recast to and then Fourier transformed to give the time-differential equation Eq. 6 is also discretized both in space and in time via finite centred differences and then integrated into the Yee scheme [48,49].Furthermore, in the case of large structures with fine details (as it is the case here), a very small value of the spatial step is needed to accurately describe all the small features.Consequently, this leads to a huge number of nodes which is both memory and time consuming.To bypass this problem, the FDTD2 method implements a non-uniform discretisation of the structure [25].
The spatial meshing step is then set to ∆ m x,y = 5 nm (i.e.λ/160) for fine geometrical features and to ∆ M x,y = λ/20 = 40 nm elsewhere.To avoid large local errors, virtual reflections, due to the large change of ∆ between the two meshes, an intermediate temperate domain (100 cells) is introduced with a spatial step varying gradually from ∆ m x,y to ∆ M x,y or conversely.With these values of ∆ M x,y and ∆ m x,y , a total parasitical reflection of about 3 × 10 −4 in amplitude is obtained from the intermediate area, while the PML reflections are of 1 × 10 −7 magnitude.
The computational time strongly depends on the distance d between the groove and the slit.Indeed as d increases, the number of time steps needed to reach the steady state also increases.Thus, the number of time steps is varied between 4000 and 40000.More precisely, the mean time spent to compute the curve of Figure 3 is about 15 hours on a personal computer equipped with a Centrino processor running at 2.8 GHz.

FDTD3.
With the FDTD3, the optical response of slit-groove structures is simulated using a finite-difference time-domain (FDTD) approach [25].The light source is an x-polarized plane wave generated along a horizontal line parallel to the sur-face of the metal film.Time propagation of Maxwell equations is performed by a time-stepping leapfrog technique.In order to absorb all outgoing electromagnetic (EM) waves on the grid boundaries, we implement perfectly matched layers (PML) absorbing boundaries with the exponentially differenced equations so as to avoid diffusion instabilities [26].In regions of space occupied by a metal, we implement the auxiliary differential equations method [25] and add equations to describe the current densities to the set describing the electromagnetic field components.As a convergence criterion, we use the comparison of the transmission S/S 0 and the electromagnetic intensity W/W 0 , where W is the timeaveraged EM energy.In the CW limit, both functions result in the same slit-groove dependence.Results are converged with a total propagation time of 200 fs (in calculations the electromagnetic field components are propagated in time for 250 fs).The simulations are performed on distributed memory parallel computers at the National Energy Research Scientific Computing Center and San Diego Supercomputer Center.The parallel technique used in our simulations is described in details in [50].
Similar simulations to those provided here have been performed recently [51] and showed better agreement with the experimental observations.Although the effective wavelength of the surface waves in the present calculations is the same as was found in [51], the amplitude of oscillations of the normalized transmission S/S 0 as a function of the slit-groove distance in Figure 2 differs slightly.The main difference between the calculations leading to the FDTD data presented here and the ones reported in [51] is the choice of a detection line shown in Figure 1a.We found that simulations with a closed contour, as performed in [51], lead to more reliable transmission as a function of the slit-groove distance, since the closed detection line allows one to register electromagnetic waves that escape laterally from the output side of the slit.
Numerical convergence is achieved with spatial steps δx = δy ≤ 4 nm and a time step δt = δx/(1.5c),where c denotes the speed of light in vacuum.Simulations were carried out for a total number of mesh points (including 16 steps in each PML layer) of 5300 × 320.The total number of time iterations was 32000.A single run (single point in Figure 2) on 64 processors at the San Diego Supercomputer Center on DataStar IBM cluster took about 1 minute.

FEM1.
The discretisation by finite elements leads to an algebraic system represented by a sparse matrix which FEM1 solves using an iterative solver.It first employs an Incomplete LU Threshold Pivoting (ILUTP) [52] pre-conditioner.Subsequently, an iterative solver based on the BiConjugate Gradient Stabilized Algorithm (Bi-CGSTAB) gives the solution.Although the system matrix is sparse and hence does not require much storage, the preconditioning causes a partial fill-in and consequently more memory is required.A computational domain is defined by an area that encloses all non-trivial scatterers [53], which are in this case the slit and groove.The computational domain contains a non-uniform conformal mesh of triangular elements.Because of the tri-angular shape, the elements can be small in some areas and large in others, depending on the discretized geometry and the field distribution.Denser meshes are used in critical regions.Similar to the FDTD method, the computational domain is extended in every non-periodic direction by a complex stretched-coordinate PML [26] to prevent reflections.
The number N, defined as the total number of elements in the computational domain including the PML, is N = 37, 000.The meshed computational domain is kept fixed for all simulations.It stretches from x = −10.15µm to x = 0.15 µm and from y = −900 nm to y = 100 nm, with the origin at the centre of the slit on the metal-air interface.The PML is 100 nm wide and is meshed with 10 elements in the PML direction.
The simulations were all performed on Linux machines stationed at Philips Research Laboratory in Eindhoven with an AMD Opteron running at 2.4 GHz.A single simulation uses a typical CPU time of about 41 seconds and requires a 1.1 GB memory.Computation time decreases with the number of elements used for discretisation.For example, for N = 16, 000, the CPU time is only 21 seconds with a memory requirement of 0.5 GB.

FEM2.
The implementation of FEM2 uses a rectangular mesh, which is well adapted for the lamellar benchmarked geometry, and allows for a reduction of the required CPU time for the mesh generation and for the computation.
For a computational example with a mesh of 59, 000 elements, the CPU is equal to 240 seconds with a Intel processor PIV-2.8GHz.The algebraic system is solved by using a sparse LU method provided in Matlab 7.This algorithm is very fast but requires a large memory.For 179, 000 nodes, 1 GB memory is required to solve the problem.As shown in Figure 4a, the convergence of the method is monotone and stable as N increases.N max is equal to 282, 000 and the extrapolated value for S/S 0 is 2.2012.A relative variation of 10 −3 is obtained for N = 14, 000 elements.

VIM.
For the slit-groove calculations we used 1, 620 cells, with a typical dimension of 5.56 nm of each cell.For a more accurate near-field, we used for the single slit calculations 10000 cells, with a typical dimension of 1 nm of each cell.The numerical integration of the Green's tensor has a relative accuracy better than 10 −5 .The field is calculated at 1 nm above the metal interface since the distance to the scattering cells should not be smaller than the typical dimension of the cell.
For the case of Figures 7 and 8, the number of the required scattering cells increases dramatically.At low number of scatterers, oscillations may occur and get more pronounced closer to the surface, where the results will start to deviate quite strongly from other methods.This is mainly caused by the assumption that the field (and Green's tensor) is constant within a cell.If an observation point is chosen very close to the cell, either more cells should be taken into account (effectively increasing the distance to the cell, since it is normalized on the cell-width) or the integration over the cell should be done numerically.
The calculation for 1, 620 cells took about 1 hour and 260 MB of memory on an AMD Opteron 852 running at 2.6 GHz.The calculation for 10, 000 cells took about 34 hours and 9.8 GB of memory on an AMD Opteron 248 running at 2.2 GHz.

HYB.
The HYB method shares basic features with the MM3.For instance, it uses the same complex nonlinear transform and the same compression PML between the groove and the slit.In this HYB approach, the dilatation PML is not used, but it is replaced by a very fine sampling in real space in the vicinity of the singularities at the slit and groove edges.The resolution of the sampling has been shown to quantitatively impact the accuracy of the method.All results have been obtained for h 1 = 50 nm and h 2 = −150 nm, see Figure 2 for a definition of these parameters.
A relative inspection of data provided by the MM3 and the HYB methods has shown that the computational results obtained with the MM3 with dilatation PMLs and those achieved with the HYB are rather similar, the maximum peakto-peak deviation in Figure 3 remaining smaller than 0.005 for the one hundred values of d.
In Figures 4 and 5 the parameter N used for convergence tests represents the number of nodes of the non uniform triangular mesh.For every N values, the number of Fourier harmonics of the Fourier expansion above and below the finite-element area is equal to one third of the number of nodes at the upper boundary of the meshed region.The calculated fields in Figures 7 and 8 have been obtained with a refined mesh in the vicinity of the groove edges, with a minimum feature size of 10 −5 λ at the groove edges.
All the calculations have been performed on a PC computer equipped with a 3 Ghz processor and with Matlab 7 software.A typical value for the CPU time for a single computation is 10 minutes for N = 150, 000.

CONCLUSIONS
We have presented a comparison of twelve different implementations of five numerical methods (Table 1) used to solve the electromagnetic field distribution of a slit-groove structure at a metallo-dielectric interface.The methods cover the most commonly used techniques, modal methods, finite-differencetime-domain methods, finite-element methods, a volume integral method and a hybrid method.Three different numerical problems have been defined for benchmarking purposes, for which experimental data is also available, to identify the intrinsic differences between the various methods and implementations.
The transmitted flux as a function of the separation distance between the slit and the groove shows a very good agreement for eleven different implementations of the five different methods, see Figure 3.As shown in Table 2, a subset of seven implementations has an error of less than one percent.One modal method implementation, MM5, yields substantially different results since some of the essential modes have been missed for this particular problem.The results from the three implementations of the FDTD method show a small difference for the amplitude and frequency of the transmission curve in Figure 3, due to only a first order accuracy near metallo-dielectric interfaces.The convergence of every implementation has been estimated for a fixed groove-slit separation distance, see Figures 4 and 5.The achieved accuracy varies from one implementation to another, from 4 − 5 digits to 2 digits.Note that even though convergence is achieved for a particular implementation, this does not necessarily mean that the calculated result is accurate or reliable.Finally, the electric and magnetic field for only a single groove has been calculated.Again, the agreement between the subset of seven implementations is very good, as is clear from the overlapping curves in Figure 7, shown for a range of more than 10 wavelengths.Special attention has been given to the singularity located at the edge of the groove, see Figure 8.
Based on the quantitative agreement obtained between different results obtained with various numerical methods and implementations, we conclude that for the chosen simulation parameters, the simulations are all reliable.Thus the differences between the numerical predictions and the experimental data, may have been caused by unobserved, yet different physical parameters occurring in the measurements.
We would like to emphasize that the conclusions derived from this exercise concerning the numerical accuracy achieved with the various benchmarked methods strongly depend on the considered problem.Perfectly vertical walls with 90 ˚corners have been assumed in this work.The incorporation of inevitable rounded corners or slanted groove walls for instance, may reduce the numerical performance of some methods, like modal or FDTD methods for instance, while the performance achieved by other methods relying on an accurate sampling in real space, like FEMs or HYB or VIM, may remain unchanged.However, these deviations from the ideal geometry do not impact the physical interpretation that is related to the actual nature of the surface waves involved in the interaction.As shown by other computational results not presented here, the assumptions made for the ideal geometry only affect the oscillation amplitude but not the oscillation frequency, the latter being only governed by the actual dielectric constants involved at the air-silver interface, see Figure 6.
Aside from a discussion of the accuracy of the obtained results, we have provided per method a general description, typical characteristics of the method and of the implementation, and the computational requirements.Although the results were calculated and compared for a two-dimensional slit-groove problem, the physics involved is universal and important for almost all structures with features of the order of a wavelength or less.

FIG. 1
FIG. 1 (a) The slit-groove doublet considered in this work.The centre-to-centre slitgroove separation distance is denoted by d.(b) The single groove geometry used for the near-field calculations.All computations are performed for Ag = −33.22+

Figure 2 FIG. 2
FIG. 2 Sketch of the zone handled with FEM and a-FMM in the HYB method.

3. 1
Transmission as a function of the slit-groove separation distance In the first numerical exercise, we calculate the normalized transmission S/S 0 as a function of the slit-groove separation distance, for one hundred separation distances d p , p = 1, 2, . . ., 100.For the sake of comparison, we define the averaged deviation D between two different implementations (labelled by α and β) as the mean value (over the 100 separation distances) of the normalized transmissions calculated by the implementations α and β, D = (1/100) ∑ p=100 p=1 |(S/S 0 ) α p − (S/S 0 ) β p |. Table

FIG. 3
FIG. 3 Normalized transmission S/S 0 as a function of the slit-groove separation distance.The error bars represent the vertical intervals that bound the numerical data provided by seven different methods (MM1, MM2, MM3, FEM1, FEM2, VIM and HYB).The maximum peak-to-peak error between the four methods is smaller than 0.043 over the full range of separation distances.The dashed curve represents the numerical data obtained by interpolation of the average values.The red crosses are experimental data taken from Ref. [3].The dotted vertical lines indicate the locations of other maximum in transmission of the averaged numerical data.

FIG. 4
FIG. 4 Absolute convergence performance of the different methods as a function of the parameter N related to the sampling accuracy of the method, starting from an arbitrary N min value up to a N max value obtained before memory or time limitations.Two curves corresponding to two different PML implementations are provided for MM5.

FIG. 5
FIG. 5 Convergence performance of the different methods as a function of the parameter N starting from one arbitrary N min value up to a N max value obtained before memory or time limitations.The error is defined as |[S/S 0 (N)]/[S/S 0 (N max )] − 1|, and the number in parenthesis located below the method acronym represents S/S 0 (N max ).

M
FIG. 6 General properties of the electromagnetic fields scattered by a single groove on an air-silver interface under normal incidence.Note that the incident and specular reflected plane waves have been removed for y > 0 in (a) and (b).The groove depth and width is 100 nm.(a) Overall display scattered field Re(H z ) was obtained with FEM2.Red and blue colours represent positive and negative values, respectively.(b) Surface wave scattered on the interface at y = 0 + by a single groove.The solid curves were obtained with MM3.The dotted curves correspond to the analytical SPP contribution, see text.
FIG. 7 Comparison between the different methods for the normalized total field |E x /E x,inc | as function of the x-coordinate at y = 0 + .Top: small x values.Bottom: large x values.The repartition of the method results between the left and right sides of the figure is arbitrary.This choice is simply motivated by clarity reasons.

FIG. 8
FIG. 8 Comparison between the different methods for |E x /E x,inc | as function of the x-coordinate at y = 0 + in the immediate vicinity of the upper groove edge located at x = 0.05 µm.
x l the x-location where the medium experiences a discontinuity.The geometry of the structure is described by 6 transition points.The first arbitrary transition point, x 1 , corresponds to the thickness of the PML.The distances between the left PML and the groove and between the slit and the right PML are given by x 2 − x 1 and x 6 − x 5 , respectively.For the numerical implementation, we chose x 1 = λ and x 2 − x 1 = x 6 − x 5 = λ.All the other transition points are imposed by the structure under consideration, namely by the metallo-dielectric interfaces.The overall periodicity is thus: L = 2x 1 + 2(x 2 − x 1 ) + d + w.For d = 500 nm, it corresponds to L = 4.7042λ.In the transformed domain, we keep the same periodicity and make the transition points equally spaced.In addition to this first transformation, the resolution around the transition points has been refined.The functions used for the mapping between the two domains allow to calculate analytically all the coefficients of the Fourier series.All the calculations have been performed on a PC computer Dell latitude 820 equipped with a 2.16 Ghz processor and with a Matlab 7 software.For 201 retained Fourier harmonics (from −100 to +100), the code runs in about 11 seconds and for 601 Fourier harmonics it takes 4 minutes.Except for the convergence test, all other calculations in this benchmark have been performed with 561 Fourier harmonics.

FIG. 9
FIG.9Real and complex coordinate transforms applied in the MM3 from the real space (a) to the computational space (b).Black dashed lines: nonlinear complex coordinate transform to satisfy outgoing wave conditions for x → ±∞.Dotted blue lines: real PML-like compression transform used for reducing the transversal size L of the computational window.Solid red lines: real dilatation PML-like transform used for enlarging in the new space the vertical silver-air boundaries.This transform is applied four times, but for the sake of clarity, only the left boundary transform is shown.Note that the transversal dimension x' in the computation space is regularly sampled in the Fourier domain.

FIG. 10
FIG. 10 Impact of the dilatation transform at the slit and groove boundaries on the performance of the MM3 for d = 500 nm.Stars and dots are respectively obtained without and with the transform.N represents the total number of Fourier harmonics retained in the calculation.The relative error is defined as |S/S 0 − T 0 |/T 0 , with T 0 = 2.200952 being the best extrapolated value for S/S 0 .

FIG. 11
FIG. 11 Influence of the computational window and of the absorbing boundaries on |E x | obtained with the MM4 in the horizontal plane located at 400nm below the silver interface.The computational window is 20.5 µm for (a) and (b), 10.5 µm for (c) and (d).The three angles for zero reflection are [0˚, 0˚, 0˚] for (b) and (c), and [0˚, 80å nd 85˚] for (a) and (d).