Figure 2a shows the schematic illustration of the optical behavior of the proposed see-through reflective metalens. Different from conventional metalenses, our metalens is optimized to achieve good focusing effect for the 45o incident light while maintaining a good transmission spectrum for the normal incidence. The collimated oblique incident light is reflected towards the normal direction while being converged by the metalens, which functions like an off-axis reflective lens. To realize such an optical behavior, the phase profile φ (x, y) of the metalens should be designed as (see Supplementary S1 and Fig. S1 for the detailed derivation):
$$\varphi \left( {x,y} \right)=\frac{{2\pi }}{{{\lambda _d}}}\left( {f - \sqrt {{x^2}+{y^2}+{f^2}} - \frac{x}{{\sqrt 2 }}} \right)$$
1
where x and y are the coordinates of each unit cell, λd is the design wavelength, and f is the focal length. To realize such phase distribution, we designed a Pancharatnam-Berry (PB) phase metalens, whose unit cell structure is shown in Fig. 2b. It consists of a crystalline silicon (Si) nanorod on a sapphire substrate with fixed length L, width W, height H, and period P, but spatial-varying rotation angle θ(x, y). For a circularly polarized incident light, the rotation angle of each nanorod generates a phase shift φ(x, y) = 2θ(x, y), which is caused by the birefringence arising from the asymmetric nanorod structure [37, 38]. Therefore, the desired deflection and focusing effects can be achieved.
In the design and optimization of this optical see-through reflective metalens-visor, both reflection (for the virtual images) and transmittance (for the environmental scenes) conditions should be considered. First, according to Fresnel theory, to achieve high reflection, it is important to have high refractive index contrast at the interface. Silicon is a high refractive index material, that is comparable with the process of Complementary Metal Oxide Semiconductor (CMOS) mass production and industrialization, and it has already been used for realizing high performance metalenses [21, 39–41]. Compared to amorphous silicon, crystalline silicon has a lower absorption coefficient in the visible wavelengths and has been reported to be a material platform for high efficiency metalenses [42–44]. Second, to preserve a good see-through property, the size of the nanorod and the aperture ratio of the unit cell are critical due to the non-negligible absorption of silicon in the visible region [43, 45], and therefore need to be subtly tuned. Third, to achieve high diffraction efficiency for the reflected light, it is desired that the orthogonal polarization components to experience similar amplitude attenuation but a phase retardation close to π upon reflection, according to the PB phase. Therefore, we conducted systematic simulations to optimize the structure parameters using a commercial finite-difference time-domain (FDTD) solver (Lumerical Inc., Vancouver). We varied the length, width and height of the nanorods in a meta-grating-cell with six gray levels (Fig. S2), trying to obtain high 1st order diffraction efficiency in reflection and balanced transmission spectra for the normal incident light covering from 400 nm to 700 nm. Detailed optimization process is described in Supplementary S2 and the simulated results are shown in Fig. S3. Figure 2c shows the 1st order diffraction efficiencies of the reflected light, as the length and width of the nanorods vary while the height is kept at 230 nm. And Fig. 2d shows the simulated transmission spectra for three selected groups of structures. After balancing the reflection and transmission performance, we decided the optimized parameters of the metalens are: P = 250 nm, L = 160 nm, W = 100 nm, and H = 230 nm.
As shown in Fig. 1b, which is the schematic illustration of an AR eyeglasses architecture using the proposed metalens visor, an off-axis incident image from the micro-display is reflected and converged by the metalens towards the eye pupil. Meanwhile, due to the see-through capability of the visor in the visible region, ambient natural light can directly pass through and then enter the eye. Hence, the eye can see virtual images augmented on the real world through the ultra-compact single piece visor.
To validate our design, we fabricated a metalens at the design wavelength of λd = 633 nm. Figure 3 shows the scanning electron microscope (SEM, Zeiss Auriga) images of a fabricated sample. Figure 3a shows the top-view of the central portion of the metalens, and Fig. 3b shows the sidewalls of the nanorods through a 45° tilted-view. From the figures, we find that the structure parameters of these nanorods agree well with the design values. It should be noted that, because of our limited fabrication facility, the sample possesses a smaller area of 1×1 mm2 (4000×4000 units), resulting in a smaller numerical aperture (NA) of 0.035 (see Supplementary S3 and Fig. S4 for the detailed characteristic parameter description). Here, the NA of the metalens is calculated based on the diagonal of the square aperture.
We first characterized the focusing performance of our reflective metalens using a HeNe laser beam (633 nm). At about 20-mm away from the metalens in the normal reflection direction, we observed the minimal focal spot on a screen, which confirms the realization of a focal length of approximately 20 mm. To precisely measure the focal spot size, we adopted the experimental setup depicted in Fig. S5, and the focal spot image captured by a charge-coupled device (CCD) camera is shown in Fig. 4. Figure 4a to 4c shows the measured symmetric focal spots of our metalens at three discrete illumination wavelengths, including the design wavelength of 633 nm and another two specific wavelengths of 532 nm and 457 nm. The corresponding cross-sections of the three focal spots are shown in Fig. 4d to 4f with the full width at half maximum (FWHM) labeled in the legends. The metalens shows slight defocusing at the illumination of blue light (Fig. 4f) because the phase mask is designed for λ = 633 nm instead of 457 nm. We also simulated the focusing properties based on the Rayleigh-Sommerfeld diffraction formulas in Matlab and obtained the cross-sections of focal spots at the three wavelengths with the corresponding FWHM labeled in the legends (Fig. 4d to 4f). The focal spots of our metalens at the three illumination wavelengths are near diffraction limited. Although the metalens is designed for λ = 633 nm, it presents good focusing performance for all the red, green and blue colors. Fig. S6 shows the enlarged focal spot of the fabricated metalens at the illumination wavelength of 633 nm, compared with the simulation results in FDTD. We further measured the focusing efficiencies of our fabricated metalens with a detector (Newport 918D-SL-OD3) placed right at the focal plane in front of the metalens. Here the focusing efficiency is defined as the ratio of power of light collected by the detector at the focal plane to that of the circularly polarized incident light. The measured focusing efficiencies are 16.03%, 3.92% and 2.01% at λ = 633 nm, 532 nm and 457 nm, respectively. The efficiency for red light incidence is quite competitive according to ref. [21], ref. [36] and ref. [46]. The relatively low efficiencies for green and blue are mainly attributed to the unoptimized unit cell structure for these two wavelengths, as well as stronger absorption of the silicon material at shorter wavelengths.
We also measured the visible transmission spectrum of the metalens using a spectrometer, and the light source was also set to be right-handed circularly polarized. As shown in Fig. 5, one could see that the metalens possesses good see-through characteristics with a relatively high efficiency for most of the visible light. Overall, the experimental result is in good agreement with simulation, except that the dips in the measured spectrum seem to be shallower than expected. Such a deviation might be attributed to the measurement errors or fabrication imperfection. For instance, although we attached a diaphragm aperture with a diameter of 0.8 mm to the sample, trying to prevent the incident light from reaching beyond the effective region (1 mm×1 mm), there might still be some unwanted stray light entering outside the effective region due to imprecise alignment.
Thanks to its good see-through property and focusing performance, we employed the fabricated metalens-visor in a prototype of an AR display where it performs the function of an eyepiece and a combiner simultaneously. Figure 6 shows the experimental setup for realizing the AR display system. Illuminated by collimated laser light, the reflective spatial light modulator (SLM) generates the desired virtual image pattern for the AR display. After passing through a circular polarizer, the virtual image light is converted to right-handed circular polarization. Then a telescope system consisting of two positive lenses (lens1: f = 400 mm; lens2: f = 50 mm) is used to reduce the beam size while maintaining its collimated nature, so that the output light with a reduced beam size only illuminates the effective region of the sample. Next, careful orientation adjustment of the sample is performed, to assure the beam impinging on the sample with an incident angle of 45°. If the human eye is positioned at the focal point, it should be able to see both the reflected virtual image and the real-world scene. In our experiment, we used a camera (CANON EOS M10 with CANON ZOOM LENS EF-M 15–45 mm) to replace the human eye to capture the displayed images. Here, a black card with a 1 mm×1 mm hole was also affixed on the back of the sample, and strictly aligned with it, to make sure that the effective region is the only window that the real-world light could pass through.
Figure 7a to 7c shows the AR imaging results when the camera was focused at different distances. In the figures, one can see the augmentation of a virtual image “R” on the real-world scene that has three representative real objects, a dial, a puppet and a piece of white paper with black letters “SJTU” printed on it, placed at 150 mm, 600 mm, and 2000 mm away from the camera, respectively. The quality of the virtual image is reasonably good, except for some noise due to the diffraction effect caused by the small pixel size of the SLM and the coherent laser light employed. This problem can be alleviated by optimizing the distance between the telescope system and the metalens-visor, choosing other image source such as a laser scanning projector [47] or adopting an incoherent light emitting diode (LED) light source. The real-world scene is clearly seen with high color fidelity. So indeed, the proposed single-piece metalens-visor could realize the functions of a reflective lens and an optical combiner simultaneously. Such a metalens-visor would undoubtedly decrease the system complexity, weight, and formfactor of an AR display, which are critical for the comfortable wearing of an AR headset.
Moreover, because a collimated virtual-image light was employed in our experiment, an interesting phenomenon was observed. As shown in Fig. 7a, when the camera was focused at the depth of the dial (150 mm), the virtual image “R” was clearly displayed but the distant puppet and “SJTU” are blurred. In Fig. 7b and 7c, when the camera was focused at the depth of the puppet (600 mm) and “SJTU” (2000 mm), respectively, the virtual image was always clear. We observed the similar phenomenon when we used a black scattering screen to receive the real images employing the experimental setup shown in Fig. S7, and Fig. S8 shows the corresponding imaging results. We also used a receiving screen with graduated scale to measure the image size at different depths. As shown in Fig. S9, the image size decreases first and then increases. And the minimum image size (a point) was achieved right at the focal length of our metalens. Thus, the collimated image light is converged into a point by the metalens visor. If the pupil is positioned at the focal point, it would be like a pinhole imaging system, and allow the user to observe always-clear images on the retina regardless of eye accommodation. Such a display technique employed here is the Maxwellian View display technique [48] (see Supplementary S6 and Fig. S10 for more detailed explanation). By employing Maxwellian View display technique in an AR system, the vergence-accommodation conflict, which induces 3D visual fatigue in conventional AR displays, could be eliminated [49, 50]. However, it suffers from having a small eyebox, which makes the virtual image easily missed out as the eye moves. Possible solutions to address this problem include employing multi-view display technique [51] or pupil duplication technique [52, 53] as described in the Supplementary S7 and shown in Fig. S11.
As discussed before, the reflective metalens-visor possesses good focusing characteristics for green (532 nm) and blue (457 nm) wavelengths as well, although it is designed for a single wavelength of 633 nm. Thus, we further employed red, green and blue lasers (633 nm, 532 nm and 457 nm) as the light sources for the AR display, to perform RGB color imaging. Theoretically, if the metalens-visor is achromatic, we could conveniently realize full color imaging by strictly aligning red, green and blue beams as shown in Fig. S12. Taking the dispersive nature of our metalens-visor into consideration, which yields distinctively different diffraction angles for 532 nm and 457 nm, we carefully adjusted the incident angles and divergence of the three beams (see Fig. S13 for experimental setup), so that the red (letter “R”), green (letter “G”) and blue (letter “B”) virtual images can be captured by the camera simultaneously. Figure 7d and 7e, shows the multi-color AR images at different focusing distances, and Fig. 7f shows the corresponding VR image for the same virtual content. Similarly, one can see in Fig. 7d and 7e, the virtual images are always clear as the camera’s focus distance varies. Here, the virtual image quality is improved because diffraction is suppressed by replacing the pixelated SLM with 3 patterned masks to generate red, green, and blue images.
Here, because of the polarization dependency of the PB metalens, an additional circular polarizer (a linear polarizer + a quarter-wave plate) is required to generate a circularly polarized incidence light from our linearly polarized laser beam. But the addition of such an ultrathin circular polarizer would barely increase the energy loss, weight, or cost of the system.