1 Introduction
Head-mounted display (HMD) is of great interest in current research, including augmented reality (AR) smart glasses [
1], virtual reality (VR) devices [
2], and mixed reality (MR) technology. The accommodation and convergence of conventional HMDs, which generates eye fatigue and compromises image quality, is immanently dissociated [
3]. However, retinal projection displays (RPDs), as a recent important development in HMDs [
4], can overcome this immanent dissociation between the accommodation and convergence of conventional HMDs [
5,
6] and support extremely long focal depths.
Retinal projection technology was developed from retinal scanning technology, which originated from the scanning laser ophthalmoscope (SLO) set developed by Webb et al. in 1980 [
7]. To support the SLO, a special design of scanning and detecting devices that control the laser beam in scanning two-dimensionally through the pupil of the eye onto the retina, which reconstructs the retinal image through the fundus reflex light. This technology has been widely used in medical images to obtain the fundus of the retina [
8–
10]. Webb et al. concluded that the user can observe the digital image if the scanning beam is modulated by loading the virtual digital information [
7].
In 1986, Yoshinaka proposed the use of a scanning device to control the light beam projected on the retina in image display in order to reduce the volume and energy consumption of the display device. He pointed out that the device can potentially display the stereoscopic image, which can be used in the near eye display [
11,
12]. Based on the concept of image scanning directly on the eye retina, a retinal scanning display prototype was developed in the Human-computer Interface Technology Laboratory (HIT Laboratory) by Kollin et al. in 1992 [
13]. Kollin also noted that the spatial position of the focal point on the retina was determined by the angle of the incident beam. Thus, the accommodation of the eye does not affect the blurring of digital images [
13–
15].
Several companies have completed RPD prototypes as near-eye displays. Based on the technology spun off from the HIT Laboratory, Microvision Inc., which was founded in 1993, built a prototype called Nomad Personal Display System in 2001 using micro electro-mechanical system (MEMS) scanning mirrors for laser projection [
16,
17]. Brother Inc. demonstrated its eyewear named AiRScouter at the Brother World Japan 2010 Expo [
16,
18]. However, both companies failed to commercially produce their prototypes and ceased development probably because the use of eyewear terminal equipment did not gain popularity until recently. In 2014, Avegant Int. formally issued a VR device called Glyph, which provided stereoscopic image vision by means of retinal projection technology [
19]. In the same year, QD Laser Int. demonstrated its AR eyewear, Laser Eyewear (LEW), based on laser retinal imaging optics [
20,
21] and composed primarily of a MEMS, an RGB laser module, and a free-surface reflection mirror [
16]. With these, retinal projection HMD technology is likely to be used as display devices in the next generation of computer terminals.
Based on the data on the advanced technology of RPD design and development, this work aims to examine aspects of the literature on optical engineering. Principles and applications of four theories [
4,
22] were studied, e.g., the Maxwellian view and its modified modality [
23], as well as monocular and binocular depth cues of stereoscopic objects in the physiology of a human visual system [
22]. To support the Maxwellian view and achieve retinal projection systems, results from previous designing works using different methods [
23–
32] were analyzed. With an extremely long focal depth, a prototype of a full-color stereoscopic see-through RPD system was discussed. Finally, a brief outlook on future development trends and applications of RPDs was presented.
2 Principles
This section explores the geometric optics of the Maxwellian view, the key technology of the retinal projection HMD [
33,
34]. Technology with the imaging method of the conventional HMD is compared to illustrate how virtual information is projected on the human retina in the Maxwellian view. A modified Maxwellian view [
23], which overcomes the drawbacks of the previous one, is introduced.
2.1 Maxwellian view
Figure 1 shows that in the Maxwellian view, thin parallel beams are emitted from a spatial light modulator (SLM, such as liquid crystal display (LCD) and organic light emitting diode (OLED), all converged at the center of the pupil by the lens system and directly projected on the retina. Each pixel of the SLM that forms a clear image at any position behind the SLM stimulates a certain unique point on the retina. An image pattern modulated from the SLM is thus recognized. Note that the image pattern can be observed in the Maxwellian view without ocular accommodation because all the beams pass through the center of the crystalline lens. Thus, the image pattern supports an extremely long focal depth.
However, in conventional HMD, beams are emitted from the display, adjusted by the lens system, converged at the crystalline lens, and projected on the retina, as shown in Fig. 2. Ocular accommodation is indispensable in conventional HMD to obtain a clear image.
In addition, the conventional HMD has a fixed focal point of virtual images on display, whereas the eye dynamically accommodates the changing fixation points of real objects. Therefore, both real and virtual objects cannot be steadily observed because the observer can only accommodate one of them [
23], as indicated in Fig. 3(a). The use of such an AR-Smart Glasses in complex environments, requiring the eyes of the observer to accommodate continually, is also unsafe. Compared with the conventional HMD, thin parallel beams from the RPD support an extremely long focal depth, rendering a clear view of the virtual object to the observer when his eyes accommodate real objects as shown in Fig. 3(b).
The Maxwellian view makes it possible to fix the immanent dissociation between accommodation and convergence, support an extremely long focal depth, and minimize the effect of optical aberrations on the eye. However, the Maxwellian view is limited by its extremely narrow viewing area that requires the beams to strictly converge at the center of the crystalline lens.
2.2 Modified Maxwellian view
To overcome the narrow viewing area of the Maxwellian view, Takahashi used a holographic optical element (HOE) to propose the modified Maxwellian view [23]. This innovation allows the observer to keep viewing an image with eyes rotating at a limited extent. Compared to the Maxwellian view, the converging point of the beams in the modified Maxwellian view is located at the rotation center of the human eyeball, as shown in Fig. 4. The observer can keep viewing an image with rotating eyes as the field of view angle narrows through the aperture of the pupil [
23]. This method, to a certain extent, overcomes the narrow viewing area problem but neglects the crystalline accommodation, narrows the pupil aperture, and reduces the view angle field.
3 Methods supporting the Maxwellian view
From the discussion on the Maxwellian view in Section 2, obtaining the thin light beams and their convergence are conclusively the key points of the approach. This section summarizes the methods to achieve the Maxwellian view from previous design works by examining the literature on the state-of-the-art in retinal projection; their advantages and disadvantages are analyzed as well.
3.1 Obtaining thin light beams
The importance of the thin light beams is considered in Fig. 3(b). Each thin light beam of the image pixel stimulates a certain unique point on the retina to recognize the image pattern. The two methods to obtain thin light beams are as follows.
The first method is the pinhole imaging model shown in Fig. 5(a). The light source image is modulated by the pinhole and projected on the retina in the Maxwellian view. The thin light beam from each image pixel of the light source passes through the pinhole, converges at the center of the crystalline lens, and is projected on the retina. Image clarity is independent of the accommodation function and the focus position of the crystalline lens. Given that only a small position of light energy comes through the pinhole, a dark image is observed, and the energy efficiency of the light source is low.
The other method to obtain thin light beams is laser image, as shown in Fig. 5(b) [
35]. This method has the advantages of good collimation and miniaturization. The combination of these advantages with MEMS [
36] results in the laser’s capacity to support a two-dimensional image by scanning the human retina through the center of the crystalline lens. The laser image can achieve very high lumens for day and night operations; however, the speckle effect blurs the laser image.
The current mainstream approach is a combination of the two methods. In this approach, the laser is used to overcome the low light intensity of the pinhole image, and the pinhole serves as a modulator to reduce the speckle effect and as a filter to limit the senior diffraction of light.
3.2 Methods of convergence
The convergence of thin light beams at the center of crystalline lens is another outstanding feature of RPD. The methods of convergence are summarized from previous design works, such as the uses of freeform free-space magnifier, half mirror, non-axisymmetric free-surface reflection mirror, and HOEs. Other methods that have the potential to support the convergence are also presented in this paper.
A feasible strategy to converge thin light beams is through the use of a magnifier to produce a distinguishable image of SLM. Glyph, released by Avegant Int. [
19], used a patented freeform free-space magnifier to obtain an occlusion view. By means of the freeform free-space magnifier, low power LEDs with digital light processing (DLP) provide a virtual image converged at the center of crystalline lens. The magnifier provides convergence rather than non-see-through view unless it cooperates with other optical elements.
As illustrated in Fig. 6, Yang et al. [
24,
25] used a half-mirror with a convex lens to reflect the image from the liquid crystal on silicon (LCoS) display, converge thin light beams at the center of crystalline lens, project them on the retina, and render real-world objects available for viewing through the RPD system. Half-mirror is used to superimpose a virtual image on a real world object. However, the RPD with half mirror cannot effectively produce both virtual image and real object rays with appropriate intensity because of the tradeoff between reflectance and transmittance.
Ando et al. [
27] presented another method that uses the HOE to combine superimpositions of a virtual image on a real scene. The HOE is a diffraction grating accomplished by the holography technique, and the light source is an LCD modulated by digital micro-mirror device (DMD). Figure 7 shows the specification of the HOE, where the parallel rays from left and right are diffracted by the HOE and converged at the corresponding eye. Therefore, the appropriate intensities of both real world object and diffracted image are seen simultaneously because the HOE diffracts only rays of specific wavelengths, without reducing the intensity of visible rays. This binocular system can potentially support a stereoscopic see-through RPD, as Takahashi hypothesized [
23]. The angular and wavelength selectivity of the HOE makes the virtual image from the HOE monochromatic. The accurate production of such a HOE is particularly demanding because of the reflection and aberration by the curved shape of the spectacle lens. Moreover, it is also difficult to accurately superimpose the optical axis of the HOE on the optical axis of the HMD [
28]. These drawbacks hinder the application of HOEs to some extent.
The universal design compact eyewear of the RPD composed of a free-surface reflection mirror from QD Laser Int. was presented by Sugawara et al. [
16]. As demonstrated in Fig. 8, the free-surface reflecting mirror is designed to collimate the RGB laser beam initially scanned by the MEMS mirror. Afterwards, the beam is converged at the center of crystalline lens. Like other methods of convergence, its capacities to match the convergence point and the center of crystalline lens were considered.
To support the convergence, the methods of obtaining the Maxwellian view were summarized based on the analysis above. The premise of this study continues to be that the free form total internal reflection (TIR), the prism mirror, and the waveguide, which have been used in conventional HMD, can potentially support the Maxwellian view.
4 Realization of depth cues
In the Sections 2 and 3, the principles of the Maxwellian view and its modified modality were explored, summarized, and applied to prove the theory and to obtain the two-dimensional virtual information. However, when the two-dimensional virtual image is displayed, viewers can observe it even without any relation to real world objects. The viewer needs to perceive the virtual information in accordance with the real world object. The present study seeks to understand how RPD with depth cues provides stereoscopic vision for the viewer in accordance with HMD research.
In the physiology of the human visual system, the realization of depth cues can be divided into monocular and binocular depth cues [
22]. Based on the theories, the current methods of depth cues in the RPD [
2,
23,
29] are analyzed in this section.
4.1 Monocular depth cues and its realization in the RPD
To obtain monocular depth cues, the viewer accommodates the focal length of crystalline lens by stretching the muscles of the ciliary body, so that different parts of the scene are seen as revealed in Fig. 9.
In practice, Takahashi et al. used a HOE composed of three types of HOEs (H1, H2, and H3), as demonstrated in Fig. 10(a), to provide a monocular multi-view with three different depth cue images [
29]. Each type of HOE converges the corresponding projected images at respective convergent points (C1,C2, and C3) on the center plane of the crystalline lens, as shown in Fig. 10(b). However, this multi-view system displays the virtual image at different distances within eye accommodation, which do not strictly agree with the Maxwellian view. As the HOE is composed of finite number types of HOEs to support the corresponding number of depth cues, the multi-view system can only support a limited number of depths. Monocular depth cues can only support an effective distance of perception within a few meters, unless it cooperates with binocular convergence. Hence, the system is confined to a limited perception distance.
4.2 Binocular parallax and its realization in RPD
When one views an object, the light beams of the object converge at the center of the retina. Owing to pupil spacing (average value of 6.5 cm), the relative position of the same scene is different in the left and right eyes, which provides the binocular parallax as indicated in Fig. 11. The human brain constructs stereoscopic impressions by binocular parallax. This mechanism provides for the use of two cameras with a certain distance to obtain a stereogram of side-by-side format as shown in Fig. 12. Two-dimensional images with depth parallax information are transmitted to the left and right eyes of the viewer, which can be perceived by the human brain to structure a stereogram.
Based on the binocular parallax mechanism, a prototype of a full-color stereoscopic see-through RPD system, especially with an extremely long focal depth, is presented. With regard to the disadvantages of previous stereoscopic see-through RPDs, non-fusion, and mono-chromaticity [
23], a universal design optical structure demonstrated in Fig. 13 is used to achieve the Maxwellian view. A specific size of micro-pore P to modulate the thin light beam is located on the front focus plane of lens L3. The pupil of the human eye is located on the back focus plane of the lens system, which means that the micro-pore and pupil conjugate. To overcome the drawback that RPD with half-mirror cannot effectively produce both rays with appropriate intensities, the intensity of the laser is controlled and adjusted to the reflectance and transmittance of the half-mirror. Figure 14 shows the schematic diagram of the binocular stereoscopic RPD. The screens display the corresponding binocular parallax images of the stereogram. On the left side, P1 and P2 indicate the different points of the stereogram projected through the lens system, the center of the crystalline lens, and finally on the retina of the viewer. In the symmetric system, when the binocular light beams projected on the retina extend reversely, they intersect at P1″ and P2″ and construct a stereogram in the human brain. To demonstrate the effectiveness of the binocular stereoscopic see-through RPD, the experimental symmetric prototype in Fig. 15 is constructed. This prototype is not the head-mounted type, but a desktop type limited to the miniaturization of the laser projection. The long focal depth of the Maxwellian view that focuses on either a near or a far scene, is verified to obtain a clear view of the virtual image as demonstrated in Fig. 16.
5 Conclusions
This paper presented a review of the state-of-the-art RPD design, which focuses on optical engineering aspects including the Maxwellian view, its modified modality, and human depth cue perception realization. Furthermore, the key differences between the optical design of RPDs and conventional HMDs were discussed with regard to the methods to obtain thin light beams and realize convergence. To solve these two problems, methods from previous design works, including pinhole image, laser image, magnifier, half-mirror, HOE, and free-surface reflecting mirror, were summarized and analyzed. With regard to the disadvantages of previous design works, non-fusion [
23] and mono-chromaticity, a prototype of a full-color stereoscopic see-through RPD system, especially with an extremely long focal depth, were presented.
Compared with conventional HMDs, RPD can relieve eye fatigue by matching accommodation and convergence. Thus, RPD can be widely used in entertainment, education, industrial, surgical assistance, and military applications, as well as other long time works. Opportunities for further research are observed in terms of matching the convergence and the crystalline lens, expanding the viewing area, and refining the models for depth perception. This indicates the following research direction in RPD. We are confident that the RPD development will have a bright future in the fields of AR and VR.
Higher Education Press and Springer-Verlag Berlin Heidelberg