Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Received 17 October 2023; revised 21 December 2023; accepted 3 January 2024.

Date of publication 28 February 2024;


date of current version 14 March 2024. The review of this paper was arranged by Associate Editor Arokia Nathan.
Digital Object Identifier 10.1109/OJID.2024.3370888

Display and Optics Architecture for Meta’s


AR/VR Development
LINGHUI RAO1 , NAAMAH ARGAMAN 2 , JIM ZHUANG 1 , AJIT NINAN2 , CHEONHONG KIM2 ,
DAOZHI WANG2 , AND SHIZHE SHEN2
(Invited Paper)
1
Meta, Redmond, WA USA
2
Meta, Sunnyvale, CA USA

ABSTRACT We believe in the future of connection in the metaverse. The addition of immersive technologies
such as augmented reality (AR) and virtual reality (VR) is the next step in this progression. In this paper, we
will provide an overview of Meta’s progress in display and optics development for AR and VR. The overview
will cover different architectures that may lead to wide adoption of AR and VR devices in the near future as
well as solutions that will mature over time. We will also discuss different components maturity and utilizing
human perception for building systems made for human vision.

INDEX TERMS Augmented reality, display, optics, virtual reality.

I. INTRODUCTION backplane technology while they cost significantly more than


The metaverse is the next evolution in social connection and the other display types. Meanwhile, the LCD can achieve
the successor to the mobile internet, which helps people con- fairly high pixel density at affordable price. From Oculus Rift
nect and get closer together [1]. Meta is moving forward to to Oculus Go, Quest, Rift S, Quest 2 and Quest Pro, Meta
the metaverse through extensive research and development of has advanced in VR display and optical architecture design
innovative concepts. The addition of immersive technologies and manufacturing to enable the immersive user experience.
such as augmented reality (AR) and virtual reality (VR) is the In this section, we are going to focus on the review on the
next step in this progression. most recent Quest 2 and Quest Pro products.
To support the most optimized visual experience for Meta For both Quest 2 and Quest Pro products, the LCD tech-
AR/VR applications, we have designed the architecture of nology has been selected considering the mass production
Meta’s Display and Optics system and developed the visual maturity, affordable price, and possibility of pixel density
performance evaluation matrixes. In this paper, we will intro- increase. The innovative pancake lens works by folding the
duce the design concepts, challenges, and discuss the human light inside the optical system, and reduces the optical stack
perception study for maximized visual performance. thickness by 40%, compared with Quest 2. Fig. 1 below shows
the unique display and optics system designed to be harmo-
II. INFINITE DISPLAY SYSTEM FOR VR ARCHITECTURE nized with the product design and to bring a hyper-realistic
Meta’s Infinite Display features high resolution fast-switch visual experience for Quest Pro. Overall, we were able to
display panels with an advanced lens system. Various dis- increase the system resolution for Meta Quest Pro (22 Pixels
play technologies, including LCDs [2], OLEDs (organic light Per Degree) by 10% compared to Meta Quest 2 (20 Pixels
emitting diodes) [3], and micro displays [4], such as µOLEDs Per Degree). Plus, Quest Pro achieved 25% of the full-field
(micro-OLEDs) and µLED (micro light emitting diodes) dis- visual sharpness improvement in the center view, 50% of the
plays, have been introduced in the VR systems. The OLEDs peripheral region improvement, and larger color gamut than
on glass or flexible substrate give wide color gamut and high Quest 2. The Infinite Display system also allows consumers
contrast, but its pixel density is very limited. The micro dis- to adjust the lens distance from eyes with a new eye relief dial
plays provide the highest pixel density thanks to silicon wafer to optimize fit, face tracking and viewing experience.

© 2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see
VOLUME 1, 2024 https://creativecommons.org/licenses/by-nc-nd/4.0/ 71
RAO ET AL.: DISPLAY AND OPTICS ARCHITECTURE FOR META’S AR/VR DEVELOPMENT

FIGURE 2. Typical timing diagram of the quest 2 LCDs with the split
FIGURE 1. Infinite display system architecture for quest pro. backlight.

TABLE 1 Specifications of Meta Quest Pro and Quest 2 Displays


system, special display panel and pixel design were imple-
mented to improve the aperture ratio uniformity. With the
increased resolution, development in panel process optimiza-
tion is also implemented to ensure robust manufacturing
operation.
A fast response display is needed to improve the system
latency, which impacts the user experience on VR devices
[8]. The LC response time was engineered with LC material
tuning, cell design/process control, and also panel driving
optimization.
We customized DDIC (display driver integrated circuit)
design to minimize the horizontal line time and the optimum
value has been chosen considering panel loading and MIPI
(mobile industry processor interface) data rate.
Power consumption is important for battery life and sys-
tem thermal management. Higher PPI display intrinsically
A. VR DISPLAY DESIGN has more challenges for power efficiency. Meanwhile, com-
To lead the VR innovations for fitness, gaming and work, pared to the Fresnel lens system, the demand of Quest Pro
we introduced high resolution, high efficiency and fast-switch display brightness was ∼4–5 times higher due to the pancake
panel designs for Meta VR products. Meta Quest 2 has a lens system. We have taken all the system requirements into
single Fast-Switch LC display panel design for both eyes consideration in architecting the backlight design, panel pixel
[5]. Meta Quest Pro is equipped with two individual fast- design, and overall optical stack innovation to maximize the
switch liquid crystal display panels. The LCD panels are optical transmittance improvement.
1800 × 1920 pixels per eye. For Quest pro, the display res-
olution density is 1058 PPI, which is 37% higher compared to 2) BACKLIGHT DESIGN OPTIMIZATION
Quest 2. In addition, the local dimming backlight technology Low persistence illumination: Motion blur is a significantly
was introduced to Quest Pro to enhance the display color noticeable spatiotemporal artifact which occurs when an im-
gamut, contrast and improve power consumption [6]. age is presented with a finite display frame rate and eyes
In order to suppress visual artifacts, such as screen door move across the image. It also reduces visual acuity due to
effects, mura, motion blur, ghosts and trailing effects, and to the blurred image. This artifact becomes even more apparent,
enhance power efficiency, fast-switch (FS) LCDs have been and users will notice it all over the place in VR applications
developed with below considerations. because the display images are world-locked instead of being
head-locked. In order to minimize the motion blur artifact,
1) HIGH EFFICIENCY AND HIGH RESOLUTION low-persistence illumination has been adopted by controlling
FAST-SWITCHING PIXEL DESIGN the duty ratio in the backlight timing. Fig. 2 below shows
The Meta VR display team has driven the display technology the typical timing diagram of the Quest 2 FS LCD with the
to over 1000 PPI to reduce artifacts, such as screen door split backlight. The left LED bars are illuminated after the LC
effect (SDE), and improve text readability. SDE describes a completely settles down in the left half active area. Meanwhile
mesh-like appearance on a screen or projected image [7]. SDE the right LED bars may extend into the next frame, but the il-
could be observed along the gate and data lines, which is lumination should be finished before the right half active area
related to the display fill factor design. For the Infinite Display scan gets started. The illumination timing has been optimized

72 VOLUME 1, 2024
FIGURE 3. Comparison of fresnel lens in rift CV1 and quest 2. FIGURE 4. Cross section of fresnel lens.

for each frame rate to get the best visual performance, such as
no ghosts and minimized stereoscopic disparity.
Local Dimming Innovation: To provide the consumers with
better contrast and visual experience, Quest Pro adopts lo-
cal dimming technology. The backlight consists of over 500
independently controllable LED zones. The local dimming
technology applied to Quest Pro is carefully designed to cope
with various challenges. For example, with the VR optical
stack, the zoning planning, and the field of view design, cer-
tain artifacts could be more noticeable compared to traditional
consumer devices. In addition, for better battery life, the Quest
Pro local dimming needs to be highly efficient in computing
resources and power consumption. We have deliberately opti-
FIGURE 5. AR/VR display image quality axes.
mized the algorithm with advanced techniques in the graphics
pipeline, such as image statistics analysis, backlight level ad-
justment, and post-spatial and temporal filtering. Thus, we a 6DOF headset. All these aspects are carefully co-optimized
have minimized the power consumption, while maintained and the associated manufacturing processes are further ad-
visual benefits and suppressed unwanted artifacts, such as vanced for the Meta’s 2nd generation Fresnel lens.
halo. The pancake lens in Quest Pro represents another leapfrog
advancement, compared with Quest 2 Fresnel lens. The pan-
3) VR DISPLAY DESIGN CHALLENGES cake lens is a polarized catadioptric optical system in which
There are other display module design challenges needed to the refraction and reflection of the light are combined to
be addressed for an optimized optical system performance. fold the light from display multiple times through the optical
For example, special panel design was introduced to reduce system before it reaches the user’s eyes. As a result, a VR
stray light in the optical system, and display optical profile optical system that is compact, lightweight, high resolution,
distribution output was finely tuned to match the lens design. wide-FOV is achieved . A polarized catadioptric optical sys-
Beyond the optical visual design, it was also important to tem was first proposed by LaRussa [9], and more recently by
have architectural consideration on display module weight, Lacroix [10] and Huxford [11] for head-mounted display ap-
mechanical dimension, system latency, and power consump- plications. Meta has collaborated with the industry to advance
tion optimization. the technologies of optics, material science, high-precision
and high-volume manufacturing, which makes it possible to
B. VR OPTICS DESIGN design and fabricate such complex optical systems at scale and
The 2nd generation of Fresnel lens from Meta can be found cost effectiveness for applications in consumer electronics.
in the products such as Oculus Go, Rift S, Quest, and most Quest Pro pancake lens features a two-element lens design.
recently Quest 2, the most popular standalone VR headset Both elements are of the plano-convex surfaces, as shown
on the market. As Fig. 3 shows, the Quest 2 Fresnel lens is in Fig. 1. This optical architecture offers additional design
a bi-convex design which provides additional design degrees degrees of freedom than the Fresnel single lens and allows the
of freedom to further optimize the optical performance, such optimization for better visual performance that is important to
as sharpness and pupil swim, compared with a plano-convex the users, such as the field of view, and full-field sharpness.
optics. The field of view of the Quest Pro display optical system
The design and manufacturing of Fresnel features including is designed at 15 mm eye relief (defined as the distance from
the pitch, draft surfaces and peak/valley tip radius, shown in the eye pupil to the apex of the first lens element), to accom-
Fig. 4, are critical for the god-ray / glare performance of the modate users wearing the spectacle lens. To achieve the same
lens when users view the contents in a VR environment with field of view, the longer the eye relief, the more challenging

VOLUME 1, 2024 73
RAO ET AL.: DISPLAY AND OPTICS ARCHITECTURE FOR META’S AR/VR DEVELOPMENT

from the design standpoint to minimize the off-axis aberra- designer who wishes to build a system that would be useful
tion, such as color, distortion, and astigmatism, etc. and valuable to consumers.
The polarization nature of the pancake lens commands the When considering the adoption of both VR and AR devices,
precise control of the polarization state of the light propa- the user experience is significant but not the only goal to
gating from display through the optical stack, over a large consider. The comfort, both physical and visual, of the system
range of field angles and broad RGB spectrum of the display. is a significant part of the design goal, in addition to the size,
The film stacks are uniquely designed and integrated with the weight, power consumption, cost (Swap-C), and performance.
optics to minimize some of the intrinsic artifacts of polarized One driver for comfort is a significant weight reduction
catadioptric optics systems, such as ghosting, and achieve compared to systems available today in the market. Spec-
better contrast and color performance. tacles worn today for vision correction weigh between 15
Quest Pro pancake lenses are made of optical thermoplas- and 65 g and are considered an acceptable weight. Adding
tic resins through injection molding process. This process is visual, audio, and AI features to the same form factor leads to
important for cost and scale, and is equally important for mini- constant trade-offs in weight acceptability and a constant push
mizing the optical aberrations to deliver desired a comfortable for smaller, lighter, and more efficient components across the
visual experience to users. Some of the optical surfaces in system.
pancake lenses are reflective (polarization-state dependent), An AR Display system consists of two main components:
and it is well-known that the lens optical performance is The Display Engine or Light Engine, which converts bits into
more prone to the imperfect quality of the reflective surfaces photons projected at predefined angles at predefined locations
than the refractive surfaces in non-pancake lens systems. In toward an optical component that merges this light with light
the high-volume manufacturing processes, Quest Pro pancake arriving at the eye from the world around the user. This sec-
lens quality is tightly controlled at each step, including surface ond component could be a mirror or a waveguide (diffractive
form error, its 1st and 2nd derivatives, and surface roughness, or refractive) but must allow light from the world to reach
etc. the user’s eyes. Both components will be discussed in detail
Another unique requirement of the pancake lens is to below.
achieve the ultra-low birefringence (on the order of single Due to the nature of the projection system, the image is
digit nanometers) in these thermoplastic lenses. Similar to the projected at infinity, making it uncomfortable to interact with
engineering of polarization film stacks as aforementioned, the when objects in the real world are close to the user. Over time,
low-birefringence lens is essential to the desired contrast and eye strain from the focus point changes could cause visual
ghosting performance. fatigue or discomfort. For that reason, AR systems often use
push-pull lens systems that bring the projection plane closer
to the user, usually anywhere between 30 cm and 2 m. These
III. AR DISPLAY AND OPTICS DEVELOPMENT lenses could be diamond-turned, molded, or even 3D printed.
A. INTRODUCTION TO AR OPTICAL SYSTEMS Meta’s recent acquisition of Luxexcel, an optical 3D company,
It is not accidental that VR and AR systems are often referred allows us to utilize this no-waste, fast, durable, and flexible
to as one technology developed together and used together. platform for the making of such lenses and other optical
Both are designed as Head Mounted Displays (HMDs) and components at an optical grade level [11]. 3D printing optical
as such are targeting a small form factor, to be lightweight components allows us to utilize complex geometries, increase
and as comfortable as possible while maintaining as long a our integration flexibility, reduce the size and the weight of
battery life as possible. But although the two do share many components made with other methods.
similarities in many areas, they differ in the optical path,
which relates to many differences in rendering, system design, B. AR DISPLAY ENGINES
and integration. Three main approaches are considered for the Display Engine
The use cases for VR and AR applications merge in a for Immersive Displays that are bright enough to support
pass-through VR, where world-facing cameras capture the both indoor and outdoor usage: Laser Beam Scanning (LBS),
real world around the user and overlay digital content on top Liquid Crystal on Silicon (LCoS), and micro-LEDs (uLED).
of the pass-through video, delivered by a display positioned With the recent improvement in Organic LED (OLED) per-
in front of the user’s eyes. While the benefits of this approach formance, it is used in AR devices but mostly for low field of
are significant from the user’s perspective, one can expect view systems where the pupil efficiency is higher and hence
limited adoption of such devices for activities that require is kept out of this discussion on Immersive Display options
social interaction or appearance in public locations. [13] DLP technology, although mature and very bright, is
An AR system is designed to overlay digital content on top power-hungry and poses challenges in size and wearability.
of the real world while keeping the see-through qualities of Future development in the field might prove valuable to the
the system. This mission is quite challenging to achieve in world of AR, though this paper will not review it in detail.
a battery-driven system, especially while competing with the Laser Beam Scanning (LBS) is based on three separate laser
optical flux coming in from the world around us on a brightly panels projecting globally onto a set of 1D or 2D MEMS-
lit day. This section will discuss the options in front of an AR based mirrors that scan across the scene pixel by pixel to

74 VOLUME 1, 2024
generate the image. LBS offers the potential of a very small between manufacturability and cost, size, performance, and
light engine at high brightness and contrast. It enables high power consumption. Among the immersive systems already
resolution and reduced latency, key factors for AR immersion. shipped in volume are the Hololens 1, Magic Leap 1 and its
However, challenges include power consumption constraints, successor Magic Leap 2 and numerous smaller FoV systems.
complexity in packaging, and production cost considerations,
which can hinder widespread adoption. Additionally, safety C. AR COMBINERS
concerns related to laser radiation and eye strain need care- Historically, AR displays have shown clear trade-offs between
ful mitigation. Historically, both Intel Vaunt that was never wearability, size, power consumption, see-through qualities
launched and North Focals, launched in 2018, chose LBS as (both inside-out and outside-in), and the visual experience
their Display Engine of choice, but the FoV was small at 15° of the user. Immersive Displays, even ones designed for
[14] and was considered a limiting factor. Microsoft chose consumers or light-industrial applications like Microsoft’s
LBS for its more immersive Hololens2, launched in 2019, Hololens [15] or Magic Leap’s series [16] chose a very dif-
which has a FoV of 43° by 29°, demonstrating that with two ferent design point compared with Google glass [17] or the
sources per color [14] and 2D MEMS scanning, a wide FoV is North Focals, aimed at a low field of view audience.
possible with LBS. To date, no other major player in the AR Our goal at Meta is to explore that gap and bring to mar-
space launched an immersive and mass-produced commercial ket the best design point products that brings our users and
AR system using LBS. the world closer together. In the following section, we will
uLED Light Engines, which are just emerging in commer- discuss the options for AR combiners from birdbath, single to
cial small FoV devices, give the Optical Display designer a multiple reflector combiners to pupil expansion geometric and
glimpse into its future potential. Unlike LBS, the uLED panels surface-relief combiners.
generate the whole image spatially, eliminating the require- Like many other components in an AR system, birdbath
ment for a high scan rate. The image is collected through a combiners for Immersive Displays also suffer from a trade-off
collimating optical component to generate an image onto the between form factor and efficiency. The smaller, lighter, and
optical combiner. The resolution is set by the number of pixels more "normal looking" the glasses are, the higher the distor-
in the panel and the system field of view, and the limiting tion and lower the efficiency. Although such combiners can
factor is the size of the pixels themselves. The technology be molded, making their cost at volume quite attractive, the
provides exceptional contrast and color accuracy, significantly bulkiness and weight of a system designed for them make their
enhancing the visual fidelity of AR content. Its miniature size adoption quite limited.
allows for compact device designs, increasing portability and On the other side of the combiner option space, one can find
wearability. Yet, to this day, the processing of very small pixel Total-Internal-Reflection combiners (TIR). Total internal re-
sizes at the brightness required for an AR Immersive Display flection occurs when light travels from a medium with a higher
remains a challenge. Furthermore, the field of AR optics is refractive index to a medium with a lower refractive index,
characterized by two distinct color generation methodologies. and the angle of incidence exceeds a critical angle. When this
The first approach involves the segmentation of red photon critical angle is exceeded, no refraction occurs, and all of the
generation from blue and green, necessitating the use of 2 or 3 light is reflected back into the higher refractive index medium.
panels. However, the challenge lies in the intricate merging of This phenomenon is exploited in AR combiners when the
the light emitted by these panels, resulting in a display system virtual images from the Display Engine are directed into the
that is notably large, limiting wearability. Conversely, the sec- high-refractive-index medium at specific angles, exceeding
ond approach, RGB pixelation, while promising, remains in the critical angle for TIR. As a result, the virtual images are
its developmental stages and is not yet ready for commercial internally reflected at the boundary and become visible to the
application. This method encompasses various technologies, user at the eyebox.
such as localized epitaxial growth and the conversion of blue When the number of bounces is increasing, the combiner is
or UV light to green and red via quantum dots, requiring referred to as a waveguide.
further refinement before practical implementation. The first in this review is a reflective waveguide using 100%
LCoS technology is the most mature of the three display op- reflective pin-mirrors. In recent years, pin-mirror projection
tions. LCoS light engine utilizes liquid crystal panels placed systems were able to show a trend towards a wide field of view
on top of silicon backplanes, reflecting flood or global illu- at relatively high efficiency, low manufacturing cost, and com-
mination coming in from LEDs. The liquid crystals modulate pelling size and weight [18]. In such systems, tiny mirrors are
the light’s intensity based on the displayed image on a pixel- embedded into the eyepiece to overlay digital content on top
by-pixel basis, and the reflected light is then directed into of the real-world view. The dots are small enough and close
the user’s eye through optics. Due to their reflective nature, enough to the user’s eyes so that no occlusion occurs. Yet, the
LCoS engines sometimes suffer from low contrast ratios, and expansion into a wide field of view is still challenging, and the
complex integration and alignment between the different com- social acceptability of the outside-in view is still unknown.
ponents. In addition, the liquid crystals modulate just one Other Geometric Reflective Waveguides use partially re-
polarization, thus reducing the overall efficiency of the sys- flective mirrors to control and direct the light from the Display
tem. If solved for, LCoS light engines provide a good balance Engine [19]. The latest designs using this approach show very

VOLUME 1, 2024 75
RAO ET AL.: DISPLAY AND OPTICS ARCHITECTURE FOR META’S AR/VR DEVELOPMENT

good color and luminance uniformity at very high efficiency seamlessly as one spatial display delivering an image to the
and see-through transparency. The Field of View has also brain. The traditional vectors of visual performance apply to
increased systematically over the past several years, though these displays, but the complexity of the binocular aspects
the trade-offs of FoV to thickness (and hence weight) and working together compound the issue of visual performance
efficiency still exist. Glares, ghosts, and other visual com- further. This also offers unique opportunities to optimize it as
fort artifacts remain a challenge using Geometric Reflective well. To help guide us we use a 9 axes framework as described
Waveguides. below.
Surface Relief Gratings (SRG) and Volume Bragg Grating At the high level, we can think of the traditional 4 axes of
(VBG) Waveguides use a different methodology to control display image quality that drives AR/VR as: (1) Resolution,
and manipulate the light. In VBG, diffractive gratings are (2) Dynamic range, (3) Color gamut, and (4) Frame rate. The
inscribed within a waveguide substrate using a photosensitive good thing about these four characteristics is that the industry
holographic material with a refractive index slightly different has had over fifty years to define and standardize them. With
than the medium and are designed to diffract specific wave- small modifications, we are able to extend the existing tools
lengths of light while transmitting others. In contrast, SRG to quantify them and make them applicable to AR/VR. The
waveguides employ nanostructures imprinted or etched onto binocular head mountable image specifics are however new
the surface of a transparent substrate [20]. Leveraging TIR and and lack any standardization or quantifiable minimum viable
by controlling the spacing and orientation of these gratings, experience. The remaining AR/VR specific axes are: (5) Field
SRG waveguides can precisely steer and overlay a digital of view, (6) Stereo presentation, (7) Latency, (8) Degrees of
image from the Display Engine to the user’s eyes. Both VBG Freedom, and (9) Accommodation Vergence. These nine de-
and SRG waveguides offer very thin and small form factors fine not only image quality, but also visual comfort. At Meta,
compared with any other combiner technology. Since the re- we have focused perception and image quality teams who are
fractive index and surface materials of SRG are higher than dedicated to this aspect of AR/VR display system designs.
VBG, SRG also offers higher field of view opportunities for
the design of Immersive Displays. Yet, due to the diffractive
nature of these technologies, they show color non-uniformity A. RESOLUTION
that reduces the visual comfort of the user. Resolution for normal displays is quantified by horizontal
and vertical resolution. This does not adequately describe an
D. FUTURE-LOOKING AR SYSTEMS AR/VR system since it does not tell us anything about what
When looking forward to the adoption of Immersive Displays, the brain will see. A better measure is pixel per degree (PPD)
the road is still long ahead of us to enable price-sensitive, low since this will give us a better idea as to how it appears to the
weight and size, highly reliable, and low-power Immersive human visual system. The human visual system is capable of
Displays that would be appealing to the mass market. In ad- resolving 120 PPD or higher for some specific content while
dition, some of the issues identified above remain a challenge 60 PPD is good enough for most content. The Meta Quest
like the single plane of projection and mitigating Vergence Pro has 22 PPD at the center of FOV, which provides 10%
Accommodation Conflict and the requirement for a comfort- improvement over Meta Quest 2 at 20 PPD [21].
able visual experience from both the user’s perspective and a The PPD may not be enough of a measure for resolution
bystander. since the theoretical number is simply the display total resolu-
tion divided by the field of view. This theoretical number will
IV. VISUAL PERFORMANCE be degraded by the lens system. The MTF of the lens will drop
Display technology for AR/VR from a visual performance the resolution further. This along with rendering algorithms
point of view needs to be thought of differently than tra- such as antialiasing would lower this to what we refer to as an
ditional display systems. The North Star for AR/VR is to effective PPD. This MTF sharpness has a spatial component
impedance match the display system to the human visual to it which is inevitable for AR/VR devices to have different
system. The device needs to fit in a form factor that takes degrees of sharpness from the center to the edge area. This
into account human ergonomics which implies that if we are is due to the optical aberrations of lenses or imperfect optical
wasteful with our system parameters such as power and light, quality from the manufacturing processes. From this stand-
we have probably not achieved ideal form factor and ideal point, the goal of a well-designed and well-manufactured lens
ergonomics. On the other hand if we underdeliver on image is to achieve a high MTF at both the center and periphery area,
quality, we have compromised our imaging experience. This and to provide the edge-to-edge full field visual clarity to the
equilibrium is what we target for the delivery of an optimal users. The Meta Quest Pro delivers a center sharpness of 0.98
visual experience. The goal is to achieve perceptual quality and an edge sharpness of 0.85 (measured at 5 lp/mm), which is
for immersive experiences. We strive to deliver experiences 25% and 50% improvement over Meta Quest 2, respectively.
for humans not metrology. To make matters worse, the persistence time has a dynamic
The additional challenge AR/VR has above that of tradi- effect on PPD which changes the perceptual PPD. All these
tional displays is the fact that we have 2 display systems which must be evaluated in order to quantify what the real resolution
deliver an image to the left and right eye that must behave quality is when setting a target.

76 VOLUME 1, 2024
B. DYNAMIC RANGE more immersive experience, it also requires other axes to be
Most of us are familiar with contrast ratio. However, there tighter since visual comfort can be compromised if charac-
is also another aspect of control over the dynamic range to teristics such as latency and frame rates are not maintained
render smooth gradients. The peak brightness with the simul- correspondingly.
taneous contrast and the sequential contrast defines the display
system contrast ratio. In AR systems the additive contrast F. STEREO PRESENTATION
ratio is usually more important to characterize [22]. This takes The key to delivering a comfortable experience is delivering a
into account the light coming from the world mixing with correct stereo image that is right for the user. This requires the
the display all contribute to the visual performance of the display to take into account how the eyes are spaced and where
changing dynamic ranger. The content must map to play- your eyes are. Unlike 3D TV’s that just deliver horizontal
back effectively in a way that makes content visible while disparity of some kind, AR/VR will require this to be done
preserving the intent of the image. On the other hand, VR right. This is especially true since extended use is expected
Infinite display features local dimming with 500 zones. This and the distances to the virtual objects need to mix with the
enables 75% higher contrast compared to Quest 2 and allows real world. The system will need to do correct rendering by
for finer control over the rendering and results in deeper blacks leveraging all the sensors. Lens designs will need to take into
and an enhanced viewing experience. Additionally, the optical account the pupil swim that can create errors. The brightness
system has polarization film stacks on the display to minimize [24] & color matching between the displays all contribute to
ghost images, especially from the mid-periphery to the edge correct depth rendering. To add to this, the vertical disparity
of the FOV. This contributes to its overall improved contrast. of the images will result in diplopia if exceeding the limit of
binocular fusion. More exact rendering that includes inter oc-
C. COLOR GAMUT ular parallax would add to that naturalness of the experience.
Color plays a key role in contributing to the consumers’ visual
G. LATENCY
experience. The quality of a VR headset’s color is measured
by color gamut and color accuracy. Color gamut is the range of The latency of the display not only includes how long it takes
colors within a spectrum that can be reproduced by the overall from receiving the first information of the image but includes
system or a metameric equivalent. Color accuracy refers to the roundtrip time from the motion for the head to when the
the display system’s ability to reproduce colors and shades as photon is registered in the brain. The larger the perceptual
intended. Meta Quest Pro has a wide DCI-P3 color gamut, latency the more uncomfortable this experience is. This is true
which enables vivid color. The calibration process to deliver for world locked content and user motion but is not as critical
accurate color is a critical aspect of image quality. In AR, for content motion.
this is exceptionally challenging due to the additive nature
H. DEGREES OF FREEDOM (DOF)
of the displays. Color becomes critical to match content in
the real world for immersive experience. The resulting gamut The ability for the display to render 3 degrees of freedom or
combined with the dynamic range defines how the content 6 degrees of freedom rests on the need for the application and
needs to be played back. its ability to locate the displays in space. 6 DoF requires other
axes such as frame rate, latency, stereo rendering to be tighter
to accommodate for this level of freedom. While this is not a
D. FRAME RATE
display characteristic the capability to do this is dependent on
Frame rate plays into the brain’s ability to fuse images into the display’s capabilities.
motion and can play an important role in comfort. Display-
ing of the image due to head motion and eye motion must I. ACCOMMODATION VERGENCE
be rendered seamlessly for smooth pursuit and for vestibular The need to interact with content for extended periods of time
ocular reflex of the eye. These are dependent on the display’s where content is in close proximity requires us to be able to
ability to render new frames quickly. Currently, the lower verge with correct stereo image but also with corresponding
bound for comfort is at 90fps for world locked content. The focus/accommodation. This disconnect in matching accom-
ability to lock content accurately to the world for mixed reality modation vergence has its effect on different parts of the
is limited by this capability. Combined with the requirement population in different ways. But when it works it is natural to
for higher frame rate, there is a need for low persistence everyone and for most it becomes a critical feature for com-
displays. Low persistence allows for sharper rendering and fort. If the display system is capable of rendering hologram’s
reduced retinal blur, which contributes to key image quality or lightfields there are usually trade offs with other aspects
characteristics [23]. mentioned above.
For each of these axes quantified perceptual models are
E. FIELD OF VIEW built and JND (Just noticeable difference) or JOD (Just ob-
Field of view determines how immersive the experience will jectionable difference) thresholds established. These are then
be. Different applications will require different fields of view. partitioned into what would be an acceptable minimum viable
Field of view can be cut with two edges. While delivering a experience and what would be diminishing returns for that

VOLUME 1, 2024 77
RAO ET AL.: DISPLAY AND OPTICS ARCHITECTURE FOR META’S AR/VR DEVELOPMENT

axis. This can be further broken down into good, better, best [9] J. A. LaRussa and A. T. Gill, “The holographic pancake window TM,”
recommendations for the product. Since this is done per appli- Proc. SPIE, vol. 162, pp. 120–129, 1978, doi: 10.1117/12.956898.
[10] M. Lacroix, “Collimation device of small size,” French Patent 2 690
cation the final product combines these requirements together 534, 1993.
to decide target specifications. [11] R. B. Huxford, “Wide FOV head mounted display using hybrid optics,”
Proc. SPIE, vol. 5249, pp. 230–237, 2004.
[12] G. Groet, “Luxexcel: 3D printing: Combining prescription power and
V. CONCLUSION a waveguide into a lightweight lens,” Proc. SPIE, vol. 11764, 2021,
In this paper, we have reviewed the key design architecture Art. no. 117640F.
and user experience metrics to maximize Meta’s AR/VR user [13] E. Hsiang, Z. Yang, Q. Yang, P. Lai, C. Lin, and S. T. Wu, “AR/VR
light engines: Perspectives and challenges,” Adv. Opt. Photon., vol. 14,
experience development. We are looking forward to bringing pp. 783–861, 2022.
the next leap in display and optics technologies into reality for [14] KGOnTech. [Online]. Available: https://kguttag.com/
our next generation products. [15] B. C. Kress and W. J. Cummings, “Optical architecture of HoloLens
mixed reality headset,” Proc. SPIE, vol. 10335, pp. 124–133,
2017.
REFERENCES [16] K. R. Curtis, “Unveiling magic leap 2’s advanced AR platform and
[1] Meta. [Online]. Available: https://about.meta.com/ revolutionary optics,” Proc. SPIE, vol. 11932, 2022, Art. no. 119320P.
[2] T. Matsushima et al., “Optimal fast-response LCD for high-definition [17] B. Kress and T. Starner, “A review of head-mounted displays (HMD)
virtual reality head mounted display,” SID Symp. Dig., vol. 49, technologies and applications for consumer electronics,” Proc. SPIE,
pp. 667–670, 2018. vol. 8720, pp. 62–74, 2013.
[3] J. Cho, Y. Kim, S. Jung, H. Shin, and T. Kim, “Screen door effect [18] LetinAR. [Online]. Available: https://letinar.com/en/pintilt
mitigation and its quantitative evaluation in VR display,” SID Symp. [19] A. Frommer, “11-3: Invited paper: Lumus optical technology for AR,”
Dig., vol. 48, pp. 1154–1156, 2017. SID Symp. Dig. Tech. Papers, vol. 48, pp. 134–135, 2017.
[4] G. Haas, “Microdisplays for augmented and virtual reality,” SID Symp. [20] B. Kress, “Optical architectures for augmented-, virtual-, and mixed-
Dig., pp. 506–509, 2018. reality headsets,” 2020.
[5] C. Kim, A. Klement, E. Park, J. Han, L. Rao, and J. Zhuang, [21] Meta. [Online]. Available: https://www.meta.com/blog/quest/vr-
“High-ppi fast-switch display development for oculus quest 2 VR display-optics-pancake-lenses-ppd/
headsets,” SID Symp. Dig. Tech. Papers, vol. 53, pp. 40–43, 2022, [22] D. R. Blanc-Goldhammer and K. J. MacKenzie, “The effects of nat-
doi: 10.1002/sdtp.15410. ural scene statistics on text readability in additive displays,” Proc.
[6] L. Rao et al., “Infinite display for meta quest pro,” SID Symp. Dig. Tech. Hum. Factors Ergonom. Soc. Annu. Meeting, vol. 62, pp. 1281–1285,
Papers, vol. 54, pp. 32–35, Aug. 2023, doi: 10.1002/sdtp.16480. 2018.
[7] J. Nguyen, C. Smith, Z. Magoz, and J. Sears, “Screen door effect [23] A. Goettker, K. J. MacKenzie, and T. S. Murdison, “Differences be-
reduction using mechanical shifting for virtual reality displays,” Proc. tween oculomotor and perceptual artifacts for temporally limited head
SPIE, vol. 11310, pp. 200–210, 2020, doi: 10.1117/12.2544479. mounted displays,” J. Soc. Inf. Display, vol. 28, pp. 509–519, Jun. 2020.
[8] S. T. Murdison, C. McIntosh, J. Hillis, and K. J. MacKenzie, “Psy- [24] T. Doi, L. Wilcox, and T. S. Murdison, “Stereopsis from interocular
chophysical evaluation of persistence- and frequency-limited displays temporal delay: Disentangling the effects of target versus background
for virtual and augmented reality,” SID Symp. Dig. Tech. Papers, vol. 50, luminance,” in Proc. Vis. Sci. Soc. Annu. Meeting, 2023, Art. no. 5159.
pp. 1–4, 2019, doi: 10.1002/sdtp.12840.

78 VOLUME 1, 2024

You might also like