Virtual and augmented realities: Asking the right questions and traveling the path ahead
The past year has been big for virtual and augmented reality: new VR hardware for consumers came to market, from smartphone-powered options to desktop-powered systems. Developers started exploring AR in earnest as the first devices to build on became available. And, our team at Google was focused on getting some major products out the door. Six months ago, we launched Daydream, our platform for high quality mobile virtual reality. Soon after, the first Tango-enabled phone landed on store shelves, putting smartphone-based augmented reality capabilities into the hands of consumers for the first time.
In a short time, there have been many advances across the industry, and we’ve made a lot of progress ourselves. But, there are still plenty of questions left unanswered. We’re already at a point where millions of people are beginning to enjoy some of what these new developments can offer, but it’s early days.
Now feels like the right time to take a step back, look at the landscape of VR and AR, and share a bit on how we think about the space: where we are, where we’re going, why it matters to Google, and why it matters to the world.
What’s in a name?
What do the terms “VR” and “AR” really mean? I often say: VR can put you anywhere, and AR can bring anything to you. VR can transport you somewhere else. AR leaves you where you are and brings objects and information to you, in context, making them seem like they’re there with you. They both give you superpowers. This can happen by wearing a headset, or in a more limited way by looking through the viewfinder of your smartphone.
Many people ask me which technology will “win”. The problem is that question suggests two competing, mutually exclusive technologies. But it’s a false distinction. VR and AR are more like two points on a spectrum — a spectrum of how much computer-generated imagery gets woven into natural environments. VR completely replaces the real world with computer-generated imagery, for instance to transport you to inside a virtual representation of the Louvre. By contrast, AR adds pieces of computer-generated imagery to your environment. If you were at the Louvre in real life, AR could overlay digital footsteps on the ground in front of you, leading the way to the Mona Lisa.
But in time, these points on the spectrum will blur: we’ll have AR headsets that can augment your whole field of view, and VR headsets which can pull in photo-realistic digital representations of your environment, and devices in between which do a bit of both. Once the technology progresses to this point, the distinction between VR and AR will be far less relevant than it is today.
In the meantime, if VR and AR are two points on a spectrum, what should we call the spectrum? Here are a few ideas — immersive computing, computing with presence, physical computing, perceptual computing, mixed reality, or immersive reality. This technology is nascent, and there’s a long way to go on our definitions, but for now, let’s call this immersive computing.
The arc of computing interfaces
Why does it matter that immersive computing can make things seem real? And why invest in technology that makes this possible?
To look forward, let’s first look back at the history of computers and how we’ve interacted with them. Over the past several decades, every time people made computers work more like we do—every time we removed a layer of abstraction between us and them—computers became more broadly accessible, useful, and valuable to us. We, in turn, became more capable and productive.
In the beginning, a person could “talk” to the computer only by literally rewiring it. Punch cards were an improvement; computers became more easily programmable. Later came the command line, and typing replaced punching cards.
Computer punch cards at the U.S. Bureau of the Census, ca. 1940
The real breakthrough was the graphical user interface (GUI). Computers became visual for the first time, and suddenly far more people could relate to and use them. People started using them for everything from creating school reports to designing jet engines.
Smartphones pushed the accessibility and power of computing even further, right into the palm of your hand. Touchscreens enabled us to interact with our computers directly, with our fingers, and smartphone cameras let us capture the world just as we see it. And most recently, conversational interfaces like the Google Assistant made interacting with your devices even more natural and seamless by enabling you to talk like you would with another person.
But as far as we’ve come, abstractions remain. When you video call a friend, you see them not as you would in real life, but as a small, flat version of themselves on your screen. When you’re trying to figure out where a restaurant is, you’re served up an incredibly detailed map, but then left to figure out how your blue dot relates to the map around you. Your phone can’t just walk you there.
With immersive computing, instead of staring at screens or constantly checking our phones, we’ll hold our heads up to the real and virtual worlds around us. We’ll be able to move things directly using our hands, or simply look at them to take action. Immersive computing will remove more of the abstractions between us and our computers. You’ll have access to information in context, with computing woven seamlessly into your environment. It’s the inevitable next step in the arc of computing interfaces.
So, why Google?
Google’s mission is to organize the world’s information and make it universally accessible and useful. We started with webpages —text and images—and then books, maps, and video. As the information available on the web got richer, so too did the tools to search and access it.
Google.com homepage in 1998
Immersive computing will take this further. If you want to learn more about Machu Picchu, instead of reading or watching a video about it, you’ll visit it virtually. We’ll have VR cameras that can capture moments of our lives, and enable us to step back inside of them years later. Instead of using a 2D street-level map to find a restaurant, your AR device will know precisely where it is in space, and walk you to within inches of the table you reserved. Surgeons will overlay 3D medical scans right on top of their patients to better understand what’s going on. With immersive computing around us and woven into our environment, information will be richer, more relevant, and more helpful to us.
It’s not just about the information itself, though. It’s about how people get access to it. That’s why over the years we’ve invested in building broadly available computing platforms. With Chrome, we wanted to make the web faster, more secure, and more powerful. With Android, our goal was to make mobile computing available to vastly more people. With Cardboard and Daydream, we want to make immersive computing accessible through devices that power diverse, useful, and interesting experiences.
Together with artificial intelligence and machine learning, we see VR and AR as part of the next phase of Google’s mission to organize the world’s information. That’s the larger context for how we think about immersive computing, where it sits in the larger story of computing, and why Google is investing in it.
Where are we now?
We’re often asked when VR and AR will “be ready” and what the “killer app” will be. This question suggests there will be a singular moment when VR and AR suddenly “works”, is immensely useful, and everyone wants to use it.
First, it’s important to consider where we are in the evolution of immersive computing, and to make sure we’re making the right comparisons. One instinct is to look at where we are with mobile phones. The iPhone launched a decade ago, and now smartphones are ubiquitous. So some assume immersive computing will follow the same curve over the same timescale. That’s not quite right, though, and to see why, it’s useful to look at the history of mobile phones.
The first commercially available mobile phone, the DynaTAC 8000X, was released in 1984. That’s 33 years ago, not ten. Since we’re on the very first generation of consumer-available immersive computing devices today, it’s more appropriate to compare them to the mobile phones from the 1980s than to anything that has been released in the past decade. That isn’t to say that I believe VR and AR will take 30 years to fully mature and achieve a similar level of adoption and impact. I’m far more optimistic than that. But smartphones from 10 years ago are the wrong comparison point.
Businessman using a Motorola DynaTAC 8000X portable cellular phone, ca. 1984, Image: Motorola, Inc. Legacy Archives Collection
There’s another relevant lesson from the evolution of mobile phones, too. Early in the life of a technology, it can be tempting to see it as niche, only serving a small subset of people. GPS was seen as something that would be narrowly useful — definitely to emergency responders, and maybe to people hiking in the remote wilderness, but to who else? The first camera phones were extremely poor quality by today’s standards. And yet today, GPS and smartphone cameras are ubiquitous.
Regardless of the specific time-scales involved, what happened with the mobile phone will happen with VR and AR. The capabilities will improve. Devices will become less expensive and easier to use. There will be breakthroughs in user interfaces and applications. As value goes up and costs come down, immersive computing will make sense for more and more people. And this isn’t a question of if—it’s a question of when.
What will it take?
Our team’s mantra is “immersive computing for everyone”. We began that mission with Cardboard, and Daydream was the next step, bringing high quality mobile VR to phones across the Android ecosystem. Tango brings a form of AR to smartphones, and can be used for anything from gaming, to entertainment, to education with Expeditions AR. But what’s next? And what needs to happen to get there?
The short answer: It’s not any one thing. Everything needs to get better.
Here’s the long answer.
First, friction. We’ve got to remove more of the friction in using these devices. Headsets have to get easier to use, more comfortable, and more portable. Daydream standalone VR headsets, which we just announced at Google I/O, are a step in the right direction. They have everything you need for VR built in. Getting into VR is as easy as picking them up. But we’re hardly done streamlining the user experience.
Next, there are the underlying technologies. To make VR more transporting, and AR more convincing and useful, everything behind these experiences must improve: displays, optics, tracking, input, GPUs, sensors, and more. As one benchmark, to achieve “retina” resolution in VR — that is, to give a person 20/20 vision across their full field of view — we’ll need roughly 30 times more pixels than we have in today’s displays. To make more refined forms of AR possible, smartphones will need more advanced sensing capabilities. Our devices will need to understand motion, space, and very precise location. We’ll need precision not in meters, but in centimeters or even millimeters.
These may sound like huge technological leaps. And they are. But we’re already making progress. WorldSense, which enables positional tracking for VR, and VPS — a “visual positioning service,” like GPS, but for precise indoor location — will be two important enablers here. And we should remember that as advanced as they may seem, today’s VR and AR devices are largely made from repurposed smartphone components. It’s like we’re building airplanes from bicycle and car parts. You can do it — it’s how the Wright brothers started — but it’s hardly where things converge. Over the next few years, we’ll start to see more components built from the ground up for VR and AR, and these will enable far more capable devices.
Another part here is enabling a broader, deeper set of experiences. As the underlying technology improves, new applications and experiences will become possible. To illustrate what I mean, take an app like Tilt Brush. It simply couldn’t have existed before we had controllers that are positionally tracked in space. Painting with a gamepad just didn’t make sense. The hardware and software had to evolve together. We’ll see more of this co-evolution in the future. Accurate hand tracking will make a new set of interfaces possible, and eye tracking will enable new types of social applications that will rely on eye contact and realistic facial expressions.
And, there just needs to be more to experience, do, and use. We need experiences and apps for every person, use case, and interest. To help, we’re supporting creators in YouTube Spaces and with Jump cameras to bring more ideas and perspectives to the medium. We’ve worked closely with dozens of artists to help them explore VR with our Tilt Brush Artists in Residence program. But of course, we’ll need help from developers, creators, storytellers, and cinematographers everywhere to explore and build.
The cave of possibilities
There’s a metaphor I use to think about where we are today with immersive computing: we’re exploring the cave of possibilities. It’s huge, with many branches and potential paths ahead — and it’s mostly dark. While there are glimmers in places, it’s hard to see very far ahead. But by doing research, building prototypes, creating products, and most importantly seeing how people use and benefit from these technologies, we light up large parts of the cave, and see more clearly where this all goes. We make progress.
From our present vantage point, it can be hard to clearly see how this all unfolds. What is clear, though, is that it will unfold. I’m very optimistic, and believe that immersive computing will, in time, transform our lives for the better. We’re already seeing glimpses of that today — helping kids explore more of the world from their classrooms, letting journalists bring their audience to the front lines of world events, and enabling artists to create previously unimaginable works of art. One day, we’ll wonder how we ever got along without computing that works like we do — computing that’s environmentally aware, that displays information to us in context, and that looks, feels, and behaves like the real world.
This won’t all happen overnight. But it will happen. And it will change how we work, play, live, and learn.
Play Store on Daydream