Machine Vision and Visual AI in the Spotlight with Apple, Niantic, and Others Eyeing the AI and XR Overlap

Subscribe To Download This Insight

By Eric Abbruzzese | 4Q 2024 | IN-7594

Researchers within Apple’s Artificial Intelligence (AI) research lab recently revealed Depth Pro, a new machine vision model for better depth sensing with a single camera. Niantic, a major name in Extended Reality (XR) and location-based content, recently integrated the powerful Visual Position System (VPS) into its content creation platform. Both announcements highlight an increasing presence and importance of visual data, both for XR devices directly and more broadly.

Registered users can unlock up to five pieces of premium content each month.

Log in or register to unlock this Insight.

 

Iterative, but Exciting Vision Advancements

NEWS


Interest in the combination of machine vision and Extended Reality (XR) has always been high, but the past year of Artificial Intelligence (AI) hype continues to push this further. While only a few companies are capable of scaling XR efforts today, the market is growing and the overlap between XR and, more broadly, applicable technology like AI is increasingly interesting. In preparation for that larger and more competitive market, there is a current focus on improving relevant machine vision products and the capability to support them. Most recently, there are two interesting announcements in the machine vision and XR space to highlight:

  • Apple’s AI research arm published information on Depth Pro, a new AI model that promises to significantly improve depth sensing capability with a single camera.
  • Niantic integrated its Visual Positioning System (VPS) into the Studio creation platform to allow for location-aware creation of Three-Dimensional (3D) content and experiences within one development tool.

While these announcements are not directly related, the tie lies in an increased importance of visual and spatial data.

Visual Data and Machine Vision in the Spotlight

IMPACT


The XR market is a tale of two segments, between hardware and software. The XR hardware market is in flux, with a mix of forward-looking flagship products like Meta’s Orion, present day established headsets like Meta Quest and Pico 4, and now, increasingly, AI-first glasses like Ray Ban Meta smart glasses. Meta being mentioned in all three of these categories is no coincidence. The company has been focused on both XR and AI for years and, to its credit, has shown a commitment to the XR space and successfully launched a number of devices already. Competition will come, especially from other big names in tech like Google and Samsung.

Software, and relatedly, development platforms, are a proving ground for new capabilities not necessarily tied to specific hardware. Depth Pro should prove valuable across machine vision-focused applications regardless of end device, including autonomous vehicles, XR devices, smartphones and tablets, and more. Niantic’s VPS and Studio stands as a go-to-market offering that will be increasingly valuable over time.

Monocular 3D capability will be a boon for smart glasses, especially lower-cost devices with a single display or no display at all. Equipping these devices with an expensive sensor suite is unnecessary and cost-prohibitive, but there are still applications usable on these devices that would benefit from some accurate 3D spatial information. A combination of efficiency and accuracy is required for any machine vision solution. Traditionally, accuracy came through an array of sensors, including stereo cameras and depth sensors, combined with consistently improving algorithms for making sense of that sensor data bringing compute efficiency. This left a gap in performance between these higher-cost devices and the rest of the market. Improvements in visual computation algorithms reduces this gap.

On the visual positioning side, integration into more resources is key for maintaining and expanding a user base. Niantic is in a very favorable position when it comes to location content and visual positioning, having built up a strong foundational platform over the past decade through its location-based gaming content like Pokémon Go. Leveraging that data both internally, as it has been, and as a value add for content creation makes sense, and is something that can be expected to be copied by others that can—few have the resources or existing visual/location data outside of Apple, Google, and some specific mapping companies like MapBox, but partnerships will play a role here.

Another Uncertain, but Promising XR Growth Opportunity

RECOMMENDATIONS


Visual AI will be the next battleground for device enablement, with XR front and center. AI-first smart glasses will look to leverage as inexpensive a camera/sensor setup as possible to enable the most capability of an AI assistant or similar solution. Foregoing some complex 3D visualization or spatial compute capabilities still leaves room for valuable additions to AI platforms—look for translation, media capture improvements, and broad AI recognition/processing expansion as a result of greater access to capable machine vision models.

There is also an important relationship between Two-Dimensional (2D) and 3D sensing. Devices will support a mixture of camera and sensor setups, so there will almost never be a one-size-fits-all approach to sensing or to leveraging that sensor data. As platforms try to scale to more users, support for a mixture of content types is critical—Depth Pro and similar advancements will help improve accuracy of capture and sensing on a wider swath of devices, and thus present a larger addressable market for developers and content providers. Being able to also leverage broad location data, no matter the provider, fills an additional data gap.

The market is young here, with XR hardware strong in some areas (Virtual Reality (VR), visualization) and weak in others (mixed reality smart glasses). The competitive ecosystem is dominated by a few key players, and that will continue for some time. However, ways to differentiate are growing—the big tech names will be keeping an eye on high-quality, specialized vendors with which to partner, acquire, or copy. Both Apple and Google have much unconfirmed in the XR space, but visual AI will certainly be a part of it. It will be a matter of cleanly integrating new features into both new and existing products for those platforms to become established, and there will certainly be growing pains figuring out what works, what is possible, and ultimately what users find valuable. During that time, these advancements in visual AI and integration into more solutions at all levels—creation to end user—present opportunity.

Services