Hardware Abstraction is Essential to Lower the Barrier to Entry for New Technology
|
NEWS
|
Silicon hardware vendors have been acutely aware that the more specialized the hardware is, the higher the barrier to entry is for potential customers. To maximize the performance gain from adopting specialized hardware, the enterprise has traditionally needed to engage with that hardware at a lower level. The most well-known example of this is when NVIDIA Graphics Processing Units (GPUs) were first touted for use as accelerators for purposes other than graphics. The benefits could be clearly demonstrated on paper and in the lab or the Proof of Concept (POC) study, but general adoption of the technology did not start to take off until the CUDA Toolkit became available. CUDA is NVIDIA’s Application Programming Interface (API) that allows programmers to program specifically for the NVIDIA GPU. With CUDA, NVIDIA effectively lowered the barrier to entry for this technology. Developers were able to exploit the new functionality it unlocked to gain a competitive edge, and they could do so without interacting with the hardware directly. NVIDIA has benefitted greatly from the investment that it put into the CUDA ecosystem, and it has become obvious to all, that if a company wants the enterprise to adopt their specialized silicon, then they need a similarly well thought out and robust method with which to interact with that silicon.
The migration of the high end, heterogeneous compute system to the mainstream enterprise data center has seen a boom in specialized acceleration technologies from multiple hardware manufacturers. Intel has the Mount Evans Intelligence Processing Unit (IPU) and Habana Gaudi Artificial Intelligence (AI) processor, NVIDIA has the Ampere GPU and the Bluefield2 Data Processing Unit (DPU), and Qualcomm, the Octeon 10 DPU. As these products face the same barriers to entry that NVIDIA did before CUDA was available for the GPU, it is no surprise then that all these vendors are investing effort into software engineering to produce layers of abstraction between the applications and the specialized hardware.
This shift towards higher levels of hardware abstraction is great for the consumer, as it lowers that barrier to entry for specialized hardware that gives them a competitive advantage, and it means that the lower level of coding becomes the headache of the hardware manufacturers. Surely this is a double win for the tech consumer?
Abstraction Layers Give Consumers Choice
|
IMPACT
|
The different layers of abstraction vary in detail between solutions, but the principles remain the same. The top layer is the application platform layer that is designed for high level developer interaction, in the case of NVIDIA, this layer includes their many purpose-oriented applications such as RIVA for conversational AI or CLARA for healthcare. The next layer down is the acceleration libraries and Software Development Kits (SDKs) which can be based on open standards, or they can be proprietary and optimized for specific hardware. This layer is designed for more advanced developer interaction, as the developer not only needs to understand the application task and requirements, but they also need to have a deeper understanding of the SDKs and acceleration libraries that drive the underlying interaction with the hardware. This layer allows developers to optimize the code to the attributes and requirements of both the application workload and the hardware it is running on. The base layer is the layer that gives the closest interaction with the hardware, since it is a very low-level layer that requires very specific programming skills to interact with. This used to be the layer that specialist high performance programmers used to work at when optimizing platforms but in current times, only the hardware vendor operates at this layer.
The upper two layers of hardware abstraction give potential users of this specialized hardware an enormous number of options when making choices about which hardware to select and how they chose to interact with that hardware. The barriers to entry for this level of specialized technology have been significantly lowered because of these layers of abstraction. Developers work with these abstracted software layers, not the underlying code, and each layer makes them more productive, programming in python rather than CUDA, for example. We have seen that the effort required to port code to accelerated technology is significantly lowered because of the existence of higher-level APIs and their ecosystems. Another benefit is that the headache of maintaining the forward and backward compatibility between generations of processors falls to the hardware manufacturers. But there is a tradeoff to this productivity gained through convenience. Has anyone considered the impact of five years of optimization for a specific API or ecosystem when it comes to move that code base to a different hardware platform?
Can You Have Your Layer Cake and Eat It?
|
RECOMMENDATIONS
|
It is fair to say that the movement between hardware platforms will not be a trivial task—it never is—but in this case, the vendor lock-in is neither cynical nor avoidable, and there are things that the enterprise can do to minimize the impact. The logic behind the hardware-specific lock in is basic; each processor vendor is investing heavily in their own platform to ease the path of access to their innovative new products. To do this, they have to write instructions specific to their hardware and further optimize the software ecosystems to take advantage of the unique properties their hardware has. Technology vendors additionally go to great lengths to enhance the developer experience and provide a common look and feel across the whole acceleration technology ecosystem. They also support common building blocks and harmonized environments across multiple products, which enhances productivity when programming and further commits an organization to that technology.
It is worthy of note that open standards do exist, and that all of these ecosystems are based on or integrated with open APIs and Development Kit (DK) frameworks such as DPDK, SPDK, IPDK, TensorFlow, or PyTorch, to name a handful. This means that portability between platforms should not require a full re-write of the code base. Intel’s OneAPI for example, is a cross-industry, open, standards-based unified programming model that delivers a common developer experience across accelerator architectures. Access to hardware-specific features will obviously not be portable between platforms so any optimization that has been done with respect to that feature will need to be re-written, and the performance gain would be lost. An example of this is NVIDIAs TensorRT, which is an NVIDIA-specific library that provides APIs via C++ and Python. It is an integration that leverages NVIDIA TensorRTs inference optimization on NVDIDIA GPUs within the TensorFlow framework, TensorFlow is an open framework for AI and Machine Learning (ML) developed by the Google Brain Team. NVIDIAs TensorRT can accelerate diverse inference workloads and claims up to six times the performance compared to using TensorFlow alone, which demonstrates that the benefits of buying into a platform ecosystem can be considerable despite the obvious downsides.
The enterprise can further ease the potential lock-in pain associated with hardware abstraction in several different way. Firstly, they need to be aware of this class of lock-in, as it is mostly unavoidable and the benefits with respect to adopting new platforms increasingly outweigh the downsides but being aware that it exists will help the enterprise make informed decisions. Secondly, the enterprise should take the time to make technology choices for the long term, not just over the depreciation cycle of the hardware because software code persists within an organization for considerably longer periods. Technology choices should also be made at the organizational level; try to pick a software ecosystem and hardware platform that meets all your needs, as isolated departmental choices increase the likelihood of deploying multiple specialist platforms with no interoperability. Lastly, when selecting a vendor to partner with, pick one with a clear commitment and track record of investing in, and maintaining, hardware abstraction ecosystems. Systems Integration (SI) specialists should also be engaged where appropriate, as they have a vast knowledge of all the different platforms and once they understand your workloads and requirements, they can guide you towards a long-term solution that minimizes inefficiencies at all stages of the lifecycle.
Hardware abstraction is unavoidable if your aim is to be productive and efficient in today’s AI and ML workspace, even more so if you consider tomorrow’s workspace. The convenience provided by these abstraction layers comes with this new class of vendor lock-in. It is a familiar double-edged sword, and one which the enterprise has gone to great lengths to avoid historically, but this sword is much blunter than those that have gone before it and the benefits far outweigh the costs.