Registered users can unlock up to five pieces of premium content each month.
The Market Is Shifting Its Focus from Large Language Models to Small Language Models |
NEWS |
The Artificial Intelligence (AI) market is seeing a profound shift in foundational model innovation. Although AI leaders are still innovating at the top end targeting Artificial General Intelligence (AGI), opportunities in the enterprise market are leading to investment and developer activity in a new class—Small Language Models (SLMs). SLMs are a new category of foundational models with fewer than 15 billion parameters. These models are built by extracting and compressing application-specific parameters from Large Language Models (LLMs). These are fine-tuned on datasets to target specific tasks, verticals, or applications. SLMs have much lower memory and power requirements, offering improved AI inferencing efficiency and time-to-train. Innovation in SLMs is being spurred by several interconnected supply side factors. Open-source expansion has accelerated innovation with platforms like Hugging Face and big tech investors like Meta acting as a foundation on which developers can build new leading-edge models. Part of this has been the availability and improvements of Machine Learning Operations (MLOps) tools. Big strides, and open-sourcing, in model compression, optimization, quantization, pruning, etc. are supporting the development of competitive SLMs. The transition from cloud to edge/devices has necessitated software innovation to build models capable of bringing productivity-AI applications to market. However, it is important to note that certain complex applications like mathematical reasoning and video understanding still require larger models.
Two types of SLMs (“tailored” and tiny) emerged in 2023, which was breakout year for “tailored” models (vertical or application-specific models with between 2 billion and 15 billion parameters). Marquee companies like Meta (LLaMA 2 7b, 13b) and Microsoft (Orca 13b) have poured capital into innovation, while startups have also shown success. Mistral (a 9-month-old startup) built an open-sourced 7 billion parameter language model offering similar performance to OpenAI’s 175 billion parameter model, GPT-3.5.
Moving into 2024, ABI Research sees new innovation targeting tiny language models. These models, with fewer than 2 billion parameters or even fewer than 1 billion parameters in the future, will further reduce power, memory, and computation demands to run generative AI applications without compromising processing efficiency. These models will be deployed in even more constrained locations. AI leaders remain mostly in the Research and Development (R&D) phase, with some being deployed in commercial applications:
These models still remain fairly basic, but evidence suggests that they are certainly moving in the right direction. For example, Microsoft Phi-1.5 has shown higher performance on difficult reasoning tasks compared to larger models like Vicuna-13B.
Can "Tailored" and "Tiny" Language Models Support Enterprise Deployment? |
IMPACT |
Enterprises are still facing enormous challenges hindering generative AI adoption deployment. First, models are not reliable enough for mission-critical or customer-facing use cases, as they remain subject to hallucinations and bias. Second, enterprises still do not have the infrastructure, datasets, structures, and skill set to support effective deployment. Third, running generative AI is costly. Leveraging targeted multiple SLMs could offer a better route to AI deployment, as compared to “giant” LLMs, by following these recommendations:
But challenges still remain. Knowledge will be limited, as SLMs are memory constrained. This can be alleviated with RAG, but this may not be appropriate for each use case. In addition, operationally, running hundreds of task-specific SLMs within or across different business units will be challenging. Both LLMs and SLMs offer enterprise opportunities and can support different purposes. Decision makers must focus on aligning the task, expectation, and infrastructure/skillset to help determine what model to deploy.
Supply Chain Has Opportunities to Take Advantage of SLM Innovation |
RECOMMENDATIONS |
So far, the generative AI market has been dominated by the loss-leading, cloud-focused consumer segment (which continues to expect free access to applications like ChatGPT). The enterprise and device spaces are lagging behind. Competitive SLMs may offer stakeholders an opportunity to build new monetization strategies and crack the enterprise space. ABI Research believes key players in the value chain must embrace, invest in, and support SLMs as part of a wider AI commercialization strategy:
In 2024, the SLM market will continue to expand. “Tailored” models will begin to mature with commercial initiatives starting to leverage 3 billion to 15 billion parameter models as exemplified by the PC and smartphone markets. Tiny models will receive greater R&D and investment, supported by open-source commitments and greater developer interest. One area that needs attention in 2024 is multi-modality, as productive applications will rely on this feature. Looking forward, accelerated innovation within MLOps techniques will enable further model compression, resulting in even smaller (sub-1 billion) and more competitive models.