
Over the past few years, large language models (LLMs) have revolutionized the software industry. Given their ability to excel at multi-step reasoning, LLMs have helped enterprises streamline workflows and adapt to the unknown. However, employing such models comes with sky-high costs, latency issues, and limited flexibility. In the realm of IT operations, it is generally wiser to employ smaller, domain-specific models instead.
Most tasks within ITOps do not require LLMs. Utilizing a billion parameter model to automate routine ITOps tasks, like ticket triaging and classification, is likely a colossal waste of money. What's more, LLMs have much higher latency than smaller, domain-specific models, and repetitive IT operations, such as alert creation and incident triage, require fast responses. Importantly, most ITOps tasks are well-defined, making right-sized, domain-specific models a great fit.
The Nature of ITOps
ITOps tasks are generally latency-sensitive, high-volume, and domain-specific; within many enterprises, thousands of events are processed within a narrow and predictable context. By nature, many of these tasks, including ticket routing, anomaly detection, and log generation, are repetitive and highly-structured.
Put simply, high volume, routine tasks require low-latency responses, and domain-specific LLMs provide exactly this. Not only are right-sized models faster, but they are also more predictable and easier to fine-tune to account for a specific environment.
Before rolling out a language model of any size into an organization's workflows, it is important to match the solution to the need at hand. When it comes to ITOps, there is often a need to automate narrow, routing tasks. By fine-tuning small models with proprietary data and right-sizing these models to one's specific workloads, enterprises can maximize accuracy and speed, while keeping costs under control.
When considering costs, IT personnel should always be sure to take operational metrics, such as latency, throughput, system uptime, and cost per query, into consideration. It's best not to over-engineer AI in ITOps, as this will introduce unnecessary complexity, add latency, and increase costs to one's organization.
The Compliance and Data Privacy Benefits to Domain-Specific Models
Given that smaller, domain-specific models are easier to run locally, all of the enterprise's sensitive infrastructure can often be retained in-house. With models running on-prem or in a private cloud, sensitive data can remain within the confines of the enterprise, reducing privacy and regulatory risks. In highly regulated industries, such as healthcare and financial services, this benefit cannot be overstated.
By not using cloud-based APIs or sending sensitive data to third-party platforms, enterprises gain more control and more easily adhere to compliance requirements. The limited scope of smaller models makes them easier to audit as well. For many cost-conscious enterprises, running self-hostable, right-sized models is far better than having an API dependency, which can potentially expose the organization to external threats outside of one's control.
Trade-Offs, Challenges, and Limitations of Smaller Models
Although domain-specific, right-sized models certainly offer strong enterprise alignment, they are not without limitations. Smaller models have more limited general reasoning capabilities, causing them to perform best within narrowly defined domains, as opposed to within cross-domain or open-ended reasoning environments.
Additionally, because the training datasets are smaller and more focused, it is vital that smaller models' training data is of particularly high quality. Poor training data will severely hamper small model accuracy and degrade trustworthiness.
Successful deployment of a domain-specific model requires in-house expertise, not only in integration and fine-tuning, but also in MLOps and inference optimization. Within the enterprise, IT personnel must be tasked with monitoring, fine-tuning, and retraining the small models as the organization's processes, workflows, and data evolve.
Lastly, even though right-sized models are comparatively simpler than LLMs, they are not without governance complexity; like their LLM counterparts, small models still require controls around explainability, versioning, access, and auditability.
Opting for a Hybrid Approach
Depending on the size and nature of an enterprise's environment, there are instances where smaller models complement, rather than replace, larger models. A hybrid approach could involve using a domain-specific model for initial data processing, whereby more complex cases are routed to an LLM.
Another hybrid strategy could be to use smaller models for repeatable and sensitive workflows, while concurrently utilizing larger language models for exploratory or broad reasoning tasks. Such an approach can keep costs relatively low, while still maximizing value.
Key Takeaways
When it comes to model usage and ITOps, bigger isn't always better. Despite delivering an impressive multi-step reasoning capability, LLMs come with their fair share of baggage, including high costs, latency issues, limited controllability, and governance risks.
For most ITOps activities, smaller language models designed for precision and efficiency are preferable. After all, most routine ITOps tasks, such as log analysis and anomaly detection, require low latency and a very specific knowledge base.
By fine-tuning smaller models on domain-specific data, IT personnel can effectively optimize their environment for unique use cases. To put it simply, most commonplace ITOps tasks do not require LLM capabilities; in fact, a domain-specific model can often get the job done quicker, cheaper, and many times, safer.
