Language models have revolutionized the way we interact with knowledge and information. From chatbots to text summarization, their wide range of enterprise applications will enable us to transform the way we work.
However, with so many providers and models available, it can be overwhelming to choose the right approach. Being hasty here could cost you in terms of iteration speed, or potential deals from security-conscious enterprise customers. In this blog, we’ll discuss the two main options for using language models in production, and the respective pros and cons of each option.
Something for everyone
Before diving into the options, it’s important to note that not all language models are created equal. Some models are better suited for specific tasks than others, and the quality of predictions can vary greatly depending on the provider. Even though a lot of machine learning research is available open-source, there is great business incentive for these providers to hold on tightly to proprietary techniques, or their “secret sauce”.
To compare the quality of different models, benchmarks like Stanford’s HELM (Holistic Evaluation of Language Models) can be useful. HELM evaluates large language models (LLMs) built by different providers on a common set of tasks and metrics, providing a standardized way to compare performance. Of course, however, the best way to test models is to build your own set of evaluations and metrics that suit your own needs and requirements.
Once you’ve identified the right model for your use case, you have two options for using it in production: either using a closed-source provider’s API, or hosting an open-source model.
Option 1 – Use a closed-source provider’s API
Closed-source providers like OpenAI, Cohere, and Anthropic offer access to their language models through subscriptions to their APIs. The process is simple – once you sign up for a provider, they’ll give you access to their API. You’ll then be able to send text to the API and receive a response. Users are typically charged based on the length of input and output.
- Ease of setup – This option requires no infrastructure or maintenance of your own. APIs provide a standardized way to access the model, making integration straightforward and adoption easy.
- No strings attached – There’s minimal investment on your side when it comes to cloud-source APIs. Continuously compare different LLMs, and readily move on to greener pastures when a different provider offers something better for your use case.
- Low to no maintenance – You won’t need any ML Ops know-how in-house to set up and utilize your LLM, saving time, costs, and headaches.
- Security & privacy – If you’re sending data to a third party, there’s an inherent risk of data leaks, along with your proprietary information being used to train and improve the third party’s models. If you have enterprise customers, this option may be a tough sell to them.
- Lack of flexibility – Closed-source models are often simultaneously hard to customize, while being eye-wateringly expensive to fine-tune. This may or may not be a problem depending on how specialized your application is. For example, any standard LLM should be able to classify text as being in English or Spanish. However, if you’re looking to build a text classifier that’s specific to custom data (like classifying a user command into an internal schema for a voice assistant), then you might need your model to be fine-tuned.
- Pricing – The cost of your subscription is entirely at the discretion of the provider, and is subject to their SLAs and pricing scheme. At scale, closed-source solutions may end up being far more expensive than models hosted in-house.
- Product defensibility – Onboarding a closed-source solution will enable competitors to more easily copy your product’s approach and market differentiation. If AI usage is a core strength for your product, this may be of considerable concern.
Option 2 – Host an open-source model
Open-source models like HuggingFace BLOOM, Meta LLaMA, and Google Flan-T5 are freely available for anyone to use. However, solutions or companies which host the model for you and provide API-based access (e.g. HuggingFace and Replicate) are very nascent, so you’ll often end up having to host them yourself. The pros and cons of closed-source models are almost (expectedly) reversed if you choose to go with open-source models.
- Security & privacy – Hosting the model yourself grants you full control over the data and how it’s processed. Potential customers concerned with data privacy might find this prospect much more appealing.
- Additional flexibility – It’s considerably easier to customize and fine-tune the model to your specific use case, enabling more specialized applications and quick responses to sudden needs.
- Pricing – If your usage has high throughput, and scaling is a concern, this option might be much more affordable in the long-run.
- Product defensibility – It’s your model, and yours alone. You’ll be free to continuously tune the “secret sauce” to uniquely fit your particular use-case, making your product harder to imitate and more resilient to competition.
- Difficult setup – Hosting the model yourself requires more technical expertise and infrastructure, making it more time-consuming and complex to set up and integrate.
- Upgrading models – Any and all upgrades you need, you’ll have to build in-house. It might end up being an expensive and tricky affair.
- In-house ML requirements – You’ll be required to house dedicated experts with know-how for fine-tuning models and MLOps. Progress and speed may also be impacted by turnover and onboarding for new hires.
Find what’s best for you
Each model differs in their number of parameters and tradeoffs. Smaller models are cheaper and easier to manage, but might deliver predictions of poorer quality. It’s why companies often start with closed-source models for testing and iterating on ideas, then transition to open-source or in-house models once those ideas find product-market fit.
Regardless, there’s a model fit for everyone’s use case and needs out there. The field is rapidly advancing, both in terms of technology and business models – so expect only more options to choose from moving forward!
Here at Glean, we use an optimal combination of these approaches to ensure that our users have a great product experience without having to sweat over implementation. To learn more and see Glean in action, sign up for a demo today!