Enterprise search is hard: why it’s so behind—and what it’ll take to catch up
All too often, we take search for granted.
Google works amazingly well—so much so that it’s painfully obvious where enterprise search software has fallen behind. Trying to find something in our day-to-day work just isn’t as seamless as trying to find something on the Internet.
Why is that? Simply put, enterprise search is hard. Each company’s content is unique to that organization, unlike the shared web that billions of people search daily. Within a company, employees are usually looking for specific, unrepeated information, rather than the millions of pages on the web which might answer the same question. And all this content is usually comprehensible only by those working at the company, making it hard to learn from usage patterns and feedback.
Given the complexity, enterprise search software has been years behind web search for far too long. Here’s what it’ll take to catch enterprise search software up to the current needs of companies today.
Get the basics right
There’s a lot of functionality you’d expect the search feature in popular SaaS (Software as a Service) tools to have, but most of the time, their native search simply doesn’t work. That’s because they often miss out on some of the basics.
Enterprise search software should search across all of your content, not just document titles. It should come with knowledge of standard acronyms and synonyms, automatically including results that mention “Chief Executive Officer'' when you search for “CEO,” and results that mention “holiday calendar” when you search for “vacation calendar.” It should also understand that different parts of your search query are intended for different purposes; when you search for “board meeting slides,'' it should know to surface the slides not just because of what they contain, but what they are.
Ranking algorithms should constantly learn from feedback. If you click on a search result low down the list, it should understand that the result probably should’ve been ranked higher, while being careful not to overfit to this one data point (unlike with Google search, feedback signals here are much sparser and less reliable). In the not infrequent moments when you slip up, it should recognize and correct your typos, based on the language of your company.
All of this should exist in one unified interface across all apps, with no manual tuning required, ready to go from Day 1.
Understand your company’s language
Typing in keywords and hoping for a match has been the dominant paradigm for search. But often, that’s not enough. Sometimes you might remember what a document talked about, but not how it was worded. Enterprise search software should come with built-in semantic search, so you can look for information the way you remembered it—even if you replace “what’s the wifi password” with “where are the internet settings.”
Of course, how you communicate within your company could differ very wildly from other companies. Depending on if you build software or grow fruit, “apple” could refer to a few different concepts. An effective enterprise search system needs custom deep learning models to help it understand your company’s specific language. These models not only drive semantic search, but also learn what words you and your colleagues use as synonyms—whether it's that project that got renamed, or the clever acronym you created for it. The amount of data in a company is usually many orders of magnitude smaller than the web or public datasources, so robust domain adaptation on such low volume requires careful, nuanced application of transfer learning.
Understand how your company works
Your company is unique. Different teams work on different documents, talk about varied projects, and use an assortment of software in their own idiosyncratic ways. None of that is shared by other companies, yet understanding all of that is critical to a search experience that just works. Constantly building a knowledge graph of all the buzzing activity within your company enables search to surface the most important, relevant and fresh content, for every query. Graph learning techniques also enable an understanding of how all documents, people, and concepts within the company relate to each other.
Aggregated data from various sources should provide a 360-degree view of all your employees—who they are, what they work on, who they work closely with, and what they’ve been up to. A similar view for customers should help teams track leads and opportunities in one unified interface.
Understand how you work
What you need to know to get work done is very different from what other people in the company might need. A search for, say, “quarterly goals'' should take into account if you’re a software engineer or a sales account executive, instead of just showing the same results to everyone. Every search should be deeply personalized, and the ranking algorithms should leverage an understanding of the documents you work on, the tools you use, the projects you talk about, and the region you work from. The knowledge graph must inform the search system which of your coworkers’ content you care about the most, and use that to make sure every search is tailored specifically for you. This understanding of each user should be further used in the autocomplete feature when suggesting queries and documents to quickly get you what you need.
Building a great search experience for the enterprise requires solving previously unsurmounted challenges. It requires deeply understanding how you work, and what information matters to you. At Glean, we tackle these problems every day. And we’re excited to give teams the work assistant that catches enterprise search up to where it should be.
Meet Glean. The work assistant with intuition.
Glean brings you exactly the information you need, right when you need it. Making it easier for you and your team to get big things done.
Organizing your team’s knowledge with Collections
Nearly seven months after launch, search remains at the core of Glean. But there’s still an opportunity for curated collections of information.