Nowadays, when we hear about AI, the topic is almost always Generative AI, specifically Generative Language Models. However, as Gartner puts it, “AI does not revolve around GenAI.” After a couple of years of getting used to various chatbot and copilot-like use cases, we believe the next phase where AI will have a real impact on business and industry is Hybrid AI, which combines Generative and Non-Generative AI.
This blog reflects on the role of Non-Generative AI in RAG solutions (Retrieval Augmented Generation) and introduces the value of predictive models in a new Hybrid AI framework that we could call PAG, from the words "Prediction Augmented Generation".
Recap: What's happened so far
Let’s start with a recap. At the start of 2023, we witnessed a game-changer emerge. ChatGPT came along and, finally, after years of skepticism and misunderstanding of AI, people could get a feel for its potential. If you’re like me, you tested out ChatGPT and thought, “Wow, it actually talks!”
Chatbots were answering questions just like a human would. However, we quickly noticed the problems with the generative models behind tools like GPT. Real, valuable questions aren't just fun trivia; they look more like this…
These questions demand internal knowledge about your company’s operations, business, products, customers, internal guidelines, and more. For these questions, GPT either responds with “I don’t know” or it fabricates an answer.
The unfortunate truth is that chatbots based on GenAI were answering questions just like a human who doesn’t work at your company.
Introducing Semantic Search
Luckily, we have another trick up our sleeve: Semantic Search. This allows a chat solution to search through mountains of internal documentation to find the most relevant pieces of information to use in the answer to a question. Generative AI then elaborates on that information, for better or worse, thanks to its uncanny ability to mimic human language, based mainly on internet text. However, through its elaboration, GenAI will always make the answer more digestible and actionable.
Semantic Search is the “Retrieval” part of RAG, Retrieval Augmented Generation. It is not Generative AI, but it is still AI because a good semantic search model is language-aware. It has been trained on human language, so it understands the context of a question. For example, if you ask a Semantic Search model to find the most relevant text for the question “Which cells are allocated to first offenders?” it will return text from guidelines on space management in a prison and skip all the documentation you have on battery cells or cancer cells.
This use of Non-Generative AI for semantic search has its roots in Natural Language Processing (NLP) and Natural Language Understanding (NLU). And although the chat experience of RAG makes us think of it as “Generative AI,” it is arguably the Non-Generative “R” in “RAG” that provides the most value.
So, spoiler alert: anyone using RAG today is already using Hybrid AI.
But RAG only goes so far.
Now, people are noticing a bigger problem with chatbots. The most valuable questions actually look like this…
This is where RAG breaks down, as there is nothing useful that can be “Retrieved”:
- A sales team in office supplies has no magic document that tells them whether John Smith is likely to buy a new LCD display.
- A controller of connected equipment in a mining company has no magic document telling them which component needs replacing in a drilling unit.
Machine Learning and predictive insights
Luckily, we have another trick up our sleeve: Machine Learning.
Remember ML? It has been around for years but never enjoyed the same game-changing fanfare as GPT. Perhaps ML even got a bad reputation for consuming a lot of PoC time with relatively little impact. Part of the reason might be that the raw outputs from Machine Learning are probabilities. What’s the probability that someone will buy a product? What’s the probability that someone belongs to this category? It has proven difficult to make use of a machine that just spits out probabilities — in other words, ML didn’t talk…
So, let's recap Machine Learning. It is a model that learns from past data to predict and explain the present and future.
For a typical business use case of ML, we can train a predictive model on sales data from quotes that have been accepted and rejected in the past. Then, for a new quote, the model predicts whether it will be accepted or rejected. It works because the accept/reject decision is influenced by all the details we have on the prospect, the product, the price, the market—even the time of year, who knows.
As an added bonus, ML predictions come with an indication of why the outcome is likely to be one way or the other. In the prospecting example above, the model could predict a quote to be accepted and at the same time reveal that, for this particular case, it is the geographical location of the prospect or the payment plan of the product that most influences their decision to buy. Such insights are valuable ammunition for sales professionals when discussing with the prospect.
For a typical industrial use case, we can train a predictive model on sensor data from online machines that have been running fine and those that have failed over the past year. This model can then recognise early signs of failure and relate them to specific machine components — the bread-and-butter of predictive maintenance.
Predictions where facts are unavailable
How Predictive models replace Semantic Search
Predictive models like these can replace the semantic search of Retrieval Augmented Generation (RAG), creating something we could call PAG: Prediction Augmented Generation.
A PAG solution works in much the same way as RAG, but it does not search for the answer to a question in some magic internal document that can tell the future. Instead it exploits predictive models.
The PAG architecture compliments ML with GenAI, using GenAI as both a translator and a communicator.
As a translator, GenAI will:
- First, translate the human question into input for a predictive model.
- Then, translate the model output (probabilities and reasons) into a meaningful, human answer.
As a communicator, GenAI will then elaborate on that information — accurately or not — drawing on its uncanny ability to mimic human responses, primarily based on internet text.
This use of Non-Generative AI for predictions puts the “P” in “PAG” and provides the most value, while the chat experience ensures that, unlike many ML models in the past, the benefits will be realised by humans who just want someone to talk to.
As with RAG, the result is an example of Hybrid AI, combining GenAI with NonGen—but unlike RAG, the solution provides useful advice even in the absence of hard facts.
Hybrid AI Combining NonGen AI with GenAI. The PAG example.
Two types of use cases for Prediction Augmented Generation (PAG)
There are many types of use cases for PAG, taking advantage of cloud technologies that both consume and serve model predictions.
One group of use cases consumes predictions in batch processes that feed a company’s data warehouse. For example, a prospecting model may have been trained on data from thousands of previous quotes. Each quote may have involved 100 facts used to train the model, including information about products, prices, customers, prospects, etc. At the end of each working day, new data are available on these or new customers, prospects, etc., and the predictive model is used to ‘score’ all prospects in the daily batch according to how likely they are to buy certain products. Those scores are saved in a database along with other details of the prospect, ready to assist sales staff the following day.
Another group of use cases gets predictions from the model ‘on demand.’ For this, we create an API to serve the model, which any application can use by providing the details the model needs to make a prediction. For example, the same prospecting model above can be used on demand when a new prospect comes along. First, the same 100 facts are gathered for the new prospect (available in the sales database), and these are input into the model API. A single score—indicating how likely the prospect is to buy a certain product—is returned by the API for use by the application.
Example of a PAG case using anonymized customer data: Prospector Agent
Let’s see how this looks from the perspective of a salesperson who has PAG enabled in an AI Assistant. In the following three examples, the AI Agent uses batch-scoring and performs the following steps:
- Takes the natural language question from the user.
- Recognizes from the context that the user seeks information on ready-scored prospects.
- Translates the natural language question into requirements for the model input: the number of prospects required and the fact that they should be ordered by priority.
- Fetches the top prospects from the database where prospects were scored the previous night.
- Fetches the contact details for these prospects, knowing that the purpose is to assist salespeople with prospecting tasks.
- Structures the resulting information in an easy-to-use format and delivers it along with human-like advice.
The chatbot has become a predictive support tool, providing all the information needed by the user in the same chat window that could also be used to retrieve facts about customers and products, using RAG and PAG interchangeably.
The benefits of using GenAI to interpret ML outputs are even more striking when we introduce an ML trick called ‘explainability’. This is where the model gives reasons for a prediction, based on the influence of certain data points. The best way to describe this is through another demonstration, this time using on-demand predictions.
In the next 3 examples the AI Agent recognises that the user has a new prospect and seeks model scoring, then
- Confirms interactively, which precise company is being considered as a prospect – enabling the necessary details to be fetched for that prospect
- Fetches the prospect details and inputs them to the model API, which returns a score for the probability of winning the sale, along what data point was mostly responsible for that prediction
- Translates the score and influential data point into human-sounding advice for the user.
The final step demonstrates the true synergy of predictive modeling and generative AI: the score on its own is of little use to the user. History has shown that numbers can be hard to interpret and trust when intended for use in decision support. With PAG, the user receives human-like advice with reasoning, not just a number. At the end of the day, time is saved by not pursuing weak leads, and there’s a higher chance that favorable quotes are made to happy new customers.
The previous example demonstrated hybrid AI with structured, operational data from business applications. This represents a large family of operational use cases, providing decision support in the context of sales, customer support, and CRM.
However, Machine Learning models also excel with unstructured data, such as physical measurements and real-time monitoring of machinery. This opens the door for another family of use cases in predictive maintenance in industries where equipment and sensors generate time series data.
The next example represents this family of use cases, using data from sensors in wearable devices. The data were collected in a research study of employees experiencing stress during shift work.
Example use case with IoT data: Employee Stress Alert
The picture above illustrates the training of a stress predictor model, using both physical measurements from a wearable device and text data from a written survey.
The physical measurements relate to heart rate, sweat, and motion signals, collected throughout a working shift. The survey text combines a simple one-liner description of the shift, provided by the employee, with an assessment of their stress levels.
Combining these data, the ML model is trained so that stress levels can be predicted after any future employee shift, when the employee provides a one-liner summary of the shift and the corresponding physical signal data from the wearable device are gathered to complete the input to the model.
In the following demonstration, the AI Agent uses on-demand scoring and performs the following steps:
- Takes the natural language shift summary from the user.
- Fetches the same user’s physical data from cloud storage (knowing the user’s ID).
- Provides the combined input to the ML model API.
- Receives a score indicating the likelihood that the employee is stressed.
- Translates the score into meaningful, tailored advice.
Combining predictive models with GenAI
The final step demonstrates the power of combining predictive models with GenAI: advice on what to do, specific to the level of stress predicted by the model, is enriched with the ‘knowledge’ held by models like GPT from vast amounts of internet text.
However, advice regarding health and safety should be treated with care, and in most cases, we would choose to further enrich the Agent’s advice with internal guidelines regarding available help or procedures to follow according to the employer. This brings back our old friend, Semantic Search, in a seamless pipeline of Prediction, Retrieval, and Generation (PRAG?).
The Employee Stress Agent represents the broader family of “predictive maintenance” use cases, where more value is expected to come from industrial usage of PAG with connected equipment, such as in the mining or manufacturing industries. In this context, the need for intervention can be recognized by a solution that monitors IoT data in two ways:
- Anomaly detection model: when a certain signal shows unexpected behavior, justifying an inspection.
- Machine Learning model: when the complex combination of multiple, interacting signals resembles circumstances that led to failures in the past—something only revealed by a model that has learned from historical data.
In both cases, predictive maintenance benefits from Hybrid AI as follows:
The model provides the prediction (likelihood) of a problem along with indicators of why a problem is likely, such as which particular signals have the largest impact on a prediction or which historical failures were preceded by the most similar behavior.
The prediction, along with reasons for recommending intervention, can be used in a final Semantic Search through technical documentation to advise on which machine components should be inspected and provide details on their standard maintenance requirements, life cycle, etc.
PAG and the demonstrations here are a major step toward realizing the next revolution in AI, as anticipated by Gartner, who suggested the advent of Hybrid AI this summer with the following statements:
"Overfocusing on Gen AI can lead to ignoring the broader set of alternative and more established AI techniques, which are a better fit for the majority of potential AI use cases."
"Organizations that develop the ability to combine the right AI techniques are uniquely positioned to build AI systems that have better accuracy, transparency and performance, while also reducing costs and need of data."
Leinar Ramos, Gartner Senior Director Analyst
In summary, the arrival of ChatGPT sparked renewed interest in AI as a powerful tool in the digitalization of business and industry because—“Wow, it TALKS!”
However, since the introduction and success of RAG solutions, it has become clear that the role of Generative AI is to summarize, enrich, and combine the results of NonGen AI, making those results actionable with greater confidence and specific recommendations.
In this blog, we have extended this hybrid philosophy to show the benefits of combining GenAI with predictive models from Machine Learning. The resulting Hybrid AI framework we have here called Prediction Augmented Generation (PAG), has been demonstrated to benefit operational use cases (e.g., sales) as well as IoT use cases (shown for personal health and, by extension, predictive maintenance of machinery).
About the author
Tony Shepherd holds a Ph.D. in Computer Science from University College London, with his dissertation focusing on machine learning. He works as a Senior Consultant in twoday's Data & Analytics team.