Using RAG and Long Context on AIBots — when to use which?

Published on January 19, 2025

Using RAG and Long Context on AIBots — when to use which?

Two approaches to optimise LLMs, both possible on AIBots.

RAG and/or Long Context are available on AIBots

Both RAG and Long Context have their merits. On AIBots, users can use both approaches. Files uploaded on AIBots are processed with RAG and Bot Instructions support up to 300,000 characters (~40,000–70,000 words), i.e. Long Context. Text from files can be copy-pasted into Bot Instructions.

Some of the 10,000 AI Bots created, using a hybrid of RAG and Long Context

What are the different approaches?

When ChatGPT first caught the world’s attention in Nov 2022, many of us saw how useful it could be. However, when it came to using it for more impactful use cases, we felt that the model needed more contextual, internal information.

Model fine-tuning was a known solution, but it is extremely expensive and time-consuming.

Retrieval-augmented Generation (RAG) was an emergent approach, which AIBots adopted in our hackathon prototype in July 2023. This allowed relevant information from a Bot’s knowledge base to be incorporated into every prompt. Customised GPTs by OpenAI adopted this approach too, launched worldwide in Nov 2023.

As technology advancements to operate LLMs more cheaply progressed, context windows also expanded, making Long Context an option. This meant that for some use cases, where the background knowledge needed for meaningful responses could fit within 50,000 words, RAG would not be needed.

RAG vs. Long Context

However, Long Context can’t completely replace RAG and the accuracy of Long Context is not always better, being susceptible to the Lost-in-the-Middle problem. Below is an in-depth comparison of RAG and Long Context, and when to use which.

A comparison of RAG and Long Context which can be used in hybrid on AIBots

There are benefits to both approaches and it is why AIBots offers Bot Creators both options to experiment and find out which combination would best suit their use case and produce the best responses.

Developing a good RAG solution is not easy

While there are many tools to help users incorporate RAG, such as LangChain, these may not be suitable for scale or for further customisation.

RAG solutions involve the parsing of files uploaded, converting the files into vectors, storing the files as vectors, and initiating the search and retrieval process. There are multiple variations to each step, and significant efforts for research, experimentation, and design were undertaken to optimise between the trade-offs in considering speed, costs, accuracy, volume, and scope.

The GovText team comprising skilled and dedicated AI Engineers developed an impressive RAG solution in a short time, which AIBots chose to adopt. Many teams attempt to build such a RAG solution, including us in the early stages, but GovText’s offering works much better.

In the AIBots team, we value collaboration between product teams and stakeholders as this will help us scale sustainably. The potential of this new wave of AI developments far exceeds what our team of five can handle.

GovText also offers RAG-as-a-Service, which is a subsequent offering that AIBots creators can consider. After creating a RAG Chatbot on AIBots (can be completed under 15 mins) and testing it, if users wish for further customisation particularly on the user interface, they can now call the GovText RAG API directly. Do contact their Product Manager, Xiu Quan, to find out more.

Create or use an AI Bot here today at https://aibots.gov.sg
(for Singapore public officers and National Healthcare Group staff only)


Using RAG and Long Context on AIBots — when to use which? was originally published in Government Digital Products, Singapore on Medium, where people are continuing the conversation by highlighting and responding to this story.