LLM Agents Demystified

Hands-on ReAct Agent implementation with AdalFlow library

Jul 27, 2024

Hi, all! I’m Li Yin, Author of AdalFlow and ex AI researcher @ MetaAI

Handy links:

AdalFlow is an LLM library that not only helps developers build but also optimizes LLM task pipelines. Embracing a design pattern similar to PyTorch, AdalFlow is light, modular, and robust, with a 100% readable codebase.

There are many tutorials that show users how to call high-level agent APIs, but none of them explain how it really works in depth. This is where the AdalFlow library aims to make a difference.

In this blog, you will not only learn how to use the ReAct Agent but more importantly, also understand how it was implemented and how you can customize or build your own agent with AdalFlow.

Let’s get started!

Image source, credits to Growtika

Introduction

“An autonomous agent is a system situated within and a part of an environment that senses that environment and acts on it, over time, in pursuit of its own agenda and so as to effect what it senses in the future.”

— Franklin and Graesser (1997)

Alongside the well-known RAGs, agents [1] are another popular family of LLM applications. What makes agents stand out is their ability to reason, plan, and act via accessible tools. When it comes to implementation, AdalFlow has simplified it down to a generator that can use tools, taking multiple steps (sequential or parallel) to complete a user query.

1. What is ReAct Agent

ReAct [2] is a general paradigm for building agents that sequentially interleaves thought, action, and observation steps.

Thought: The reasoning behind taking an action.
Action: The action to take from a predefined set of actions. In particular, these are the tools/functional tools we have introduced in tools.
Observation: The simplest scenario is the execution result of the action in string format. To be more robust, this can be defined in any way that provides the right amount of execution information for the LLM to plan the next step.

Prompt and Data Models

The prompt is the most straightforward way to understand any LLM application. Always read the prompt.

AdalFlow uses jinja2 syntax for the prompt.

DEFAULT_REACT_AGENT_SYSTEM_PROMPT is the default prompt for the React agent’s LLM planner. We can categorize the prompt template into four parts:

Task description

This part is the overall role setup and task description for the agent.

task_desc = r"""You are a helpful assistant.Answer the user's query using the tools provided below with minimal steps and maximum accuracy.Each step you will read the previous Thought, Action, and Observation(execution result of the action) and then provide the next Thought and Action."""

Tools, output format, and example

This part of the template is exactly the same as how we were calling functions in the tools. The output_format_str is generated by FunctionExpression via JsonOutputParser. It includes the actual output format and examples of a list of FunctionExpression instances. We use thought and action fields of the FunctionExpression as the agent’s response. You will be easily visualize the whole pipeline later by simply print(react).

tools = r"""{% if tools %}
<TOOLS>
{% for tool in tools %}
{{ loop.index }}.
{{tool}}
------------------------
{% endfor %}
</TOOLS>
{% endif %}
{{output_format_str}}"""

Task specification to teach the planner how to “think”.

We provide more detailed instruction to ensure the agent will always end with ‘finish’ action to complete the task. Additionally, we teach it how to handle simple queries and complex queries.

For simple queries, we instruct the agent to finish with as few steps as possible.
For complex queries, we teach the agent a ‘divide-and-conquer’ strategy to solve the query step by step.

task_spec = r"""<TASK_SPEC>
- For simple queries: Directly call the ``finish`` action and provide the answer.
- For complex queries:
   - Step 1: Read the user query and potentially divide it into subqueries. And get started with the first subquery.
   - Call one available tool at a time to solve each subquery/subquestion. \
   - At step 'finish', join all subqueries answers and finish the task.
Remember:
- Action must call one of the above tools with name. It can not be empty.
- You will always end with 'finish' action to finish the task. The answer can be the final answer or failure message.
</TASK_SPEC>"""

We put all these three parts together to be within the <SYS></SYS> tag.

Agent step history.

We use StepOutput to record the agent’s step history, including:

action: This will be the FunctionExpression instance predicted by the agent.
observation: The execution result of the action.

In particular, we format the steps history after the user query as follows:

step_history = r"""User query:
{{ input_str }}
{# Step History #}
{% if step_history %}
<STEPS>
{% for history in step_history %}
Step {{ loop.index }}.
"Thought": "{{history.action.thought}}",
"Action": "{{history.action.action}}",
"Observation": "{{history.observation}}"
------------------------
{% endfor %}
</STEPS>
{% endif %}
You:"""

2. Introduction on tools/function calls

In addition to the tools provided by users, by default, we add a new tool named finish to allow the agent to stop and return the final answer.

def finish(answer: str) -> str:
   """Finish the task with answer."""
   return answer

Simply returning a string might not fit all scenarios, and we might consider allowing users to define their own finish function in the future for more complex cases.

Additionally, since the provided tools cannot always solve user queries, we allow users to configure if an LLM model should be used to solve a subquery via the add_llm_as_fallback parameter. This LLM will use the same model client and model arguments as the agent’s planner. Here is our code to specify the fallback LLM tool:

_additional_llm_tool = (
   Generator(model_client=model_client, model_kwargs=model_kwargs)
   if self.add_llm_as_fallback
   else None
)

def llm_tool(input: str) -> str:
   """I answer any input query with llm's world knowledge. Use me as a fallback tool or when the query is simple."""
   # use the generator to answer the query
   try:
         output: GeneratorOutput = _additional_llm_tool(
            prompt_kwargs={"input_str": input}
         )
         response = output.data if output else None
         return response
   except Exception as e:
         log.error(f"Error using the generator: {e}")
         print(f"Error using the generator: {e}")
   return None

3. ReAct Agent implementation

We define the class ReActAgent to put everything together. It will orchestrate two components:

planner: A Generator that works with a JsonOutputParser to parse the output format and examples of the function calls using FunctionExpression.
ToolManager: Manages a given list of tools, the finish function, and the LLM tool. It is responsible for parsing and executing the functions.

Additionally, it manages step_history as a list of StepOutput instances for the agent’s internal state.

Prompt the agent with an input query and process the steps to generate a response.

4. ReAct Agent in action

We will set up two sets of models, llama3–70b-8192 by Groq and gpt-3.5-turbo by OpenAI, to test two queries. For comparison, we will compare these with a vanilla LLM response without using the agent. Here are the code snippets:

from lightrag.components.agent import ReActAgent
from lightrag.core import Generator, ModelClientType, ModelClient
from lightrag.utils import setup_env

setup_env()

# Define tools
def multiply(a: int, b: int) -> int:
   """
   Multiply two numbers.
   """
   return a * b
def add(a: int, b: int) -> int:
   """
   Add two numbers.
   """
   return a + b
def divide(a: float, b: float) -> float:
   """
   Divide two numbers.
   """
   return float(a) / b
llama3_model_kwargs = {
   "model": "llama3-70b-8192",  # llama3 70b works better than 8b here.
   "temperature": 0.0,
}
gpt_model_kwargs = {
   "model": "gpt-3.5-turbo",
   "temperature": 0.0,
}

def test_react_agent(model_client: ModelClient, model_kwargs: dict):
   tools = [multiply, add, divide]
   queries = [
      "What is the capital of France? and what is 465 times 321 then add 95297 and then divide by 13.2?",
      "Give me 5 words rhyming with cool, and make a 4-sentence poem using them",
   ]
   # define a generator without tools for comparison
   generator = Generator(
      model_client=model_client,
      model_kwargs=model_kwargs,
   )
   react = ReActAgent(
      max_steps=6,
      add_llm_as_fallback=True,
      tools=tools,
      model_client=model_client,
      model_kwargs=model_kwargs,
   )
   # print(react)
   for query in queries:
      print(f"Query: {query}")
      agent_response = react.call(query)
      llm_response = generator.call(prompt_kwargs={"input_str": query})
      print(f"Agent response: {agent_response}")
      print(f"LLM response: {llm_response}")
      print("")

The structure of React using print(react), including the initialization arguments and two major components: tool_manager and planner. You can visualize the structure from our colab.

Now, let’s run the test function to see the agent in action.

test_react_agent(ModelClientType.GROQ(), llama3_model_kwargs)
test_react_agent(ModelClientType.OPENAI(), gpt_model_kwargs)

Our agent will show the core steps for developers via colored printout, including input_query, steps, and the final answer. The printout of the first query with llama3 is shown below (without the color here):

2024-07-10 16:48:47 - [react.py:287:call] - input_query: What is the capital of France? and what is 465 times 321 then add 95297 and then divide by 13.2

2024-07-10 16:48:48 - [react.py:266:_run_one_step] - Step 1:
StepOutput(step=1, action=FunctionExpression(thought="Let's break down the query into subqueries and start with the first one.", action='llm_tool(input="What is the capital of France?")'), function=Function(thought=None, name='llm_tool', args=[], kwargs={'input': 'What is the capital of France?'}), observation='The capital of France is Paris!')
_______
2024-07-10 16:48:49 - [react.py:266:_run_one_step] - Step 2:
StepOutput(step=2, action=FunctionExpression(thought="Now, let's move on to the second subquery.", action='multiply(a=465, b=321)'), function=Function(thought=None, name='multiply', args=[], kwargs={'a': 465, 'b': 321}), observation=149265)
_______
2024-07-10 16:48:49 - [react.py:266:_run_one_step] - Step 3:
StepOutput(step=3, action=FunctionExpression(thought="Now, let's add 95297 to the result.", action='add(a=149265, b=95297)'), function=Function(thought=None, name='add', args=[], kwargs={'a': 149265, 'b': 95297}), observation=244562)
_______
2024-07-10 16:48:50 - [react.py:266:_run_one_step] - Step 4:
StepOutput(step=4, action=FunctionExpression(thought="Now, let's divide the result by 13.2.", action='divide(a=244562, b=13.2)'), function=Function(thought=None, name='divide', args=[], kwargs={'a': 244562, 'b': 13.2}), observation=18527.424242424244)
_______
2024-07-10 16:48:50 - [react.py:266:_run_one_step] - Step 5:
StepOutput(step=5, action=FunctionExpression(thought="Now, let's combine the answers of both subqueries.", action='finish(answer="The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.")'), function=Function(thought=None, name='finish', args=[], kwargs={'answer': 'The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.'}), observation='The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.')
_______
2024-07-10 16:48:50 - [react.py:301:call] - answer:
The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.

The comparison between the agent and the vanilla LLM response is shown below:

Answer with agent: The capital of France is Paris! and the result of the mathematical operation is 18527.424242424244.
Answer without agent: GeneratorOutput(data="I'd be happy to help you with that!\n\nThe capital of France is Paris.\n\nNow, let's tackle the math problem:\n\n1. 465 × 321 = 149,485\n2. Add 95,297 to that result: 149,485 + 95,297 = 244,782\n3. Divide the result by 13.2: 244,782 ÷ 13.2 = 18,544.09\n\nSo, the answer is 18,544.09!", error=None, usage=None, raw_response="I'd be happy to help you with that!\n\nThe capital of France is Paris.\n\nNow, let's tackle the math problem:\n\n1. 465 × 321 = 149,485\n2. Add 95,297 to that result: 149,485 + 95,297 = 244,782\n3. Divide the result by 13.2: 244,782 ÷ 13.2 = 18,544.09\n\nSo, the answer is 18,544.09!", metadata=None)

The ReAct agent is particularly helpful for answering queries that require capabilities like computation or more complicated reasoning and planning. However, using it on general queries might be an overkill, as it might take more steps than necessary to answer the query.

5. [Optional] Customization

Please refer to our tutorial for how to customize ReAct to your use case.

References

[1] A survey on large language model based autonomous agents: Paitesanshi/LLM-Agent-Survey

[2] ReAct: https://arxiv.org/abs/2210.03629

[3] Tool Tutorial: https://lightrag.sylph.ai/tutorials/tool_helper.html

API References

A guest post by

Author of AdalFlow, Founder at SylphAI, ex AI researcher at MetaAI. Github: liyin2015