ReAct: Synergising Reasoning and Acting in Language Models
“ReAct: Synergizing Reasoning and Acting in Language Models” paper summary.
A groundbreaking paper that is widely accepted as the founding of Agentic LLM Models, titled “ReAct: Synergising Reasoning and Acting in Language Models” introduces a novel framework that empowers large language models (LLMs) to solve complex tasks with greater accuracy and human-like intuition. The proposed “ReAct” framework enables LLMs to generate both reasoning traces and task-specific actions in an interleaved manner, leading to a powerful synergy between thought and action.
The Core Concept: A Fusion of Thought and Action
At its heart, ReAct allows a language model to not only “think” about a problem but also to “act” upon its thoughts by interacting with external tools and environments. This is a significant departure from previous approaches that treated reasoning and acting as separate functions. With ReAct, the model can create and adjust plans, track its progress, and even handle unexpected situations by dynamically reasoning about the task at hand. Simultaneously, it can gather new information from external sources, such as a knowledge base like Wikipedia, to inform its reasoning process.
This interleaved process of reasoning and acting allows the model to perform dynamic reasoning to create, maintain, and adjust high-level plans for acting (reason to act), while also interacting with external environments to incorporate additional information into reasoning (act to reason).
How ReAct Works: An Interleaved Process
The ReAct framework guides an LLM to produce a sequence of thoughts, actions, and observations. This step-by-step process can be broken down as follows:
- Thought: The model analyses the current situation and generates a reasoning trace, outlining its understanding of the problem and a plan to solve it.
- Action: Based on its thought process, the model takes a specific action, such as searching a database or interacting with a simulated environment.
- Observation: The model receives feedback from its action, which could be new information or a change in the environment’s state.
This cycle of thought-action-observation continues until the task is completed, allowing the model to adapt its strategy based on new information and the outcomes of its actions.
Outperforming the Status Quo
The paper demonstrates ReAct’s effectiveness across a variety of challenging tasks, including knowledge-intensive reasoning and interactive decision-making.
In tasks like question answering (HotpotQA) and fact verification (Fever), ReAct was shown to overcome common issues faced by other models, such as “hallucination” (generating false information) and error propagation. By interacting with a Wikipedia API, ReAct produced more factual and trustworthy results. While the Chain-of-Thought (CoT) method, which focuses solely on reasoning, can be prone to factual inaccuracies, ReAct’s ability to ground its reasoning in external information leads to more reliable outcomes.
The research found that combining ReAct with CoT often yields the best results, leveraging both internal knowledge and externally gathered information.
In the realm of decision-making, ReAct was tested on complex interactive environments like ALFWorld (a text-based game) and WebShop (a simulated online shopping environment). In these scenarios, ReAct significantly outperformed existing imitation and reinforcement learning methods. For instance, in ALFWorld, ReAct achieved an absolute success rate improvement of 34%, and in WebShop, a 10% improvement, often with very few examples to learn from. Without the reasoning component, the action-only models struggled to decompose goals into smaller steps and keep track of their progress.
Key Advantages of the ReAct Framework
The ReAct framework offers several key benefits:
- Improved Performance and Robustness: ReAct has demonstrated strong performance and generalisation on a diverse set of tasks, consistently outperforming models that rely on reasoning or acting alone.
- Enhanced Interpretability and Trustworthiness: By making the model’s reasoning process transparent, ReAct allows humans to understand how the model arrived at its decisions, increasing trust and making it easier to diagnose errors.
- Greater Flexibility and Adaptability: The framework is designed to be general and can be applied to a wide range of tasks with different action spaces and reasoning requirements.
- Human-like Problem Solving: The interleaved nature of reasoning and acting mirrors how humans approach complex problems, making the model’s behaviour more intuitive and controllable.
For an applied example, check this video:
Reference
Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2023). ReAct: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629v3 [cs.CL]. https://arxiv.org/pdf/2210.03629
