Meet SELF-DISCOVER: Google DeepMind’s New Method for LLM Reasoning

The new techniques addressed some of the limitations of existing reasoning techniques such as chain of thought.

Jesus Rodriguez
6 min readFeb 12, 2024
Created Using DALL-E

I recently started an AI-focused educational newsletter, that already has over 165,000 subscribers. TheSequence is a no-BS (meaning no hype, no news, etc) ML-oriented newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers, and concepts. Please give it a try by subscribing below:

Reasoning continues evolving as one of the most fascinating areas in generative AI with research papers pushing the boundaries of our imagination. Chain of thought(CoT), tree of thought(ToT), System 2 are many of the recent LLM reasoning techniques that are exploring the ability of LLMs to breakdown complex problems. Recently, researchers from Google DeepMind published a paper outlining SELF-DISCOVER, a somewhat of a novel take on LLM reasoning.

As mentioned before, there is no lack of reasoning methods in the LLM space but DeepMind’s seems to have \been inspired by the way humans tackle reasoning problems. They’ve looked at methods like few-shot and zero-shot chain-of-thought prompting, which mimic the human approach of solving problems step by step. Another method, decomposition-based prompting, draws from the human ability to break down complex problems into smaller, manageable parts and then address each part individually. Additionally, they’ve explored step-back prompting, which reflects on the nature of the task to derive general principles for solving it.

Despite the effectiveness of these techniques, there are some limitations that jump off the page. Each method acts as an isolated reasoning module, assuming a one-size-fits-all approach to problem-solving. However, they argue that each task has its own unique structure that should guide the reasoning process for more efficient problem-solving. For example, they found that least-to-most prompting is significantly better than chain-of-thought for tasks that involve symbolic manipulation, owing to the inherent structure of these tasks.

Image Credit: Google DeepMind


SELF-DISCOVER is inspired by the human ability to devise an internal reasoning plan for solving problems. This approach allows an LLM to compose a reasoning structure tailored to the specific task at hand without relying on predefined labels.

SELF-DISCOVER operates in two fundamental stages:

Image Credit: Google DeepMind

1) Stage 1

At the task level, it uses a set of actions to guide the LLM in generating a task-specific reasoning structure. This involves identifying and organizing atomic reasoning modules described in natural language, such as “breakdown into subtasks” and “critical thinking”.

The initial phase of SELF-DISCOVER is dedicated to meta-reasoning, where the goal is to reveal the task-specific reasoning structure. DeepMind employs a methodical strategy using three meta-prompts to assist LLMs in selecting, adapting, and applying a coherent reasoning framework without the need for labels or extensive training. The chosen structure is organized in a key-value pair format, akin to JSON, to enhance interpretability and the quality of reasoning and generation. This formatting decision is based on observations that structured data formats can significantly improve the performance of LLMs.

In practice, SELF-DISCOVER’s first stage simplifies the task-solving process into three distinct actions:

1) Select: This involves choosing appropriate reasoning modules from a provided set that best match the problem-solving requirements of the task.

2) Adapt: The next step is to tailor the descriptions of these selected modules, making them more specific to the context and demands of the task at hand.

3) Implement: Finally, the customized reasoning descriptions are structured into an actionable plan. This structured plan is then used to guide the LLM in solving the task, with the model filling in each key to progress towards the final answer.

Image Credit: Google DeepMind

DeepMind’s approach ensures that SELF-DISCOVER only needs to be applied once per task at the task level, streamlining the problem-solving process. By utilizing the discovered reasoning structure, LLMs can efficiently tackle each instance of the task, following the structured plan to arrive at conclusive solutions.

2) Stage 2

The LLM then applies this self-discovered structure to solve individual task instances, leading to the final solution.

This two-stage process not only enhances the LLM’s problem-solving capabilities but also makes the reasoning process more coherent and aligned with the unique demands of each task. DeepMind’s approach exemplifies a significant step forward in making LLMs more adaptable and efficient in reasoning and problem-solving, mirroring the complexity and versatility of human cognition.

The Results

Google DeepMind’s SELF-DISCOVER has shown promising results in enhancing the reasoning abilities of cutting-edge language models like PaLM 2-L and GPT-4 across a broad array of reasoning tasks. The evaluation covered complex reasoning tasks in various domains, including BBH, T4D, and MATH, demonstrating SELF-DISCOVER’s effectiveness compared to traditional methods such as direct prompting, chain-of-thought (CoT), and Plan-and-Solve (PS).

Image Credit: Google DeepMind

Performance Enhancements

The analysis revealed that SELF-DISCOVER led to significant improvements in performance. Specifically, it outperformed the chain-of-thought and Plan-and-Solve approaches by 7% and 6% respectively in PaLM 2-L, and observed similar improvements with GPT-4. These enhancements were not limited to a single type of task but were evident across a diverse set of challenges, particularly those requiring detailed world knowledge like sports trivia, movie recommendations, and identifying historical ruins.

Analyzing Reasoning Processes

In a detailed examination of the reasoning process on a geometric shape task from BBH, the limitations of CoT and Plan-and-Solve became apparent. Both methods incorrectly concluded that a path did not form a regular shape, mistakenly identifying it as not closed. In contrast, SELF-DISCOVER’s approach was markedly different. It meticulously analyzed each line segment and their coordinates, employing logical reasoning to deduce that the path indeed forms a closed shape, as it returns to the starting coordinate. This methodical breakdown and analysis allowed SELF-DISCOVER to arrive at the correct conclusion through a logical reasoning process.

Image Credit: Google DeepMind

Google DeepMind’s SELF-DISCOVER represents a significant step forward in the application of artificial intelligence to complex reasoning tasks. By meticulously breaking down tasks and applying a structured reasoning process, SELF-DISCOVER not only achieves higher accuracy but also provides a more intuitive and logical path to solving problems, much like the approach a human expert might take.



Jesus Rodriguez

CEO of IntoTheBlock, President of Faktory, President of NeuralFabric and founder of The Sequence , Lecturer at Columbia University, Wharton, Angel Investor...