Towards Autonomy Metacognitive Learning

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

AAAI Spring Symposium Series (SSS-24)

Toward Autonomy: Metacognitive Learning for Enhanced AI Performance


Brendan Conway-Smith, Robert L. West
Carleton University
brendan.conwaysmith@carleton.ca, robert.west@carleton.ca

Abstract An effective way to model metacognitive automization is


Large Language Models (LLMs) lack robust metacognitive through the proceduralization mechanism in ACT-R, which
learning abilities and depend on human-provided algorithms takes explicit strategies stored in declarative memory and
and prompts for learning and output generation. Metacogni- compiles them into automatic productions in procedural
tion involves processes that monitor and enhance cognition. memory, making them faster and unconscious (Anderson
Learning how to learn - metacognitive learning - is crucial for
adapting and optimizing learning strategies over time. Alt- 2013). Importantly, once compiled, the automatic produc-
hough LLMs possess limited metacognitive abilities, they tions (that do not consult declarative memory) compete with
cannot autonomously refine or optimize these strategies. Hu- the original productions (that do consult declarative
mans possess innate mechanisms for metacognitive learning memory). This competition is significant as the productions
that enable at least two unique abilities: discerning which must prove their utility across time and can be unlearned
metacognitive strategies are best and automatizing learning
strategies. These processes have been effectively modeled in (via time-delayed learning algorithms). The proceduraliza-
the ACT-R cognitive architecture, providing insights on a tion mechanism tested in experiments focused on learning
path toward greater learning autonomy in AI. Incorporating strategies and shows similar speed up curves to humans (An-
human-like metacognitive learning abilities into AI could po- derson et al. 2019). However, we are unaware of any broader
tentially lead to the development of more autonomous and testing of this mechanism in situations where multiple strat-
versatile learning mechanisms, as well as improved problem-
solving capabilities and performance across diverse tasks. egies compete and their effectiveness varies over time or
conditions. In theory, this mechanism should prevent au-
tomization except in cases where it is reliably effective.
Introduction The process of proceduralization in ACT-R requires that
Currently, Large Language Models (LLMs) do not possess strategies have been stored in declarative memory. This
a robust set of self-directing learning abilities, and rely on maps most directly to humans who have decided that a strat-
human-designed algorithms, training data, and prompts to egy is useful, memorized it, and then practiced to make it
learn and generate outputs. For an LLM to become adept at automatic, which is an example of metacognition (i.e., cog-
metacognitive learning would require significant advance- nition designed to influence, control, or improve cognition).
ments in AI, including the development of AI systems capa- Here it is important to distinguish between learning to do
ble of self-modification, self-assessment, and autonomous a particular task better and learning how to learn a task bet-
strategy development. ter. Learning how to learn is metacognitive learning. Meta-
Metacognition is an array of cognitive processes that cognitive learning produces knowledge about different
monitor and guide ordinary cognition in order to improve its types of learning strategies, and where they are best applied
functioning (Flavell 1979). For example, a student can rec- (Van Velzen 2015). While humans can ordinarily perform
ognize they learn better when they study in the morning in- metacognitive learning, it takes practice to become skilled.
stead of the evening. Generally, we think of metacognition An example is of a research scientist with expert knowledge
as conscious, deliberate efforts to control and enhance cog- of strategies for learning about different topics in their field.
nitive processes, however, the practice of a metacognitive
strategy can result in it becoming automatic (Conway-Smith, Metacognitive Learning in LLMs
West, and Mylopoulos 2023). For example, a student's choice
In the case of LLMs, prompt engineering can be considered
to work in the morning can become an automatic habit.
____________________________________
a type of metacognition that is provided by humans. Prompts
are explicit instructions that are intended to direct computa-
Copyright © 2024, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.

545
tional processes during the completion of some task. In ad- not the automatization learning algorithm is open for modi-
dition to providing the initial prompt, a human prompt engi- fication through metacognitive practice. A simple example
neer monitors the LLM’s responses and adjust the prompts of this would be learning to occasionally check if an already
on an ongoing basis to improve the output. This is very automatized procedure is still the best, something that hu-
much like a human engaged in metacognitive learning. They mans are capable of but often struggle with. This is common
begin with a strategy and then monitor and adjust it as they in Cognitive Behavioural Therapy, where therapists will en-
move forward. In the example of the research scientist, when courage clients to periodically re-evaluate their coping strat-
exploring a novel problem they may begin with one research egies to determine their current effectiveness (Beck 2020).
method but alter it when monitoring identifies it as not the An important component of almost all cognitive architec-
best choice. By generalizing what is learned and how, the tures (Laird, Lebiere, and Rosenbloom 2017) is the separation
scientist understands which research strategies elicit useful of associative learning in declarative memory and reinforce-
results for a specific class of problem. Similarly, a prompt ment learning in procedural memory. The most direct way
engineer learns which prompting techniques elicit the most to implement this type of system would be to treat the LLM
useful results for a given class of interaction with the LLM. as declarative memory and implement prompt engineering
We argue that LLMs such as Chat GPT, which have been in procedural memory. Procedural memory (in most archi-
trained on extensive databases, have some limited metacog- tectures) represents the task using graph structures. Hence,
nitive abilities. Specifically, if an LLM has absorbed enough some method to convert language outputs to graph-based
of the right content it is able to output metacognitive sug- outputs would be needed. To some degree, LLMs are already
gestions as part of its answers. For example, when Chat GPT capable of this, and are able to interpret graph-based code as
is asked for advice on writing a scientific abstract it re- well. However, unlike LLMs which operate from a bottom-
sponded (among other things): up approach, cognitive architectures, like humans, can exert
strong top-down controls, effectively mimicking an expert
“Maintain a clear focus on your objective and what you system. This is also reflected in learning as procedural
want to communicate. This helps to organize your memory rewards are largely based on achieving task goals.
thoughts and present your findings coherently.” We argue that equipping LLMs with human-like meta-
“Approach your writing with an organized plan. Decide cognitive capabilities would require a distinct procedural
on the structure of your abstract beforehand and allo- module embodying the characteristics we've outlined. Such
cate your focus accordingly.” a module would allow for the ongoing refinement of internal
strategies for prompt optimization, promoting autonomous
In this sense, LLMs motivate a question that has been ig- learning and the creation of more effective prompts. This
nored in the ACT-R approach i.e., where do metacognitive could be used to augment human-provided prompts or ena-
strategies come from in the first place? In this example, one ble self-prompting, allowing the LLM to better act as an in-
could argue that Chat GPT simply reproduced strategies that dependent agent. Metacognition is particularly important for
were in its learning set. Alternatively, one could argue that these agents, as it facilitates the self-monitoring and self-
Chat GPT completely understands that metacognitive sug- correction necessary for addressing safety and ethical issues.
gestions should be part of the answer. Either way, it is able
to supply some forms of metacognitive strategies. This is not References
entirely different from humans, who mainly receive meta- Anderson, J. R. 2013. Cognitive skills and their acquisition. Psy-
cognitive strategies from other sources (e.g., teachers, chology Press.
books) and store them in declarative memory for later use. Anderson, J.; Betts, S.; Bothell, D.; Hope, R.; and Lebiere, C. 2019.
However, beyond this, LLMs are limited in their ability Learning rapid and precise skills. Psychological review, 126(5), 727.
to employ metacognition effectively. While a metacognitive Beck, J. S. 2020. Cognitive behavior therapy: Basics and beyond.
strategy may be included in a prompt, this inclusion is not Guilford Publications.
the same as actually applying the strategy. The prompt Conway-Smith, B.; West, R. L; and Mylopoulos, M. 2023. Meta-
would allow the strategy to moderately influence the process cognitive skill: how it is acquired. In Proceedings of the 45th An-
but not to guide the process. Furthermore, LLMs cannot per- nual Conference of the Cognitive Science Society.
form the two types of metacognitive learning that we de- Flavell, J. 1979. Metacognition and cognitive monitoring: A new
scribed above — automatization and learning which meta- area of cognitive inquiry. American psychologist, 34(10), 906.
cognitive strategies are best. Both of these are important and Laird, J. E.; Lebiere, C.; and Rosenbloom, P. S. 2017. A standard
related to each other. Metacognitive automatization com- model of the mind: Toward a common computational framework
across artificial intelligence, cognitive science, neuroscience, and
pares and seeks the best metacognitive strategy using rein- robotics. Ai Magazine, 38, no. 4 (2017): 13-26.
forcement learning. This method could also be used to find Van Velzen, J. 2015. Metacognitive learning. New York, NY:
the best metacognitive learning strategy, but it would re- Springer International Publishing.
quire more overhead. Also, we should consider whether or

546

You might also like