Evolutionary Agent (Evolving Agent)
Figure 6-8 shows an evolutionary (evolving) agent. The set of actions the agent can perform is A. Each action, when taken (do function) causes the environment to change to another state (A x S —* S). During each cycle, the environment exists in a state (s <= S) however the agent can detect only one of a set of partitions (groups of states) in S called T. The agent perceives the environment (see function) and detects which partition (t <= T) the environment is in (S —> T). The agent modifies its internal representation (model function) based on its new perceptions, its knowledge of the world, its goals, and its utility values (M x G x К x U xT —> M). Having perceived the environment and updated its internal model, the agent selects an action (a c A) (action function) using the internal models, goals, knowledge, utility values, and its perceptions (MxGxKxUxT—> A). The agent then performs the selected action (do function) causing the environment to transition to a new state (A x S —> S). The agent's knowledge, K, can be modified by the agent (learn function). Pieces of knowledge can be added, removed, or modified (MxGxKx IfxT^K). The alter function allows the agent to change its goals over time (M x G x К x U x T —> G). The assess function allows the agent to adjust its utility values (M x G x К x iTxT->lZ).
The ability for an agent to change its utility values and goals gives it further ability to adapt to real-word situations and achieve the utmost in robust behavior. For example, the goal of "recharge batteries" for a bird-photographing agent would have a fairly low utility value when the battery level is high. In such a condition, the agent will proceed with its normal activity of detecting and photographing birds. However, as the battery level drops, the agent can change the utility value of "recharge
Fig. 6-8: An evolving agent.
batteries" thereby changing the degree of importance to the agent. Changing the utility value to an intermediate level might cause the agent to begin navigating toward a recharging station but also continue with its bird detection and photographing tasks along the way. As the battery level gets critically low, the utility value of "recharge batteries" can be changed to a level making achieving that goal the most important thing to the agent—even surpassing bird detection and photographing. The agent in this condition will stop trying to detect and photograph birds and perform actions only relevant to reaching the recharging station.
Likewise, an agent with the ability to alter its own goals gives the agent a degree of autonomy not available in other architectures. For example, as conditions change for our bird-photographing agent, the agent could establish new goals such as: "photograph bears," "count birds," "classify and count birds," etc. By acquiring new knowledge, setting new goals for itself, and adjusting utility values, the agent evolves new behaviors.
Indeed, this is a picture of a human. When born, a baby possesses a certain set of abilities but immediately begins learning and continuously does so for the rest of its life. Basic goals such as "seek food," "seek comfort," and "seek safety" are established quite early, but humans evolve other goals as time passes such as "graduate from college" and "buy a car." Over time, evolving agents, human or artificial, evolve into individual entities each with a different set of goals, utility values, and knowledge.
In this book, we are concerned with artificial entities, cogs, able to perform at or above human-expert levels. Later, we will argue cogs should be evolving agents as described in this architecture. We will introduce a more detailed architecture—our model of expertise—but at its core, that architecture will be equivalent to the architecture of an evolving agent introduced here.
The cognitive architectures presented so far in this chapter are collectively called the "formal models." The next series of cognitive architectures are based on studying human cognition. We call this group the "humanistic models." The most successful cognitive architecture in this group is the Soar architecture. However, Soar has evolved to incorporate elements from several other architectures including: EPIC, ACT-R, and CLARION.
As shown in Fig. 6-9, the Executive Process-Interactive Control (EPIC) architecture was developed in the 1990s by David E. Kieras and David E. Meyer at the University of Michigan (Kieras and Meyer, 1997). Compare this architecture with the formal models described above. EPIC has the same basic perceive-reason-act structure. The "perceive" function is broken down into auditory, visual, and tactile processing. The "act" function is broken down into vocal, ocular, and manual processing. Since EPIC is derived from studying human cognition, it is not surprising these are the only sensor and effector categories. The perceptual processors perform the "see" function depicted in the formal models and the motor processors perform the "act" function depicted in the formal models.
Reasoning is done in the cognitive processor. In EPIC, all cognition is done by executing production rules. Production rules are stored in the production memory and are retrieved into the rule interpreter when needed. Information needed to execute the production rules comes from two sources: long-term memory and short-term memory. Some information is retained over significant lengths of time in long-term memory. Presumably, this would include general knowledge (К in the
Fig. 6-9: The EPIC cognitive architecture.
formal models) and could include goals (G), utility values (IT) and models (M), but these are not expressly represented in the EPIC architecture.
Information in short-term memory (working memory) comes from the perceptual processors and represents the agent's perceptions about the environment (Г in the formal models). Production rule execution places more information in working memory which is used by the motor processors to carry out actions (A in the formal models).
Note, EPIC does not include learning. Knowledge is encoded a priori into the production rules and long-term memory but there is no mechanism for updating this knowledge nor the creation of new production rules.
As shown in Fig. 6-10, the Adaptive Control of Thought-Rational (ACT-R) architecture was developed by John R. Anderson and Christian Lebiere at Carnegie Mellon University (Anderson, 2013). Work leading to ACT-R began in the 1970s influenced by Allen Newell and the idea of developing unified theory of cognition by studying human cognition.
Like EPIC and the formal models, ACT-R features a basic perceive- reason-act structure. However, ACT-R's perception is depicted as only a visual module. Presumably, other modules could be added as needed to
Fig. 6-10: The ACT-R cognitive architecture.
cover other forms of sensory capabilities. The "act" function is represented by the motor module and likewise could be expanded to include other effector elements as needed. Also, like EPIC, ACT-R maintains a set of production rules which feed a pattern matching and production rule execution module. Buffers serve as ACT-R's short term, or working, memory. Information from the perceptual module (T in the formal models) is fed into the buffers and used in production rule execution.
ACT-Rassumes knowledge is divided into two types of representations: declarative and procedural. Declarative knowledge (e.g., "Washington, D.C. is the capital of United States") is represented in the form of chunks (structured collections of labeled data). Chunks accessible through buffers and may be stored in declarative memory. Procedural memory consists of production rules representing knowledge about how to do things (e.g., how to type the letter "Q" on a keyboard). Together, declarative and procedural knowledge represent the agent's knowledge (К in the formal models).
As shown in Fig. 6-11, the Connectionist Learning with Adaptive Rule Induction On-Line (CLARION) cognitive architecture was developed in the early 2000s by a research group led by Ron Sun at the Rensselaer Polytechnic Institute (Sun, 2002).
CLARION makes a distinction between implicit and explicit knowledge (corresponding to К in the formal models). Implicit knowledge is gained
Fig. 6-11: The CLARION cognitive architecture.
through experience and incidental activities (e.g., learning how to ride a bicycle). Explicit knowledge is knowledge easily represented and transmitted to others (e.g., Earth's atmosphere is 21% oxygen). Note, learning is embedded in the architecture so the agent's implicit and explicit knowledge can change over time.
The other distinction is between action-centered and non-action- centered activities. The Action-Centered Subsystem (ACS) is where all of the agent's actions (both instinctive and learned) are stored (A in the formal models). The Non-Action-Centered Subsystem (NACS) maintains general knowledge (К in the formal models). Some knowledge is semantic knowledge (general knowledge statements about the world) and some is episodic knowledge (knowledge about specific situations).
The Motivational Subsystem (MS) is a unique feature of CLARION providing the agent with motivation for perception, cognition, and action. These correspond to goals and utility values in the formal models (G and If). Drives include low-level motivations (e.g., hunger, avoidance) and high-level motivations (e.g., affiliation, fairness).
Another unique feature is the Meta-Cognition Subsystem (MCS). MCS monitors and directs the operations in the other three subsystems.
Learning is an example of this. Reinforcement learning allows the agent to modify its knowledge. Both actions and non-actions can be learned. The agent can also set/modify its own goals. Therefore, CLARION has elements corresponding to the alter, assess, and learn functions found in the evolutionary agent formal model.
CLARION is also unique from other architectures in this group because it employs connectionism. Connectionism is a branch of artificial intelligence using artificial neural networks (ANNs) to describe and replicate intelligence (McCulloch and Pitts, 1943; Hebb, 1949; Medler, 1998). Connectionism is inspired by the structure of the human brain and is based on interconnected networks of simple units (like neurons in the human brain). When such a network is presented with a stimulus (some form of input information) it causes signals to flow from unit to unit regulated by weighting factors on each connection. The network's response can be fine-tuned by changing the weights on the connections. Over several trials, the response of the network can be tuned to be different for different stimuli. After training, when presented with a similar stimulus, an ANN can determine the kind of stimulus by comparing its response to the responses formed by the training set. This gives connectionist networks a great deal of robustness in dealing with unstructured and partial data which historically have caused production rule systems problems.