Knowledge-based Multi-agent Coordination - Artificial Intelligence ...

Knowledge-based Multi-agent CoordinationAbstractIn many domains, intelligent agents must coordinate their activities in order forthem to be successful both individually and collectively. Over the last ten years,research in distributed articial intelligence has emphasized building knowledge-leansystems, where coordination emerges either from simple rules of behavior or from adeep understanding of general coordination strategies. In this paper we contend thatthere is alternative for domains in which the types and methods of coordination are wellstructured (even though the environmentmaybevery unstructured and dynamic). Thealternative is to build real-time knowledge-based agents that have a broad, but shallowunderstanding of how to coordinate. We demonstrate the viability of this approachby example. Specically, wehave built agents that model the coordination performedby Navy and Air Force pilots and controllers in air-to-air and air-to-ground missionswithin a distributed interactive simulation environment. The major contribution of thepaper is an examination of the requirements and approaches for supporting knowledgebasedcoordination, in terms of the structure of the domain, the agents' knowledge ofthe domain, and the underlying AI architecture.1

1 IntroductionIntelligent agents often must coordinate their activities in order to achieve both individualand collective goals. Over the last ten years, research in distributed articial intelligencehas emphasized building knowledge-lean, domain-independent systems, where coordinationemerges either from simple rules of behavior or from a deep understanding of general coordinationand negotiation strategies. However, in some domains, knowledge-intensive approachescan have a distinct advantage because they can use pre-compiled domain-specicorganizational structures that provide coordination with minimal communication and runtimeplanning. Furthermore, if computer-based agents are to coordinate their activitieswith humans, who already have well-developed organizational structures and languages forcommunication, the agents must be able to understand those languages and behave appropriatelywithin the organizational structures. Thus, the agents must have signicantdomain-specic knowledge. Our hypothesis is that just as knowledge-based systems becamean alternative toearlydomain-independent search and planning methods within AI in general,knowledge-based systems may also be an alternative to current domain-independentcollaboration techniques.With a knowledge-based approach to collaboration, the agents may have a broad, butshallow understanding of how to coordinate. Thus, although an agent knows how to coordinate,it may not know why its particular action or method of collaboration is the best.(It just knows it is the appropriate action to take.) This alternative is only appropriatefor domains in which there exist well-structured methods for coordination (even though theenvironment itself may be unstructured and dynamic). The advantage of this approach isruntime eciency and improved coordination. It eliminates many of the runtime negotiationsthat would be required of other methods, thereby reducing communication. It alsoeliminates the runtime deliberations that would be required of each agent deciding how tocoordinate | instead the agents just know what to do, although \knowing what to do"may be dependent on an understanding of the current situation and the agent's goals. Thisapproach should improve coordination because we can encode within our agents the resultsof more extensive deliberations and planning than would be possible at runtime. All of theseare critical advantages in domains where communication bandwidth is limited and reactionto changes in the world must be as fast as possible. A serious question with this approachis whether it is tractable to obtain and encode the knowledge for reasonably complex andrealistic domains.In this paper we demonstrate the viability of the knowledge-based approach by example.Specically, wehave constructed AI agents that model the coordination performed by Navyand Air Force pilots and controllers in a variety of air-to-air and air-to-ground missionsusing existing Navy and Air Force tactics and doctrine. These agents are medium-size realtimeintelligent systems (4800 rules), built using an existing AI architecture (Soar [Lairdand Rosenbloom, 1990 Laird and Rosenbloom, 1994 Rosenbloom et al., 1991 Rosenbloomet al., 1993]). The architecture was not modied to support reasoning about coordination.Rather, all capabilities for coordination are implemented as problem solving knowledge. Thesystem in which these agents are constructed is called TacAir-Soar. An earlier version ofTacAir-Soar, with only limited coordination capabilities, has been described by Tambe et al.(1995) and a preliminary discussion of coordination in TacAir-Soar was presented by Laird2

et al. (1994).This domain is an appropriate test bed for our thesis because there are existing proceduresused by the military to coordinate behavior, with the intent to minimize deliberationand communication among Navy and Air Force personnel. The domain also requires a widevariety of coordination types and methods. The agents must coordinate their actions, sensors,internal state, goals, and the organization structure (ow of information and control),and the agents must use a variety of sources of knowledge (sensors, communication, sharedknowledge, and shared goals) to coordinate their activities. As a result, our agents display arichness in coordination rarely seen in AI multi-agent systems. This domain also allows usto examine the requirements and approaches for supporting complex knowledge-based coordination.We analyze the requirements in terms of the domain structure, the organizationalstructure, the agents' knowledge of these structures, and the underlying AI architecture usedto develop the agents.2 The Domain: Military AviationThe practical goal of our research is to produce AI systems whose behavior is tacticallyindistinguishable from humans for all Navy and Air Force missions, within a large scale,entity level, distributed simulation environment. The agents that we develop are to beused for training of command and control personnel such as a human AWACS operatorgiving commands to our agents acting as ghters. This general approach can also be usedfor training and mission rehearsal for humans ying missions with and against our agents.Currently, our systems y a variety ofNavy and Air Force missions, which serve as the basisfor our analysis of knowledge-based coordination.The specic simulation environment weareworking in is DIS [steering committee, 1994],which is based on a network protocol for communication between independent, distributedsimulators linked together via a network. The simulators can include human controlled simulators,as well as computer-generated forces. All entities in the simulation run independently,in real time, over a collection of 3D real-world terrain cells.Our agents include pilots of ghter, attack, tanker, reconnaissance, and early warningaircraft and a variety of ground and ship-based controllers that provide mission and routinginformation to the planes. Each of our agents is an independent Soar system situated in itsown virtual vehicle (such as an F/A-18), perceiving information similar to ahuman in suchavehicle (via radar, vision, and radio). Dierent agents can run on workstations distributedpossibly hundreds (or thousands) or miles apart. Thus, there is no shared state among agents.Rather, communication between agents takes place via simulated radios using messages thatapproximate the messages sent byhumans. The agents interact with the DIS world throughModSAF [Calder et al., 1993], which provides simulations of vehicle dynamics, sensors, andweapons.One limitation on the delity of our agents is the delity of the underlying simulation.Currently, there are only limited types of weather in the simulation, and planes never havepartial damage. These limitations eliminate some of the types of coordination required inreal aircraft. However, wehave attempted to implement all the relevanttypes of coordinationthat the delity of the simulation permits.3

verify that the planes are where they belong, to control air trac to avoid collisions, and torelay mission changes from other controllers.At time 3, let us assume that an E-2c, which is an early warning aircraft with a verypowerful radar, informs the planes of an approaching threat (the MiG-23's). Suppose furtherthat the mission dictates that the ground attack cannot be abandoned. The lead decides tohave half of the division \strip" and intercept the MiG's while the other section continueswith the original mission. Each section then becomes an autonomous unit, in which a sectionlead has control.The MiG-23's mission is to protect assets on the ground by attacking the F/A-18's.Because the MiG's have relatively poor radar, they must depend on a ground controller(GCI), which has more powerful radar, to guide them toward the engagement. The GCI doesthis by providing the bearing, range, altitude, speed, and heading of the F/A-18 aircraft.Similarly the F/A-18's receive information about the MiG's from the E-2c. The E-2c mustalso give the ghters permission to re. Once the F/A-18's have the MiG's on their own radarthey will prosecute the engagement on their own, receiving information only by request.During the engagement, the lead of the intercepting F/A-18 section orders its partner,or wingman, to change to a formation that provides more mutual support and to \sort" theMiG's so each F/A-18 has a separate target. Similarly, the MiG's lead communicates toperform a tactical maneuver called a pincer. In this engagement we assume that both MiG'sare destroyed (at time 4) and that the intercepting section of F/A-18's returns to the carrier.The other section of F/A-18's continues to the next waypoint (Cougar). At about thistime, a forward air controller (FAC), whose job is to locate and guide the planes to theirtargets, comes under attack by enemy tanks. The FAC calls for support from anothercontroller (the TAD), who decides to reroute the incoming air strike to attack the tanksinstead of Target 1. The TAD contacts the F/A-18's (at time 5) with a new mission brief,then contacts the FAC with information about the incoming mission. The lead of the F/A-18's must replan the nal attack altitude and geometry to the target and communicate it tothe wingman.As the F/A-18's approach the \initial point" (Wanda) the lead checks in with the FACto verify the mission and receive any further mission changes. Once the FAC veries themission, visually acquires the planes, and determines they will attack the target withoutendangering friendly forces, the FAC gives them nal permission to drop their ordnance.In the nal bombing run, the F/A-18's perform a tactic called a 90-10 split (plannedand communicated earlier by the lead) to provide separation between the two aircraft. Sincethe tanks are moving, the F/A-18's must visually acquire them, modify their approach, anddrop their bombs (at time 6). They then exit the area and y back on their egress route(not shown).3 Coordination AnalysisThe need for coordination in this domain is obvious. The overall goals of the agents as a groupare very general, such as to take control of land, suppress hostile forces, or protect friendlyforces and civilians. However, no individual agent has the physical capabilities to achievethese goals alone, so multiple agents are required. There are many further complications to5

coordination:1. Heterogeneous nature of the agents. Some agents have very good sensors (E-2c), somehave better weapons, while still others have resources, such as fuel, that can be distributedto other agents. Some agents have weapons more appropriate for air-to-aircombat than air-to-ground. Even the same type of plane can have very dierent capabilitiesdepending on its load-out of weapons (an F/A-18 can be congured for air-to-airor air-to-ground, and have dierent air-to-ground weapons depending on its intendedtargets). Any specic mission will often involve coordinating a variety of agents withdierent capabilities.2. Limitations on communications. Radio communication among planes is minimized forseveral reasons. First, it is a relatively slow method of conveying information. Second,communications can be intercepted by the enemy and used to identify and locatethreats. Some missions are own under complete radio silence. Third, generating andprocessing messages disrupts the other activities of a pilot. Even if the pilot is notthe intended recipient of a message, ltering useless chatter can aect performance.Fourth, a pilot can only listen to two radios at once, so many planes must share thesame frequencies. This greatly limits the bandwidth of communication between planes.3. Unpredictability. Although missions can be planned in great detail, the domain is nondeterministic.For example, it is impossible to predict the reaction of the enemy, bothin the large (will there be any enemy ghters) and in the small (what direction willthey come from, which tactic will they use, when will they use it). It is also impossibleto predict the eect of the enemy's actions. For example, what happens if they shootdown the leader of the division?To carry out their orders, the agents must coordinate many dierent activities, whichrequires them to coordinate their sensing, actions, and goals. They must also coordinatetheir organization structure, which itself might change during a mission (such as when agroup splits or joins). Below we discuss the dierent classes of agents and the types ofcoordination they exhibit. This analysis is summarized in Figure 2. These are all of thetypes of coordination we have observed in human pilots and they have all been implementedin our agents.1. Flight groups: A ight group consists of ghters and attack aircraft (F/A-18's andF-14's) with a common mission. Usually, attack aircraft will be in ight groups of two(called sections) or four (called divisions). A division is made up of two sections yingtogether. Division and sections can be organized into larger groups of six to sixteenplanes (called packages). Fighters will be grouped in sections or divisions, and theymay beattached to packages as escorts.Within a section, one of the planes is designated as the lead and the other the wingman.The lead dictates the general actions of the section. This is not a pure master-slaverelationship, because although the wingman must y in formation with the lead, thewingman decides how best to achieve and stay in formation. The wingman must also6

Figure 2: Types of Coordination Agents.keep track of the group's progress in the mission, so that he can take over as lead ifthe lead is damaged or destroyed.The left column of Figure 2 lists the types of coordination found within ight groups.As detailed in the example scenario, the aircraft coordinate their actions in terms ofmaneuvering, ying in formation and employing their weapons. Our agents can yin six dierent section formations (defensive combat spread, oensive combat spread,ghting wing, bearing, parade, and trail) and eight dierent division level formations(line-abreast, box, oset box, wall, VIC, ghting wing, bearing, and trail).The agents coordinate their sensing by directing their radar so that they can covera larger area than a single plane can cover. They explicitly communicate radar andvisual sightings. The planes will also communicate information about their currentstate, such as their fuel status. They also coordinate their goals, and the lead willcommunicate changes to the mission plan, and specic intent, such as the decisionto intercept enemy planes. Finally, there are specic procedures for coordinating theorganizational structure of the ight group by joining, stripping, or changing the leadwhen appropriate.2. Controllers:The GCI and E-2c (as well as the TACC and TAD) can provide centralized informationand control for many groups of planes. The TACC, TAD, and FAC form a distributedcontrol network in which requests for missions are propagated through the network and7

assigned to ight groups. The controllers coordinate the activities of multiple aircraftby routing them, and by assigning altitudes, communication frequencies, and attacktimes.The middle column of Figure 2 shows the types of coordination found between a ightgroup and a controller. The only coordination of actions is the marking of targetswith smoke by a forward air controller, although the coordination of goals providescoordination of one ight group with other ight groups and ground forces. TheGCI and E-2c controllers have much better radar capabilities than individual planes,allowing them to communicate information that would otherwise be unavailable tothe ight groups. Forward air controllers can also spot targets and evaluate damage.Controllers (and ight groups) use authentication procedures to determine that a ightgroup is where it is supposed to be.The most complex coordination involves goals, where the controllers can change almostany aspect of the mission for a ight group, including the altitude for ying routes, theroutes themselves, the controllers the ight group contacts along the route, the radiofrequency to use during the contact, the target location, and the time of the attack.The controllers also give nal authorization to ight groups with pre-specied missions.The organizational structure between controllers and ight groups is relatively xed.3. Support aircraft: This includes tankers for in-ight refueling. There is limited coordinationbetween these aircraft and the ight groups, but it is not signicantly dierentfrom the other types of coordination we have presented.4. Others: The missions of a ight group are part of an overall plan, and thus involveinteractions with other ight groups as well as ground troops. This coordination is notaccomplished through explicit communication, but usually through timing of actions.For example, a division of ghters might bescheduled to y through an area ve minutesbefore an air-to-ground attack, in order to disperse any enemy ghters. Similarly,an air-to-ground attack may be coordinated with a pause in an artillery barrage. Insuch cases the air-to-ground attack must take place at a quite precise time (+/- tenseconds), which forces the planes to adjust their speed during a mission so that theyarrive ontime.4 Knowledge as the Basis of CoordinationIn this domain, the key to coordination is knowledge. The agents must know the appropriatetechniques and methods for performing their specic tasks, such as maneuvering, sensing,and employing their own weapons. They must know their responsibilities for their currentmission, the details of that mission, and with which other agents they must interact. Theyalso must know, in general terms, when and what to communicate to other agents during themission, and what to do in response to messages from others. To support all these types ofcoordination, we have identied four dierent general types of knowledge. These appear tobe all of the sources used by humans [Corps, 1988], and our agents use all of these sources.8

Background Knowledge: Common Doctrine and Tactics. Most of the long-termknowledge in our agents consists of knowledge about how to perform missions. This includeshow to maneuver, sense, and employ weapons. It also includes doctrine and tactics thatspecify methods and procedures for coordinating with other agents. We attempt to modelcurrent military doctrine and tactics realistically. Our sources include unclassied militarydocuments, books, extensive interviews with former U.S. pilots, and observations of U.S.Navy and Air Force pilots training. Doctrine dictates specic roles for individuals (suchas lead, wingman, TAC, TADD, FAC) and the specic duties to be performed. Thus, forexample, there is no need for the lead and wingman to negotiate how to maintain theformation. The wingman just does it, and the lead doesn't have to deliberate about it at all.This is a social contract, where agents implicitly create coordinated behavior by behavingaccording to certain prespecied rules [Shoham and Tennenholtz, 1992].The reliance on common background knowledge to support coordination in the militaryis not surprising. The military has sucient planning and training time to develop andimplement common doctrine and tactics. The individual agents need not determine the bestcoordination strategy on their own, but can rely on ecient, compiled, common versions ofthe coordination strategy. Although this is the backbone of all coordination, the behaviorof the agents is not purely scripted because of the complexity and unpredictability of theenvironment. Thus, other sources of knowledge are critical to coordination.Mission Brieng. Before a mission, the participants are briefed on the tactical situation(such asweather and enemy activity), their responsibilities, and the responsibilities of others.The brieng helps establish specic operational parameters required for coordination, suchas the partners of a section, their initial formations, the methods for communication (radiofrequencies, call signs), the default radar contract, the default method for sorting enemyplanes, any specic tactics the section plans to employ, the waypoints of the mission, thecontrollers who will be contacted during the mission, the authentication procedures, and soon. In our agents, the mission brieng knowledge is not \compiled" into the agent, but isinstead represented declaratively in the agent's working memory. This allows an individualagent to take part in dierent missions at dierent times, and to accept modications to amission at any time.Coordination for an air mission starts with the highest command levels generating airtasking orders (ATOs) using the CTAPS system. CTAPS generates an ATO that is deliveredto the air wings on a daily basis. In our simulation, the ATO generated by CTAPSis electronically transmitted to a program we have developed, called the Exercise Editor[Coulter and Laird, 1996]. The Exercise Editor allows a human to ll in the details of themissions that would normally be determined at the wing level for each aircraft. This informationconsists of up to 130 parameters and includes items such as routes, targets, radiofrequencies, names of members of ight groups, controllers, etc. Given the mission brieng,the lead aircraft will ll in any missing details, such as an attack plan, based on the missionand the tactical situation, and communicate it to the rest of his command group.Observed Behavior. During a mission, the members of a ight can directly observe eachother's behavior through a variety of sensors including vision and radar. These observations9

0:16 LEIA-1: Endor-1 this-is Leia-10:19 ENDOR-1: Endor-1 go-ahead0:29 LEIA-1: Leia-1 mission-number ed1-ev2-4 proceeding-to Elmerangels 25 time-on-station 1+30 checking-in-as fragged0:31 ENDOR-1: roger Leia-10:31 ENDOR-1: radar-contact0:33 ENDOR-1: cleared-to-enter-aoa0:35 ENDOR-1: proceed-as-briefed0:35 ENDOR-1: maintain angels 240:37 ENDOR-1: check-in-with Obi-1 on indigo at Cougar0:40 LEIA-1: Leia-1 wilco2:45 LEIA-1: Leia-1 pushing-button indigoFigure 3: Example Communication between ight groupand TACC controller.will sometimes have a direct impact on behavior for coordination. For example, a wingmanobserves its lead's position to get into and maintain formation an attack plane identiesa target when it sees smoke dropped by a controller and ight groups react appropriatelywhen they see one of their members destroyed.Explicit Communication. The most exible way to coordinate behavior is to explicitlycommunicate knowledge and goals between two agents. In this domain, all communicationoccurs via simulated radio. We have attempted to replicate the verbal communication usedby humans for the missions performed by our agents, although text is transmitted instead ofsound. There are approximately 250 message templates used for communication. These messagetemplates approximate the language used by military aviators (called\comm brevity")and are easily understandable by humans. Interaction with humans is supported via CommandTalk,a speech to text, and text to speech system [Moore et al., 1997]. Figure3showsan example transcript of text communications among the lead of a ight group (LEIA-1)and aTACC controller (ENDOR-1) during a mission.5 Agent Design for Tactical Air MissionsUp to this point, we have emphasized how coordination is used in this domain, and howdierenttypes of knowledge support coordination. Wehave mentioned in passing the abilitiesof TacAir-Soar in terms of coordination, but wehave not gone into the details of its design andhow the design supports coordination. Nor have we described the structure of the underlyingarchitecture, Soar. This section presents a general review of the Soar architecture and theTacAir-Soar system. It serves as preparation for the next section where we examine howTacAir-Soar mets the requirements of multi-agent coordination.The original purpose of Soar was to support the construction of AI systems that could usemany dierent problem solving methods, on many dierent problems. It quickly evolved to10

include an integration of problem solving, planning, and learning, as well as detailed modelingof human behavior. However, for TacAir-Soar, the emphasis has been on knowledge-rich,goal-driven and reactive behavior. Soar's planning abilities and learning mechanisms havenot been emphasized.The basic operation of Soar consists of the selection and application of operators, whichcorrespond to the basic deliberative acts of the task at hand. TacAir-Soar has operatorssuch as y-ight-plan, intercept, communicate-message, accept-message, execute-turn, andso on. Altogether it has over 400 operators. TacAir-Soar's behavior consists of the repeatedselection and application of operators.Although operators are the basis for generating its behavior, operators are not representedas monolith data structures in Soar. For example, there is not a single data structure thatrepresents all of the conditions under which a turn should be attempted, nor is there a singledata structure or procedure for executing a turn. Instead, Soar represents all of its knowledgeabout selecting and applying operators as if-then rules that re whenever their conditions arematched. For example, an operator for ring a missile consists of rules that test for situationsin which a missile should be red, and other rules test that the re-missile operator has beenselected and then perform the appropriate actions. In general, the conditions of rules test thecontents of working memory. Working memory consists of system's current data structureswhich include all sensory information (from vision, radar, ...), intermediate computations(such as determinations of whether another plane is a threat), and selected operators.Soar's rules can be classied based on their problem-solving function.1. Operator Proposal: Rules that propose operators encode precondition knowledge.Proposal is done by creating an acceptable preference for the operator and creating theoperator data structure in working memory. There may be many distinct rules thatpropose the same operator, thus providing disjunctive preconditions.2. Operator Comparison: Rules that compare operators encode control knowledge.Comparison is done by creating a variety of preferences (best, worst, better, indierent,...) which areinterpreted by a xed decision procedure and used for selecting the nextoperator. Once again, there may be many distinct rules that contribute to the naldecision.3. Operator Application: Rules that makechanges to the current state encode operatorapplication knowledge. Application rules test the currently selected operators andmake appropriate changes to the current situation. In contrast to the other typesof knowledge, these changes will persist even after the rules no longer match. Thesechanges can include the initiation of external actions. A single operator may have alarge number of application rules associated with it that provide support for complex,conditional actions.4. Operator Termination: Rules that detect that all required changes have beenmadeand that a new operator can be selected. These rules encode post-condition knowledge.Once again, there can be many for a single operator, providing disjunctivepost-conditions.11

Figure 4: Decomposition of the intercept operator into sub-operators.Using this approach, the basic operation of the system involves the continued selection,application, and termination of a stream of operators. For ying a plane, this might include asequence of commands to move the stick, as well as operators to wait when no new commandis required. However, for complex environments, it is likely that the agent will have manyoperators that cannot be performed directly by rules. For example, one operator in TacAir-Soar is called intercept and it is used to intercept an enemy plane.There is no simple set ofrules that can perform all of the actions required over the extended period of time requiredin an intercept. In this situation, Soar automatically generates a subgoal to achieve theselected operator. In this subgoal, the problem is now to implementintercept using operators,essentially decomposing intercept into simple actions. TacAir-Soar has many operators thatare relevant to an intercept, and the appropriate ones get proposed and selected. Some ofthese are complex, and in turn lead to subgoals. Ultimately, the problem is decomposed toa level where primitive operators can be applied directly by rules. This leads to a stack ofoperators/goals that are simultaneously being attempted by TacAir-Soar. Figure 4 shows aportion of the operator hierarchy that is encoded in TacAir-Soar for the intercept operator.In general, these operator hierarchies correspond to task and subtask decompositionsdescribed by experts in the domain. The majority ofknowledge in TacAir-Soar is organizedaround this top-down hierarchical structure, so that most operators are only proposed withinthe context of executing another, more abstract operator. However, some operators arerelevant across the hierarchy. These are operators that help achieved implicit goals, such assurvive or maintain situational awareness. These operator are proposed at all levels of thecurrent hierarchy, sothat they can be selected as soon as possible. For example, in order to12

accept a message from another agent, the accept-message operator is proposed for all goalswhenever there is a new message coming in on the radio. This operator will be selected, andthen can process the radio message immediately.6 Agent Design for CoordinationThe main purpose of this section is to analyze the cognitive capabilities required to supportcoordination in our agents. Our analysis is based on types of coordinated behavior (acting,sensing, goals, etc.), and the methods for sharing knowledge and goals. These capabilitiesserve as a requirement list for constructing coordinating agents in knowledge-rich domainslike tactical air combat. For each of these requirements we describe howTacAir-Soar supportsit.6.1 Individual Entity RepresentationIn order to support truly distributed simulation of a large number of entities, any of whichcan be human, it is necessary to represent each entity independently. For our domain, anentity corresponds to an individual plane. By representing these independently, it is possibleto y missions that are a mixture of human and computer-generated agents. For example, ahuman could y the lead of a group of four planes, with all of the other planes being computergenerated. This requirement isviolatedby some of the other work being done in multi-agentsystems research in articial intelligence. For example, the closest work on coordinatingaircraft has been done by Rao et al. (1993). In their work, extensive knowledge is used fordivision-level air-to-air combat coordination. However, in addition to having agents controleach aircraft, they introduce \team" agents that contain knowledge about the goals, state,and knowledge of a group of agents. Their system supports the free ow of information fromindividual agents into the team agent, which then reasons about higher level goals (such astactics) and then sends subgoals for the individual agents to perform. This approach doesnot model real-world procedures because it completely nesses the issue of how the lead ofa group intermixes executing it own goals with the goals of the group, and how the groupcoordinates its behavior without any directly available shared knowledge. This is the generaltechnique used within computer-generated forces, where a separate process or agent containsshared state and control of a group of coordinating forces however, it makes it impossiblefor a human to be included in the simulation.Although our work is on knowledge-based coordination, our approach should not beconfused with work on Blackboard architectures [Nii, 1986] and other approaches for usingcoordination within the construction of intelligent systems. In these systems, independentknowledge sources share a common blackboard and the activities of the knowledge sourcesmust be coordinated. In our approach, independent knowledge-based agents must coordinatetheir behavior, but without a common blackboard. Each of our agents has its own sensorsand eectors and can only share data via explicit, human-like communication.13

6.2 Extensive and Ecient Knowledge BaseOnce the decision is made to represent each entity independently, it is also necessary for eachindividual agents to be experts at performing their missions and interacting with others.Each agent must have an extensive knowledge base that includes all of the tactics anddoctrine applicable to the roles and missions the agentmaybegiven. For example, a wingmanmust have the same knowledge of doctrine and tactics as the lead, as well as track theprogress of the group during a mission, so that the wingman can take over when necessary.Other researchers have used knowledge for driving coordination. One example is the workof Hayes-Roth, Brownston, and van Gent (1995) on improvisational characters. However,to date, improvisational agents (and other work using knowledge for coordination) do notappear to require as extensive a background knowledge or communication repertoire forcoordination as is required in this domain.Soar directly supports extensive knowledge bases through its temporary working memoryand its long-term rule-based memory. Research in Soar has focussed on how to scale up tovery large production systems containing from hundreds of thousands to millions of ruleswithout slowing down [Doorenbos, 1994]. Currently, over 10 Soar agents together with theModSAF simulation environment can be run simultaneously on a 200MHz Pentium Proworkstation.In TacAir-Soar, and agent's knowledge of its current mission is represented in workingmemory based on parameters it reads in at start up. It encodes all of its knowledge aboutdoctrine and tactics as rules that are used to propose, select, apply, and terminate operators.The operators form a tangled hierarchy of goals that are dynamically instantiated based onthe current situation and mission. The doctrine for coordination is encoded as additionaloperator knowledge that mixes with knowledge for performing other aspects of missions.As an example of low-level doctrine, consider the problem of a wingman ying in formation.In service of the operator of following a lead, a sub-operator to achieve proximityto the lead will be activated if the lead is out of visual range. This might involve furthersub-operators of contacting the lead to nd his position, or just ying on a course to the leadif the lead is on radar. Once the wingman has the lead in visual range, the wingman willselect an operator to y in formation. This operator in turn will lead to calculations of theappropriate position of the wingman relative to the sensed position of the lead based on thecurrent formation. As part of the ying in formation operator, rules will propose changes tothe agent's speed, altitude, and heading to steer the plane into the right position.There is other work on getting groups of agents to maneuver in formation. Balch andArkin (1995) report on an approach for keeping ground-based vehicles in formation, andinvestigate three dierent reference techniques: unit-center-referenced, leader-referenced, andneighbor-referenced. Our approach models the methods used by pilots, which is a hybridof leader-referenced and neighbor-referenced techniques, in that a group of four vehiclescomprises two groups (sections), each with two vehicles in which there is a wingman and alead. One of the section leads is also the division lead, with the other lead being the \secondlead". In each section, the wingman ies in reference to its lead, and the second lead ies inreference to the division lead. In implementation, this ends up being most similar to Balchand Arkin's neighbor-referenced approach. However, the lead does not maneuver to stay information. The justication given by our experts for this approach is that it is inadvisable14

to have two agents attempting to modify their behavior in relation to each other, because itwill lead to overcorrections.6.3 Parameter-driven BehaviorAn agent must not be limited to only one type of behavior, but must be able to performa variety of activities in coordination with others. Thus, it is not sucient to build agentsthat can perform only one type of mission. In this domain, a single plane can performboth air-to-air and air-to-ground attacks at dierent times during its mission. In our case,the mission brieng received before launch and the mission changes received from controllersdynamically determine the goals of the agents. An agent's behavior must be designed so thatthe knowledge relevant to the current mission parameters can be used. For some complexmissions, the information in the brieng may involve fragments of plans that the agent mustintegrate into its overall behavior at the appropriate times. Thus, the generators of theagent's behavior must be exible enough so that they can be modied at any time.In TacAir-Soar, all mission-related behavior is based on a representation of the currentmission that is held in the working memory. This can be examined and modied by therules that make up an agent's long-term knowledge. The mission is specied at briengtime, but also can be dynamically changed later through communication. The best exampleof dynamic changes to the missions comes in on-call close-air support missions. Here, a setof planes is prepared to perform an air-to-ground attack in support of ground troops. Attake-o, it does not know its target, but is tasked to y to a specic holding point andawait further instructions. Once it gets to the holding point, it checks in with the forwardair controller (FAC), which will assign it a mission. When a mission is assigned, the attackplane deliberately modies its internal representation of the mission, adding the target andother parameters it has received via communication with the FAC. These modications inturn lead it to plan its specic attack tactic, which as with all other reasoning in Soar, isdone via the deliberate applications of operators.6.4 Explicit Representation of Organizational StructureRelated to the need for parameter-driven behavior is the need to represent, examine, andchange any organizational structure in which the agent participates. For example, an individualplane must be able to join into a section, take on the responsibilities of either thewingman or lead, and then later join up in a division. Divisions can be organized into packages,which include dissimilar aircraft with dierent missions. Some of the planes may beon a ground attack mission, while the others are ying as escorts. Each agent must keeptrack of the other members of its ight group, possibly changing formations along the way,splitting away for a while and then rejoining.Soar's working memory serves as the storage for the explicit representation of the organizationalstructure. In TacAir-Soar, each agentmaintains in working memory an ordered listof all groups it is a member of. Each group structure contains information about the rolethe agent has in that group and the formation being own by that group. A mark specieswhether each group is currently active, and at any one time, there is a single \primary"group (although multiple groups may be active). The primary group is the group used by15

the agent to y in formation. For example, a wingman in a section that is also part ofa division would have both the division and section active, but the section would be theprimary group because it is that group which determines how the wingman should y. Theagent's role in a group can be modied based on its perception and communication. Forexample, in perception, an agent may see a partner destroyed, or it may see a tanker it istrying to rendezvous with for refueling. Similarly, an agent that is an escort may sense anenemy plane on radar, and \strip" from its package, sending a radio message which informsthe other planes in the package of its plan.6.5 Reactive Execution and Interruptible ProcessingA wingman must respond quickly to changes in the lead's behavior. Computer-generatedforces must in general be reactive, but coordination also requires that they can interrupttheir current goals to process and respond to urgent messages. Thus, it must be possibleboth to react to changes in the world and to respond to the current goals and subgoals thatimplement thecurrent mission.Soar's representation of knowledge as rules in support of operators provides for fast, reactiveinterruption if appropriate. Sensory data is deposited into working memory where rulescan match and suggest new operators, or the termination of existing operators. For example,in TacAir-Soar, the wingman's main goal is to y in formation with the lead. Whenever thewingman is out of position, rules re to modify the heading, speed, or altitude. Wheneverthe wingman receives radio messages from the lead, rules re to select operators that respondto the message appropriately in the current situation. These rules may interleave with eachother, with higher priority tasks interrupting other tasks.6.6 Generate and Comprehend MessagesIn order to communicate with other agents, an agent must be able to translate internalinformation about its goals, state, perception of the world, and current actions into a formthat can be understood by other agents. Conversely, the agent must translate messagesfrom other agents into an appropriate internal representation. To support interaction withhumans requires full natural language, which is still beyond the state of the art.In the current version of TacAir-Soar, we nesse the general problem of natural languageunderstanding and instead use a template-based approach that prespecies the specic formof the messages that the system can generate and accept. The agents know when to generatethese messages. They also know howtointerpret these messages and modify their own internalknowledge structures appropriately. For example, the message to hand o an aircraft toanother controller would be \check-in-with controller-name on radio-frequency at location."Message generation consists of binding variable values and transmitting the resultant formas a list. The receiver uses pattern matching to select the appropriate template and thenextract the variable values. For human interaction we provide the words for speech synthesis,and are working with others to develop a speech to text capability [Moore et al., 1997], bothof which have been demonstrated in tests.We have implemented the types of communication required by our agents, but gone nofurther. However, the interactions between human pilot are also very stylized. Pilots are16

trained on the specic methods of how to communicate to reduce misunderstandings andcompensate for noisy radios. There is a strong emphasis on encoding information into shortphrases whenever possible. For example, a pilot might send to a ground controller themessage, \bogey dope", which is a request for information on the current bogey that thepilot is engaging. This approach has been successful for the types of communication ouragents need to exchange among themselves, and is natural enough that humans can y aslead or wingman with our agents using a simulation interface [van Lent and Wray, 1994].However, this approach will break down when extended to unrestricted natural languageinteractions that can occur without such a constraining interface. To that end, others areinvestigating general natural language approaches [Rubino and Lehman, 1994].The problem of communication is further confounded by the need to send and receivemessages from synthetic agents that do not possess even this basic level of parsing messagetemplates. To address this problem we are using the Command and Control SimulationInterface Language (CCSIL)[Salisbury, 1995] which is essentially an enumerated set of datastructures for all military communications. For example, the directive to hand o a ightto another controller (given above) would be message 1107 with three data elds specied:controller, frequency, and location. We provide translation to and from these structuresand the human understandable message templates described above. Thus to send a CCSILmessage our agents would rst form an understandable utterance, translate it into CCSIL,and send this form on the radio. At the receiving end the process is reversed it receives aCCSIL communication, translates it into an understandable utterance, then interprets thisand acts appropriately. This approach limits what our agents need to understand to a singlerepresentation and facilitates interaction with the subject matter experts who do not haveto learn the low level representation details.This overall requirement to support computer to human communication is not achievedby most multi-agent systems being developed today. Most approaches develop their ownlanguage of goals, methods, and plans which is based on the internal representation ofknowledge in the agent, or on a standard declarative representation (such as KIF). Thesesystems often use a more general approach to coordination that is shared among all of theagents, which in turn does not require as much domain specic knowledge as required inTacAir-Soar. One example of this is the work of Tambe (1996) on teamwork in helicopters.Tambe's system is an outgrowth of the original TacAir-Soar system however, he has useda deep theory of teamwork to organize communication and coordination. Although thisprovides more generality, to date it does not produce the same communications, using thesame language, that a human would use, making it dicult to embed this in a mixture ofhuman and computer forces.7 EvaluationThis work is not yet to the stage where it is appropriate to perform detailed quantitativecomparisons of the behavior of our agents to humans performing the same tasks. Ouremphasis has been on breadth of coverage. This is appropriate because current limitations inthe simulation environment, specically the level of control our agents have of their aircraft,make it impossible for our agents to display the same behavior as pilots at the level of17

quantitative measurements (such as specic altitudes and speeds).However, some quantitative measurements of completeness of coverage of the domain canbe made. Prior to the development of these agents, an independent group (BMH Associates)developed an extensive list of requirements for ight behavior. This list was not used duringthe development of our agents, but it can serve as a partial benchmark for the completenessof capabilities. This is really just a gross measure because some individual items on the listthat we've implemented involve hundreds of rules in our systems, while others require onlya few. Although the list is quite extensive, listing more than 800 independent behaviors,only about 145 are directly relevant to coordination in tactical missions (the remainder haveto do with pre and post-mission activities, as well as activities that must be performedindependent of coordination). Of those remaining activities, 21 cannot be done becauseof limitations in the simulation environment (such as that the delity of aircraft control isinsucient to perform the maneuver). Thus we are left with 124 coordination activities, ofwhich the agents currently perform 114 (92%). Those capabilities that are missing includemanagement of radar searching within a division and some close-in combat behaviors.How good is 92%? Is it sucient for meeting our goals of creating agents that can ycoordinated missions among themselves and with humans, or is the missing 8% critical toreasonable performance. Thus, another type of evaluation is to have our agents y variousmissions and then evaluate their behavior at a qualitative level as to whether the planesy the correct mission, whether they use appropriate communications at specic points inthe mission, and whether they achieve their objectives. As part of a large-scale demonstrationof distributed simulation (called ED-1), we prepared three dierent missions (defensivecounter air, close air support, and integrated strike) involving 21 agents spread across thethree missions. These missions were designed to exercise the entire spectrum of behaviorsimplemented within the agents. They were own in the context of a larger exercise involvingsimulated ground and surface forces. All agents ran in real-time. Each mission took between40 minutes and an hour to complete, with each agent making many thousands of individualdecisions.Although our agents had specic missions, the overall exercise was unscripted, with manyunexpected interactions between our agents and other forces. Overall, each mission wasperformed successfully, and was robust in the face of unexpected interactions. For example,during one close-air support mission own by our agents, a large number of enemy planesstarting to attack the eet. Our agents correctly abandoned their mission and engaged theattackers in air-to-air combat. This disrupted the point of our demonstration (the close-airsupport mission), but demonstrated the ability of our agents to respond to novel situations.Finally, as mentioned earlier, our agents are suciently complete so that they can yin coordination with vehicles controlled by humans using a special interface. Humans areable to control a lead or wing aircraft, as well as act as a controller (such as in an E-2c)or the director of an exercise. As a exercise director, a human can direct our agents tonew positions, changes in target, and so on, so as to improve the simulation from a trainingperspective.18

8 Future WorkIn October of 1997 TacAir-Soar will participate in a full scale military training exercisecalled \Synthetic Theater of War" (STOW-97) where it will y all xed wing aircraft onall missions in support of that exercise. This will involve approximately 200 aircraft being\airborne" at a time. In this exercise, the planes will be completely autonomous and willinteract with humans using only doctrinally correct simulated radio messages.Beyond STOW-97, we hope to transition TacAir-Soar so that it will be part of a newAir Force mission training system. In addition to training, we are also using TacAir-Soar asthe basis of modeling pilot behavior under various conditions, including heavy workload andfatigue. Finally, aspart of DARPA's Advanced Simulation Technology Thrust (ASTT), weare performing research on using learning within TacAir-Soar to quickly build and extendcomputer-based agents.9 DiscussionThe long-term goal of our work is to build intelligent autonomous agents. In this paper, wehave demonstrated that it is possible to create agents for a complex environment in whichextensive knowledge of the structure and procedures for coordination are available at thetime when the agents are being constructed. Our approach has been straightforward. Wetry to model the coordination methods used by humans, and to date, we have implementedcoordination without negotiation, extensive internal agent modeling, or special architecturalmechanisms. The coordination arises out of shared doctrine and tactics, shared knowledgeof missions, observations of behavior, and explicit communication. Our success is heavilydependent on four characteristics of our domain, which simplify the implementation of coordination:the shared goals of the agents, the expert-level performance (and knowledge) ofthe agents, the well-dened methods and procedures of the military that we are modeling,and the availability of experts who are willing and able to provide the details of procedures.On the surface, our approach might appear to suer from rigidity because it depends on aset of \canned" interactions based on existing doctrine and tactics. However, our agents arenot blindly applying a xed doctrine independent of changes in the environment. Instead,our agents are continually reassessing the situation, dynamically stringing together bits andpieces of existing doctrine and tactics that are appropriate to each situation, possibly generatingnovel behavior (when viewed over time). Thus, our agents do very well as long asthe situation is covered by some combination of existing military practice (which includesdening new missions and many types of changes to the organizational structure). In completelynovel situations, our agents will use whatever pieces of doctrine that are relevant tothe situation. In exercises to date, this has been sucient. However, our agents currentlydo not have the ability to step back and reason from rst principles about what would be anew, possibly novel coordinated response to the situation. Soar has the ability to supportthis type of planning, however, the additional knowledge required by the agents to internallysimulate their environment has not been added.19

AcknowledgmentsThis research was supported at the University ofMichigan under contract N00014-92-K-2015from the Advanced Systems Technology Oce of the Defense Advanced Research ProjectsAgency and the Naval Research Laboratory, and contract N66001-95-C-6013 from the AdvancedSystems Technology Oce of the Defense Advanced Research Projects Agency andthe Naval Command and Ocean Surveillance Center, RDT&E division. TacAir-Soar wasdeveloped in cooperation with Frank Koss, Paul Rosenbloom, Karl Schwamb, and MilindTambe. The authors would like to thank BMH Associates, Inc. for their technical assistance.20

References[Balch andArkin, 1995] T. Balch and R. C. Arkin. Motor schema-based formation controlfor multiagent robot teams. In Proceedings of the First International Confernce onMulti-Agent Systems, pages10{16, Menlo Park, PA, June 1995. AAAI Press/The MIT Press.[Calder et al., 1993] R. Calder, J. Smith, A. Courtenmanche, J. Mar, and A. Ceranowicz.ModSAF behavior simulation and control. In Proceedings of the Third Conference onComputer Generated Forces and Behavioral Representation, 1993.[Corps, 1988] U.S. Marine Corps. FMFM 5-4A: Close Air Support and Close-in Fire Support.Department of the Navy, Headquarters United States Marine Corps, Washington, D.C.,1988.[Coulter and Laird, 1996] K. J. Coulter and J. E. Laird. A brieng-based graphical interfacefor exercise specication. In Proceedings of the Sixth Conference on Computer GeneratedForces and Behavioral Representation, pages 203{207, 1996.[Doorenbos, 1994] R.B. Doorenbos. Combining left and right unlinking for matching a largenumber of learned rules. In Proceedings of AAAI-94, Seattle, WA, August 1994.[Hayes-Roth et al., 1995] B. Hayes-Roth, L. Brownston, and R. van Gent. Multiagent collaborationin directed improvisation. In Proceedings of the First International Confernceon Multi-Agent Systems, pages 148{154, Menlo Park, PA, June 1995. AAAI Press / TheMIT Press.[Laird and Rosenbloom, 1990] J. E. Laird and P. S. Rosenbloom. Integrating execution,planning, and learning in Soar for external environments. In Proceedings of the EighthNational Conference on Articial Intelligence, pages 1022{1029, July 1990.[Laird and Rosenbloom, 1994] J. E. Laird and P. S.Rosenbloom. The evolution of the Soarcognitive architecture. Technical report, Computer Science and Engineering, Universityof Michigan, 1994. To appear in Mind Matters, T. Mitchell Editor, 1996.[Laird et al., 1994] J. E. Laird, R. M. Jones, and P. E. Nielsen. Coordinated behavior ofcomputer generated forces in TacAir-Soar. In Proceedings of the Fourth Conference onComputer Generated Forces and Behavioral Representation, May 1994.[Moore et al., 1997] R. Moore, J. Dowding, H. Bratt, J. M. Gawron, Y. Gorfu, andA. Cheyer. Commandtalk: Aspoken-language interface for battleeld simulations. Availableat http://www.ai.sri.com/ ~ lesaf/commandtalk.ps.gz, March 1997.[Nii, 1986] H. P. Nii. Blackboard systems: The blackboard model of problem solving andthe evolution of blackboard architectures. AI Magazine, 7(2):38{53, 1986.[Rao et al., 1994] A. Rao, A. Lucas, M. Selvestrel, and G. Murray. Agent-oriented architecturefor air combat simulation. Technical report, The Australian Articial IntelligenceInstitute, 1994. Techinal Note 42.21

Knowledge-based Multi-agent Coordination - Artificial Intelligence ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?