Sociality and the life-mind continuity thesis - Dr. Tom Froese

Sociality and the life-mind continuity thesis - Dr. Tom Froese Sociality and the life-mind continuity thesis - Dr. Tom Froese

from froese.files.wordpress.com More from this publisher

30.04.2015 Views

Sociality and the Life-Mind Continuity Thesis: A Study in Evolutionary Robotics Tom Froese Submitted for the degree of D.Phil. University of Sussex June 2009

Sociality and the Life-Mind Continuity Thesis:

A Study in Evolutionary Robotics

Tom Froese

Submitted for the degree of D.Phil.

University of Sussex

June 2009

Declaration

I hereby declare that this thesis has not been submitted, either in the same or different

form, to this or any other University for a degree.

Signature:

2 | P a g e

Sociality and the Life-Mind Continuity Thesis:

A Study in Evolutionary Robotics

Tom Froese

Summary

The life-mind continuity thesis holds that mind is prefigured in life and that mind

belongs to life. Its biggest challenge is the problem of scalability: how can the same

explanatory framework that accounts for basic phenomena of life and mind be extended

to incorporate the highest reaches of human cognition? So far there has been little

systematic response to this „cognitive gap‟. The main argument of this thesis is that the

problem appears insurmountable because of the prevalent focus on the individual agent

alone, and that it can start to be addressed by an appreciation of the constitutive role of

sociality for mind and behavior. This argument is developed in a theoretical,

experimental, and phenomenological manner. In terms of theory, the enactive paradigm

of cognitive science is developed in a novel direction by highlighting the specific

manner in which the dynamics of the interaction process opens up new behavioral

domains. This provides the motivation for using an evolutionary robotics methodology

to synthesize a set of minimalist simulation models that are based on experiments in

social psychology. A detailed dynamical analysis of these models supports the enactive

approach; the behavior of the agents is not an individual achievement alone but rather

co-determined by their mutual interaction and organized effectively by this multi-agent

interaction process. Some phenomenological observations complement these results by

indicating that the detached perceptual attitude that is characteristic of adult human

perception is essentially an intersubjective and socially mediated ability. Finally, the

systemic and phenomenological insights are combined to provide the beginnings of a

novel perspective on the origins of cumulative cultural development that gives further

support to the main argument of this thesis. It is concluded that the life-mind continuity

thesis is a viable working hypothesis even when accounting for specifically human

abilities, and that an appreciation of the constitutive role of sociality for life and mind

confirms it to be a serious contender for a unified theory of cognitive science.

Submitted for the degree of D.Phil.

University of Sussex

June 2009

3 | P a g e

Table of contents

1 Introduction ............................................................................................................... 8

2 A brief history of cognitive science ........................................................................ 15

2.1 Toward embodied-embedded cognitive science .............................................. 16

2.2 Further: Toward enactive cognitive science ..................................................... 22

2.3 An empirical stalemate ..................................................................................... 27

2.4 A phenomenological resolution ........................................................................ 30

2.5 Summary .......................................................................................................... 32

3 Enactive cognitive science ...................................................................................... 35

3.1 The life-mind continuity thesis ......................................................................... 35

3.2 Constitutive autonomy is necessary for intrinsic teleology .............................. 39

3.3 Adaptivity is necessary for sense-making ........................................................ 44

3.4 Constitutive autonomy is necessary for sense-making ..................................... 48

3.5 Summary .......................................................................................................... 53

4 The enactive approach to social cognition .............................................................. 58

4.1 The autonomy of the interaction process ......................................................... 58

4.2 Social interaction .............................................................................................. 64

4.3 Cultural interaction ........................................................................................... 71

4.4 Summary .......................................................................................................... 75

5 Beyond methodological individualism ................................................................... 77

6 Studies in social psychology: A critical analysis .................................................... 80

6.1 Body image and body schema .......................................................................... 81

6.2 Case studies in social psychology .................................................................... 83

6.2.1 Non-pathological face-to-face interaction ................................................. 83

6.2.2 Facial imitation by human neonates .......................................................... 84

6.2.3 Gesturing by a deafferented subject (I)..................................................... 87

6.2.4 Perceptual crossing in a virtual space ....................................................... 90

6.2.5 Deafferented subject under a blind ........................................................... 94

6.2.6 Gesturing by a deafferented subject (II).................................................... 96

6.2.7 Bodily coordination in a virtual space (I) ................................................. 99

6.2.8 Bodily coordination in a virtual space (II) .............................................. 102

6.3 An integrative motor theory ........................................................................... 104

4 | P a g e

6.4 Summary ........................................................................................................ 111

7 Toward the synthesis of minimally social behavior .............................................. 114

7.1 Evolutionary robotics ..................................................................................... 114

7.2 An integrative methodology ........................................................................... 120

7.3 Implementation details ................................................................................... 124

8 Investigating sensitivity to social contingency ..................................................... 128

8.1 Methods .......................................................................................................... 130

8.2 Results ............................................................................................................ 132

8.3 Behavioral analysis ......................................................................................... 134

8.4 Dynamical analysis ......................................................................................... 138

8.5 Summary ........................................................................................................ 140

9 Investigating the interaction process ..................................................................... 143

9.1 Methods .......................................................................................................... 144

9.2 Experiments .................................................................................................... 148

9.2.1 Experimental setup 1: Original setup ...................................................... 148

9.2.2 Experimental setup 2: Switched receptor fields ...................................... 157

9.2.3 Experimental setup 3: Conflicting behaviors .......................................... 160

9.3 Dynamical analysis ......................................................................................... 166

9.4 Discussion ...................................................................................................... 172

9.5 Summary ........................................................................................................ 175

10 Investigating social interaction ............................................................................. 177

10.1 Experimental setup 4: Infinitely small objects ........................................... 178

10.2 Experimental setup 5: Maximally distant shadows .................................... 182

10.3 Experimental setup 6: Coordinated behavior.............................................. 184

10.4 Summary ..................................................................................................... 190

10.5 Discussion ................................................................................................... 192

11 Beyond methodological physicalism .................................................................... 197

12 Phenomenological considerations ......................................................................... 201

12.1 The phenomenology of perception ............................................................. 201

12.2 The phenomenology of intersubjectivity .................................................... 209

12.3 A phenomenologically informed continuity thesis ..................................... 213

13 Toward an enactive approach to culture ............................................................... 218

13.1 The „ratchet effect‟...................................................................................... 219

5 | P a g e

13.2 Primatology ................................................................................................. 221

13.3 Developmental and social psychology ....................................................... 224

13.4 Evolutionary anthropology ......................................................................... 227

14 Conclusion ............................................................................................................ 232

15 References ............................................................................................................. 234

6 | P a g e

Acknowledgments

I would like to give special thanks to my supervisor Ezequiel Di Paolo for being a much

needed critical filter for the stream of ideas that were produced by my irresistible urge

to adopt the working assumption that there is never an end to relevant context. I am also

appreciative of my D.Phil. research committee Inman Harvey and Anil Seth for keeping

me focused and on target. Many thanks are owed to my academic colleagues, especially

everyone from the CCNR and PAICS, as well as the participants of the Life and Mind

seminars for their many helpful discussions.

Of course, I would not be where I am now without my parents Rainer and Sabine, and

my sister Nele. I dedicate this thesis to them. I am also extremely grateful to my friends

and especially to my partner Iliana for the constant support, encouragement and friendly

background noise that kept me sane in those moments of crisis. Without the invaluable

help of this extended family the completion of thesis would have not been possible.

Some of the chapters of this thesis have benefited from extensive comments made by

anonymous reviewers as a result of their being published elsewhere. In particular,

Chapter 2 is based on a paper that appeared as Froese (2007). Chapter 3 includes large

parts from Froese and Ziemke (2009), as well as some text from Froese and Di Paolo

(2009). Chapter 4 grew out of some of the ideas presented in De Jaegher and Froese

(2009). A shorter version of Chapter 8 has previously been published as Froese and Di

Paolo (2008a). Chapters 9 and 10 are based on Froese and Di Paolo (in press-a) and (in

press-b), respectively. The content of Chapter 12 is largely taken from Froese and Di

Paolo (2009). I am indebted to all the reviewers for their criticisms and comments on

how to improve the manuscripts. Recognition is also due to Shaun Gallagher and Mike

Beaton for providing detailed comments on Chapter 6 and 13, respectively. I would also

like to give many thanks to my examiners Phil Husbands and Mike Wheeler for their

constructive feedback. Finally, I am grateful to Eörs Szathmáry and the Collegium

Budapest for hosting me during the summer of 2006, where some of the initial ideas for

Chapter 4 took shape, as well as to Tom Ziemke and the University of Skövde for

hosting me for a few months in 2007 and 2008 (with the generous financial assistance

of the euCognition network), where substantial parts of Chapter 3 were written.

7 | P a g e

1 Introduction

Out of all the traditional difficulties faced by mainstream cognitive science, the mindbody

problem has all the makings of morphing into a paradigm buster. Even though

consciousness has recently started to become a hot topic in science, it is still not clear

what precisely the nature of the dilemma is, let alone what form a systematic response

should take. The status of the problem, recently selected as one of the top outstanding

problems in a special issue of Science (cf. Miller 2005), indicates that there is more at

stake than revising our understanding of mentality: it can potentially challenge a

particular way of doing science that dates back to the problem‟s Cartesian origins in the

scientific revolution of the 17 th century.

While most cognitive scientists continue the attempt to somehow get a grip on the mindbody

problem within the conventional Cartesian framework, this thesis is part of a

growing trend to change the fundamental terms of the debate. More specifically, it

builds on what has become known as enactive cognitive science, an approach which

replaces the computer metaphor of mind with a focus on life – a phenomenon that

incorporates body and mind as two aspects of a unified whole. By placing the

phenomenon of life at the heart of its conceptual framework, the enactive paradigm has

turned the intractable mind-body problem into a novel research program that is based on

the principles of biological autonomy and phenomenological philosophy.

However, this shift in terms of the debate from computer science to what might be

called „bio-phenomenology‟ has also made the enactive approach vulnerable to the

criticism that its foundational principles, which are largely based on minimal forms of

life, are irrelevant for the interests of cognitive science. In particular, proponents of the

Cartesian mainstream, who prefer to treat the human mind as a computer, have argued

that such biological foundations are essentially incapable of accounting for „higher‟

cognitive faculties. And, indeed, even though the explicit working hypothesis of the

enactive approach is that there actually is continuity between life and mind, so far it has

been difficult – if not impossible – to conceive of a satisfactory way to bridge the

„cognitive gap‟ that lies, for example, between the capacity for adaptive behavior of a

simple bacterium and the ability for abstract cognition of an adult human being.

8 | P a g e

Accordingly, in response to this situation the main goal of this thesis is to argue for two

complementary claims: (i) that the apparent inconceivability of a satisfactory version of

the life-mind continuity thesis largely results from the widely unquestioned assumption

of methodological individualism in cognitive science (an assumption which treats all

cognition as essentially an individual achievement), and (ii) that the enactive approach

has the means to bridge this cognitive gap in a principled manner through a systematic

consideration of the constitutive role of sociality for mind and behavior.

In Chapter 2 the stage for this twofold argument is set by means of a brief history of

recent cognitive science, framed in terms of issues related to philosophy of science. In

particular, several possibilities for irresolvable stalemates between different paradigms

are identified. Special emphasis is placed on the role of research in artificial intelligence

(AI) and robotics in breaking a longstanding philosophical stalemate, and supporting the

subsequent turn toward more embodied-embedded approaches. It is argued that progress

in this experimental domain is nevertheless still threatened by an empirical stalemate,

which is related to the necessity of an observer to adopt some interpretative perspective

in order to make sense of the experimental data. There is thus a need for analyzing the

constitutive conditions of our scientific perspective, a task which motivates the role of

phenomenology for the development of the enactive paradigm. This background chapter

therefore acts as a first introduction to the three main approaches pursued in the thesis,

namely theoretical argumentation, experimental investigation, and phenomenological

observation, as well as to some of the basic ideas of the enactive paradigm.

This general introduction is followed in Chapter 3 by a more detailed description of the

conceptual framework of enactive cognitive science. The life-mind continuity thesis is

presented as a strong working hypothesis that has the potential to become a unified

theory of cognitive science. Some outstanding problems with the continuity thesis are

identified, especially what we call the „cognitive gap‟: the seemingly insurmountable

distance between the basic phenomena of life and the higher cognitive functions of adult

human beings. It is suggested that this perceived problem is largely due to the

methodological individualism that is present in most cognitive science, and that a

consideration of the constitutive role of sociality from the perspective of the enactive

9 | P a g e

approach can systematically resolve this issue. As a first step toward this goal, the

biological foundations of enactive cognitive science are introduced, in particular the

notions of autonomy and sense-making, which denote a system‟s capacity to generate

its own identity under precarious conditions and the capacity to adaptively regulate its

interactions in relation to those conditions, respectively. The chapter finishes by briefly

indicating how the notions of autonomy and sense-making inform the current version of

the life-mind continuity thesis.

The second step follows in Chapter 4 which provides a critical analysis of what has

already been published about the enactive approach to social cognition. The starting

point is an appraisal of the claim that the defining aspect of social interaction is its

autonomy, i.e. that the interaction process between two or more interacting agents can

itself take on an autonomous organization and thereby effectively organize the behavior

of those interactors to expand (or constrain) their individual domains of interaction. It is

argued that the autonomy of the interaction process is a necessary but not sufficient

condition for social interaction, and that this necessary condition is better captured by

the notion of „multi-agent interaction‟. Accordingly, a revised definition of social

interaction is offered: it is a type of multi-agent interaction whereby an agent‟s action

necessarily requires an appropriate response by another agent for its completion. This

co-regulation of activity opens up specifically social ways of sense-making, i.e. forms

of participatory sense-making, and thereby introduces a qualitative change to the agents‟

cognitive domains. However, this type of social interaction is still not sufficient to

account for the specificity of cultural forms of interaction, which depend on pre-existing

practices, and a provisional account of cultural interaction is suggested that takes such

heteronomy into consideration. Each of these transitions in sociality entails constitutive

changes to the structures of agency which originally give rise to them, and thereby lead

to increases in an individual‟s behavioral capacity. In this way the enactive approach to

social cognition provides a theoretical opening for a research program that addresses the

cognitive gap of the life-mind continuity thesis by means of a systematic investigation

of the constitutive role of sociality for mind and behavior.

The development of novel definitions for multi-agent systems and social interaction

completes the theoretical part of the thesis. Chapter 5 provides a brief recap of how the

10 | P a g e

enactive approach to social cognition has responded to methodological individualism,

and it situates this achievement in a wider scientific and historical context. The next part

of the thesis is concerned with experimental evidence.

The aim of Chapter 6 is to show that the enactive approach to social cognition can be

used to provide a fresh interpretation of some important experiments in developmental

and social psychology. The results of these experiments are critically analyzed in order

to reveal the significant problems that are faced by the traditional explanations based on

methodological individualism. These problems serve as a motivation to broaden the

acceptable range of explanations to include a consideration of the constitutive role of

the interaction process. It is argued that otherwise even the integrative explanations in

embodied-embedded cognitive science can be forced to return to the traditional method

of postulating hypothetical neuro-physiological structures to explain the existence of

social phenomena. This chapter also provides the empirical backdrop for some of the

modeling experiments presented in subsequent chapters.

The methodology for these simulation models, namely the development of a mutually

informative relationship between the artificial and empirical sciences, is introduced in

Chapter 7. In particular, the aim is to promote a dialogue between evolutionary robotics

and social psychology. The models are mainly used as useful tools for thinking that can

challenge established positions and explanations, serve as proof of concepts, and lead to

the generation of novel predictions and hypotheses. In all cases the intention is to

capture relevant phenomena in the most minimalist manner possible such that these

insights do not get lost in unnecessary complexity. The next three chapters of the thesis

present novel modeling experiments.

To begin with, Chapter 8 presents a model of a famous psychological experiment on

infants‟ sensitivity to social contingency. The results demonstrate that, contrary to

traditional expectations, it is not necessary to postulate innate cognitive modules in

order to explain this capacity, and that a consideration of the interaction process itself

could provide a more parsimonious explanation of the empirical data. Due to the

minimalism of the model it is also possible to give a detailed dynamical explanation of

the evolved behavior. It is shown that it is the interaction process itself which invests

11 | P a g e

these agents with the role of interactors, because the mutual interaction perturbs their

structure such that the behavior required for the interaction becomes possible. Isolated

agents have a more limited behavioral domain.

The aim of the model presented in Chapter 9 is to further investigate how the interaction

process itself can organize the behavior of individuals. This is achieved by modeling a

recent psychological experiment that was also specifically designed for this purpose. A

number of modifications to the original experimental design demonstrate the robustness

of the interaction process to organize behaviors even under impaired and unfavorable

conditions. The results of these modifications lead to the generation of novel hypotheses

that are open to verification by future psychological experiments. The simplicity of the

model also allows a detailed dynamical description of the individuals‟ behavior and how

this behavior is constituted by the interaction, thereby generating skepticism about

traditional ways of schematizing sub-personal processes. It is shown that even simple

multi-agent interactions can expand individual behavioral domains by a process of codetermination

of agential structures.

Chapter 10 further explores some of the implications of this model by fine-tuning the

experimental design so as to further reduce the potential for agents to rely on individualbased

behavioral strategies. The results of these modifications lead to novel predictions

about what precisely are the essential elements of the experimental setup of the original

psychological study. Moreover, a simple modification to the task requiring coordinated

behavior results in a model of social interaction, as defined in Chapter 4. The results

demonstrate that this particular type of interaction process can further increase the

behavioral repertoire of the agents. The model leads to a novel hypothesis about the

minimal conditions for human participants of the psychological study to experience the

experimental situation as qualitatively social.

In all of these modeling experiments the minimalist approach afforded by evolutionary

robotics is demonstrated as an effective antidote against the widespread assumption of

methodological individualism, especially because it is possible to resolve doubts in a

non-mysterious manner by providing detailed dynamical accounts of the constitutive

role of the interaction process. This completes the experimental contribution of this

12 | P a g e

thesis. It has been argued that the enactive approach to social cognition can provide a

theoretical framework to close the cognitive gap. These modeling experiments have

demonstrated that this theory can be put into scientific practice and that the results are

amenable to analysis in dynamical terms.

In Chapter 11 this improved dynamical understanding of the constitutive role of

sociality is placed into a wider scientific and historical context. It is argued that a simple

rejection of methodological individualism based on this kind of systems theory alone is

not sufficient to break out of the conventional framework entirely. For that to happen it

is also necessary to complement this work with another defining aspect of enactive

cognitive science, namely experiential considerations. This phenomenological approach

reveals another widespread assumption that has limited mainstream approaches to social

cognition: the idea that the primary function of perception is to process information

about an independent world of abstract physical quantities. We refer to this assumption

as „methodological physicalism‟. It has led much mainstream research in the field of

social cognition to be concentrated on the „problem of other minds‟, i.e. the question of

how understanding of others is possible on the basis of perceiving their abstract physical

details alone. This misguided but deeply engrained focus has regrettably come at the

expense of a more phenomenologically plausible research program.

Fortunately, the enactive paradigm has the capacity to provide an effective remedy to

methodological physicalism (and those aspects of methodological individualism that are

derived from it) by appealing to careful phenomenological analyses of our immediate

experience. Thus, Chapter 12 introduces some central insights of the phenomenology of

intersubjectivity. In particular, the consideration of intersubjectivity is motivated by a

critical analysis of our perception of objects. It is argued that the experience of an object

as independent of our current perspective of concern is constitutively dependent on what

Husserl calls „open intersubjectivity‟, i.e. the potential presence of other perspectives in

the world. This opens up the way for more detailed observations of how other subjects

appear in our experience, and how their presence impacts on how we make sense of the

world. In particular, it is argued that the categories of objectivity and subjectivity are

impossible to appreciate experientially without open intersubjectivity. Accordingly, the

phenomenological perspective enables us to refine the life-mind continuity thesis from a

13 | P a g e

„top-down‟ perspective, namely by starting from the specificity of human (inter-)

subjectivity. This perspective provides novel insights into the qualitative dimensions of

the continuity thesis that complement the „bottom-up‟ approach of previous chapters.

The phenomenological return to our immediate experience provides not only a vantage

point from which to question the validity of methodological physicalism, but also makes

it possible for us to take a fresh perspective on some controversial empirical data. This

is the task of Chapter 13, which completes the investigation of sociality by focusing on

some crucial aspects of cumulative cultural development. More specifically, it proposes

to turn the traditional framework in primatology and infant studies on its head by means

of a novel explanation of cumulative cultural development based on the systemic and

phenomenological accounts of sociality of the enactive paradigm. Interestingly, this

perspective reveals a blind spot in the primary literature, which leaves unaccounted the

capacity of humans (and enculturated chimpanzees) to perceive others in terms of their

abstract physical properties, an ability that is necessary for imitative learning (a primary

mechanism of cultural development). Some empirical evidence is presented which, in

combination with the phenomenological insights developed in Chapter 12, points to a

socially mediated origin of this perceptual capacity. Finally, once cumulative cultural

development is underway, it appears that it takes on properties that can be captured by

concepts akin to the basic organizational principles of life.

On the basis of the theoretical, experimental, and phenomenological insights developed

in this thesis it is concluded in Chapter 14 that the life-mind continuity thesis of the

enactive paradigm is indeed a viable working hypothesis for cognitive science. It has

been shown that the organizational principles which can be derived from minimal forms

of life, complemented by phenomenological considerations, can help us to understand in

a unified manner the processes which connect individual agency and simple interaction

processes to human agency and cultural cognition. At the heart of this understanding lie

the complementary notions of biological autonomy and enacted meaning. More work

surely needs to be done, but this thesis has contributed to the beginnings of a research

program that has the potential to provide a unified theory of cognitive science. In fact,

we can expect that this approach will not only impact how we scientifically approach

life, mind and sociality, but also how we perceive ourselves, others and the world.

14 | P a g e

2 A brief history of cognitive science

Over the last two decades the field of artificial intelligence (AI) has undergone some

significant developments (cf. Anderson 2003; Froese & Ziemke 2009). Good oldfashioned

AI (GOFAI) has faced considerable problems whenever it attempts to extend

its domain beyond simplified „toy worlds‟ in order to address context-sensitive realworld

problems in a robust and flexible manner (Dreyfus 1981; 1972). A few wellknown

examples are the commonsense knowledge problem (Dreyfus 1991, p. 119), the

frame problem (McCarthy & Hayes 1969), and the symbol grounding problem (Harnad

1990). These difficulties motivated the Brooksian revolution toward an embodied and

situated robotics in the early 1990s (Brooks 1991a; 1991b). Since then this approach has

been further developed (e.g. Pfeifer & Scheier 1999; Pfeifer 1996; Brooks 1997), and

has also significantly influenced the emergence of a variety of other successful

methodologies, such as the dynamical approach (e.g. Beer 1995a; 2003), evolutionary

robotics (e.g. Harvey et al. 2005; Nolfi & Floreano 2000; Cliff, et al. 1993), and

organismically-inspired robotics (e.g. Di Paolo 2003; Iizuka & Di Paolo 2007a; 2008;

Wood & Di Paolo 2008). These approaches are united by the claim that cognition is best

understood as embodied and embedded in the sense that it emerges out of the dynamics

of an extended brain-body-world systemic whole.

These developments make it evident that the traditional GOFAI mainstream, with its

emphasis on perception as representation and cognition as computation, is being

challenged by the establishment of an alternative paradigm in the form of embodiedembedded

AI. How is this major shift in AI related to the ongoing paradigm shift within

the cognitive sciences 1 ? Section 2.1 analyzes the role of AI in the emergence of what

has been called „embodied-embedded‟ cognitive science (e.g. Clark 1997; Wheeler

2005). Recently, there has also been a noticeable shift in interest toward „enactive‟

cognitive science (e.g. Thompson 2007; Di Paolo, et al., in press), a paradigm which

radicalizes the embodied-embedded approach by placing autonomous agency and lived

1 Whether any of the major changes in AI or cognitive science are in fact paradigm shifts in the strict

sense introduced by Kuhn (1962) is an interesting open question but beyond the scope of this chapter.

Here the notion is used in the more general sense of a major shift in experimental practice and focus.

15 | P a g e

subjectivity at the heart of cognitive science. How the field of AI relates to this further

shift is still in need of clarification. The rest of this chapter provides some initial steps in

this direction by providing a general introduction to the conceptual framework of the

enactive paradigm (Sections 2.2). Nevertheless, some methodological worries still

remain (Section 2.3). The brief history of cognitive science concludes with some

remarks about the need for a practice-oriented phenomenology, especially when trying

to promote a more widespread acceptance of the enactive approach (Section 2.4).

2.1 Toward embodied-embedded cognitive science

Much of contemporary cognitive science owes its existence to the founding of the field

of AI in the late 1950s by the likes of Herbert Simon, Marvin Minsky, Allen Newell,

and John McCarthy 2 . These researchers, along with Noam Chomsky, put forth ideas that

were to become the major guidelines for the computational approach which has

dominated the cognitive sciences since its inception (cf. Boden 2006a). In order to

determine the impact of AI on the ongoing shift from such orthodox computationalism

toward embodied-embedded cognitive science, it is necessary to briefly consider some

of the central claims associated with these competing theoretical frameworks.

The paradigm that came into existence with the birth of AI, and which was essentially

identified with cognitive science itself for the ensuing three decades and which still

represents the mainstream today, is known as cognitivism (e.g. Fodor 1975). The

cognitivist claim, that cognition is a form of computation (i.e. information processing

through the manipulation of symbolic representations), is famously articulated in the

„Physical-Symbol System Hypothesis‟ which holds that such a system has the necessary

and sufficient means for general intelligent action (Newell & Simon 1976). From the

cognitivist perspective cognition is essentially a centrally controlled, disembodied, and

2 The origins of this early symbolic AI, and the computationalist cognitive science that was to be founded

on it, can be traced to the influential cybernetics tradition of the „40s and „50s, which is best known for

the work by Wiener, von Neumann and other participants of the Macy conferences (cf. Dupuy 2009). The

enactive paradigm has related roots, though it was influenced more by British cyberneticists such as Pask

and Ashby (cf. Husbands, et al. 2008), as well as by the „second-order cybernetics‟ of von Foerster and its

further development into Maturana and Varela‟s „biology of cognition‟ (cf. Varela 1996a).

16 | P a g e

decontextualized reasoning and planning algorithm as epitomized by abstract problem

solving. Accordingly, the mind is conceptualized as a digital computer and cognition is

viewed as fundamentally distinct from the embodied action of an autonomous agent that

is situated within the continuous dynamics of its environment.

The cognitivist orthodoxy remained unchallenged until connectionism arose in the early

1980s (e.g. McClelland, Rumelhart et al. 1986). The connectionist alternative views

cognition as the emergence of global states in a network of simple components, and

promises to address two practical shortcomings of cognitivism, namely by (i) increasing

efficiency through parallel processing, and (ii) achieving greater robustness through

distributed operations. Moreover, because it makes use of artificial neural networks as a

metaphor for the mind, its theories of cognition are often more biologically plausible.

Nevertheless, connectionism still retains many cognitivist commitments. In particular, it

maintains the idea that cognition is essentially a form of information processing in the

head which converts a set of inputs into an appropriate set of outputs in order to solve a

given problem. In other words, “connectionism‟s disagreement with cognitivism was

over the nature of computation and representation (symbolic for cognitivists,

subsymbolic for connectionsists)” (Thompson 2007, p. 10), rather than over the notion

of computationalism as such (see also Wheeler 2005, p. 75). Accordingly, most of

connectionism can be regarded as constituting a part of orthodox cognitive science.

Since the early 1990s this computationalist orthodoxy has begun to be challenged by the

emergence of embodied-embedded cognitive science (cf. Clark 1997; Wheeler 2005), a

paradigm which claims that an agent‟s embodiment is constitutive of its perceiving,

knowing and doing (e.g. Gallagher 2005; Noë 2004; Varela, et al. 1991; Thompson &

Varela 2001). Furthermore, the computational hypothesis has been confronted by the

dynamical hypothesis that cognitive agents are best understood as dynamical systems

(van Gelder 1998; van Gelder & Port 1995). Thus, while the embodied-embedded

paradigm has retained the connectionist focus on self-organizing dynamic systems, it

further holds that cognition is a situated activity which spans a systemic totality

consisting of an agent‟s brain, body, and world (e.g. Beer 2000). In order to assess the

17 | P a g e

importance of AI for this ongoing shift toward embodied-embedded cognitive science, it

is helpful to first consider the potential impact of theory for this shift alone.

The theoretical premises of orthodox and embodied-embedded cognitive science can

generally be seen as Cartesian and Heideggerian in character, respectively (cf. Wheeler

2005; Dreyfus 2007; Anderson 2003). The traditional Cartesian philosophy accepts the

assumption that any kind of phenomena can be reduced to a combination of more basic

atomic elements which are themselves irreducible. On this view cognition is seen as a

general-purpose reasoning process by which a relevant representation of the world is

assembled through the appropriate manipulation and transformation of basic mental

states. Orthodox cognitive science adopts a similar kind of reductionism in that it

assumes that symbolic (or, in the case of connectionism, sub-symbolic) structures are

the basic representational elements which ground all mental states 3 , and that cognition is

essentially treated as the appropriate computation of such representations which pick

out facts about the physical world. What are the arguments against such a position?

The Heideggerian critique starts from the phenomenological claim that the world is first

and foremost experienced as a significant whole and that cognition is grounded in the

skilful disposition to respond flexibly and appropriately as demanded by contextual

circumstances. Dreyfus (1991, p. 117) has argued that such a position questions the

validity of the Cartesian approach in two fundamental ways. First, the claim of holism

entails that the isolation of a specific part or element of our experience as an atomic

entity appears as secondary because it already presupposes a background of significance

as the context from which to make the isolation. From this point of view a reductionist

attempt at reconstructing a meaningful whole by combining isolated parts appears

nonsensical since the required atomic elements were created by stripping away exactly

that contextual significance in the first place:

3 In contrast to the Cartesian claim that mental stuff is ontologically basic, orthodox cognitive science

holds that these constitutive elements are not basic in any metaphysical sense because they are further

reducible to binary logic. And, even though it is only this domain which ultimately constitutes the mental,

there is no problem of it being realized in a physical system. Nevertheless, this change in position does

not make any difference with regard to Heidegger‟s critique.

18 | P a g e

Facts and rules are, by themselves, meaningless. To capture what Heidegger calls

significance or involvement, they must be assigned relevance. But the predicates

that must be added to define relevance are just more meaningless facts. (Dreyfus

1991, p. 118)

From the Heideggerian perspective it therefore appears that the Cartesian position is

faced with a problem of infinite regress. Second, if we accept the claim of skills, namely

that cognition is essentially grounded in a kind of skilful know-how or context-sensitive

coping, then the orthodox aim of reducing such behaviour into a formal set of

input/output mappings which specify the manipulation and transformation of basic

mental states appears to be hopelessly misguided.

Judging from these philosophical considerations it seems that the Heideggerian critique

of the Cartesian tradition could have a significant impact on the paradigm shift from

orthodox toward embodied-embedded cognitive science. However, since the two

approaches have distinct underlying constitutive assumptions (e.g. reductionism vs.

holism), there exists no a priori theoretical argument which would force someone

holding a Cartesian position to accept the Heideggerian critique from holism and skills.

Similarly, it is not possible for the Cartesian theorist to prove that worldly significance

can indeed be created through the appropriate manipulation and transformation of

abstract and de-contextualized representational elements. The problem is that, like all

rational arguments, both accounts of cognition are founded on a particular set of

premises which one is at liberty to accept or reject. Thus, even if the development of a

strong philosophical position is most likely a necessary factor in the success of the

embodied-embedded paradigm, it is by itself not sufficient. In other words, there is a

fundamental stalemate in the purely philosophical domain; a shift in constitutive

assumptions cannot be engendered by argumentation alone.

It has often been proposed that this theoretical stalemate has to be resolved in the

empirical domain of the cognitive sciences (e.g. Dreyfus & Dreyfus 1988; Clark 1997,

p. 169; Wheeler 2005, p. 187). The authors of the Physical-Symbol System Hypothesis

(Newell & Simon 1976) and the Dynamical Hypothesis (van Gelder 1998) are also in

agreement that only sustained empirical research can determine whether their respective

19 | P a g e

hypotheses are viable. Empirical research in AI is thereby awarded the rather privileged

position of being able to help resolve theoretical disputes which have plagued the

Western philosophical tradition for decades if not centuries 4 . This reciprocal

relationship between AI and theory has been captured with the slogan „understanding by

building‟ (e.g. Pfeifer 1996; Pfeifer & Scheier 1999, p. 299).

In what way has AI research managed to fulfill this role? It can do so negatively, such as

when insurmountable problems appear in practice. Dreyfus (1991, p. 119), for example,

has argued that the Heideggerian philosophy of cognition has been vindicated because

GOFAI faces significant difficulties whenever it attempts to apply its Cartesian

principles to real-world situations which require robust, flexible, and context-sensitive

behavior. In addition, he demonstrates that the Heideggerian arguments from holism

and skills can provide powerful explanations of why this kind of AI has to wrestle with

the frame and commonsense knowledge problems.

But AI can also fulfill this role positively, as when philosophical assumptions lead to the

successful design and implementation of actual systems. Wheeler (2005, p. 188), for

instance, argues compellingly that the growing success of embodied-embedded AI

provides important experimental support for the shift toward a Heideggerian position in

cognitive science. He suggests that Heidegger‟s claim that a cognitive agent is best

understood from the perspective of „being-in-the-world‟ is put to the test by embodiedembedded

AI experiments which investigate cognition as a dynamical process which

emerges out of a brain-body-world systemic whole.

In light of these developments it seems fair to say that AI can have a significant impact

on the ongoing shift from orthodox toward embodied-embedded cognitive science.

However, while embodied-embedded AI has managed to overcome some of the

significant challenges faced by traditional GOFAI, it has also started to encounter some

4 It is worth noting that there are compelling arguments for claiming that the results generated by AI

research are not „empirical‟ in the same way as those of the natural sciences, and that this is likely to

weaken their impact outside the field. Nevertheless, it is still the case that AI, just like a good empirical

experiment, can provide valuable tools for re-organizing and probing the internal consistency of a

theoretical position (cf. Di Paolo, et al. 2000). We will return to this issue in Chapter 7.

20 | P a g e

of its own limitations. Considering the seemingly insurmountable challenge to make the

artificial agents of current embodied-embedded AI behave in a more robust, flexible,

and generally more life-like manner, particularly in the way that more complex living

organisms do, the embodied robotics pioneer Brooks was led to entertain the following

skeptical reflections on the topic:

Perhaps we have all missed some organizing principle of biological systems, or

some general truth about them. Perhaps there is a way of looking at biological

systems which will illuminate an inherent necessity in some aspect of the

interactions of their parts that is completely missing from our artificial systems.

[…] I am suggesting that perhaps at this point we simply do not get it, and that

there is some fundamental change necessary in our thinking in order that we

might build artificial systems that have the levels of intelligence, emotional

interactions, long term stability and autonomy, and general robustness that we

might expect of biological systems. (Brooks 1997, p. 301)

Has the field of AI managed to find this missing „organizing principle of biological

systems‟ during the decade of research since Brooks‟ pronouncement? Unfortunately,

we do not need to look far to find reasons for continued skepticism.

The existential philosopher Dreyfus, while mostly known in the field of AI for his

scathing criticisms of GOFAI (e.g. Dreyfus 1972), has recently referred to the current

work in embodied-embedded AI as a „failure‟. He points to the lack of “a model of our

particular way of being embedded and embodied such that what we experience is

significant for us in the particular way that it is. That is, we would have to include in our

program a model of a body very much like ours” (Dreyfus 2007, p. 265). Similarly, Di

Paolo (2003) has argued that embodied-embedded robots, while in many respects an

improvement over traditional GOFAI, can never be truly autonomous. Moreover, the

mere presence of a physical body and a closed sensorimotor loop in such robots does

not fully solve the problem of grounding meaning (cf. Ziemke 1999; 2001). These

problems are even further amplified because, while embodied-embedded AI has focused

on establishing itself as a viable alternative to the traditional computational paradigm,

relatively little effort has been made to connect its practical and experimental work with

21 | P a g e

theories outside the field of AI, such as with theoretical biology, in order to address

issues of autonomy and embodiment (Ziemke 2007).

It appears that there is a growing awareness in the field of embodied-embedded AI that

something crucial is still missing in the current implementations of cognitive systems,

and that this shortcoming is likely related to their particular manner of embodiment (cf.

Ziemke 2003). But what could this elusive factor be? What is so special about the body

of living systems? In order to answer these questions we need to shift our focus back to

recent developments in the cognitive sciences.

2.2 Further: Toward enactive cognitive science

The enactive paradigm originally emerged as a part of embodied-embedded cognitive

science in the early 1990s with the publication of the influential book The Embodied

Mind (Varela, et al. 1991). It has recently distinguished itself by more explicitly placing

the phenomenon of life at the heart of cognitive science (e.g. Thompson 2007). In order

to determine what is missing in current embodied-embedded AI, we will therefore

consider how such work could contribute to the enactive account. In particular, we are

interested in how it could inform theories of how bodily activity relates to the mind at

three interrelated „dimensions of embodiment‟: (i) bodily self-regulation, (ii) sensorymotor

coupling, and (iii) intersubjective interaction (cf. Thompson & Varela 2001).

While the development of such fully „enactive‟ AI is a significant challenge to existing

AI methodologies, it has the potential of providing a fresh perspective on some of the

issues currently faced by the embodied-embedded approach.

(i) Bodily self-regulation. This dimension of embodiment is central to the enactive

paradigm in cognitive science, because its theoretical framework builds on the notion of

biological autonomy (Di Paolo, et al., in press). Since embodied-embedded AI has

always been involved in extensive studies of „autonomous systems‟ (e.g. Pfeifer &

Scheier 1999), it might seem that such AI research is particularly destined to relate to

the enactive paradigm in a mutually informative manner. Unfortunately, things are not

as straightforward; the enactive account of biological autonomy has a very different

view of what constitutes autonomy when compared to most embodied-embedded AI,

22 | P a g e

which is why it is sometimes referred to more specifically as constitutive autonomy (cf.

Froese, et al. 2007). Its distinctive approach can be traced to the notion of autopoiesis, a

systems concept which originated in the theoretical biology of the 1970s (e.g. Maturana

& Varela 1980). We will return to this concept in Chapter 3.

In brief, we can say that the enactive paradigm broadly defines an autonomous agent as

a self-producing network of processes which constitutes its own identity; the

paradigmatic example being a minimal living organism (cf. Di Paolo 2009). The

existence of this self-constituted system is necessarily precarious, because it continually

needs to sustain its own identity against the equalizing forces of its environment.

Drawing from the bio-philosophy of Hans Jonas (1968), it is claimed that such an

autonomous system, one whose being is its own doing, should be conceived of as an

individual in its own right. Moreover, as a consequence this process of self-constitution

brings forth, in the same stroke of identity generation, what is outside of this identity,

namely its world (cf. Thompson 2007, p. 153). In other words, it is proposed that the

continuous process of self-construction, which constitutes the autonomous system as a

precarious individual, also furnishes it with a meaningful perspective on its physical

environment. In sum, biological autonomy lies at the basis of sense-making (Weber &

Varela 2002).

It follows from these considerations that today‟s robotic AI systems are not autonomous

in the enactive sense. They do not constitute their own identity, and the only „identity‟

which they can be said to possess is projected onto them by the observing researcher (cf.

Barandiaran, et al. 2009; Froese, et al. 2007). The popular methodology of evolutionary

robotics, for example, presupposes that an „individual‟ is already defined by the

experimenter as the basis for selection by the evolutionary algorithm. And in the

dynamical approach to AI it is up to the investigator to distinguish which subpart of the

systemic whole actually constitutes the „agent‟ (Beer 1995a). The enactive notion of

autonomous agency therefore poses a significant difficulty even for current embodiedembedded

AI methodologies (Froese & Ziemke 2009).

Nevertheless, it is worth noting that AI researchers do not have to synthesize actual

living beings in order for their work to provide some relevant insights into the

23 | P a g e

dimension of bodily self-regulation. This misunderstands the purpose of a good model

(cf. Chapter 7, p. 114). Following the organismic approach first proposed by Di Paolo

(2003; Di Paolo & Iizuka 2008), an initial step would be to investigate artificial systems

with some kind of self-sustaining dynamic structures. In this manner embodiedembedded

AI can move beyond its current focus on closed sensory-motor feedback

loops by implementing systems which have a reciprocal link between internal

organization and external behavior (cf. Iizuka & Di Paolo 2008). Indeed, there are signs

that a shift toward more concern with bodily self-regulation is starting to develop. This

is demonstrated by an increasing interest in homeostasis as a regulatory mechanism for

investigating, for example, sensory inversion (e.g. Di Paolo 2003), the emergence of

sensory-motor coupling and development (e.g. Ikegami & Suzuki 2008; Wood & Di

Paolo 2007), mechanisms of behavioral preference (e.g. Iizuka & Di Paolo 2007a), and

active perception (e.g. Harvey 2004). Of course, looking at the emergence of behavior

from the perspective of modeling chemical self-assembly should be considered as well

(e.g. Egbert & Di Paolo 2009), especially since the notion of autonomy is currently best

understood in the chemical domain (Froese, et al. 2007).

(ii) Sensory-motor coupling. Since sensory-motor embodiment or situatedness is the

research target of most current embodied-embedded AI, its results can have an impact

on the sensory-motor theories of the enactive paradigm. However, since the vast

majority of such work is not concerned with how the constraints of constitutive

autonomy are related to the emergence of sensory-motor behavior, it is not contributing

to the enactive account of how an autonomous agent is able to bring forth its own

relational domain (Froese & Ziemke 2009). To become more relevant in this respect,

the field of system modeling needs to adapt its methodologies so as to deal with the

enactive proposal that an agent‟s sense-making is grounded in the active regulation of

ongoing sensory-motor coupling in relation to the viability of a precarious, dynamically

self-sustaining identity. So far this is an area which has been practically unexplored,

although some promising work has begun from the perspective of evolutionary robotics

(e.g. Di Paolo 2003; Iizuka & Di Paolo 2008). Another route that shows potential,

though radically different from the usual evolutionary robotics methodology, is to

follow an incremental approach in simplified artificial chemistries, which has already

24 | P a g e

een used to model the emergence of autonomous systems that move and can follow

gradients (e.g. Ikegami & Suzuki 2008; Egbert & Di Paolo 2009).

(iii) Intersubjective interaction. The considerations regarding sensory-motor

embodiment can be extended to the domain of intersubjective interaction, since this

dimension of embodiment also involves distinctive forms of sensory-motor coupling

(Thompson & Varela 2001). An enactive account of social understanding based on this

continuity, further discussed in Chapter 3, has recently been outlined by Di Paolo,

Rohde and De Jaegher (in press). They make the important suggestion that the

traditional focus on the embodiment of individual interactors needs to be complemented

by an investigation of the interaction process that takes place between them. This shift

in focus enables them to extend the enactive notion of sense-making into the realm of

social cognition in the form of participatory sense-making (De Jaegher 2006), a shift we

will specify in more detail in Chapter 4 and support in subsequent chapters.

The development of such an account is important for embodied-embedded AI, because

most of its current research remains limited to „lower-level‟ cognition. Exploring the

domain of social interaction might provide it with the necessary means to tackle the

problem of „scalability‟ (cf. Clark 1997, p. 101) by bridging the cognitive gap, in

particular because such interaction can constitute new ways of sense-making that are not

available to the individual alone (Froese & Di Paolo, in press; De Jaegher & Froese

2009). The challenge is to implement AI systems that constitute the social domain by

means of an interaction process that is essentially embodied and situated, as opposed to

the traditional means of formalized transmissions of abstract information over prespecified

communication channels. Di Paolo, Rohde and De Jaegher review some initial

work in this direction which demonstrates that these models have the possibility to

capture the rich dynamics of reciprocity that are left outside of traditional individualistic

approaches. A more detailed review of this methodology is presented in Chapter 7.

There is thus a possibility for modeling work to inform each of these central dimensions

of embodiment. However, it is debatable if AI research should be considered as enactive

rather than merely embodied-embedded if it does not address some form of bodily self-

25 | P a g e

egulation, or leads to the constitution of autonomy in novel domains of interaction 5 . In

this sense the authors of The Embodied Mind perhaps got slightly carried away when

they referred to the emergence of Brooks‟s behaviour-based robotics as a “fully enactive

approach to AI” (Varela, et al. 1991, p. 212). However, this is not to say that embodiedembedded

AI does not have an impact on the shift toward the enactive framework, it

certainly does, but only to the extent that there is an overlap between the paradigms. Its

current influence is therefore by no means as significant as it has been on the shift

toward embodied-embedded cognitive science. For example, Thompson‟s recent book

Mind in Life, which can be considered as a successor to The Embodied Mind, does not

even include robotic AI as one of the cognitive science sub-disciplines from which it

draws its insights (cf. Thompson 2007, p. 24). Of course, it goes without saying that all

of these dimensions of embodiment are open to further refinement through artificial

modeling, and that some initial work in this direction has already begun. Nevertheless,

for AI to have a more significant impact on the ongoing shift toward enactive cognitive

science, it must address some considerable methodological challenges (Froese &

Ziemke 2009). The field needs to extend its current preoccupation with sensory-motor

interaction in the behavioral domain to include a concern of the constitutive processes

that give rise to that domain in living systems. Maybe Brooks (1997) was right when he

suggested that in order for AI to be more life-like perhaps there has to be some

fundamental change in our thinking. Fortunately, such a change might be provided by

the development of enactive AI (Froese 2007).

Indeed, at the moment it seems more likely that the influence will run more strongly

from enactive cognitive science to AI instead. Its account of autonomous agency, for

example, has the potential to provide embodied-embedded AI with exactly the kind of

bodily organizational principle that has been identified as missing by Brooks (2001). In

addition, the enactive notion of sense-making, as a biologically grounded account of

how a system must be embodied in order for its encounters to be experienced as

significant, can be used as a response to Dreyfus‟s vague requirement of a detailed

5 In a similar manner it could be argued that since recent work in enactive perception (e.g. Noë 2004) is

more concerned with sensory-motor contingencies than with autonomous agency or lived subjectivity,

such work might be more usefully classified as part of embodied-embedded cognitive science. For a more

in-depth discussion of this issue, cf. Froese and Ziemke (2009), Thompson (2005) and Torrance (2005).

26 | P a g e

description of our body, which in terms of AI apparently has not even “a chance of

being realized in the real world” (Dreyfus 2007, p. 265). Furthermore, there is a good

possibility that the field‟s current restriction to „lower-level‟ cognition could be

overcome in a principled manner by extending its existing research focus on sensorymotor

embodiment to also include participatory sense-making. All of these concepts

will be introduced in more detail in Chapters 3 and 4, and some new AI models of social

interaction will be presented in Chapters 7 to 10.

Nevertheless, we can already now ask to what extent such modeling work can impact on

the current developments in cognitive science? The following section argues that, while

clearly an important aspect, results in AI are not sufficient to displace the orthodox

mainstream on its own. More than just having to make Heideggerian AI more

Heideggerian, as Dreyfus (2007) proposes, Heideggerian cognitive science as a whole

must become more Heideggerian by complementing its methodological focus on AI

with considerations of phenomenology, a shift which coincides with a movement from

embodied-embedded to enactive cognitive science.

2.3 An empirical stalemate

Over two decades ago Dreyfus and Dreyfus (1988) characterized GOFAI as a project in

which the rationalist tradition had finally been put to an empirical test, and it had failed.

Nevertheless, despite this supposed „failure‟ no alternative has yet succeeded in fully

displacing the orthodox mainstream in AI or cognitive science. While it could be argued

that more progress in embodied-embedded or enactive AI will eventually remedy this

situation, a more serious problem becomes apparent when we consider why this

perceived „failure‟ did not remove the orthodox framework from the mainstream. As

Wheeler (2005, p. 185) points out, this did not happen for the simple reason that

researchers are always at liberty to interpret practical problems as mere temporary

difficulties which will eventually be eliminated through more scientific research and

additional technological development. Accordingly, Wheeler goes on to conclude that a

resolution of the standoff must await further empirical evidence.

27 | P a g e

However, while Wheeler‟s appeal to more experimental data is applicable when there is

a need to resolve theoretical issues within a particular paradigm, it is not clear whether it

is also valid when deciding between different paradigms: you always already have to

choose (whether explicitly or not) one paradigm over the others from which to interpret

the data. Furthermore, the impact of this choice is significant:

The conceptual framework that we bring to the study of cognition can have

profound empirical consequences on the practice of cognitive science. It

influences the phenomena we choose to study, the questions we ask about these

phenomena, the experiments we perform, and the ways in which we interpret

the results of these experiments. (Beer 2000, p. 91)

Since data is only meaningful in a manner which crucially depends on the underlying

premises of the investigator, the current empirical stalemate in AI appears to be partly

due to a lack of empirical evidence, but also largely due to the fact that the impact of

experimental results fundamentally depends on an interpretive aspect.

Again, this is not to say that experimental evidence has no effect on moving forward a

paradigm shift; of course, it is certainly helpful. Indeed, an important step will be to reinterpret

the existing empirical evidence that has already been accumulated (a strategy

we will pursue in Chapters 6 and 13) However, the point is simply that such evidence is

a necessary but not sufficient condition for a successful paradigm shift. In other words,

in order for experimental data to be turned into scientific knowledge it first has to be

interpreted according to (often implicitly) chosen constitutive assumptions. Moreover,

our premises even ground the manner in which we distinguish between noise and data 6 .

It follows from this that the major cause of the standoff in the philosophical domain also

plays a significant role in the current empirical stalemate: both domains of enquiry

require an interpretative action on the part of the observer. And, more importantly,

6 Consider, for example, the fact that the fossil record shows long periods of stasis interspersed with

layers of rapid phyletic change. Someone who believes that evolution proceeds gradually will treat this

fact as irrelevant noise (e.g. due to accidental differences in preservation), while someone who claims that

it proceeds as punctuated equilibria will view it as supporting evidence (Eldredge & Gould 1972).

28 | P a g e

while it is possible to influence this act of interpretation through research progress, its

outcome cannot be fully determined by such external events since any kind of

understanding always already presupposes interpretative activity. In addition, the impact

of this potential influence is also limited because the significance of such advances

might not become apparent if one does not already hold the kind of constitutive

assumptions required to understand them appropriately.

From the perspective of enactive cognitive science this constitutive role of interpretation

for scientific activity is hardly surprising (Varela, et al. 1991, p. 10-12) 7 . In fact, at one

point the enactive approach was actually called “the hermeneutic approach” (Thompson

2007, p. 24), and it can even ground these epistemological reflections in the biology of

autonomy by claiming that a living system always constitutes its own perspective of

value on the world (we will return to this idea in Chapter 3). Nevertheless, these

considerations give a rather bleak outlook for the possibility of actively generating a

successful paradigm shift in the cognitive sciences. At this point it might seem relatively

futile to worry about such abstract problems and better to just get on with the work.

Considering the overall state of affairs this is in many respects a sensible and pragmatic

course of action, and one that is evidently also pursued in this thesis. Nevertheless, in

order to better set the stage for the final chapters in this thesis, especially for Chapters

11 and 12 on phenomenology, it will be useful to paint the bigger picture at least in a

broad outline. For it is still the case that we at least implicitly choose a paradigm for our

research. However, if rational argument combined with empirical data is still not

sufficient to establish this choice, then what is it that determines which premises are

assumed? And how can this elusive factor be influenced? The rest of this chapter

7 The philosophy of science that is associated with the enactive paradigm is typically a combination of the

operational epistemology of Maturana (1988) and the phenomenological ontology of Heidegger (1927)

and the later Husserl (1936). See, for example, the excellent paper by Bitbol (2002). This view of the

scientific method fits nicely with the content of the enactive approach, but whether it is the most

compatible one is still open to debate. To be sure, it clearly differs from the view of science adopted by

the sensory-motor „enactive‟ approaches that prefer to retain a realist stance (Pascal & O‟Regan 2008;

Wheeler 2005). The difference, in essence, is that realism necessitates a role for mental representations.

29 | P a g e

provides a tentative answer to these questions by focusing on a crucial aspect of

enactive cognitive science that has not been addressed so far.

2.4 A phenomenological resolution

The enactive account of autonomous agency as expressed in terms of systems biology is

complemented by a concern with the first-person point of view, by which is meant the

subjectively lived experience associated with cognitive and mental events (cf. Varela &

Shear 1999). This culmination of the recent developments in the cognitive sciences is

illustrated in Figure 2-1.

Enactive

Embodied / Embedded

Connectionist

Cognitivist

Today 1990s 1980s 1970s

computational

dynamic - emergent

embodied - embedded

living - lived

Figure 2-1. This schematic summarizes the paradigm shift which is ongoing in cognitive science. There

has been a systematic trend toward more inclusive frameworks which incorporate and ground the

previous insights in a more extended context. With enactive cognitive science we have finally returned to

the point from which all of our investigations must necessarily originate in the first place, namely the

subjectivity of human existence: our lived experience as living beings.

Since the enactive framework incorporates both biological agency (the living body) and

phenomenological subjectivity (the lived body), it has the capacity to recast the

traditional mind-body problem in terms of what has recently been called the „body-body

30 | P a g e

problem‟ (Hanna & Thompson 2003). On this view the traditional „explanatory gap‟

(Levine 1983) between our best explanation and „what it is like to be‟ that which is to

be explained (Nagel 1974) is no longer absolute since the concepts of subjectively lived

body and objective living body both require the notion of life. Though more work needs

to be done to fully articulate the details, such as the development of an account that

integrates these dual aspects into a coherent conception of the embodied subject, this

reformulation of the „hard problem‟ of consciousness (Chalmers 1996) can be seen as

one of the major contributions of the enactive paradigm (cf. Torrance 2005).

Nevertheless, it is not yet clear how a concern with subjective experience could provide

us with a way to move beyond the stalemate that we have identified in the previous

sections. Surely the enactive approach is just more philosophical theory? However, to

say this is to miss the point that it derives many of its crucial insights from a source that

is quite distinct from standard theoretical or empirical enquiry, namely from careful

phenomenological observations that have been gained through the principled

investigation of the structure of our lived experience (see Ch. 2 in Thompson 2007 for

an overview; for an introduction, cf. Gallagher & Zahavi 2008). But what about the

insights from which Heidegger originally deduced his claims? If his analysis of the

holistic structure of our Dasein or „being-in-the-world‟ (Heidegger 1927) is one of the

most influential accounts of the continental phenomenological tradition, then why did it

not succeed in convincing mainstream cognitive scientists? The regrettable answer is

that while his claims have sometimes been probed in the philosophical or empirical

domain, there have not been many sustained and principled efforts in orthodox

cognitive science to verify their validity in the phenomenological domain.

If the enactive paradigm is to avoid a similar fate then it needs to focus less on the

development of better, more enactive AI (an aim which will, to a large extent, already

be pursued by embodied-embedded cognitive science), and more on the promotion of

principled first-person phenomenological studies. Indeed, according to Di Paolo, Rohde

and De Jaegher (in press) the central importance of experience is perhaps one of the

most revolutionary implications of the enactive approach, especially since a

phenomenologically informed science goes beyond black marks on paper and

experimental procedures for measuring data, and dives straight into the realm of

31 | P a g e

personal experience. They point out, for example, that no amount of rational argument

will convince a reader of Jonas‟s claim that, as an embodied organism, he is concerned

with his own existence if the reader cannot see this for himself. Thus, development of

enactive cognitive science implicates an element of personal en-action.

Accordingly, Varela and Shear (1999) outline the beginnings of a project where neither

experience nor external mechanism have the final word, but rather stand to each other in

a relationship of generative mutual constraints. They point out that the process of

collecting phenomenological data requires disciplined training in the skilful exploration

of one‟s lived experience. Such an endeavor to raise awareness might already be

worthwhile in itself, but in the context of the stalemate in the cognitive sciences it

comes with an added benefit. To be sure, it is still the case that phenomenological data

first has to be interpreted from a particular point of view before it can be integrated into

a conceptual framework. But, in a nutshell, generating such data also requires a change

in our mode of experiencing. Moreover, this change in our experiential attitude is

constituted by a change in our mode of being, and this in turn entails a change in our

understanding (cf. Varela 1976). It is not primarily a matter of theoretical knowledge, or

of deriving facts. Rather, it is this being, the structure of our everyday existence, which

determines how we interpret our world. Of course, since we are autonomous agents this

does not mean that actively practicing phenomenological inquiry necessarily commits

us to an enactive approach. But perhaps by changing our awareness in this manner we

will be able to understand more fully the reasons, other than in terms of theory and

empirical data, which are at the root of why we prefer one paradigm over another.

2.5 Summary

The field of AI has had a significant impact on the ongoing shift from orthodox toward

embodied-embedded cognitive science, especially because work in AI has made it

possible for philosophical disputes to be addressed in an experimental manner.

Conversely, enactive cognitive science can have a strong influence on AI because of its

biologically and phenomenologically grounded account of autonomous agency, sensemaking,

and social interaction. The development of such enactive AI, while challenging

32 | P a g e

to current methodologies, has the potential to address some of the problems currently

impeding significant progress in embodied-embedded AI.

However, if this alternative paradigm is to be successful in actually displacing the

orthodox mainstream, then it is likely that theoretical arguments and empirical evidence

alone are necessary but not sufficient. For this shift to happen it will additionally be

necessary that a phenomenological pragmatics is established as part of the accepted

methodological toolbox of contemporary cognitive science (cf. Depraz, et al. 2003).

This shift of focus from AI to phenomenology coincides with a shift from embodiedembedded

to enactive cognitive science. Unfortunately, however, most of our current

cognitive science institutions are not concerned with supporting first-person

phenomenological inquiry in any principled manner. One promising opportunity for

change is the increasing interest in sensory augmentation technology (cf. Froese &

Spiers 2007). It will be one of the major challenges faced by those wanting to make the

enactive approach accepted as part of mainstream science to devise appropriate ways of

overcoming this impasse. In this context, early day AI practitioner Terry Winograd‟s

decision to turn toward teaching Heidegger in computer science courses at Stanford,

after he became disillusioned with his pioneering work in symbolic language parsing

(Winograd 1972), appears in a new light (Dreyfus 1991, p. 119) 8 . In return, this shift in

understanding also had a profound impact on his work in AI, which prefigured some of

the concerns in embodied-embedded cognitive science (cf. Winograd & Flores 1986).

The rest of this thesis will unfold according to the pattern established in this chapter. In

the first part the theoretical framework of the enactive paradigm is presented in more

detail (Chapter 3), with a special focus on its approach to sociality (Chapter 4). Some

empirical evidence on social interaction will then be re-interpreted from this theoretical

perspective (Chapters 5 and 6). In the second part of the thesis this perspective is further

supported by means of an integrative evolutionary robotics methodology, which is used

to synthesize a series of agent-based models that investigate the dynamics of social

8 Note that it was Winograd‟s practical frustration with AI design that motivated his phenomenological

and embodied-embedded turn. We will consider this kind of pedagogical value of engaging in AI-based

research more fully in Chapter 7.

33 | P a g e

interaction (Chapters 7 to 10). The results of these models support the enactive approach

to social cognition. Still, they say almost nothing about what it is like for someone to be

involved social situations. Accordingly, in the final part of the thesis this modeling

approach is complemented by a phenomenological analysis of how our experience is

modulated by the presence of others (Chapters 11 and 12). The thesis finishes with a

brief look at potential future work in related fields (Chapter 13), and a summary of what

has been done (Chapter 14).

34 | P a g e

3 Enactive cognitive science

The aim of this chapter is to unpack the biological foundations of enactive cognitive

science in more detail. First, the focus on basic biological principles is motivated by a

closer consideration of the life-mind continuity thesis, which forms the theoretical

backbone of enactive cognitive science. On this basis the fundamental notions of

autopoiesis, organizational closure, and constitutive autonomy are introduced, followed

by a consideration of the notion of sense-making and its necessary dependence on

adaptivity and constitutive autonomy. Finally, all of these notions are combined in order

to indicate the broader framework of enactive cognitive science.

3.1 The life-mind continuity thesis

A radical element of the recent embodied turn in cognitive science has become known

as the life-mind continuity thesis (LMCT). The LMCT has been proposed in a wide

variety of formulations (e.g. Di Paolo 2003; Godfrey-Smith 1996; Wheeler 1997;

Stewart, in press; 1996; Maturana & Varela 1987). Most of these essentially revolve

around what has been called „strong‟ or „deep‟ continuity, i.e. that the phenomena of life

and mind have a common set of basic organizational properties:

In more concrete terms, the thesis of strong continuity would be true if, for

example, the basic concepts needed to understand the organization of life turned

out to be self-organization, collective dynamics, circular causal processes,

autopoiesis, etc., and if those very same concepts and constructs turned out to be

central to a proper scientific understanding of mind. (Clark 2001, p. 118)

This version of the LMCT is especially attractive for embodied, dynamical approaches

to cognitive science for obvious reasons: for if the thesis turns out to be correct, then the

applicability of these approaches is not only limited to mere low-level, „implementation‟

details of adaptive behavior. Instead, they would actually be providing the very

foundations of a general theory of mind and cognition, one that would also include the

highest reaches of human cognition (cf. Clark 2001, pp. 128-130).

35 | P a g e

The most comprehensive framework based on the LMCT is currently being developed

by enactive cognitive science (e.g. Di Paolo, et al., in press; Thompson 2007; 2004;

Barandiaran & Moreno 2006; 2008; Froese 2009). The enactive approach is just as

interested in the single-cell organism, as the paradigmatic case of individual agency, as

it is in human existence, as the paradigmatic case of enculturation. It is important to

clarify from the start that this version of the LMCT does not involve a reductive form of

continuity, whereby „higher-level‟ phenomena would be reduced to „lower-level‟ ones.

The notion of autonomy, which is applicable to novel phenomena in each of the major

transitions of life, guards against such trivialization. In other words, the enactive

paradigm proposes a view of life-mind continuity where that continuity is more like an

open-ended set of autonomous domains of dynamics that are partially decoupled and

constitutively interrelated by multiple interdependencies (cf. Di Paolo 2009). To use an

example that will be discussed at length in this thesis, we can note that a process of

social interaction is enabled and constrained by the behavior of autonomous individuals,

but that the behavioral capacity of these interacting individuals is simultaneously

enabled and constrained by the dynamics of the autonomous interaction process (e.g.

Froese & Di Paolo 2008). The characterization of this kind of interdependency as a

form of continuity is justified by the fact that the same conceptual framework is applied

at all levels, in this case both for the description of behavioral and social dynamics.

Moreover, as Thompson (2007, p. 129) points out, the enactive approach goes further

than other life-mind continuity theories by following Hans Jonas‟ phenomenological

claim that certain basic experiential categories that are needed to understand human

experience turn out to be applicable to life itself:

The great contradictions which man discovers in himself – freedom and

necessity, autonomy and mortality – have their rudimentary traces in even the

most primitive forms of life, each precariously balanced between being and notbeing,

and each already endowed with an internal horizon of „transcendence‟.

(Jonas 1966, p. ix)

In other words, the LMCT is not only based on an organizational (or behavioral)

continuity, but also on a corresponding phenomenological continuity. In this manner our

36 | P a g e

understanding of the phenomenon of life can be seen to comprise biology and cognitive

science, as well as the philosophy of the organism and philosophy of mind. To be sure,

the development of such a radical LMCT is not without its problems:

The danger, of course, is that by stressing unity and similarity we may lose sight

of what is special and distinctive. Mind may indeed participate in many of the

dynamic processes of life. But what about our old friends, the fundamentally

reason-based transitions and the grasp of absent and the abstract characteristic of

advanced cognition? (Clark 2001, pp. 118-119)

We can unpack this concern into two related but distinct aspects, namely the problem of

agency and scalability. Thus, on the one hand, there is the morally and scientifically

motivated worry that the LMCT “threatens to eliminate the idea of purposive agency

unless it is combined with some recognition of the special way goals and knowledge

figure in the origination of some of our bodily motions” (Clark 2001, p. 135). In

response to this concern it is important to emphasize that the enactive approach is

acutely aware of the problem of agency, and most of its efforts are directed toward

gaining a better understanding of this phenomenon (cf. Di Paolo 2009; Moreno &

Etxeberria 2005). Indeed, the very turn toward the LMCT is largely motivated by a

perceived lack of any coherent notion of agency in current cognitive science, and the

possibility that a closer examination of biological autonomy can fill this gap.

However, there still remains another problem that is closely associated with the LMCT:

“What, in general, is the relation between the strategies used to solve basic problems of

perception and action and those used to solve more abstract or higher level problems?”

(Clark 2001, p. 135). Is it a question of mere complexity, of just having more of the

same kind of organizations and mechanisms? Then why is it seemingly impossible to

properly address the hallmarks of human cognition with these basic principles? In a

recent paper, De Jaegher and Froese (2009) have referred to this missing link as the

„cognitive gap‟ of the LMCT. They propose that this gap is a symptom of the still

prevalent methodological individualism of cognitive science (cf. Boden 2006b), i.e. an

exclusive focus on individual agency, and that it can be addressed by taking the role of

sociality into account.

37 | P a g e

To be sure, a related response has been developed by „extended mind‟ theorists such as

Clark, who proposes “to depict much of advanced cognition as rooted in the operation

of the same basic kinds of capacity used for on-line, adaptive response, but tuned and

applied to the special domain of external and/or artificial cognitive aids” (2001, p. 141).

However, these efforts have largely focused on the role of language (e.g. Clark 2008,

pp. 44-60) and technology (e.g. Clark 2003), thereby relating specifically human

cognition with specifically human abilities and their cultural context. Thus, while this

consideration of „cognitive technology‟ helps to spread the explanatory burden outside

of the individual human agent, and thereby indeed makes basic embodied-embedded

accounts more plausible, it still leaves the main cognitive gap of the LMCT largely

unaddressed.

It is certainly crucial to adopt an externalist view of cognition as a first step to make the

LMCT plausible. But this entails nothing more than a commitment to the hypothesis of

embodied-embedded cognitive science that cognition emerges out of the dynamics of a

brain-body-world systemic whole (e.g. Beer 2000). What is additionally needed is a

non-species-specific operational mechanism to account for the transformative potential

of such cognitive extension. To be sure, the desirability of a more encompassing

account is not denied by extended mind theorists. Clark (2005), for example, suggests

the sound-amplifying burrow of the mole cricket as a loose analogy to the cognitiontransforming

symbols found in human culture. But it is important to note that the

chirping cricket in its burrow is passively interacting with a static physical structure.

Moreover, this example completely ignores the fact that human symbols only exist

within a social context.

Accordingly, De Jaegher and Froese (in press) would agree with Clark that “interactive

complexity characterizes almost all forms of advanced human cognitive endeavor”

(Clark 2001, p. 154), but they argue that such interactive complexity is already

prefigured in the interactive and social co-constitution of more basic cognitive domains.

Even simple interactions between agents can give rise to an interaction process

characterized by autonomous dynamics that self-sustain by modulating the behavior of

the interactors. Here we have an example of cognitive extension that involves active

38 | P a g e

coordination, dynamic emerging structures, and an interactive social context that

removes the need for static physical structures. In the enactive approach to social

cognition this transformative potential of the interaction process is the basis of what has

been called „participatory sense-making‟ (De Jaegher 2006). In order to understand this

notion properly we will have to introduce the basic conceptual framework of enactive

cognitive science in more detail, beginning with its origins in the autopoietic tradition.

3.2 Constitutive autonomy is necessary for intrinsic teleology

The notion of autopoiesis (from Greek: self-producing) as the minimal organization of

the living first originated in the work of the Chilean biologists Maturana and Varela in

the 1970s (e.g. Maturana & Varela 1980; for a more accessible introduction, cf.

Maturana & Varela 1987). While the concept was developed in the context of

theoretical biology, it was right from its inception also associated with computer

simulations (Varela, et al. 1974) long before the term „artificial life‟ was first introduced

in the late 1980s by Langton (1989). Nowadays the concept of autopoiesis continues to

have a significant impact on the field of artificial life in both the computational and

chemical domain (see McMullin (2004) and Luisi (2003), respectively, for overviews of

these two kinds of approaches). Moreover, there have been recent efforts of more tightly

integrating the notion of autopoiesis into the overall framework of enactive cognitive

science (e.g. Weber & Varela 2002; Thompson 2007; 2005; Di Paolo 2005; 2009;

McGann 2007; Colombetti, in press). The reasons for this ongoing integration will be

clarified in this chapter.

What precisely is autopoiesis? During the time after the notion of autopoiesis was first

coined in 1971 9 its exact definition has slowly evolved in the works of both Maturana

and Varela (cf. Thompson 2007, pp. 99-101; Bourgine & Stewart 2004). For the

purposes of this article we will use a definition that has been used extensively by Varela

in a series of publications throughout the 1990s (e.g. Varela 1991; 1992; 1997), but

which has also been used as the definition of choice in more recent work (e.g. Weber &

9 See Varela (1996a) and Maturana (2002) for more detailed accounts of the historical circumstances

under which the notion of autopoiesis was first conceived and developed.

39 | P a g e

Varela 2002; Di Paolo 2003; 2005; Froese, et al. 2007). This more-or-less standard

definition states:

An autopoietic system – the minimal living organization – is one that

continuously produces the components that specify it, while at the same time

realizing it (the system) as a concrete unity in space and time, which makes the

network of production of components possible. More precisely defined: an

autopoietic system is organized (defined as a unity) as a network of processes of

production (synthesis and destruction) of components such that these components:

1. continuously regenerate and realize the network that produces them, and

2. constitute the system as a distinguishable unity in the domain in which they

exist.

(Varela 1997, p. 75)

In addition to these two explicit criteria for autopoiesis we can add another important

point, namely that the self-constitution of an identity entails the constitution of a

relational domain between the system and its environment. The shape of this domain is

not pre-given but rather co-determined by the organization of the system, as it is

produced by that system, and its environment. Accordingly, any system which fulfils

the criteria for autopoiesis also generates its own domain of possible interactions in the

same movement that gives rise to its emergent identity (Thompson 2007, p. 44).

Considering that current embodied AI fails to fully capture what is needed for life-like,

intentional agency (cf. Chapter 2), it is interesting to note that the autopoietic tradition

has been explicitly referred to by Varela (1992) as a „biology of intentionality‟. In other

words, for enactive cognitive science the phenomenon of autopoiesis not only captures

the basic mode of identity of the living, but is moreover at the root at how living beings

enact their world of significance. Thus, the notion of autopoiesis in many ways

continues a particular philosophy of the organism, such as Kant‟s, von Uexküll‟s, and

Jonas‟ intuitions regarding the organization of the living (cf. Froese & Ziemke 2009, pp.

476-479). However, as a more recent development, it has the added advantage that it

formalizes these intuitions in a systemic, operational manner.

40 | P a g e

The term „operational‟ denotes that the autopoietic definition of life can be used to

distinguish living from non-living entities on the basis of a concrete instance and

without recourse to wider contextual (e.g. functional, historical) considerations.

Autopoiesis can be considered as a response to the question of how we can determine

whether or not a system is a living being on the basis of what kind of system it is rather

than on how it behaves or where it came from. As such it can be contrasted with

functional (e.g. Nagel 1977) or historical (e.g. Millikan 1989) approaches to teleology.

Already Kant (1790) speculated that since a living system is characterized by a form of selforganizing

reciprocal causality, it follows that all relations of cause and effect in the system

are also at the same time relations of means and purpose. More importantly, this reciprocal

causality entails that such a natural purpose then, as an interrelated totality of means and

goals, is strictly intrinsic to the organism (Weber & Varela 2002). Kant‟s philosophy thus

provides the beginning of a theory of the self-producing organization of life, which attempts

to capture the observation that organisms generate their own goals. In other words, a living

system, as an autopoietic system, is both cause and effect of itself, and therefore also of the

feedback systems underlying its goal-directed behavior. This intrinsic generation of goals is

generally lacking in current AI systems (cf. Haselager 2005). Embodied-embedded AI made

an advance when it included its systems within a sensory-motor loop (e.g. Cliff 1991), but

these systems are nevertheless lacking the kind of intrinsic teleology that is characteristic of

biological systems.

The paradigmatic instance of an autopoietic system is a minimal, living cell (Varela, et

al. 1974), which is often cited as an illustration of the circularity that is inherent in

metabolic self-production. In the case of the cell this circularity is expressed in the codependency

between the (boundary) semi-permeable membrane and the (internal)

metabolic network. The metabolic network constructs itself as well as the membrane,

and thereby distinguishes itself as a unified system from the (external) environment. In

turn, the membrane boundary makes the metabolism possible by preventing the network

from fatally diffusing into the environment.

While there are cases in the literature where multi-cellular organisms are also classed as

autopoietic systems in their own right, this is an issue that is far from trivial and still

41 | P a g e

emains controversial (cf. Thompson 2007, pp. 105-107). For instance, would there be a

difference whether the multi-cellular organism is distinguished on the level of chemical

processes, cells, or organs? Whatever the case, we intuitively want to say that such

organisms meet the requirements for autonomy. A multi-cellular organism might be

different from an autopoietic minimal entity in its mode of identity, but it is also

essentially similar at an abstract level of organization: its activity demarcates it as an

entity from its environment (Varela 1991).

In the late 1970s Varela became dissatisfied with the way that the concept of autopoiesis

was starting to be applied loosely to other systems, with its use even extended to nonmaterial

systems such as social institutions. He complained that such characterizations

“confuse autopoiesis with autonomy” (Varela 1979, p. 55). Nevertheless, there was still

a need to make the explanatory power offered by the systemic approach to autonomy

available for use in other contexts than the molecular domain. Thus, while autopoiesis is

a form of autonomy in the biochemical domain, “to qualify as autonomy, however, a

system does not have to be autopoietic in the strict sense (a self-producing bounded

molecular system)” (Thompson 2007, p. 44).

Accordingly, Varela put forward the notion of organizational closure 10 by taking “the

lessons offered by the autonomy of living systems and convert them into an operational

characterization of autonomy in general, living or otherwise” (Varela 1979, p. 55):

We shall say that autonomous systems are organizationally closed. That is, their

organization is characterized by processes such that

1. the processes are related as a network, so that they recursively depend on

each other in the generation and realization of the processes themselves, and

10 In recent literature the term organizational closure is often used more or less interchangeably with the

notion of operational closure. However, the latter seems better suited to describe any system which has

been distinguished in a certain epistemological manner by an external observer, namely so as not to view

the system under study as characterized by inputs/outputs, but rather as a self-contained system which is

parametrically coupled to its environment. On this view, an organizationally closed system is a special

kind of system, namely one which is characterized by some form of self-production or identity-generation

when it is appropriately distinguished by an external observer in an operationally closed manner.

42 | P a g e

2. they constitute the system as a unity recognizable in the space (domain) in

which the processes exist.

(Varela 1979, p. 55)

This definition of autonomy applies to multi-cellular organisms (Maturana & Varela

1987, pp. 88-89), but moreover to a whole host of other systems such as the immune

system, the nervous system, and even to social systems (Varela 1991). Maturana and

Varela (1987) introduced a couple of simple ideograms to denote systems which are

characterized by organizational closure (Figure 3-1):

Figure 3-1. Maturana and Varela‟s ideograms for autonomous systems, namely those systems which can

be characterized by organizational closure. The ideogram on the left depicts a basic autonomous system:

the closed arrow circle indicates the system with organizational closure, the rippled line its environment,

and the bidirectional half-arrows the ongoing structural coupling between the two. The ideogram on the

right extends this basic picture by introducing another organizational closure within the autonomous

system, which could be the nervous system, for example.

We will refer to the autonomy entailed by organizational closure as constitutive

autonomy in order to demarcate it from the concept‟s more general usage (cf. Froese, et

al. 2007). Since it does not specify the particular domain of the autonomous system, it is

also to some extent more amenable to the sciences of the artificial, though some

fundamental problems remain (cf. Froese & Di Paolo 2008b). For a more detailed

description of how the notions of emergence through self-organization, constitutive

autonomy, and autopoiesis relate to each other, see Froese and Ziemke (2009),

especially Appendix C (p. 497).

In summary, when we are referring to an autonomous system we denote a system

composed of several processes that actively generate and sustain their systemic identity

43 | P a g e

under precarious conditions (cf. Di Paolo & Iizuka 2008). The precariousness of the

identity is explicitly mentioned in order to emphasize that the system‟s identity is

actively constituted by the system under conditions which tend toward its disintegration,

and which is therefore constantly under threat of ceasing to exist. Accordingly, this

working definition of constitutive autonomy captures the essential insights of both the

situation of the organism as described in the philosophical biology tradition, as well as

the operational definitions provided by the autopoietic tradition. Both of these traditions

converge on the claim that it is this self-constitution of an identity, an identity that could

at each moment become something different or disappear altogether, which grounds our

understanding of intrinsic teleology (Weber & Varela 2002). Living systems are not just

goal-directed because they are feedback systems; they are also the source of those goals

because they are autonomous systems. These considerations allow us to state the first

core claim of the enactive paradigm as a systemic requirement (SR-1): autonomy is

necessary and sufficient for intrinsic teleology 11 .

3.3 Adaptivity is necessary for sense-making

In contrast to the agents of embodied AI whose identity and domain of interactions are

externally defined, constitutively autonomous systems (SR-1) bring forth their own

identity and domain of interactions, and thereby constitute their own „problems to be

solved‟ according to their particular affordances for action (SR-2). Such autonomous

systems and their worlds stand in relation to each other through mutual specification or

co-determination (Varela 1992). In other words, there is a mutual dependence between

the intentional agent (which must exist in some world) and its world (which can only be

encountered by such an agent): in addition to self-production, there is thus another

fundamental circularity at the core of intentionality (cf. McGann 2007).

Furthermore, what an autonomous system does, due to its precarious mode of identity,

is to treat the perturbations it encounters from a perspective of significance which is not

11 Froese and Ziemke (2009) give a weaker version of SR-1, claiming that autonomy is merely necessary

for intrinsic teleology. However, since any autonomous system always operates according to at least one

internally defined goal, namely self-production, it is more accurate to claim sufficiency as well.

44 | P a g e

intrinsic to the encounters themselves. In other words, the meaning of an encounter is

not determined by that encounter. Instead it is evaluated in relation to the ongoing

maintenance of the self-constituted identity, and thereby acquires a meaning which is

relative to the current situation of the agent and its needs. This process of meaning

generation in relation to the perspective of the agent is what is meant by the notion of

sense-making (Weber & Varela 2002). Translating this concept into von Uexküll‟s

(1934) terms we could say that sense-making is the ongoing process of active

constitution of an Umwelt for the organism.

It is important to note that the significance which is continuously brought forth by the

endogenous activity of the autonomous agent is what makes the world, as it appears

from the perspective of that agent, distinct from the physical environment of the

autonomous system, as it is distinguished by an external observer (Varela 1997). Sensemaking

is the enaction of a meaningful world for the autonomous agent.

Note that the enactive account of autonomy and sense-making entails that meaning is

not to be found in the elements belonging to the environment or in the internal dynamics

of the agent alone. Instead, meaning is an aspect of the relational domain established

between the two (Di Paolo, et al. in press). It depends on the specific mode of codetermination

that an autonomous system realizes with its specific environment, and

accordingly different modes of structural coupling will give rise to different meanings

(Colombetti, in press). However, it is also important to note that the claim that meaning

is grounded in such relations does not entail that meaning can be reduced to those

relational phenomena. There is an asymmetry underlying the relational domain of an

autonomous system since the very existence of that domain is continuously enacted by

the endogenous activity of that system. In contrast to most embodied AI, where the

relational domain exists no matter what the system is or does, the relational domain of a

living system is not pre-given but depends on precarious processes of self-production. It

follows from this that any model that only captures the relational dynamics on their

own, as is the case with most work on sensory-motor situatedness, will only be able to

capture the functional aspects of the behavior. A functional model will not reproduce

the intrinsic meaning such behavior would have for an autonomous system whose

existence is constitutively linked with its relational domain.

45 | P a g e

In order for these considerations to be of more specific use for the development of better

models of natural cognition, we need to unpack the notion of sense-making in more

detail. Essentially, it requires that the perturbations which an autonomous agent

encounters through its ongoing interactions must somehow acquire a valence that is

related to the agent‟s viability. Varela (1992) has argued that the source of this worldmaking

is always the breakdowns in autopoiesis. However, the concept of autopoiesis

(or constitutive autonomy more generally) by itself allows no gradation – either a

system belongs to the class of such systems or it does not. The self-constitution of an

identity can thus provide us only with the most basic kind of norm, namely that all

events are good for that identity as long as they do not destroy it (and the latter events

do not carry any significance because there will be no more identity to which they could

even be related). On this basis alone there is no room for accounting for the different

shades of meaning which are constitutive of an organism‟s Umwelt. Furthermore, the

operational definitions of autopoiesis and constitutive autonomy neither require that

such a system can actively compensate for deleterious internal or external events, nor

address the possibility that it can spontaneously improve its current situation. What is

missing from these definitions? How can we extend the meaningful perspective that is

engendered by constitutive autonomy into a wider context of relevance?

Di Paolo (2005) has recently proposed a resolution of this problem. He starts from the

observation that minimal autopoietic systems have a certain kind of tolerance or

robustness: they can sustain a certain range of perturbations as well as a certain range of

internal structural changes before they lose their autopoiesis, where these ranges are

defined by the organization and current state of the system. We can then define these

ranges of non-fatal events as an autonomous system‟s viability set, which is “assumed to

be of finite measure, bounded, and possibly time-varying” (Di Paolo 2005, p. 438).

However, in order for an autopoietic system to actively improve its current situation, it

must (i) be capable of determining how the ongoing structural changes are shaping its

trajectory within its viability set, and (ii) have the capacity to regulate the conditions of

this trajectory appropriately. These two criteria are provided by the property of

adaptivity, for which Di Paolo (2005) provides the following definition:

46 | P a g e

A system‟s capacity, in some circumstances, to regulate its states and its relation

to the environment with the result that, if the states are sufficiently close to the

boundary of viability,

1. Tendencies are distinguished and acted upon depending on whether the

states will approach or recede from the boundary and, as a consequence,

2. Tendencies of the first kind are moved closer to or transformed into

tendencies of the second and so future states are prevented from reaching the

boundary with an outward velocity.

(Di Paolo 2005, p. 438)

Similar to the case of robustness, the notion of adaptivity 12 implies tolerance of a range

of internal and external perturbations. However, in this context it entails a special kind

of context-sensitive tolerance which involves both actively monitoring perturbations

and compensating for their tendencies, rather than mere homeostasis. In this context the

notion of active monitoring and compensating not only refers to the asymmetry of selfproduction,

which entails that the system is the active source of activity, but also to an

internal differentiation of operations that involves some partially decoupled, specialized

adaptive mechanisms (cf. Barandiaran & Moreno 2008). The explicit requirement of

active monitoring is crucial for two reasons: (i) it allows the system to distinguish

between positive and negative tendencies, and (ii) it ensures that the system can

measure the type and severity of a tendency according to a change in the internal,

regulative resources that are required for compensation of negative tendencies.

It is important to note that the capacity for (i) does not contradict the organizational

closure of the autonomous system because of (ii). In other words, the system does not

have any special epistemic access to an independent (non-relational) environment, and it

therefore does not violate the relational nature of constitutive autonomy, but this is not a

problem since it only needs to monitor internal effort. Furthermore, it is worth

emphasizing that the capacity for (ii) already implies the need for suitable

12 Note that this form of adaptivity, as a special kind of self-regulatory mechanism, must be clearly

distinguished from the more general notion of „adaptedness‟. This latter sense is usually used to indicate

all viable behaviour that has evolutionary origins and contributes to reproductive success.

47 | P a g e

compensation. In the context of sense-making we can therefore say that both elements,

i.e. self-monitoring and appropriate regulation, are necessary to be able to speak of

different kinds of meaning from the perspective of the organism. Thus, “if autopoiesis

in the present analysis suffices for generating a natural purpose, adaptivity reflects the

organism‟s capability – necessary for sense-making – of evaluating the needs and

expanding the means towards that purpose” (Di Paolo 2005, p. 445).

While it is likely that some form of adaptivity as it is defined here was assumed to be

implicit in the definition of autopoiesis as constitutive of sense-making by Weber and

Varela. Nevertheless, it is useful to turn this implicit assumption into an explicit,

operational specification. Di Paolo‟s work thus allows us to state the second core claim

of the enactive paradigm in the form of another systemic requirement (SR-2): adaptivity

is necessary for sense-making. In the next section we will use a recent debate on the

relationship between autopoiesis and cognition to illustrate SR-1 and SR-2. It will be

argued that autonomy and adaptivity are necessary and sufficient for sense-making.

3.4 Constitutive autonomy is necessary for sense-making

We have argued that the systemic requirements of autonomy and adaptivity are

necessary for intrinsic teleology and sense-making, respectively. Together they are also

necessary and sufficient for sense-making and adaptive agency, where the latter is

defined as an agent capable of adaptive behavior in a purposeful and meaningful

context. However, we are not making the stronger claim that autonomy and adaptivity

are also sufficient conditions for cognitive agency. In fact, we expect that more systemic

requirements will be added to this list as the enactive approach begins to address a

wider range of phenomena. Some promising lines of research in this regard are the

development of an enactive approach to cognition (Barandiaran & Moreno 2006), to

emotion theory (Colombetti, in press), to goals and goal-directedness (McGann 2007)

and to social cognition (De Jaegher & Di Paolo 2007). All of these developments are

consistent and continuous with the fundamental notions of autonomy and sense-making

as they have been presented here, and we will return to some of these additional

requirements in Chapter 4. We can now summarize the insights of the previous sections

as shown in Table 3-1.

48 | P a g e

# Systemic requirement Entailment Normativity

SR-1 autonomy intrinsic teleology uniform

SR-2 adaptivity sense-making graded

Table 3-1. Summary of the enactive approach to intentional agency, which includes at least two systemic

requirements: (SR-1) autonomy is necessary and sufficient for intrinsic teleology, and (SR-2) adaptivity is

necessary for sense-making. Since the viability constraints of adaptivity depend on the autonomous

identity, we can say that both autonomy and adaptivity are necessary and sufficient for sense-making.

When it comes to the practical challenge of how to go about realizing the two systemic

requirements in the form of artificial systems it might be tempting to initially avoid

tackling SR-1 and SR2 in combination. Would it not be better to implement them as

independent modules first and then think about integrating them later? Evolutionary

roboticists in particular will want to avoid SR-1, which we will call the „hard problem‟

of enactive AI (cf. Froese & Ziemke 2009), and first focus on the problem of SR-2

alone. However, is it possible to design artificial systems with adaptivity as the basis for

sense-making independently of constitutive autonomy? Conversely, those interested in

modeling the chemical basis of autonomy might be inclined to approach things the other

way around. Is autopoiesis perhaps not sufficient for sense-making after all?

While an affirmative answer to these questions might sound desirable for AI research,

unfortunately things are not that simple. This is best illustrated by an analysis of the

relationship between autopoiesis and cognition as it has been presented by Bourgine and

Stewart (2004) in a paper which bases its insights on a mathematical model of

autopoiesis. Whereas traditionally it was held that „autopoiesis = life = cognition‟ (e.g.

Maturana & Varela 1980; Stewart 1996; 1992), Bourgine and Stewart propose:

Analytically, the interactions between a system and its environment can be

subdivided into two sorts […]. Firstly, there are those interactions that have

consequences for the internal state of the organism: we may call these type A

interactions. Secondly, there are those interactions that have consequences for

the state of the (proximal) environment, or that modify the relation of the

system to its environment: we may call these type B interactions. This

49 | P a g e

terminology allows us to propose a definition of “cognition”: A system is

cognitive if and only if type A interactions serve to trigger type B interactions

in a specific way, so as to satisfy a viability constraint. (Bourgine & Stewart

2004, p. 338)

Of course, in most situations type A interactions can be termed „sensations‟, and type B

interactions can be termed „actions‟ 13 . Bourgine and Stewart also note that their notion

of „viability constraint‟ has been deliberately left vague so that their definition of

cognition (similar to what we have been calling „adaptivity‟) can by „metaphorical

extension‟ also be usefully applied to non-living systems (cf. Bourgine & Stewart 2004,

pp. 338-339). This is supposed to make room for the „autonomous‟ robots designed in

the field of artificial life.

As a hypothetical example they describe a robot that navigates on the surface of a table

by satisfying the constraints of neither remaining immobile nor falling off the edge.

Since this robot is cognitive by definition when it satisfies the imposed viability

constraint, but certainly not autopoietic, Bourgine and Stewart claim that autopoiesis is

not a necessary condition for cognition (in contrast to what has been argued here in

Sections 3.2 and 3.3). Furthermore, they provide a mathematical model of a simple

chemical system, which they maintain is autopoietic but for which it is nevertheless

impossible to speak of „action‟ and „sensation‟ in any meaningful manner. Accordingly,

they also make the second claim that autopoiesis is not a sufficient condition for

cognition, a claim which appears to be compatible with SR-2.

However, while this last claim might sound at least vaguely analogous to Di Paolo‟s

argument that minimal autopoiesis is insufficient to account for sense-making, there are

some important differences. It is worth noting that, as side effect of not further

restricting what is to count as a „viability constraint‟, Bourgine and Stewart‟s definition

of cognition is different from Di Paolo‟s notion of sense-making for two important

13 They further clarify their position by stating: “It is only analytically that we can separate sensory inputs

and actions; since the sensory inputs guide the actions, but the actions have consequences for subsequent

sensory inputs, the two together form a dynamic loop. Cognition, in the present perspective, amounts to

the emergent characteristics of this dynamical system.” (Bourgine & Stewart 2004, p. 339)

50 | P a g e

easons: (i) the viability constraint can be externally defined (as illustrated by the

example of the robot), and (ii) even if the viability constraint was intrinsic to the

cognitive system, there is no explicit requirement for that system to actively measure

and actively regulate its performance with regard to satisfying that constraint.

To illustrate the consequences of (i) we can imagine defining an additional arbitrary

constraint for the hypothetical navigating robot, namely that it must also always stay on

only one side of the table. Accordingly, we would have to treat it as cognitive as long as

it happens to stay on that side, but as non-cognitive as soon as it moves to the other side

of the table. Clearly, whether the robot stays on one side or the other does not make any

difference to the system itself but only to the experimenter who is imposing the viability

criteria (whether these criteria are externalized into a component of the robot or not).

Thus, the only overlap between Bourgine and Stewart‟s definition of cognition and Di

Paolo‟s conception of sense-making is that both require the capacity for some form of

sensory-motor interaction, a capacity which is not sufficient for grounding meaning by

itself (Froese & Ziemke 2009).

It will also be interesting from the perspective of AI to draw out the consequences of the

second difference (ii). Bourgine and Stewart‟s claim that a given interaction between a

system and its environment “will not be cognitive unless the consequences for the

internal state of the system are employed to trigger specific actions that promote the

viability of the system” (2004, p. 338). How do we know what constitutes an action as

opposed to mere physical change? They define actions as “those interactions that have

consequences for the state of the (proximal) environment, or that modify the relation of

the system to its environment” (2004, p. 338). However, this criterion is trivially met by

all systems which are structurally coupled to their environment since any kind of

interaction (whether originating from the system or the environment) changes the

relation of the system to its environment at some level of description. Thus, while their

definition enables the movement of the hypothetical navigating robot to be classed as an

action, it also has the undesirable effect of making it impossible to distinguish whether

it is the system or the environment that is the „agent‟ giving rise to this action. In order

to remove this ambiguity we can adopt Di Paolo‟s view on adaptivity:

51 | P a g e

[There is an] important distinction between structural coupling and the

regulation of structural coupling. The former is an ongoing happening, the

necessary outcome of non-lethal physical encounters between organism and

medium. Only the latter, the parametrical action that regulates coupling, fully

deserves the name of behaviour because such regulation is done by the

organism […] as opposed to simply being undergone by it. Unregulated

coupling is better described as suffering an exchange while behaviour is the

control and selection of what exchanges to suffer. (Di Paolo 2005, p. 442)

As such, autonomy and the regulative capacity of adaptivity can account for the fact that

“cognition requires a natural centre of activity on the world as well as a natural

perspective on it” (Di Paolo 2005, p. 443). We have already seen that it is the principle

of autonomy which introduces this required asymmetry: an autonomous system brings

forth the relational domain that forms the basis for adaptive regulation by constituting

its own identity, which is the reference point for its domain of possible interactions. It is

essentially the lack of this relational asymmetry in Bourgine and Stewart‟s conception

of cognition, which has made their proposal problematic.

From this discussion we can conclude that autonomy and adaptivity are both necessary

and sufficient for sense-making. Accordingly, while it might be desirable from a

practical engineering and scientific approach to treat autonomy and adaptivity as

separate requirements for sense-making, this abstraction might be the root problem for

the continuing difficulties faced by embodied AI. Better models of natural cognition

will require us to combine the two core systemic requirements of the enactive paradigm,

SR-1 and SR-2, into one internally integrated system. Note, however, that the systemic

requirements are not quite as constraining as they might at first appear: an operational

definition says nothing about how the required organization is structurally realized.

The line of reasoning that we have developed in this section is further supported by

recent work on chemical autopoiesis by Bitbol and Luisi (2004). While they broadly

agree with Bourgine and Stewart‟s definition of cognition, provided that extended

homeostasis is considered to be a special variety of sensory-motor behavior, they

nevertheless reject its proposed radical dissociation from autopoiesis. Thus, while

52 | P a g e

Maturana and Varela‟s original provocative assertion was that autopoiesis is strictly

equivalent to cognition, Bitbol and Luisi weaken this claim slightly by holding that

minimal cognition requires both (i) the self-constitution of an identity (autonomy), and

(ii) dynamical interaction with the environment. Since they maintain that minimal

autopoiesis is sufficient for (i) but does not necessarily entail (ii), their position, like Di

Paolo‟s, falls between the extreme positions of radical identity (e.g. Maturana & Varela

1980; Stewart 1992; 1996) and radical dissociation (e.g. Bourgine & Stewart 2004).

The upshot of these last two sections is that the enactive paradigm might have the

conceptual tools to effectively diagnose the problems which have prevented embodied

AI from designing more life-like artificial systems. In a nutshell, it turns out that

sensory-motor interaction alone is not sufficient to ground intrinsic meaning or goal

ownership, and it does not entail sense-making or cognition. The embodied AI approach

has attempted to capture what biological agents do, e.g. their sensory-motor behavior,

but while leaving out what these agents are, e.g. autonomous and adaptive 14 . Ironically,

the supposedly „Heideggerian AI‟ has ignored the question of being.

3.5 Summary

This has been a long chapter dealing mainly with issues that belong to theoretical

biology. How do these systemic foundations relate to the life-mind continuity? In order

to answer this question we will briefly relate the main argument, i.e. that relational

phenomena such as cognition, behavior, and sense-making cannot be decoupled from

the operations of an autonomous and adaptive system without rendering them

intrinsically meaningless, to the rest of enactive cognitive science. Effectively, this

summary will be a first pass through the enactive version of life-mind continuity.

As the starting point it is important to realize that an autonomous system, by generating

its own identity in separation of what it is not, simultaneously generates the particular

14 For a more extensive discussion of why natural agency and cognition entail autonomy and adaptivity in

terms of more detailed biological considerations, see Barandiaran and Moreno (2006; 2008) and Moreno

and Etxeberria (2005). For the most up-to-date enactive account of agency that will likely be influential

for future work, see Barandiaran, Di Paolo and Rohde (2009).

53 | P a g e

conditions by which it can relate to its environment. Indeed, it is this fundamental

asymmetry of the organism-environment relationship which partly constitutes the

organism‟s perspective on that environment:

Now, in this dialogic coupling between the living unity and the physicochemical

environment, there is a key difference on the side of the living since it has the

active role in this reciprocal coupling. In defining what it is as unity, in the very

same movement it defines what remains exterior to it, that is to say, its

surrounding environment. […] the autopoietic unity creates a perspective from

which the exterior is one, which cannot be confused with the physical

surroundings as they appear to us as observers, the land of physical and chemical

laws simpliciter, devoid of such perspectivism. (Varela 1997, p. 78)

Of course, this is not to say that we cannot provide a description of the organism and its

environment in physicochemical terms 15 . The point is merely that this type of

description does not exhaust the domain of phenomena with which biology should be

concerned: “One could envisage the circularity metabolism-membrane entirely from the

outside (this is what most biochemists do). But this is not to deny that there is, at the

same time, the instauration of a point of view provided by the self-construction” (Weber

& Varela 2002, p. 116). More precisely, this generation of a point of view for the

organism consists in two essential aspects, namely the constitution of (i) an identity, and

(ii) a relationship to what is „other‟, whereby (i) acts as a reference point for (ii):

In other words by putting at the center the autonomy of even the minimal

cellular organism we inescapably find an intrinsic teleology in two

complementary modes. First, a basic purpose in the maintenance of its own

identity, an affirmation of life. Second, directly emerging from the aspect of

concern to affirm life, a sense-creation purpose whence meaning comes to its

15 Nor should Varela be misunderstood as implying that the physical surroundings that are described by

scientists are devoid of perspectivism in that they reflect an absolute reality (cf. Varela, et al. 1991, pp. 9-

12). We can better understand his point by realizing that the scientific description of the surroundings is

generated precisely by stripping the world of its significance (cf. Dreyfus 1991, pp. 112-121), a capacity

that depends on a highly developed sensitivity to intersubjectivity (cf. Chapter 12).

54 | P a g e

surrounding, introducing a difference between environment (the physical

impacts it receives), and world (how that environment is evaluated from the

point of view established by maintaining an identity). (Weber & Varela 2002, p.

117)

Let us briefly consider these two aspects in turn. First, there is the notion of intrinsic

teleology in terms of the organism‟s relation to its own identity, i.e. its capacity to

constitute its own purposeful and goal-directed existence. Weber and Varela begin to

derive this notion by combining Kant‟s conception of a natural purpose, namely the

idea that a self-organizing system that is both cause and effect of itself is also its own

means and purpose (cf. Kant 1790, §64-65), with our modern understanding of

autopoiesis. Then, by appealing to the philosophical biology of Jonas, they move

beyond Kant‟s conception of teleology as a useful regulative idea for the observer, and

posit teleology as intrinsic to the phenomenon of life itself. Indeed, Jonas argues that the

precarious situation of the living furnishes the organism with more than just an

inherently purposeful existence: “The organism has to keep going, because to be going

is its very existence – which is revocable – and, threatened with extinction, it is

concerned in existing” (Jonas 1966, p. 126, emphasis added). He thus argues that it is

the generation of a precarious identity through self-production that simultaneously

enables the generation of existential values. Poetically expressed:

The basic clue is that life says yes to itself. By clinging to itself it declares that it

values itself. […] Are we then, perhaps, allowed to say that mortality is the

narrow gate through which alone value – the addressee of a yes – could enter the

otherwise indifferent universe? (Jonas 1992, p. 36)

This brings us to the second aspect of the organism‟s perspective, namely its capacity

for sense-creation or sense-making. This notion highlights that the generation of values

always happens in the context of a particular organism-environment relationship. The

internal and external encounters that perturb the process of identity generation take on a

value in relation to this perturbation: “The perspective of a challenged and selfaffirming

organism lays a new grid over the world: a ubiquitous scale of value. To have

a world for an organism thus first and foremost means to have value which it brings

55 | P a g e

forth by the very process of its identity” (Weber & Varela 2002, p. 118). In other words,

an organism‟s world is first and foremost a meaningful context that is related to its

particular manner of realizing its identity. To quote a famous example from Varela:

“There is no food significance in sucrose except when a bacteria swims upgradient and

its metabolism uses the molecule in a way that allows its identity to continue” (1997, p.

79). This nicely illustrates how meaningful behavior entails an autonomous identity.

One of the most important consequences of the argument that we have developed in this

chapter is that it strongly underlines the deep continuity between life and mind, and it is

this continuity which forms the very core of the theoretical foundation of enactive

cognitive science. In order to better illustrate this link between the systemic approach to

biology, as it has been presented in this section of the paper, and the enactive approach

as a cognitive science research program, we have adapted Thompson‟s (2004) five steps

from life to mind for the present context. As a first rough pass we can say that:

1. Life = constitutive autonomy + adaptivity 16

2. Constitutive autonomy entails emergence of an identity

3. Emergence of an adaptive identity entails emergence of a world

4. Emergence of adaptive identity and world = sense-making

5. Sense-making = cognition

This chapter has provided only the basic theoretical elements for an understanding of

these five steps. We would only like to emphasize that, as Thompson points out, these

steps amount to an explicit hypothesis about the natural roots of intentionality. In other

words, they form the basis of the claim that the „aboutness‟ of our cognition is not due

to some presumed representational content that is matched to an independent external

reality (by some designer or evolution), but is rather related to the significance that is

continually enacted by the precarious activity of the organism during its ongoing

encounters with the environment. Here we thus have the beginnings of how the enactive

16 Step 1 is a more refined version of the traditional claim that „autopoiesis = life‟, though it is likely that

the additional requirement of adaptivity was already implied by a looser conception of autopoiesis (cf.

Section 3.3 of this chapter). Also, as will become clear in Section 4.2 of Chapter 4, it is more accurate to

say that cognition entails sense-making, but that sense-making alone is not sufficient for cognition.

56 | P a g e

approach might go about incorporating its two main foundations, namely

phenomenological philosophy and systems biology, into one coherent theoretical

framework (cf. Thompson 2007).

Let us close this chapter by highlighting some issues for future research in this area. The

concept of adaptivity has enabled us to become much clearer about what we mean by

the notion of sense-making and its necessary conditions of realization. Our own

perspective, and that of most other organisms, too, is evidently characterized by a whole

range of different shades of meaning, and this phenomenal differentiation had to be

matched in operational terms by a notion capable of giving rise to such graded

distinctions. However, while this conceptual advance is an important accomplishment in

itself, it is nonetheless just the beginning of the task of developing a more precise notion

of sense-making. Indeed, the term‟s general applicability to all living beings, except

perhaps for a few degenerate bacteria that have lost their adaptive mechanisms (cf.

Barandiaran & Moreno 2008, pp. 333-334), cries out for further specification. It follows

that more work needs to be done in order to account for different kinds of sense-making

activities and their qualitative variations. How might we account for this variation?

One possibility is to further develop the emotive aspects of sense-making, for example

in terms of a bodily cognitive-emotional form of understanding (e.g. Colombetti, in

press). Such work is particularly important with respect to providing an enactive

account of a special type of autonomous agent, namely animals, which are specifically

characterized by motility, perception, and emotion (cf. Jonas 1966, pp. 99-107).

Another approach is to further clarify the way in which sense-making is related to action

(e.g. Thompson 2005). This can happen in terms of basic adaptivity (Di Paolo 2005), as

well as for the kind of goals and goal-directedness that are specifically characteristic of

human agency (e.g. McGann 2007), in particular with respect to the role of play (e.g. Di

Paolo, et al., in press). All of these are promising avenues of further research. The

approach pursued in the next chapter, however, is to clarify how sense-making is

transformed by interaction in a social context, especially because this will enable us to

develop a principled response to the problem of the LMCT‟s „cognitive gap‟.

57 | P a g e

4 The enactive approach to social cognition

In this chapter we refine the concepts introduced in the previous chapter in order to

develop the enactive approach to social cognition from the most basic forms of interindividual

interaction to cultural interaction. First, the notion of adaptive agency is

introduced as the most basic form of agency that can become part of a multi-agent

system, in which inter-individual interactions can themselves take on an autonomous

organization. This is followed by a consideration of the conditions for cognitive agency

and properly social interaction. On this basis a possible definition of cultural interaction

is suggested. Finally, the main arguments are summarized in relation to the life-mind

continuity thesis.

4.1 The autonomy of the interaction process

In Chapter 3 we defined an autonomous system to be a system composed of several

processes that actively generate and sustain their systemic identity under precarious

conditions. More precisely, an autonomous system is a network of processes in which

each constituent process has as part of its enabling conditions one or more other

processes in this network, and is itself also an enabling condition for one or more other

processes. It is this organizational closure which underlies the self-constitution of an

identity by that very system. The existence of this identity is qualified as being

precarious in order to emphasize that the constituent processes would disintegrate in the

absence of this autonomous organization. This notion of autonomy is fundamental to the

enactive approach because it provides a way to naturalize the concept of an identity with

intrinsic teleology. In other words, only to a system which is characterized by autonomy

is it possible to attribute goal-states that belong to that system itself, rather than being

imposed on the system from the outside by some external designer, structure or process.

Intrinsic teleology should not be misunderstood in an anthropomorphic fashion; it is

merely a way of specifying a certain quality of systemic behavior.

58 | P a g e

Figure 4-1. The relationship between constitutive autonomy and adaptive agency: the autonomous system

self-constitutes an identity which is conserved during structural coupling with its environment (black

arrows); adaptive agency requires additional regulation by the system which is aimed at adjusting this

coupling relationship appropriately (dotted arrows).

It is important to emphasize again that the property of autonomy is a necessary but not a

sufficient condition for adaptive agency (see Figure 4-1). Autonomy as such only

ensures the passive conservation (homeostasis) of the self-constituted identity during

structural coupling with the environment. In the previous chapter we argued that

adaptivity, i.e. the capacity of an autonomous system to actively regulate its states in

relation to self-constituted viability constraints, is also necessary for agency. We will

now re-state this claim with more precision.

As Barandiaran and Moreno point out, adaptation can happen either by means of the

internal reorganization of constructive processes, or by regulation of an extended

interactive cycle; in both cases there is some degree of decoupling from the basic

constitutive processes: “we are now talking about two dynamic „levels‟ in the system:

the constitutive level, which ensures ongoing self-construction, and the (now decoupled)

interactive subsystem, which regulates boundary conditions of the former” (Barandiaran

& Moreno 2008, p. 332). It is only when the mechanisms of regulation operate by

modulating structural coupling, such that adaptation is achieved through recursive

interactions with the environment (interactive adaptivity), that we speak of adaptive

agency. In contrast to internal compensation, this adaptive regulation of systemenvironment

relations opens up a novel relational domain that can be traversed by

means of behavior (i.e. regulated sensory-motor interactions).

59 | P a g e

Of course, constitutive autonomy and interactive adaptivity are only the minimal

conditions of agency (what we could call agency „mark I‟), exemplified, for example,

by bacteria capable of performing chemotaxis. Many forms of life are likely to be more

of a “veritable topology of processes of identity generation (intersecting, embedded,

hiearchical, shared, etc.)” (Di Paolo 2009, p. 18). Still, the phenomenon of adaptive

agency is sufficient to allow us to consider a simple extension to the basic scenario

shown in Figure 4-1, namely by introducing two adaptive agents into a shared

environment. This change results in the situation depicted in Figure 4-2.

Figure 4-2. The relationship between two adaptive agents sharing the same environment: the manner in

which one agent‟s movements affect the environment can result in changes to sensory stimulation for the

other agent, and vice versa, creating the basis for a multi-agent recursive interaction.

The sensory stimulation of a solitary agent is largely determined by its own structure

and movements, thus giving rise to a closed sensory-motor feedback loop. This closed

loop makes it possible for the agent to engage in sensory-motor coordination so as to

structure its own perceptual space (cf. Pfeifer & Scheier 1999, p. 377-434). However, in

the case where two adaptive agents share a particular environment together, one agent‟s

movements can affect that environment in such a way that it results in changes of

sensory stimulation for the other agent, and vice versa. Moreover, when these changes

in stimulation for one agent in turn lead to changes in its movement that change the

stimulation for the other agent, and so forth in a way that recursively sustains this

mutual interaction, the result is a special kind of interaction process. This process can be

characterized as an autonomous structure in the relational domain that is constituted by

coordinated behaviors. Accordingly, we can simplify Figure 4-2 slightly by focusing on

the autonomy of this interaction process, as shown in Figure 4-3.

60 | P a g e

Figure 4-3. The autonomy of the interaction process: it is possible that when two adaptive agents share an

environment and they engage in sensory-motor interaction, that their activities become entwined in such a

manner that their mutual interaction results in an autonomous interaction process.

Famous examples of the emergence of autonomous structures from the interactions

between adaptive agents are the slime molds (mycetozoans). Spores of Physarum begin

life as unicellular amoebae, and multiply while feeding on bacteria. If they encounter

the correct mating type, they can form zygotes which grow into large plasmodia, a

unified life form containing many nuclei that are not separated by cell membranes. A

plasmodium is even capable of migrating toward more favorable conditions by shifting

concentrations of protoplasm. Note, however, that in this particular case the emergence

of an autonomous structure in the relational domain actually coincides with the

dissolution of autonomy of the constitutive cells. It is therefore better described as a

transformation between two types of adaptive agent, where the multi-agent interaction

between amoebae is limited to a transitional phase. In the case of the slime mold

Dycostelium, however, amoeboid individuals are capable of forming a fructiferous body

without cellular fusion, and with a clear diversity of cellular types. To be sure, all multicellular

organisms are clear examples that mutual interactions between adaptive agents

(as defined above) can lead to the emergence of structures that are autonomous in their

own right (cf. Maturana & Varela 1987, pp. 74-89).

Note that the self-organized emergence of multi-cellularity highlights the importance of

considering a developmental systems perspective (e.g. Oyama 2009) for overcoming the

„cognitive gap‟ of the life-mind continuity thesis. Thus, at first sight the task of

establishing this continuity on the basis of insights gained from minimal, single-cell

forms of life appears to equate the „gap‟ with the whole history of life on earth. Surely it

would be better to start with something of medium complexity, as often practiced by

61 | P a g e

embodied AI? But notice that whereas Brooks‟ insect-like robots still face an immense

phylogenetic gap (and hence the provocative title of Kirsh„s (1991) paper „Today the

earwig tomorrow man?‟), the single-cell models often favored by the enactive paradigm

can be viewed as confronting us with an ontogenetic gap instead. With this shift in

perspective the cognitive gap has been narrowed from the whole extent of evolutionary

history, to the developmental lifespan of a single human individual.

While the integration of the developmental systems approach and the enactive paradigm

is beyond the scope of this thesis, we will nevertheless focus our efforts on addressing

the cognitive gap of the continuity thesis in terms of lifetime changes in behavior. More

precisely, the task will be to investigate how „higher-level‟ autonomous identities can

appear due to the coordinated interaction between two or more adaptive agents, and can

be realized as more or less stable structures. In the most general terms we can define

such a type of interaction as follows:

Multi-agent interaction is the regulated coupling between at least two adaptive

agents, where the regulation is aimed at aspects of the coupling itself so that it

constitutes an emergent autonomous organization in the domain of relational

dynamics, without destroying in the process the autonomy of the agents

involved (though the latter‟s scope can be augmented or reduced).

This definition is based on a related one proposed by De Jaegher and Di Paolo 17 , but it

puts more specific requirements on the necessary form of agency (adaptive), and refers

to this type of interaction as „multi-agent‟ rather than „social‟. The motivation for this

distinction is that it gives us a more fine-grained conceptual handle on the variety of

phenomena that involve more than one agent, including a more specific definition of the

social which we will develop later in this chapter. Note that the definition of multi-agent

17 De Jaegher and Di Paolo‟s definition reads: “Social interaction is the regulated coupling between at

least two autonomous agents, where the regulation is aimed at aspects of the coupling itself so that it

constitutes an emergent autonomous organization in the domain of relational dynamics, without

destroying in the process the autonomy of the agents involved (though the latter‟s scope can be

augmented or reduced)” (2007, p. 493). We will consider this definition more fully in Section 4.2.

62 | P a g e

interaction is sufficiently abstract so as not to be limited to interactions between singlecell

organisms. It is equally applicable to social interaction between humans as well.

It is helpful to illustrate this idea briefly by means of a simple concrete case study which

will be described in more detail later (cf. Chapter 6, p. 90). A recent psychological

experiment by Auvray, Lenay, and Stewart (2009) has investigated the dynamics of

human interaction under minimal conditions. Two participants were asked to locate

each other in a simple 1-D virtual environment using only left-right movement and an

all-or-nothing tactile feedback mechanism, which indicated whether their virtual

„avatar‟ was overlapping any objects within the virtual space. They could encounter

three types of objects: (i) a static object, (ii) the avatar of the other participant, and (iii) a

„shadow‟ copy of the other participant‟s avatar that exactly mirrored the other‟s

movement at a displaced location. Since all objects were of the same size and only

generated an all-or-nothing tactile response, the only way to differentiate between them

was through the interaction dynamics that they afforded. And, indeed, participants did

manage to locate each other successfully because ongoing perceptual crossing afforded

the most stable situation under these circumstances. Thus, even though the participants

„failed‟ to achieve the task individually, i.e. there was no significant difference between

their clicking response to the other‟s avatar and the other‟s shadow (cf. Auvray, et al.

2009, p. 39), they managed to solve the task because of the self-sustaining dynamics of

the interaction process.

Di Paolo and De Jaegher (2007; 2008) suggest another paradigmatic example, namely

any situation in which the individual interactors are attempting to stop interacting, but

where the interaction process self-sustains even in spite of this intention. That can easily

occur, for instance, when two people attempt to walk past each other in a corridor, but

happen to move in mirroring directions at the same time. They thereby co-create a

symmetrical coordinated relation, which is likely to result in them moving in mirroring

directions again, thus leading to further interaction. In this case the individual intention

of terminating the interaction process is actually prevented from being realized due to

the emerging coordination patterns at the inter-individual level. In other words, in these

kinds of cases the overall organization of the interaction subsumes the individual actions

of the interactors in such a way that the identity of the interactive situation is retained, at

63 | P a g e

least temporarily, despite their efforts to the contrary. Accordingly, De Jaegher and Di

Paolo suggest that the reciprocal relationship between the two autonomous domains,

namely the individual and the interactional, may more easily be studied in situations

where they are in conflict.

In sum, with this definition of multi-agent interaction we have taken a first step toward

an enactive approach to social cognition. The co-constitution of an interactive cycle by

two adaptive agents is a necessary (but not sufficient) condition. It is worth emphasizing

that the insufficiency of multi-agent interactions with regard to sociality does not make

it meaningless to investigate them in their own right. On the contrary, it is clear that the

effects of such an interaction process are irreducible to individual capacities, and that

they can nevertheless significantly shape an individual‟s behavioral domain. Indeed, this

intermediate level between the individual and the social therefore works in favor of the

life-mind continuity thesis because it shows the transformative potential of basic multiagent

interactions even without the presence of sociality (i.e. without the need for the

presence of others as such). A multi-agent system therefore provides the foundation for

the emergence of more involved interactions.

4.2 Social interaction

The notion of „multi-agent interaction‟ has provided us with a general way of

characterizing interactions between adaptive agents that result in autonomous structures,

and which can radically alter the behavioral domains of the interacting individuals.

However, as it stands the notion is too broad to capture what is specific about social

interactions. As a first step, we can note that there is a mismatch of values: failure to

regulate a social interaction does not necessarily imply a direct failure of material selfmaintenance.

However, for an adaptive agent this independence of social purpose is

impossible because its capacity for regulating interactions is, while partially decoupled

from constructive processes, still too closely tied to its metabolic existence. The norms

that are constitutive of its regulatory activity, while being potentially constrained by the

dynamics of multi-agent interaction, cannot be specifically social norms because their

success is largely determined by basic energetic and material needs. What is needed for

64 | P a g e

sociality is the creation of a new domain that can have its own internal coherency. The

foundation for this social domain is provided by the cognitive domain.

What is cognition? This question could be the topic of another whole thesis, so we will

restrict ourselves here to mentioning some essential aspects. In order to define what is

special about the cognitive, we can fortunately draw on the work of Barandiaran and

Moreno (2006; 2008) who have recently argued that cognition is given by the adaptive

preservation of a dynamical network of autonomous sensory-motor structures sustained

by continuous interactions with the environment and the body:

The hierarchical decoupling achieved through the electrochemical functioning of

neural interactions and their capacity to establish a highly connected and nonlinear

network of interactions provides a dynamic domain with open-ended

potentialities, not limited by the possibility of interference with basic metabolic

processes (unlike diffusion processes in unicellular systems and plants). It is

precisely the open-ended capacity of this high-dimensional domain that opens

the door to spatial and temporal self-organization in neural dynamics and

generates an extremely rich dynamic domain mediating the interactive cycle,

overcoming some limitations of previous sensorimotor control systems.

(Barandiaran & Moreno 2008, p. 338)

A paradigmatic example of such structures are habits, which encompass partial aspects

of the nervous system, physiological and structural systems of the body and patterns of

behavior and processes in the environment (Di Paolo 2003). Due to the partial

hierarchical decoupling of the electro-chemical activity of the nervous system from

metabolic-constructive processes, the normative regulation of sensory-motor interaction

is underdetermined by basic material and energetic needs. This is because the stability

of a cognitive structure largely depends on the activity of the nervous system as well as

the way the structure is coupled to sensory-motor correlations. In sum, following on

from Barandiaran and Moreno, we can define cognitive interaction as follows:

Cognitive interaction is the regulated coupling between a dynamic agent and its

environment, where the regulation is aimed at aspects of the coupling itself so

65 | P a g e

that it constitutes an emergent autonomous organization in the domain of internal

and relational dynamics, without destroying in the process the autonomy of the

agent (though the latter‟s scope can be augmented or reduced).

It is important to emphasize once more that the minimal form of agency required for

cognitive interaction (what we call „dynamic‟ or „cognitive‟ agency) is more complex

than that provided by adaptive agency, though it is difficult to capture this difference in

operational terms. Effectively, there is a need for a hierarchically decoupled domain of

dynamics that can generate its own, non-metabolic goals (e.g. determined by neurodynamic

forms of autonomy), and be able to regulate its own activity and the

organism‟s sensory-motor behavior accordingly. Partly this is already a possibility for

adaptive agents, since the mechanisms of adaptive regulation are partially decoupled

from the metabolic-constructive processes. But the behavior of these agents is limited

because the regulatory goals are largely determined by metabolic needs, rather than by

the activity that is generated via sensory-motor interaction and within the adaptive

mechanism itself. Cognition, as an open-ended domain of behavior, only becomes

possible when the adaptive mechanism is partially decoupled from the rest of the body

in such a way that it is possible for autonomous structures to arise via recurrent

dynamics (cf. Barandiaran & Moreno 2006, p. 180). This form of dynamic agency

(what we might call agency „mark II‟), is typically based on the nervous system.

Once dynamic agency is in place it is possible that the continuation of certain patterns

of sensory-motor interaction become goals in themselves, for example due to the

autonomous dynamic structures which they induce in neural activity. Moreover, these

patterns can involve coordination with another agent in multi-agent system. Thus, only

an agent capable of cognitive interaction can help to give rise to a social domain that is

defined by its own specific normativity 18 . But is cognitive interaction in a multi-agent

18 Might this be the beginning of a radicalization of the „social brain hypothesis‟ (cf. Dunbar 1998)? The

question is why a form of life should evolve that is controlled by a system whose operations are largely

decoupled from its essential metabolic (self-relative) values, especially since this immediacy increases the

precariousness of the organism. But perhaps this is the price to pay for being able to regulate behavior in

relation to social (other-relative) values? Is sociality related to the origin of the nervous system as such?

66 | P a g e

system necessary and sufficient for that interaction to be called social? What is the

precise role of the other agent? De Jaegher and Di Paolo rightly insist that:

if the autonomy of one of the interactors were destroyed, the process would

reduce to the cognitive engagement of the remaining agent with his non-social

world. The „other‟ would simply become a tool, an object, or a problem for his

individual cognition (such a situation would epitomise what we have diagnosed

traditional perspectives on social cognition as suffering from: namely, the lack

of a properly social level). (De Jaegher & Di Paolo 2007, p. 492)

It is certainly the case that the other agent must remain autonomous for an interaction to

be characterized as social. The question that remains, however, is whether a cognitive

interaction between two or more dynamic agents in a multi-agent system is also a

sufficient criterion. Can such a cognitive inter-agent interaction capture what is specific

about sociality in the sense presumably intended by De Jaegher and Di Paolo? What is

needed is a notion of the social that not only excludes interactions that destroy the

autonomy of the other, but also exclude those situations in which the other is simply

encountered as a mere tool, object or problem to be solved by an individual‟s cognitive

ability (if the other appears as something to be encountered at all).

Unfortunately, the notion of an interaction in a multi-agent system of dynamic agents is

not specific enough. There are situations in which dynamic agents can interact (such

that all of De Jaegher and Di Paolo‟s requirements are fulfilled), but in which the other

agent is simply treated as part of the non-social environment. A famous example is the

cognitive domain of an autistic person who is embedded within the social world of

others, but who does not perceive this sociality as such.

An illustration of this possibility is provided by the psychological experiment by

Auvray and colleagues (briefly described above, also cf. Chapter 6, p. 90), whereby the

participants constitute an autonomous interaction process, but without actually being

able to meaningfully differentiate between the socially contingent and non-contingent

situations. What this example demonstrates is that it is not sufficient for two cognitive

agents to give rise to an autonomous interaction process if they are to break out of their

67 | P a g e

individual cognitive domains. While the behavior of the participants is, unbeknownst to

them, guided by the dynamics of the interaction process to an appropriate solution to the

given task, their sense-making ability remains qualitatively unaffected with respect to its

solitary point of reference. It is impossible for individuals to distinguish between the

movements of the other participant and its copy, even though they are „collectively‟

solving the task due to the dynamics of the multi-agent system.

In sum, multi-agent interaction between dynamic agents is a necessary but not sufficient

condition for the constitution of social significance. Since we have argued that it is

regulation of structural coupling which is constitutive of the qualitative aspect of sensemaking

activity (cf. Chapter 3, p. 35), we need to take a closer look at this regulative

aspect. But what kind of regulation is characteristic of a social interaction such that it

attains meaning as a social event? What is needed is way of defining the operational

basis of participatory sense-making:

If regulation of social coupling takes place through coordination of movements,

and if movements – including utterances – are the tools of sense-making, then

our proposal is: social agents can coordinate their sense-making in social

encounters. […] This is what we call participatory sense-making: the

coordination of intentional activity in interaction, whereby individual sensemaking

processes are affected and new domains of social sense-making can be

generated that were not available to each individual on her own. (De Jaegher &

Di Paolo 2007, p. 497)

This „regulation of social coupling‟ is precisely what has to be made explicit in De

Jaegher and Di Paolo‟s definition of „social interaction‟ if it is to do all the intended

work. For if we cannot find a qualitative difference in terms of the regulation of

coupling then the constitution of a novel social domain of sense-making will remain

mysterious. What specific regulatory process is involved in the coordination of

intentional activity during social interaction?

Let us proceed by means of a concrete example. De Jaegher and Di Paolo (2008),

drawing on Fogel (1993), provide an insightful description of a paradigmatic social act:

68 | P a g e

the act of giving. Fogel describes a filmed session between a 1-year-old baby and his

mother, in which the infant extends his arms with an object, and keeps them relatively

stationary only to gently release the object as the mother‟s hand takes hold of it. From

this description it is already evident that giving has an essentially different structure of

behavior that distinguishes it from merely individual cognitive engagements. In essence,

in order for the act to be completed successfully, it requires acceptance from the other

agent. In a more recent paper Di Paolo comments:

Assuming for a moment that the infant is the initiator of the act, we realise that

he must create an opening by his action that may only be completed by the

action of the mother. The giving involves more than orientation of the mother‟s

sense-making; it involves a request for her not only to orient towards the new

situation, but also to create an activity that will bring the act to completion. In

other words: to take up the invitation for an intention to be shared. […] an

invitation to participate is experienced as a request to create an appropriate

closure of a sense-making activity that was not originally hers. To accept this

request is to produce the „other half of the act‟ bringing it to a successful

completion. (Di Paolo, in press; emphasis added)

On the basis of the act of giving we can now make explicit what was already implicit in

the enactive approach to social interaction that was first proposed by De Jaegher and Di

Paolo (2007). The regulation involved in social interaction is indeed of a special kind:

one cognitive agent‟s regulation of interaction creates an opening for an act that can

only be realized through the complementary regulation of interaction by another. In

other words, social interaction is a manifestation of co-regulation. More precisely, we

can provide the following definition:

Social interaction is the co-regulated coupling between at least two dynamic

agents, where the regulation is aimed at aspects of the coupling itself so that:

1. It constitutes an emergent autonomous organization in the domain of internal

and relational dynamics, without destroying in the process the autonomy of

the agents involved (though the latter‟s scope can be augmented or reduced),

and

69 | P a g e

2. An agent‟s regulation of coupling can only be completed by the coordinated

regulation of at least one other agent.

With respect to the goal of further developing the notion of sense-making, two aspects

of this definition are particularly noteworthy: (i) since the interacting agents are

autonomous systems and they adaptively regulate this interaction, it follows that they

engage with each other in terms of sense-making, and (ii) since the regulation of the

interaction by one agent changes not only its own coupling but also that of the other

agent, it follows that the agents can enable and constrain each other‟s sense-making. But

even though sense-making can be modulated by the interaction process in this manner,

it essentially remains an individual affair if all we are dealing with is a multi-agent

system. It only takes on a social significance when it is the result of co-regulation in the

strict sense, such that it could not be achieved by individual regulation alone. It is the

addition of the second requirement, i.e. of necessary co-regulation, that gives meaning

to the notion of participatory sense-making as such.

It is worth emphasizing the basic idea of this proposal again: if agents mutually enable

and constrain their sense-making activities in a multi-agent system, they can certainly

open up behavioral domains that would have otherwise remained inaccessible to the

individual agents. This is nicely illustrated by the psychological experiment conducted

by Auvray and colleagues (2009), where the relative stability and instability of the

interaction process causes the participants to succeed at a task that they are individually

incapable of solving. But the fact that participants are equally likely to „locate‟ the other

during mutual interaction as when interacting with the irresponsive mobile lure also

shows that, while the interaction process has organized their behavior appropriately, it

has not affected their sense-making activity. To the participants there is no meaningful

difference between the two situations. For that to happen the task must be changed such

that an intended activity of one participant can only become realized by the coordinated

activity of the other. Only then can we properly speak of participatory sense-making and

expect a qualitative difference in experience.

Of course, these two scenarios are not mutually exclusive. Participatory situations

necessarily emerge out of the interactions of a multi-agent system, and may in turn

70 | P a g e

influence the structure of that system so as to lead to further openings for co-regulated

participation. Di Paolo, for example, suggests that when we remove the assumption that

the infant intentionally originated the act of giving we open up the possibility even

richer degrees of interaction: “A certain movement extending the object in the direction

of the mother, without yet intending to give it, may now be opportunistically invested

with a novel meaning through joint sense-making. Latent intentions become crystallised

through the joint activity so that not only the completion of the act is achieved together,

but also its initiation” (Di Paolo, in press). In other words, interaction in a multi-agent

system can not only extend the relational domains of the individual agents (i.e. their

cognitive and behavioral capacities), but also lead to a novel way of participatory sensemaking

(via co-regulation of activities) that inaugurates a social domain specific to their

history of interactions.

4.3 Cultural interaction

The act of giving, as a paradigmatic social act, is widespread throughout the animal

kingdom, most often in the context of parenting (e.g. giving food) or courtship (e.g.

making more or less arbitrary offerings). As such, it is one of the most fundamental

social acts on the basis of which other forms of sociality can develop. The act itself does

not presuppose much and, following De Jaegher and Di Paolo‟s interpretation of the

infant giving an object to its mother, it is possible that none of the interactors

intentionally originated the act. An arbitrary exchange can be subsequently invested

with social significance when its joint completion changes the very meaning of the

relationship to that of „giver‟ and „receiver‟.

However, do the abstract categories of „giver‟ and „receiver‟ actually have any meaning

in the animal kingdom apart from their use by human beings? Typically, we would

expect that the roles are much more concretely situated in non-human cases of social

interaction, e.g. as „feeder‟ and „fed‟ or „courter‟ and „courted‟. The example of the

object exchange between the infant and its mother thus points to the need for some

additional clarification. Where do the norms which guide the mother‟s response to the

infant‟s behavior come from? And how do they provide a measure for the successful

completion of the act as a whole?

71 | P a g e

It is here that the socio-cultural background, in which the interactors and the unfolding

interaction process are embedded, comes into play (cf. Steiner & Stewart 2009). Indeed,

the mother might be moved to accept the held object because that is „what one does‟

when offered something by another. From her perspective, treating the gesture as the

infant‟s attempt to „give‟ the object is a natural way of making sense of the situation,

and this sense-making is implicitly achieved in terms of a pre-established social

practice. Moreover, this meaning, once it has been actualized in the situation, is not lost

on the infant, either, who has now discovered a novel way of interacting with his

mother. In other words, to characterize this example as a social interaction alone misses

the fact that we are dealing with a process of enculturation.

The appeal to a pre-existing order of shared practices indicates that an approach to

social cognition which only focuses on the momentary constitution of norms in social

interaction is not sufficient to capture the whole of sociality. In particular, it is missing

what is specific about those social interactions that unfold within a cultural context. As

Steiner and Stewart emphasize, the latter kind of social interactions also necessarily

involve a form of „heteronomy‟, i.e. the abiding by a heritage of pre-established social

structures. Indeed, the claim that there are heteronomous cultural values that guide our

behavior and understanding points to a more general phenomenon, since enculturation

has similarly profound effects on our solitary behavior. A castaway like Robinson

Crusoe does not immediately cease to behave like an Englishman when he finds himself

socially isolated on a tropical island. Enculturation thus involves at least some form of

internalization of heteronomy (cf. Vygotsky 1978).

In terms of the social, Steiner and Stewart argue that only enculturated forms of

interaction deserve to be called social interactions, in order to distance them from the

kind of multi-agent interactions that are paradigmatic of De Jaegher and Di Paolo‟s

approach. However, while we agree that the latter approach is too inclusive, which is

why we have re-conceptualized it as merely a necessary condition for social interaction

(i.e. in terms of multi-agent interaction), Steiner and Stewart‟s approach is also overly

exclusive. They make sociality a specifically human phenomenon, thereby excluding

everything from the so-called social insects to our closest primate relatives.

72 | P a g e

In contrast to both of these approaches, the definition of social interaction that we have

provided in the previous section takes up a middle ground. On the one hand, it excludes

cognitive interactions that merely contingently happen to involve another agent, but on

the other hand it includes co-regulated interactions that are not already guided by preestablished

cultural norms. Of course, this is not to deny that Steiner and Stewart are

correct in insisting that there is something special about many human forms of sociality,

including their heteronomous character, but this specificity is perhaps better captured by

the notion of culture rather than by sociality as such.

While the enactive paradigm has acknowledged the constitutive role of the cultural

context for life and mind (cf. Thompson 2007; Steiner & Stewart 2009; Di Paolo 2008;

in press), so far there has been no attempt to provide an operational definition of those

social interactions whose unfolding is partially determined by a pre-existing sociocultural

background. Nevertheless, the target phenomenon is starting to be clarified and

the bottom-up approach of the enactive paradigm is systematically developing an

explanation in a step-by-step manner. In this chapter we have taken the important step

of clarifying social interactions as being a special kind of multi-agent interaction where

meaning is co-created by the joint action of the interactors.

An important problem that remains is to explain how such social interaction is shaped

by „external‟ cultural values. In response we can note that one way to begin to

understand the heteronomy of cultural structures, and the one pursued in this thesis, is to

first consider the autonomy of a social interaction process in more detail. After all, this

autonomy is, when viewed from the perspective of the interacting agents, also a form of

heteronomy that has its own intrinsic teleology, and which can enable and constrain the

behavior of the individual agents. Of course, future work will need to determine more

precisely what is special about the heteronomy of culture. In particular, how is it

possible that behavior implicitly adheres to cultural norms even when others are not

immediately present? But even here we should be able to approach this problem from

the perspective of social interaction, especially social learning. If we want to know how

culture can shape our behavior even outside of an immediate social context, then we

first need to better understand how an agent involved in a social interaction, faced with

73 | P a g e

the heteronomy of another agent and the heteronomy of the interaction process itself,

can undergo a change in behavior that we would call learning.

A final question to consider is whether the constitutive impact of cultural values is not a

problem for the life-mind continuity thesis. Do we not have to provide a biological

foundation for these values? Yes and no. Yes, in the sense that these values can only

exist for certain kinds of sense-making agents, and these agents are biological in that

they are alive (autonomous and adaptive). No, in the sense that this is not a reduction of

cultural values to their biological conditions of possibility; the socio-cultural domain

retains its own autonomy. As such, the emergence of the heteronomy of culture is the

appearance of another discontinuity in the system of discontinuities which constitutes

life, mind, and sociality. More specifically, the continuity thesis is preserved because

the heteronomy of culture turns out to be mutually interdependent with the heteronomy

of sociality, and the same conceptual framework of autonomy, which forms the very

foundation of the enactive paradigm, is applicable to both.

It is already clear that, like the previous transitions along the life-mind continuity, a

cognitive agent‟s entrance into a cultural domain is both enabling and constraining. It is

constraining because taking part in shared practices requires the alignment of an

individual‟s autonomy with a pre-established. But despite this constraining, or rather –

because of it, there is also an expansion of possibilities. A good example of this is play,

the freedom of which lies in a players‟ capability to create new meaningful constraints

by which it can steer its sense-making activity and set new laws for itself and others to

follow (Di Paolo, et al. in press). Moreover, by inaugurating a historical trace of shared

individual and social practices that can go beyond an individual‟s lifetime, cultural

interaction provides the foundation for cumulatively building on previous more or less

viable ways of living.

The subsequent chapters of this thesis will investigate the dynamics of social interaction

more generally, but we will return to some speculations about the possible mechanisms

of cumulative cultural development in Chapter 13.

74 | P a g e

4.4 Summary

In this chapter we have traced the conceptual framework of the enactive paradigm from

autonomy to culture, as summarized in Figure 4-4.

Autonomy

Adaptivity

Agency I

Agency II

Operational specificity

Sociality

Culture

Qualitative change

Social cognition

Cognition

Behavior

Sense-making

Intrinsic teleology

Figure 4-4. This schematic summarizes the relationships between core concepts of the enactive paradigm

as we have developed them in this chapter. Any inner layer necessarily depends on all of the outer layers.

Thus, for each phenomenon specified at the bottom of a layer (e.g. „sense-making‟), the operational

requirements specified at the top of that layer, including those of all previous outer layers, together form

its necessary and sufficient conditions (e.g. „autonomy‟ and „adaptivity‟). „Agency I‟ refers to an

autonomous system that achieves adaptation not only through internal re-organization (i.e. adaptivity,

more generally), but also by regulation of its structural coupling (adaptive agency). „Agency II‟ denotes a

form of adaptive agency, whereby the norms of the regulation of structural coupling are underdetermined

by metabolic criteria alone (dynamic agency). As the operational specificity increases with each inner

layer, we can attribute an expansion of qualitative existence to the system. The central layer, culture, is

still in need of further clarification in both operational and phenomenological terms.

This figure shows that the enactive paradigm indeed promotes a form of life-mind

continuity that is not exhausted by unification into one conceptual framework. In other

words, all of the latter, more specialized phenomena (inner layers) depend necessarily

75 | P a g e

(and not just historically, i.e. evolutionarily and developmentally) on the existence of all

of the former, more inclusive phenomena (outer layers). Note, however, that even

though every new domain emerges on the basis of activity in the preceding domains, it

cannot be reduced to that enabling activity. This operational asymmetry between

successive domains is what the recurring concepts of partial decoupling, emergence,

and autonomy provide. It is also what guarantees that we are actually dealing with a

non-reductive life-mind continuity, rather than a progression of heuristics that could be

collapsed into a purely metabolic level on the basis of a more advanced science.

Of course, we should not misunderstand this operational asymmetry as prescribing a

one-sided interaction between the different phenomenal domains. On the contrary, once

the different domains of activity have been established for an agent, their relationship is

not one of hierarchical dependence, but rather of multiple interdependence. For any

agent it is possible (and likely) that its activities in the different domains all mutually

constrain and enable each other in various non-trivial ways. Thus, even cultural norms

can be re-inscribed back into the normativity operative on the metabolic level (Di Paolo,

in press). For example, I might take up drinking due to the kind of socio-cultural

environment of which I am a part, but that appropriated activity might itself become

sustained as a self-constituting habit, and that habit can even begin to re-organize my

metabolism in such a way that it reinforces the frequency of my drinking behavior,

which thereby starts to shape the socio-cultural environment faced by others around me,

perhaps making them more inclined to take up drinking as well. Accordingly, we can

identify multiple, interdependent, mutually enabling and constraining autonomous

systems within and across different phenomenal domains. In general, working out how

these multiple interdependencies precisely operate, and how they combine to bring forth

coherent forms of agency is one of the most important research problems for enactive

cognitive science. In particular, it remains to be explained how it is possible for us to

reflectively live out a unified existence, though it is likely that this has to do with our

interactions in a linguistic domain (Di Paolo 2009, p 19) and processes of self-other codetermination

more generally (Thompson 2001).

76 | P a g e

5 Beyond methodological individualism

A fundamental assumption of mainstream cognitive science is that the individual agent

(whatever that may mean in the absence of a mainstream definition of agency) is the

correct unit of analysis for understanding life, mind, cognition, and behavior, as well as

all social phenomena. This approach has been termed “methodological individualism”

(cf. Boden 2006b), after the doctrine in social science that was introduced by Max

Weber in the beginning of the 20 th century, and continued by Friedrich von Hayek and

Karl Popper among others. The history of this doctrine in social science is complex; it

became embroiled in highly politicized debates, largely because it was often invoked as

a way of discrediting historical materialism (Heath 2009). In scientific terms the central

claim of methodological individualism is that social phenomena must be explained by

showing how they result from individual actions, which in turn must be explained

through reference to the intentional states that motivate the individual actors.

More recently, the validity of this widespread assumption in cognitive science is starting

to be questioned on the basis of research from a variety of its sub-disciplines. The idea

that cognition is at least partly constituted by social interactions and cultural context has

received support from cognitive anthropology (Hutchins 1995), developmental and

social psychology (cf. Tomasello 1999; Lindblom & Ziemke 2003), social studies

(Pentland 2007), primatology (Savage-Rumbaugh, et al. 2005; Tomasello 2000),

interaction studies (Auvray, et al. 2009; Di Paolo, et al. 2008), as well as philosophical

and phenomenological considerations (e.g. Zahavi 2001). In terms of understanding

social cognition this shift is expressed by positing embodied interaction as the primary

mechanism of our understanding of other minds, rather than theoretical inference or

empathic simulation (cf. Gallagher 2001).

One formidable challenge that needs to be addressed by the critics of methodological

individualism is to build a framework that enables them to specify precisely and in a

non-mysterious manner how it is possible for social phenomena to play a constitutive

role in the unfolding of individual behavior (De Jaegher 2009). In Chapters 2 to 4 we

have provided the foundation for this task by introducing and developing the conceptual

framework of the enactive approach to social cognition. Interestingly, the historical

77 | P a g e

oots of this approach were never that far away from methodological individualism (Di

Paolo 2008). The influential traditions of radical constructivism (e.g. von Glasersfeld

1984) and especially second-order cybernetics (e.g. von Foerster 1973) were certainly

concerned with the existence of others, but largely as a one-sided response to the specter

of solipsism that was haunting their subject-centered worldview. Maturana and Varela‟s

(1980) biology of cognition continued the work of these traditions (cf. Varela 1996a),

but offered an essential improvement. They insisted on a clear logical accounting in

biology that separated constitutive (individual) and relational (interactive) phenomena

into two non-intersecting domains, whereby the relational phenomena may constrain the

constitutive processes and vice versa. This opened the door to a fuller appreciation of

the role of interaction in a social, linguistic, and cultural context for the development of

higher cognitive functions (e.g. Maturana, et al. 1995), even leading to the radical

conclusion that “we are constituted in language in a continuous becoming that we bring

forth with others” (Maturana & Varela 1987, pp. 234-235). Nevertheless, this strong

idea of self-other co-determination has remained marginalized in the biology of

cognition, in particular because its doctrine of non-intersecting domains simply leaves it

unexplained precisely how relational phenomena constrain processes in the constitutive

domain. It thus remains unclear how the social is afforded any proper constitutive role.

A detailed comparative analysis of how the enactive approach relates to these traditions

is desirable, but unfortunately beyond the scope of this thesis. In its early formulations it

was certainly still afflicted by a lingering methodological individualism, though in

recent work this has become less of a problem 19 . As should have become clear from the

preceding chapters, one crucial difference is that the notion of identity conservation has

been replaced by a focus on precariousness and normativity. The profound implication

of this simple change in focus is that a richer grasp of the interrelationship between the

constitutive and relational domains is now conceivable, to the extent that it becomes

possible to think about how socio-cultural values can transform even basic metabolic

processes (Di Paolo, in press). Enactive cognitive science has thus traded the framework

19 The related tradition of the enactive or, more precisely, sensory-motor approach to perception (e.g. Noë

2004; O‟Regan & Noë 2001), which took inspiration from the early work by Varela and colleagues, has

remained committed to methodological individualism. We will return to this point in Chapter 12.

78 | P a g e

of methodological individualism for an approach that is more akin to the dialectical and

historical materialism of Vygotsky‟s psychology, as first proposed by Marx and Engels,

though stripped of its metaphysical pretense to universality and instead embedded in

systems thinking and a closed-loop epistemology.

However, it is one thing to say that social interaction plays a constitutive role, and

another to say exactly how it plays that role. The focus of Chapters 6 to 10 is therefore

to demonstrate more concretely the constitutive interplay between the individual and

interactional levels. First, the problems faced by the doctrine of methodological

individualism in accounting for a range of psychological phenomena are highlighted,

and a possible role of the interaction process is indicated (Chapter 6). In order to get a

better understanding of how it is possible for the dynamics of the interaction process to

be constitutive of an individual‟s behavior we introduce evolutionary robotics as a

methodology to generate complete models of minimal complexity (Chapter 7). This is

followed by a range of novel modeling experiments which show concretely and in

mathematically analyzable terms how individual and interaction levels are interrelated

in multi-agent systems (Chapters 8 to 10).

79 | P a g e

6 Studies in social psychology: A critical analysis

The aim of this chapter is to show that the enactive approach to social cognition can be

used to provide a fresh perspective on some important experiments in developmental

and social psychology. Most contemporary studies of social cognition attempt to explain

the widespread phenomenon of bodily coordination, for example facial imitation or

gestural exchange, in terms of a combination of three factors pertaining to the individual

interlocutors. These factors consist of two kinds of self-perception and one kind of

other-perception: (i) visual self-perception, (ii) proprioceptive self-perception, and (iii)

visual other-perception. As we will see, most traditional explanations have no problems

accounting for bodily coordination as long as they can appeal to either or both kinds of

self-perception and some form of other-perception. If this is not possible, for instance in

some pathological cases, additional ad hoc neural systems are typically postulated to fill

the explanatory gap. From the perspective of the enactive approach to social cognition,

it appears that what is missing from these traditional explanations in psychology is an

appreciation of the role of the interaction process in organizing individual behavior.

In order to motivate a consideration of the interaction process for explaining empirical

data in developmental and social psychology we will proceed as follows. First, some of

the main concepts used by traditional approaches, i.e. body image, body schema and

proprioception, need to be clarified. This is followed by a discussion of several case

studies which map out empirically the various possibilities of the conceptual space

afforded by the combinations of these three concepts. On this basis an attempt is made

to adopt a more progressive approach to social cognition in recent cognitive science,

namely the integrative theory of gesture, to make sense of this data. Nevertheless, some

potential problems are identified. Finally, it is suggested that the enactive approach to

social cognition, with its focus on the constitutive role of the interaction process, is a

promising candidate to resolve these difficulties.

Note that it is beyond the scope of this chapter to develop a full response to each of the

chosen case studies. However, the modeling experiments presented in the next chapters

indicate one potential methodology of how to start going about this in a more systematic

80 | P a g e

manner. They begin to develop the conceptual language of dynamics that would be

needed for a more thorough review of these psychological studies.

6.1 Body image and body schema

We will follow Gallagher and Cole (1995) in making a conceptual distinction between

two aspects of embodiment. On the one hand there is the body image (BI), which

consists of a system of perceptions, attitudes, and beliefs pertaining to one‟s own body.

On the other hand, there is the body schema (BS), namely a system of sensory-motor

capacities that functions without conscious awareness or the necessity of perceptual

monitoring. In other words, “the difference between body image and body schema is

like the difference between a perception (or conscious monitoring) of movement and the

actual accomplishment of movement, respectively” (Gallagher 2005, p. 24). There is

empirical evidence that this conceptual distinction does indeed pick out two different

aspects of our embodiment, and some of this evidence will be presented as part of the

case studies.

Another important concept that we need to define more clearly is that of proprioception,

which is often used to refer to somatic information about joint position, limb extension,

as well as bodily position and body posture more generally. This notion of somatic

information can be meant in the form of a pre-reflective pragmatic awareness which,

while not taking the body as an object, still contributes a certain spatial structure to the

perceptual body image. But the notion can also be used to refer to a non-conscious

process, whereby physiological stimuli activating peripheral proprioceptors, which in

turn are registered at certain strategic sites in the brain, operate as part of the system that

constitutes the body schema. Following Gallagher (2005, p. 46) we will refer to these

two different ways of conceptualizing proprioception in terms of proprioceptive

awareness (PA) and proprioceptive information (PI), respectively.

The distinction between body image and body schema is related to the important

phenomenological distinction between the body-as-object and the body-as-subject (cf.

Legrand 2006). In the former case our embodiment is reflectively experienced by means

of sensory perception, and in the latter case it is experientially transparent as a means of

81 | P a g e

eing-in-the-world. Note, however, that in contrast to the notion of body schema, which

refers to an integrated system of physiological capacities and proprioceptive information

that remains outside our awareness, the body-as-subject is a phenomenological notion.

As such it refers to a structure of our lived experience, for example our pre-reflective

proprioceptive awareness.

With these distinctions in place we can now return to the three factors traditionally used

to explain the existence of bodily coordination. First, we have visual self-perception,

which is an essential part of the body image we have of ourselves (Self-BI). And then

there is proprioceptive self-perception, which is usually taken as an essential part of the

sub-personal mechanisms that are outside of our awareness (PI). And finally there is

visual other-perception, which forms an essential part of the body image that we have of

the other interlocutor (Other-BI). On the basis of these three factors we can now

perform a simple meta-analysis of the literature by focusing on case studies that map out

the space of possibilities, as summarized in Table 6-1.

Case: Self-BI: Other-BI: PI: Example of bodily coordination:

1 √ √ √ Non-pathological face-to-face interaction

2 √ √ Facial imitation by human neonates

3 √ √ Gesturing by a deafferented subject (I)

4 √ √ Perceptual crossing in a virtual space

5 √ Deafferented subject under a blind

6 √ Gesturing by a deafferented subject (II)

7 √ Body imitation in a virtual space (I)

8 Body imitation in a virtual space (II)

Table 6-1. A list of case studies (1-8) that spans all possible combinations of the three factors

traditionally used to explain bodily coordination: a self-directed body image (Self-BI), an other-directed

body image (Other-BI), and proprioceptive information (PI). Each case study provides a brief description

of an instance of bodily coordination and the particular factors which would be available in terms of the

traditional explanatory framework. Note that there are some instances of coordination which appear to fall

outside the scope of this traditional framework. Nevertheless, all of these cases allow the possibility of

responsive interaction between the participants, and thus retain a role for the interaction process.

82 | P a g e

6.2 Case studies in social psychology

We will now expand on the information provided in Table 6-1 by providing a more

detailed description of the example of bodily coordination for each case, as well as the

kind of explanations that have been offered to account for them. The focus on bodily

coordination is justified because it is one of the most basic forms of inter-individual

interaction. In all of the cases the aim is to use the empirical evidence to develop an

understanding of the necessary and sufficient conditions for the establishment of bodily

coordination between cognitive agents. What will emerge out of this analysis is a more

precise grasp of the explanatory gaps in the traditional framework of social psychology,

and a sense of the kind of explanations that can potentially be offered by the enactive

approach to social cognition.

6.2.1 Non-pathological face-to-face interaction

Let us begin this analysis of psychological studies of bodily coordination with a brief

consideration of unencumbered, everyday social interaction between human beings. It is

well known in social psychology that unconscious gestural imitation is prevalent during

human interactions, and that the particular unfolding of this imitation even influences

the meaning of the social encounter (e.g. LaFrance 1982). In this unconstrained case the

full explanatory framework of traditional social psychology is available to account for

this phenomenon of bodily coordination, as represented schematically in Figure 6-1.

This basic case covers a whole range of social phenomena, including what has been

described as primary intersubjectivity (Trevarthen 1979), secondary intersubjectivity

(Trevarthen & Hubley 1978), as well as joint attention and linguistic interaction

(Tomasello 1988). Indeed, out of all the listed cases this is the one that allows one of the

richest forms of embodied interaction to emerge between the participants, i.e. the full

range of human intersubjectivity. However, because explanations of these forms of

bodily coordination can appeal to any combination of the three factors, it is difficult to

sort out which of them are necessary and/or sufficient conditions.

83 | P a g e

Perception of Target

Acts (Other-BI)

Self/Other

Comparator

Perception of Actual

Acts (Self-BI)

Body Schema (BS)

Proprioception

(PI)

Motor Act

Figure 6-1. A schematic of the three factors that are traditionally used to explain bodily coordination

during social interaction: (i) the body image that we have of ourselves (Self-BI), which is largely formed

through visual perception, (ii) the proprioceptive information (PI) of one‟s bodily posture and movement,

and (iii) the visual perception of the other‟s body (Other-BI).

In order to better determine whether any combination of the three traditional factors

depicted in Figure 6-1 is necessary and/or sufficient to explain the phenomenon of

bodily coordination in general we can consider a series of more restrictive case studies

which systematically eliminate their potential influence.

6.2.2 Facial imitation by human neonates

It is well known in developmental psychology that human infants are good at imitating a

wide variety of gestures performed by adult experimenters (e.g. Meltzoff & Moore

1977; 1989). Moreover, it has been shown that even newborn infants less than an hour

old can successfully imitate facial gestures such as mouth-opening and tongue

protrusion (Meltzoff & Moore 1983). In general, the range of imitative capacities

exhibited by young infants is too extensive and specific to be explained in terms of predetermined

innate sensory-motor structures (BS) alone. Indeed, the fact that infants

retain this imitative capacity even when the experimenters introduce considerable delays

between the presentation of a stimulus and the infant‟s ability to respond implies that

early imitation is not entirely stimulus bound, directly triggered, or reflexive (cf.

Meltzoff & Moore 1997, p. 182).

84 | P a g e

In the case of adult imitation it is possible to provide an explanation in the traditional

framework that is based on a comparison between self-BI and other-BI. But how can we

explain the infants‟ ability to imitate facial gestures given that they have not yet had the

chance to construct a body image based on visual perception of their face? For theorists

who hold that perception is unorganized in early infancy even this very phenomenon

itself appears to be impossible:

Thus since the child cannot see his own face, there will be no imitation of

movements of the face at this stage. […] For imitation of such movements to be

possible, there must be co-ordination of visual schemas with tactilo-kinesthetic

schemas. (Piaget 1962, p. 45; quoted by Gallagher 2005, p. 68)

However, that such imitation by even very young infants is indeed possible has now

been conclusively demonstrated. How are we to explain this phenomenon? Meltzoff and

Moore propose that early facial imitation is based on „active intermodal mapping‟

(AIM), whereby the target matching process is captured by a proprioceptive (PI)

feedback loop. In essence, the experiments on neonate imitation demonstrate, contrary

to the traditional position of theorists such as Piaget (1962), Merleau-Ponty (1945/1962)

and others, that a functioning body schema is present at least from birth onwards, if not

even earlier. The AIM model posits that a PI feedback loop from the infant‟s motor acts

enables it to perform the appropriate gesture even without visual awareness of its own

face. The equivalence between the target other-BI and the infant‟s PI is therefore

determined by means of a supra- or intermodal matching process (Meltzoff & Moore

1997). A simple schematic of the AIM hypothesis is shown in Figure 6-2.

85 | P a g e

Visual Perception of Target

Adult Facial Acts

Supramodal

of

Representation

Acts

Equivalence

Detector

Proprioceptive

Information

Infant Motor Acts

Figure 6-2. A conceptual schematic of the „active intermodal mapping‟ (AIM) hypothesis adapted from

Meltzoff and Moore (1997). They make use of a traditional form of explanation: the imitative ability of

human neonates is accounted for by appealing to an intermodal equivalence detector between the

neonate‟s visual percept of the other (Other-BI) and its proprioception (PI).

Perception of Target

Acts (Other-BI)

Self/Other

Comparator

Body schema (BS)

Proprioception

(PI)

Motor act

Figure 6-3. A revised version of the schematic in Figure 6-1. In the case of neonate imitation traditional

explanations cannot appeal to the existence of a body image based on visual perception of the neonate‟s

body (Self-BI). Accordingly, Meltzoff and Moore (1977) propose their „active intermodal mapping‟

(AIM) hypothesis which is illustrated in Figure 6-2.

Apparently the existence of such a general matching process does not require extensive

periods of learning appropriate intermodal correlations since neonatal imitation has been

demonstrated with infants even less than a month old (Meltzoff & Borton 1979). Here

86 | P a g e

we thus have an example of how bodily coordination can be explained in the traditional

framework by appealing to the existence of an Other-BI and PI alone without the need

for a perceptually formed self-BI as well. This possibility is illustrated in Figure 6-3.

While the existence of an innate BS can explain the ability of the neonates to engage the

appropriate part of their bodies, it is difficult to see how this BS alone can explain their

ability to actually improve this performance. How do they know which PI matches the

target PI that they would receive when accurately copying the intended movement? 20 To

be fair, it could be argued that at least a minimal Self-BI is also present for the neonates

because of the existence of proprioceptive awareness (PA). Gallagher and Meltzoff

(1996), for example, suggest that PA, as a tacit, pre-reflective awareness, constitutes the

very beginning of a primitive form of Self-BI. Thus, a primitive self-BI might also play

an essential role, namely as the comparative goal-state to be attained by the innate BS.

6.2.3 Gesturing by a deafferented subject (I)

Is it possible to better differentiate between the contributions of the self-BI and PI for

bodily coordination? We can gain some further insights by considering the case of a

deafferented subject, Ian Waterman (sometimes referred to as IW in the literature), who

has lost all sense of touch and proprioception in his body below the neck. This acute

sensory neuropathy developed in 1971, when Ian was 19 years old, because of an illness

which damaged the large myelinated fibers below his neck. The case of Ian has been

well document by his doctor, Jonathan Cole (1995). Ian‟s case is of interest here

because one way to describe his condition is that of a lost body schema:

20 Since some readers might be tempted to answer „mirror neurons‟ to this question (e.g. Rizzolatti, et al.

2001; Gallese & Goldman 1998), it is fitting to very briefly consider why this response is fundamentally

inadequate. First, there are deep conceptual difficulties with the notion of „mirroring‟, including the lack

of pretense at the level of neurons (Gallagher 2007). And even if these difficulties could be resolved there

is experimental evidence that the activation of these neurons is nothing specifically social, but rather a

contingent outcome of associative learning that can even be reversed (Catmur, et al. 2007). Finally, the

reference to „mirror‟ system activation just shifts the explanatory problem to the neural domain: what is

the mechanism which makes these neurons fire when the infant sees the other‟s movement, and what

mechanism can relate this firing activity to the infant‟s accurate copying of the target movement?

87 | P a g e

At the earliest stage of his illness he had no control over his movements and was

unable to put intention into action. There was, one might say, a disconnection of

will from the specifics of movement. If Ian decided to move his arm in a certain

direction, and then tried to carry out the intended movement, the arm and other

parts of his body would move in unpredictable ways. Without support, Ian was

unable to maintain anything other than a prone posture. He had no knowledge of

limb position unless he saw the limb. But even with vision, he had no control

over his movement. Because of the absence of proprioceptive and tactile

feedback his entire body schema system failed. (Gallagher 2005, p. 44)

The case of Ian thus highlights the importance of proprioception for the basic

maintenance of posture and the governance of movement. However, even though Ian

never recovered from the original neuropathy he nevertheless recovered control over his

movement and regained a close to normal life as a result of extreme effort. He has been

able to address the motor problem on a cognitive and behavioral level by rebuilding a

partial and very minimal body schema, in terms of nearly automated cognitive

processes, and by using his well-developed body image to help consciously control

movement:

When we say that he has regained control over his posture and movement, we do

not mean that he has recovered from the neuropathy that destroyed his sensory

nerves. Rather his control of movement is based primarily on visual attention

and cognitive effort (although some aspects of walking have become close to

automatic due to consciously guided practice). In the dark, controlled movement

is impossible since he has not visual access to current position of his limbs and

cannot tell where they are in relation to one another. (Cole, et al. 2002, p. 51)

In other words, even after his behavioral recovery “Ian still does not know, without

visual perception, where his limbs are or what posture he maintains. In order to maintain

motor control he must conceptualize his movements and keep certain parts of his body

in his visual field. His movement requires constant visual and mental concentration”

(Gallagher 2005, p. 44). How has Ian‟s neurophysiological loss of a body schema and

subsequent recovery by means of a highly developed body image affected his ability to

88 | P a g e

gesture? Cole, Gallagher and McNeill (2002a; 2002b) report on a series of experiments

conducted with Ian that were designed to investigate this question. When Ian was asked

to narrate a cartoon he had just seen they found that “when vision of his hands was

available, he made numerous meaningful gestures well synchronized with his coexpressive

speech, confirming that he had the ability to produce gestures. His gestural

performance looked essentially identical to non-neuropathic performance, and further

computerized analysis of the video confirmed this” (Cole, et al. 2002a, p. 55).

Ian reports that his relatively normal gestural performance is made possible by the fact

that he consciously initiates the gestures, and that he controls them just like he does for

the case of instrumental movement, namely by continually checking his body by means

of visual perception. This possible explanation is depicted in Figure 6-4.

Perception of Target

Acts (Other-BI)

Self/Other

Comparator

Perception of Actual

Acts (Self-BI)

Body schema (BS)

Motor act

Figure 6-4. The case of Ian Waterman presents us with the opposite situation of human neonates (at least

to some extent). The neurophysiological loss of proprioception (PI and PA) and the subsequent behavioral

compensation via visual perception have left him with a highly developed body image (Self-BI) but only

a primitive body schema (BS), indicated by the non-solid box. Under these conditions it is still possible to

explain his successful gestural coordination in terms of the traditional framework, namely by appealing to

a reflective (cognitive) comparison between the Self-BI and Other-BI.

Note that the case of Ian seemingly presents us with the opposite situation of human

neonates. He can achieve bodily coordination in terms of his reliance on a visually

guided Self-BI, and without proprioception (PI and PA). Accordingly, the cases of

89 | P a g e

human neonate imitation and Ian‟s gestural competence have indicated that neither a

Self-BI nor pre-reflective PI is a necessary condition for bodily coordination. On the

other hand, either of these factors in combination with a visually formed Other-BI

appears to be a sufficient condition to explain the existence of such bodily coordination.

But is this kind of visual Other-BI itself a necessary condition? How can we make sense

of coordination between agents that cannot perceive each other as such?

6.2.4 Perceptual crossing in a virtual space

In order to answer this question we need to find a case in which a normal participant,

i.e. one with a visually guided Self-BI and pre-reflective PI, can engage in mutual

interactions with another participant, but where that partner is not perceived in terms of

an Other-BI. One way to consider this situation is by means of Auvray, Lenay and

Stewart‟s (2009) psychological experiment in perceptual crossing. This study attempts

to explore the necessary conditions for participants to locate each other through minimal

technologically mediated interaction in a shared virtual space. A schematic of the

experimental setup is shown in Figure 6-5.

Figure 6-5. The experimental setup of Auvray, Lenay and Stewart‟s (2009) study in perceptual crossing.

Two participants face each other in a 600 unit long 1-D wrap-around environment. Note that each

participant‟s receptor field (white) can encounter three different objects: a static object (gray), the other

participant‟s receptor field (gray), and the other participant‟s „shadow‟ of that receptor field (gray).

Since this experimental setup will be the basis for the models presented in Chapters 9

and 10, it will be helpful to describe it in a bit more detail here. Two adult participants,

acting under the same conditions, can move a mouse cursor left and right along a shared

one-dimensional virtual tape that wraps around at the edges. They are asked to indicate

the presence of the other partner by clicking their mouse pointer. The participants are

90 | P a g e

lindfolded, in separate rooms, and all they can sense are on/off tactile stimulations on a

finger when their cursor crosses an entity on the tape. Apart from each other‟s receptor

field, participants can encounter two other objects: a static object on the tape, and a

displaced „shadow image‟ of the partner, which moves strictly in an identical manner to

the partner‟s receptor (though it is not a source of stimulation for that partner). Thus,

each participant can encounter three different types of object in the shared environment:

(i) The four-unit wide sensory receptor field of the other participant. When any of

the four sensors of an participant overlaps with any sensors of the other

participant, both of them receive sensory stimulation. This possibility of

„perceptual interaction‟, or shared perception, represents the way in which an

embodied perceiver cannot observe someone else without also at the same time

being perceivable in some manner, at least potentially (e.g. mutual gaze or

touch).

(ii) A four-unit wide static object that is placed at a specific location. There are two

static objects, one for participant „up‟ and one for participant „down‟, which are

located between 148-152 units and 448-452 units, respectively. Each participant

can only perceive its specific static object. The objects were chosen to be

participant specific and placed in diametrically opposite locations of the

environment in order to encourage the participants‟ displacement within the

whole 1-D space.

(iii) A four-unit wide ‘shadow’ object, whose position tracks that of the other

participant at a fixed distance. The mobile shadow object reproduces the exact

same movement as the receptor field of the other participant, but is displaced by

48 units. Nevertheless, in contrast to an actual inter-individual encounter it does

not give rise to the possibility of two-way or mutual perceptual interaction.

Thus, if a participant encounters the other‟s shadow object, it will indeed

receive the same sensory stimulation as if it had encountered the other

participant. However, since the other participant does not receive any

corresponding sensory stimulation, there is only the possibility of „one-way‟

interaction.

91 | P a g e

In summary, there are three distinct types of objects which can be encountered by a

participant, one of which is placed at a fixed location and two of which are moving

within the 1-D space. All objects are four units wide. The two mobile objects exhibit

exactly the same movement, but only an overlap of the receptor fields of both

participants gives rise to mutual sensory stimulation. Note that the difference between

these three types of objects cannot be directly provided by the sensors, which in all

cases can only produce a binary, all-or-nothing response depending on whether

something is overlapping their particular receptor field or not. There is no Other-BI

available; all entities produce the same immediate percept. Thus, if the participants are

to be successful at distinguishing which of the encountered objects is the other agent (or

more precisely, its receptor field), they must accordingly rely on differences in the kinds

of interactions that these objects afford.

The results of the psychological study show that, at least under the minimalist

conditions of this experiment, the successful recognition of an ongoing interaction with

another person is impossible for individual participants. There is no significant

difference in the probability of responding to an encounter with the other‟s receptor or

with the „shadow image‟. Thus, overall success in this task cannot be due to individual

capacities alone. It is also based on certain properties that are intrinsic to the joint

perceptual activity itself. The important issue is that the scanning of an object

encountered will only stabilize in the case that both partners are in contact with each

other. If interaction is only one-way, between a participant and the other‟s shadow, the

shadow will eventually move away, because the participant it is shadowing (the partner)

is still engaged in searching activity. Therefore, the solution to the task does not only

rely on individuals performing the right kind of perceptual discrimination between

different momentary sensory patterns, but also emerges from the mutual perceptual

activity of the experimental subjects that is oriented towards each other. Moreover, this

perceptual activity typically involves the spontaneous emergence of coordinated

behaviours, namely both participants moving the mouse pointer so as to oscillate around

92 | P a g e

each other. Bodily coordination in the form of two-way mutual scanning is, given the

task and experimental setup, the only long-term stable solution 21 .

Perception of Target

Acts (Other-BI)

Self/Other

Comparator

Perception of Actual

Acts (Self-BI)

Body Schema (BS)

Proprioception

(PI)

Motor Act

Figure 6-6. Under the minimalist conditions of perceptual crossing that have been investigated by

Auvray, Lenay and Stewart (2009) bodily coordination emerges despite the lack of any evident Other-BI.

In sum, Auvray, Lenay and Stewart‟s study indicates that the factor of Other-BI that has

been appealed to in the previous cases is not a necessary condition in order to explain

the existence of bodily coordination. The participants of the study are individually not

able to distinguish between the other‟s avatar and the linked shadow, but can still

collectively engage in coordinated behavior (cf. Figure 6-6). Accordingly, this study

also indicates something much more important: there is another factor available that the

traditional form of explanation has not even properly considered yet, namely the selforganizing

properties of the interaction process itself. In other words, in this case the

direct perception of the other participant (presumably necessary for the constitution of

an Other-BI) was not necessary. The overall experimental situation was organized in

such a way that the appropriate individual behavior emerged spontaneously out of the

stabilities and instabilities of the interaction process (IP) within a multi-agent system.

21 Strictly speaking, a solitary interaction with the static object is another stable behavior, as long as the

participant does not realize her mistake and moves elsewhere. The possibility of getting stuck around

static objects can therefore cause some problems. We will return to this issue in Chapter 10.

93 | P a g e

6.2.5 Deafferented subject under a blind

Another way to develop a better appreciation of the role of the IP as a potential factor

which can be appealed to in order to explain bodily coordination is to consider the more

extreme cases. We have already seen that it is possible for a deafferented subject like

Ian Waterman to be integrated within everyday life, even if he has to compensate for his

lack of proprioception with a constantly visually updated Self-BI (cf. Section 6.2.3). We

also know that he is incapable of performing instrumental actions without such visual

feedback: “In reaching to grasp, for example, he has to see not only the target object,

but also his hand. On the basis of what he sees, he needs to think about how to shape his

hand in order to pick up the object” (Gallagher 2005, p. 50). Accordingly, it is to be

expected that his ability to gesture would be severely impaired when he is not able to

guide his hands and arms by means of visual feedback.

To discover whether Ian controlled his gestures using visual feedback, a blind was

placed to block the view of his hands. In this blind condition he is able to perform

gestures similar to those seen without the blind, even though he is now lacking both a

Self-BI and any sense of proprioception. This situation is depicted in Figure 6-7.

Perception of Target

Acts (Other-BI)

Self/Other

Comparator

Body Schema (BS)

Motor Act

Figure 6-7. When Ian gestures under the blind he lacks both visual and proprioceptive (PI and PA)

feedback. But even under these conditions he is able to coordinate his behavior meaningfully with his

interlocutors. How can traditional accounts explain this situation?

94 | P a g e

Gallagher (2005, p. 113) offers three factors to explain Ian‟s performance: (i) prosodic

feedback in speech, (ii) semiotic feedback, i.e. the co-expressiveness of gesture and

speech, and (iii) pragmatic feedback relating to the communicative situation. It is factor

(iii) that we are interested in here, which may include the meter or cadence of the

conversation, and response gestures made by one‟s interlocutor. These may offer some

important cues for maintaining synchronization. Since Ian can perform gestures almost

normally and fluently under conditions in which instrumental action would be difficult

or impossible for him, namely without visual feedback, we can conclude that gestural

and instrumental movement cannot be the same with regard to the mechanisms that

generate them. Gallagher explains:

[G]esture is an action that helps to create the narrative space that is shared in the

communicative situation. This suggests that it is part of and is controlled by a

linguistic/communicative system rather than a motor system. […] The fact that

Ian‟s gestures can be decoupled from visual monitoring and can be performed

without sensory feedback of any kind, and yet remain relatively accurate in time

and form, suggests that gestural movements are controlled by a system that is in

some measure independent from the system that controls the same muscles in

instrumental actions. (Gallagher 2005, pp. 117-118)

To be sure, due to Ian‟s condition it is not sufficient to appeal to the motor system in

this case, but what does the additional control by this linguistic/communicative system

precisely consist in? Unfortunately, due to Cole, Gallagher and McNeill‟s (2002a) focus

on Ian‟s behavior alone, nowhere do we find a description of the interaction process or

gesturing and general social presence of Ian‟s interlocutors. Why is nothing said about

the pragmatic feedback relating to the communicative situation that is available?

For our present purposes we can fortunately refer to two pictures of the experimental

situation that are depicted in Gallagher‟s (2005) book on pages 112 and 115. The former

shows Ian under a blind, facing a visible interlocutor and another potentially visible

participant sitting in a corner. The latter shows Ian wearing an eye-tracking device

explaining, with the aid of maps and floor plans, how he had arrived at the lab that

morning. At first sight it appears that he is alone in the room, but a closer look reveals

95 | P a g e

that there is someone sitting diagonally in front of him. She would be potentially visible

to Ian if he only slightly turned his head.

This makes it likely that Ian had access to an Other-BI during the gesturing. But in what

way was there mutual interaction between him and the others? Was there coordination

of gestures? The lack of detailed description on this point is surprising considering that

Gallagher‟s integrative theory of gesture portrays gesture as embodied (constrained and

enabled by motoric possibilities), communicative (pragmatically intersubjective), and

cognitive (contributing to thought). What is the role of pragmatic intersubjectivity for

gesture here, other than perhaps to function as a source of inputs like it does in the AIM

hypothesis shown in Figure 6-2? Unfortunately, not much more is said on this matter,

other than a statement to the effect that “gesture, as a movement concerned with the

construction of significance rather than with doing something, is organized primarly by

the linguistic-communicative context” (Gallagher 2005, p. 120). But how precisely does

the social context organize mere movement into gesture? And does the efficacy of this

context require the existence of a visually formed Other-BI?

6.2.6 Gesturing by a deafferented subject (II)

As a first step toward addressing these questions we can ask: Can Ian gesture normally

under conditions when (i) he has visual access to his own movements (Self-BI), but (ii)

has no access to the movements of his interlocutors (Other-BI)? This question has not

yet been specifically investigated with Ian or another deafferented subject, but we can

try to piece a possible answer together by considering what has been reported by Cole,

Gallagher and McNeill (2002a; 2002b) and Gallagher (2005, pp. 107-129).

We have already noted with respect to the previous case that, even though it is difficult

to tell from the descriptions provided of the experimental situations, it appears that Ian‟s

gesturing has always been observed in conditions where others were visually present,

thereby providing him with access to a visually formed Other-BI to coordinate his

gestural movements. This Other-BI can therefore be appealed to by a traditional account

in order to explain Ian‟s ability to gesture meaningfully under the blind where he cannot

96 | P a g e

see his hands or arms. However, a more enactive explanation that emphasizes the role of

the interaction process is also possible.

Consider, for example, the ability of participants to coordinate their movements during

Auvray et al.‟s psychological study. In this setup a minimalist technological interface

prevented the participants from directly perceiving the other as such, thereby leading to

an individual failure of distinguishing that other‟s presence, but nevertheless resulted in

collective coordination of perceptual crossing. It is therefore possible that in the case of

Ian the interaction process could similarly be sufficient to organize his gestural activity

appropriately, even without visual access to others. It is evident, for example, that Ian‟s

gesturing is shaped by the general context of the social situation, including its meaning,

verbal structure and timing, even without conscious awareness (i.e. under the blind):

IW‟s gestures reflected the meaning he was attempting to convey with a

significant degree of morphokinetic precision, suggesting that meaning plays a

part in how those gestures are shaped and how his hands move in gesture. IW‟s

gestures were also precisely synchronized with the verb phrases he used to

describe various events. His gestures are normal both with respect to

morphokinesis and with respect to timing. (Cole, et al. 2002a, p. 57) 22

Moreover, it is clear that Ian also sometimes guides his gesturing by means of visual

feedback, even while not directly observing his interlocutors at the time, for example

during situations of joint attention. Such a situation occurred during the eye-tracking

experiment when Ian was asked to describe how he had arrived at the lab that morning:

“While recounting his journey and arrival most of his gestures involved pointing at the

map or floor plan, with his gaze directed at the same place as he was pointing”

(Gallagher 2005, p. 115). In his book Gallagher complements this account with a picture

22 Note that a lingering trace of methodological individualism appears in Cole, Gallagher and McNeill‟s

approach to synchrony and timing. Instead of using these terms in relation to the dynamics of the interindividual

interaction process and the unfolding of bodily coordination between Ian and others, they are

only used to indicate intra-individual coordination (e.g. the internal relation between hand movement and

speech production). In terms of this internalist focus their position is somewhat akin to the traditional

framework of the „active intermodal mapping‟ hypothesis of Meltzoff and Moore.

97 | P a g e

of Ian looking at a floor plan while pointing to a particular location, with an onlooker

observing this action. This kind of social situation, where neither Other-BI nor PI is

immediately available, is schematically depicted in Figure 6-8.

Perception of Target

Acts (Other-BI)

Self/Other

Comparator

Perception of Actual

Acts (Self-BI)

Body Schema (BS)

Motor Act

Figure 6-8. Experiments with Ian Waterman could present us with an example of gestural coordination

that neither depends on the visual perception of bodily movements of the other interlocutor (Other-BI),

nor on any form of PI. The restricted availability of an Other-BI, which may still be informed by the

overall social context, is indicated by the non-solidity of the circle.

To be fair, such an intermittent episode during a joint attentional scene does not fully

qualify for the case we are looking for, especially since Ian often visually faced the

other during the experiment: “Several times the gaze-tracking cross-hairs zeroed in on

the interlocutor‟s face while Ian‟s gestures remained in perfect synchrony with his

speech and were semantically appropriate for non-deictic comments – normal gestures

in every way” (Gallagher 2005, p. 115). Nevertheless, we can derive a novel empirical

prediction from these considerations. If Ian is engaged in a mutually responsive social

interaction during which he is able to control his movements by means of a visually

informed Self-BI, but the other is not visually present (e.g. a conversation with someone

in the next room), he will still be able to perform appropriate gestures. Moreover, if a

Self-BI is sufficient for Ian‟s ability of bodily coordination even when only a severly

restricted Other-BI is available, we can hypothesize that for a normal experimental

participant PI alone could be sufficient. These conjectures may appear to be forced, but

the point is precisely to drive home the pervasive structuring presence of social context.

98 | P a g e

6.2.7 Bodily coordination in a virtual space (I)

Is there a way to investigate whether bodily coordination is possible for a subject with

PI, but where there is neither a Self-BI nor any substantial Other-BI available? Indeed,

this made possible by means of another minimalist psychological experiment devised by

Charles Lenay and his group at the University of Compiègne in France. The study has

not been published yet, so the description of the experimental setup and the results

provided here are based on recent talks given by Lenay 23 .

The general setup of the experiment is quite similar to that of the perceptual crossing

study by Auvray, Lenay and Stewart (2009) that was described in Section 6.2.4 (cf. pp.

90-94). Two participants face each other in a 1-D virtual environment that wraps around

on itself after 400 units of space. Each participant can control their virtual embodiment,

which consists of two aspects: (i) the ‘body as subject’, through which the environment

is perceived, and (ii) the ‘body as object’, which can be perceived in the environment by

others. The subjective aspect of this virtual body is implemented as a basic receptor

field that is activated (tactile stimulus) when the field overlaps with the other’s body

object, otherwise the receptor remains off (no tactile stimulus). The receptor field is

moved horizontally by means of a mouse pointer, and it remains invisible to the other

participant. The objective aspect is represented simply by an object which is attached by

a variable-length connection to the receptor field. The receptor field and body object

each occupy 8 units of space.

The horizontal displacement of the body object from the receptor field is controlled by

means of mouse clicks (i.e. left click for decrease and right click for increase), and it is

the only object that is detectable by the other’s receptor field (i.e. only the body object

gives rise to a tactile stimulation on contact). Importantly, participants are told the

direction of correlation between clicks and changes to body displacement at the start of

the experiment (e.g. left-click for body size increase, etc.). This enables each participant

23 One talk was given in August 2008, at the Workshop on Enactive Approaches to Social Cognition, held

in Battle, UK, and the other was given in March 2009, as part of the Life and Mind seminars at the

University of Sussex, Brighton, UK. See: http://lifeandmind.wordpress.com

99 | P a g e

to change the relative displacement of their body object according to the unfolding of

the interaction process. This setup is illustrated schematically in Figure 6-9.

Figure 6-9. A schematic of the experimental setup of the minimalist psychological study in body

imitation designed by Lenay and colleagues at the University of Compiègne, France. Two participants

face each other in a 400 unit long 1-D virtual environment that wraps around at the boundaries. Their

virtual embodiment consists of two aspects: (i) a receptor field, and (ii) a body object, which is attached

by a variable-length connection to the receptor field. Note that each participant‟s receptor field (white)

can encounter only one object in the environment that gives rise to tactile stimulation: the other

participant‟s body object (gray).

The task for the participants is to engage in mutual perceptual crossing as best as

possible. If they are to achieve optimal performance they will have to coordinate the

displacements of their body objects such that they can be mutually close to each other‟s

receptor field. Otherwise the situation will be inherently unstable: if one agent has to

move its receptor field to locate the other‟s body object, its own body will follow this

movement (since it is attached to the agent‟s receptor field with a certain displacement),

thereby in turn forcing the other agent to move as well, and so forth.

The task of coordinating displacements of body objects is made non-trivial due to the

following factors: (i) the receptor fields begin at random positions at the start of each

trial (range [0, 400]), (ii) each participant‟s body object is attached to its receptor field

by a random displacement at the start of each trial (range [-100, 100]), (iii) they have no

knowledge of the position of their receptor field nor the displacement of their body

object, and (iv) they have no knowledge of the position of the other‟s receptor field, nor

the displacement of the other‟s body object. The only information directly available to

the participants is the binary activation status of their receptor field, which indicates

whether the field is currently overlapping with the other‟s body object or not.

100 | P a g e

We have thus found an experimental setup to investigate this particular distribution of

traditional explanatory factors since the participants (i) have some proprioceptive access

to the movement of their receptor field (in terms of horizontal mouse movement) and

the changing displacement of their body object (in terms of mouse clicks), but (ii) have

no perceptual access to the current displacement of their own or the other‟s body object,

and (iii) nevertheless have to engage in movements and regulatory behavior that results

in bodily coordination. This situation is depicted in Figure 6-10.

Perception of Target

Acts (Other-BI)

Self/Other

Comparator

Perception of Actual

Acts (Self-BI)

Body Schema (BS)

Proprioception

(PI)

Motor Act

Figure 6-10. In the minimalist psychological study of body imitation designed by Lenay and colleagues

we find an experimental situation where participants (i) have some proprioceptive access to the relative

changes of position and displacement of their body object, but (ii) have no perceptual access to the

current displacement of their own or the other‟s body object, and (iii) nevertheless have to engage in

coordinated movements that are appropriate in a shared context of interaction.

Some preliminary studies by Lenay and his colleagues with this experimental setup

have indicated that it is possible for participants to regulate their movements and body

size so as to engage in stable perceptual crossing, and that they even manage to improve

their performance during the interaction. It appears that they can use the overall stability

of perceptual crossing as an indicator to appropriately change their respective body

objects, even though they do not know the actual current displacements. In other words,

during the interaction process they quickly learn how to left- and right-click so as to

appropriately regulate their relative body displacement in a coordinated fashion, namely

in relation to improvement of the overall stability of the interaction.

101 | P a g e

Here we have a case that would be difficult to explain in terms of the three traditional

factors alone. The role of both the Self- and Other-BI has been reduced to a minimum in

the virtual environment, and even the role of proprioception is limited: active clicking

can inform the participants about the direction of changes to the distance between their

receptor field and body object, but it gives them no sense of the actual magnitude of

displacement. In other words, the results of this experimental study indicate that neither

a perceptually informed Self-BI nor an Other-BI is a necessary condition for the

emergence of bodily coordination during interaction in a multi-agent system. It appears

that the autonomous dynamics of the interaction process, with a little help of PI, can

organize the individual gestures appropriately. Might this remaining PI, in the sense of

information about relative change, be a necessary condition? Considering the case when

Ian was gesturing under the blind, we can expect that this is not the case.

6.2.8 Bodily coordination in a virtual space (II)

We have no reached the final case listed in Table 6-1 (p. 82), whereby the potential role

of all three traditional explanatory factors can be called into question. The possibility of

bodily coordination under this extreme condition has not yet been investigated

explicitly. We can imagine such an extreme situation occurring when Ian was under the

blind, if he happened to talk to someone out of view. In this case he would not only lack

proprioception and a visually formed Self-BI, but also a visually formed Other-BI.

Thus, considering Ian‟s spontaneous gesturing under the blind, we can hypothesize that

normal communicative gestures will occur even during such a non-visual interaction.

Given that this kind of experiment has not been conducted with a deafferented subject,

is there a way to at least approximate this case? One possibility is to effectively retain

the experimental setup described for the body imitation experiment described in the

previous section, but further reduce the possible role of PI. Fortunately, precisely this

has been done by Lenay and colleagues in a second set of experiments, in which the

participants were uninformed about the relationship between their left/right clicking

activity and the relative changes to the body object displacement. In other words, not

only did they have no perceptual access to their virtual embodiment (Self-BI) or that of

102 | P a g e

the other participant (Other-BI), but they also had no sense of how their clicking

activities changed the manner of this embodiment in terms of the displacement between

their receptor field and body object (PI). This extremely constrained situation is

depicted schematically in Figure 6-11.

Perception of Target

Acts (Other-BI)

Self/Other

Comparator

Perception of Actual

Acts (Self-BI)

Body Schema (BS)

Proprioception

(PI/PA)

Motor Act

Figure 6-11. In a second version of the minimalist psychological study of body imitation designed by

Lenay and colleagues, the participants (i) have no „proprioceptive‟ access to the displacement of their

body object, nor its relative changes due to their clicking activity, and (ii) have no „perceptual‟ access to

the current displacement of their own or the other‟s body object, and (iii) nevertheless have to engage in

coordinated movements that are appropriate in a shared context of interaction.

As we would expect by now, the fact that the participants were unaware of the current

extent of their virtual embodiment, and that they had no knowledge of how their

clicking behavior changed this displacement, did not prevent them from collectively

accomplishing the task. The relative stability of the interaction process in the form of

perceptual crossing effectively enabled them to organize their movement and clicking

behavior appropriately, such that bodily coordination, in terms of a complementary

displacement of body objects, emerged out of regulatory behavior informed by the

stability of ongoing mutual interaction. It therefore seems that not even a single

traditional explanatory factor is necessary to account for this result. It seems that the

most important explanatory factor is an appeal to the interaction dynamics.

103 | P a g e

We have now considered all possible combinations of the three traditional factors used

to explain bodily coordination. The cases have indicated that the popular approach of

positing a Self-BI and/or PI alongside an Other-BI, related by some internal comparator

module, needs to be called into question. More precisely, while the sufficiency of some

combinations appears to be supported by the experimental data, any claims of necessity

must be rejected on grounds of the competing cases. But what kind of theoretical

framework can account for all of these eight cases, considering (i) that no combination

of factors remains constant throughout, and (ii) that the case of Ian Waterman and the

other psychological experiments demonstrate that the traditional factors are largely

dispensable without significantly impairing bodily coordination?

To be sure, we have already indicated that there actually might be a common factor to

all of these different cases, namely the self-organizing dynamics of the interaction

process. However, in order to properly motivate a closer analysis of this potential fourth

factor, let us first consider a recent attempt to make sense of all this experimental data in

terms of the „integrative motor theory‟ proposed by Gallagher and colleagues.

6.3 An integrative motor theory

Cole, Gallagher and McNeill use the case of Ian Waterman to argue against what they

call a motor theory of gesture, which holds that “gesturing is primarily a matter of

movement, falling within the domain of sensory-motor behavior. Gesture is the same

sort of movement as instrumental or locomotive movement” (Cole, et al. 2002a, p. 53).

On this view, gesture comes under the control of the body schema, and we would

therefore expect Ian to consciously control and monitor his gesturing, just like his

instrumental and locomotive movements. But if this does not turn out to be the case,

then the motor theory of gesture may not provide the whole story and the case of Ian

would establish “a clear differentiation between expressive movement and other kinds

of movement (e.g. reflex, locomotive, and instrumental)” (Cole, et al. 2002a, p. 55).

To test the prediction of the motor theory, i.e. that Ian controlled his gestures using

visual feedback, Cole et al. placed a blind before him in such a way as to block view of

his hands. If he was indeed using visual feedback to control his gesturing, like Ian

104 | P a g e

himself appears to believe (cf. Cole, et al. 2002a, p. 57), then this setup should prevent

him performing appropriate gestures under this condition. He was then asked to narrate

the plot of a cartoon he had just seen.

Once IW allowed his gestures to get under way, they seemed to have a mind of

their own. That is, they did not seem to be under IW‟s attentional control, and

they were consistent with normal measurements in terms of timing and shape,

relative to IW‟s speech acts. An especially striking illustration of this came later

in the experiment when IW, no longer recounting the cartoon story, continued to

converse while his hands remained under the blind. His hands began to form

gestures but did so outside of awareness. During the first 20 seconds of the

conversation he performed a string of gestures (14 in all) and then said,

revealingly “… and I‟m starting to use my hands now …” while continuing to

gesture. His gestures were utterly non-exceptional in timing and shape during

the critical 20 seconds when they were outside of awareness. (Cole, et al. 2002a,

p. 56)

In contrast to the expectations of the motor theory of gesture, which predicts significant

impairment under this condition, Ian‟s gesturing in fact appears to be close to normal. It

should be conceded, however, that their spatial organization lacks some topokinetic

precision, which indicates an impaired ability to move to a target position in space. This

topokinetic precision, in combination with morphokinetic precision (i.e. the ability to

shape hands appropriately), is usually required for the performance of instrumental

movements. But topokinetic precision is less important for gesture than morphokinetic

precision. In other words, it could be argued that Ian‟s ability to gesture under the blind

is aided by the fact that gesturing requires less topokinetic precision than instrumental

action. Nevertheless, this concession still does not explain why he is able to gesture at

all outside of awareness, given that this is simply impossible in the case of instrumental

movement. Accordingly, Cole et al. suggest that Ian‟s gesturing “is much more

integrated with linguistic behavior, and controlled by factors that go beyond ordinary

sensory-motor control” (Cole, et al. 2002a, p. 58). What precisely are these additional

factors that go beyond motor theory?

105 | P a g e

The self-organizing intentionality of instrumental and locomotive movement

normally depends on the implicit workings of body schemas. For IW, as we

have seen, self-organizing motility breaks down. Yet, the self-organizing

intentionality of language remains intact, and gesture, temporarily disrupted by

IW‟s illness, has been re-established to a higher degree than his capacity for

instrumental or locomotive movement. On the communicative theory of gesture

the reason gesture can be re-established with such proficiency is that gesture, as

a movement concerned with the construction of significance rather than doing

something, is organized primarily by the linguistic-communicative context.

(Cole, et al. 2002a, p. 61)

And, as these experiments indicate, whereas the operation of body schemas requires

feedback via proprioceptive information, the efficacy of the linguistic-communicative

context requires feedback related to the social situation. At first sight this position

therefore appears to be compatible with the enactive approach to social cognition,

especially since it also appeals to the construction of significance and the role of the

social context. However, this apparent similarity hides some essential differences.

Most importantly, for Cole, Gallagher and McNeill the case of Ian provides us with an

important distinction between (i) instrumental action, and (ii) communicative action, “or

action with meaning mapped onto it” and as such gesture “comes under the control of

linguistic/communicative systems rather than the instrumental motor system” (Cole, et

al. 2002, p. 61). The basis of this distinction is deeply problematic, and leads to tensions

in the rest of their account. Let us consider these consequences in more detail.

First of all, note that Cole, Gallagher and McNeill draw their distinction between

instrumental and communicative action by asserting that only the latter form of action

expresses meaning (that is somehow „mapped onto it‟). This way of carving up the

space of actions goes strictly against the enactive approach, which holds (i) that all

action is a form of sense-making, and (ii) that this sense is intrinsic to the performance

of the action. Moreover, this enaction of meaning is not an internal or private event. On

the contrary, the meaning of embodied action, as an expression of dynamics in relation

to the world, is typically directly perceivable by others (cf. Gallagher 2008b). Action in

106 | P a g e

the world is characterized by a pre-reflective expression of intention, mood, attitude and

general state of being that does not require any additional communicative intent to be

perceived as such 24 . It follows that instrumental action, and even locomotive action, is

always more than mere physical displacement. If we were ever faced by movement that

was not also expressive of meaning in some manner, it is unlikely that we would even

perceive it as an action, instrumental or otherwise.

Second, by contrasting communicative action as an „action with meaning mapped onto

it‟ Cole, Gallagher and McNeill lose sight of what makes social interaction special,

namely that it is a shared form of action. To be fair, they occasionally do hint at the role

of the intersubjective context in generating gestures. For example, they admit that “it is

possible that the semantic and communicative (pragmatic) aspects of gesture provide

sufficient feedback to sustain control of gestural movement” (Cole, et al. 2002a, p. 64),

and they finish their paper by stating that “it is, of course, another person like myself

who moves, motivates, and mediates this process. To say that language moves my body

is already to say that other people move me” (ibid., p. 65). However, despite this appeal

to the intersubjective context we already noted that descriptions of the experiment with

Ian lacked any consideration of the role of his interlocutors. Ian could presumably see

their gestures, if any were made, but even this has not been made explicit. Did they

mirror, imitate, or support one another? Unfortunately, Cole, Gallagher and McNeill are

silent on these matters of social interaction and intersubjectivity.

Third, Cole, Gallagher and McNeill‟s lingering methodological individualism becomes

even more evident in the explanation given for Ian‟s gestural ability. Instead of further

exploring the tantalizing idea that „other people move me‟, an idea which presumably

24 Consider this phenomenological observation, for instance: When I pass a young man on his way to the

park, I experience his relaxed attitude in terms of the ease of his gait, and by the way he causally swings

his gym bag over the shoulder; I perceive happiness in the melodies he is humming to himself, and in the

satisfied smile of a good day‟s end; I see his intent to play a game of football in his attire, and in the

direction of his chosen path. In other words, his whole bodily presence exudes a purposeful, emotional

and intentional being-in-the-world, a manner of being which is intersubjectively accessible to others. In

addition, this meaningful presence of the other subsumes and is accentuated by every little bodily action,

even if they can ultimately be abstracted as being merely instrumental or locomotive by an observer.

107 | P a g e

involves the embodiment of sociality in a self-world-other structure, they hypothesize

that the source of Ian‟s ability may be localized in his brain‟s activity: “As IW speaks

and gestures brain regions responsible for the generation of language may be

contributing to control of gestural movement by enabling access to motor programmes

that underpin his gesture stream” (ibid., p. 60). Of course, the enactive approach to

social cognition would certainly not deny that some form of brain activity is involved in

Ian‟s gesturing. Nevertheless, it would strongly insist that this neural component can

only ever be part of a more encompassing explanation. We should not forget that the

performance of gesture, like behavior in general, can only be described as arising out of

an integrated brain-body-world systemic whole (cf. Chiel & Beer 1997).

Fourth, we can note that Cole, Gallagher and McNeill‟s hypothetical neuro-centric

explanation throws up an apparent riddle that is difficult to resolve from within their

perspective. The problem is that “under the blind, and without proprioception, IW has

no explicit, conscious knowledge of the specifics of his gestural production. In this case,

what does he gain from his gestures?” (Cole, et al. 2002a, p. 62). However, while this

might be a puzzling problem for a functionalist theory of mind, it is not a worry for the

enactive paradigm for several reasons. To begin with, the enactive approach can appeal

to careful phenomenological observation which shows that the entire body‟s movement

is involved in the expression of the lived meaning of a situation. Accordingly, from this

holistic perspective it should not be surprising that a subject‟s involvement in a social

situation becomes embodied in what we describe as gesturing, even if that movement is

not actually visible to others.

Fifth, if we assume that another way to understand the question of „gain‟ is in terms of

the movement‟s „value‟, then the enactive approach can appeal to the notion of sensemaking.

In other words, the gesture might not only be an embodied expression of the

subject‟s involvement in a social situation, it might even be an essential component of

the actual realization of the social meaning of that situation. This interpretation is

further supported by recent psychological research which has revealed that people with

Möbius Syndrome (the congenital absence of pre-reflective facial expression) not only

difficulty in expressing emotion, as might be expected, but also suffer from an impaired

experience of emotion itself (cf. Cole 2009). Or, conversely, we can note that even the

108 | P a g e

eflective positioning of facial muscles into a smile can lead to a more elevated mood

(an experience which the reader is encouraged to try out). This inseparability of the

meaning of a movement from the movement itself is precisely what the enactive notion

of embodied action as sense-making is trying to capture.

In contrast to this interpretation from the enactive paradigm, Cole, Gallagher and

McNeill offer what could be construed as essentially a cognitivist response. The focus

of the explanation is again on the individual‟s brain, and the „gain‟ is cashed out in

terms of improved cognitive performance. They propose that gesture, as an aspect of

language rather than mere hand movement, might assist in the accomplishment of Ian‟s

thought. However, since Ian does not receive any feedback from movement in the blind

condition, they conclude that the mechanism by which an individual might receive this

cognitive benefit must be sought in the brain: “His cognitive gain from gesture without

visual feedback, we suggest, may be due to pre-motor preparatory processes involved in

the generation of the gestural movement rather than from the gestural movement itself”

(Cole, et al. 2002a, p. 62). The fundamental contrast between the interpretation offered

by the enactive paradigm and this functionalist explanation, which separates the „gain‟

from the actual realization of the movement and instead localizes the source of efficacy

in the brain, should by now be sufficiently clear.

Sixth, we can note that it still remains mysterious as to precisely why Ian‟s gesture

should have been “re-established to a higher degree than his capacity for instrumental or

locomotive movement” (Cole, et al. 2002a, p. 61, emphasis added). What causes this

relative difference in performance? Cole, Gallagher, and McNeill hint at a possible

answer in that “gesture is never a mere motor phenomenon; it draws the body into a

communicative order defined by its own pragmatic rules” (ibid., p. 65). We have seen

that one possibility is to attribute this „communicative order‟ to a special region of the

brain. However, might it not be possible, as indicated by the experiments conducted by

Lenay and colleagues, that this order is realized by the distributed organization of the

social situation itself? An important advantage of explaining Ian‟s relative improvement

in relation to the scaffolding provided by an appropriate interaction process is that it

better generalizes to other neuropathological cases. For example, it might also account

for the finding that patients with apraxia who, although they are not paralyzed, are

109 | P a g e

unable to execute learned movements even if they want to, but can act almost normally

when situated in social settings. One such patient, for instance, was unable to move a

cup-like object from the table to her face area on command in an experimental setting,

but was able to serve and drink tea in a social setting that involved expressions of

hospitality (cf. Marcel 1992; Gallagher & Marcel 1999). Can the ability of this apraxia

patient also be explained in terms of an appeal to the supposed cognitive gain of a

communicative gesture that as an action has „meaning mapped onto it‟, as Cole, et al.

suggest for the case of Ian? While gesturing is certainly an important aspect of this

setting (e.g. expressions of hospitality), nevertheless the act of drinking tea as such, for

example, appears to primarily be an instrumental action. It would therefore seem to lack

the potential cognitive gain that they associated with communicative gesturing, and

therefore could not bootstrap the ability of drinking in social circumstances.

Accordingly, it appears that Cole, Gallagher and McNeill‟s position is confronted by a

choice between two options: (i) they concede that instrumental actions can also be

meaningful and expressive, or (ii) they persist in claiming that instrumental behavior is

meaningless movement. But each of these two options comes at a price. If they choose

to pursue option (ii) the internal consistency of the integrative theory of gesture can be

retained, but the applicability of the theory has been limited such that it can account for

the case of Ian, but not for the case of apraxia. However, if they choose option (i) they

will have to give up the very distinction between instrumental and gestural behavior that

formed the basis for their original explanation, thereby leaving them in no position to

account for either the case of Ian or that of the apraxia patient 25 .

The enactive approach to social cognition, on the other hand, manages to avoid this

dilemma by treating all embodied action as inherently meaningful, as captured by the

notion of sense-making. Moreover, this meaning generation, as a relational phenomenon

pertaining to the interaction between an agent and its world (including other agents), is

inherently expressive. There is no need to appeal to some internal process of mapping

25 To be fair, their original distinction could perhaps be saved by an attempt to systematically distinguish

between those bodily actions that are part of language and those that are not. However, given the holistic

manner in which our social presence is embodied, the validity of this distinction could be questioned.

110 | P a g e

meaning onto physical movement into order to explain the phenomenon of gesture. In

addition, this starting point of bodily subjectivity enables the enactive approach to take

better into account the shared nature of social situations in which the interactors enable

and constrain each other‟s behavior. Of course, the individual actor with its abilities and

limitations remains a crucial element in this account, but as an embodied and situated

being-in-the-world it is also essentially a being-with-others, an integrated part of the

unfolding dynamics of social interaction (cf. Heidegger 1927). With this shift toward a

consideration of the constitutive role of sociality for the kind of agents that we are, it

becomes possible to offload an important part of the explanatory burden from the

individual level to the social context in which that individual is embedded.

6.4 Summary

The critical analysis of the various psychological case studies and their potential

explanations in terms of the traditional framework of social psychology, as well as the

engagement with the more recently developed integrative theory of gesture, have shown

that it is neither necessary nor sufficient to explain Ian Waterman‟s gesturing in terms of

special brain regions. In fact, the analysis of the case studies points the other way: since

we cannot appeal to any of the three traditional factors to explain Ian‟s ability to gesture

normally under the blind without any proprioceptive or visual feedback, where does this

leave the hypothetical „Self/Other Comparator‟ module which is traditionally postulated

as an internal mechanism to determine intersubjective equivalence on the basis of these

factors? Could this supposed brain module itself also be removed as a necessary

condition given that we can appeal to the organizing dynamics of the interaction process

instead? This additionally reduced situation is depicted in Figure 6-12.

111 | P a g e

Perception of Target

Acts (Other-BI)

Self/Other

Comparator

Perception of Actual

Acts (Self-BI)

Body Schema (BS)

Proprioception

(PI/PA)

Motor Act

Figure 6-12. A fully reduced schematic of the traditional explanatory paradigm. The fact that Ian can

gesture under the blind without any visual or proprioceptive feedback eliminates the necessity for the

three factors of Self-BI, Other-BI and PI/PA. Without these factors the need for an internally localized

„Self/Other Comparator‟ module has been called into question as well.

Note that we now have almost rehabilitated the original motor theory of gesture, which

did not posit any difference between instrumental and gestural action. However, as

should have become clear in the discussion of the integrative theory of gesture, we have

recovered this motor theory with an added enactive twist. As shown in Figure 6-12, in

terms of the individual actor the core of the explanation is centered on the existence of a

body schema and motor action. But the enactive approach leaves the motor theory (and

with it Figure 6-12) behind in two essential aspects: (i) it holds that all embodied action

is inherently meaningful, and directly perceivable by others as such, because it is bodily

expression of the sense-making that is enacted by an autonomous agent, and (ii) it holds

that explanations of individual agency need to take into account the constitutive role of

the interaction process in the organization of behaviors.

In order to get a better grasp of these two essential aspects of the enactive approach to

social cognition the next part of this thesis will proceed as follows. First, we will use an

evolutionary robotics approach to clarify the constitutive role of the interaction process

in dynamical terms (Chapters 7 to 10). In particular, we will demonstrate that there is no

need to posit a Self/Other Comparator module to account for bodily coordination, and

that the interaction process can organize such bodily coordination appropriately even

112 | P a g e

under minimal conditions that resemble the situation depicted in Figure 6-12. These

models are designed to help us better understand what precisely is the form of such

interaction dynamics, and how can we use them to explain behavioral capacities of

individual agents in a non-individualistic and non-mysterious manner.

It is then argued that this systemic approach alone, however, is not sufficient to support

the claim that all embodied action is inherently expressive, as it leaves out the

qualitative aspects of intersubjectivity (Chapter 11). The systemic approach will thus be

complemented by more detailed phenomenological considerations of the presence of

others (Chapter 12). Finally, the systemic and phenomenological insights are combined

to develop a novel perspective from which it is possible to begin to probe the current

consensus on cumulative cultural development by taking perceived meaning and multiagent

interaction as the starting point for explaining the empirical data (Chapter 13).

113 | P a g e

7 Toward the synthesis of minimally social behavior

In response to the growing challenge of methodological individualism Boden (2006b, p.

54) asks: “How can we drop it without descending into mysterianism about social

forces, group minds, or personal interactions?” She notes that the social behavior of

agents has become a practical issue in the field of artificial intelligence (AI), and cites

Di Paolo‟s (2000, 1999) work in evolutionary robotics (ER) modeling as an example of

how the solipsistic attitude may be rejected in a non-mysterian way. His models

demonstrate how the unfolding of social behavior can depend on the dynamical

properties of the interaction process itself. This general result has since been replicated

by several other modeling studies, some of which are based on actual psychological

experiments and have in turn cast new light on the interpretation of empirical data (e.g.

Rohde 2008).

In this chapter we will explain the methodology of ER in more detail, and then propose

an integrated approach which links ER models to empirical science by means of

hypothesis generation and verification. Since the next three chapters will be based on

ER models that have been synthesized in a similar fashion, unnecessary repetition is

avoided by providing the common implementation details here.

7.1 Evolutionary robotics

There was a time – and for many there still is – when doing computer science was no

different from doing cognitive science, a tradition that is captured well by the slightly

nostalgic slogan „Good Old-Fashioned AI‟ (Haugeland 1985). The computer metaphor

of mind ensured, to the delight of programmers and engineers, that when something was

learned about computers and algorithms, ipso facto something was learned about mind

and cognition as well (cf. Harnish 2002). To be sure, much of this research continues

today under the label of Artificial Intelligence, but a closer look reveals that as a field it

is not much different from Applied Informatics (cf. Russell & Norvig 2002).

However, even though the computer metaphor of mind has outlived its usefulness for

many cognitive scientists, the use of computers itself has not. Ironically, they have

114 | P a g e

proven to be indispensable tools for the development of a variety of alternative

approaches to the study of life and mind, most of which explicitly oppose the

computationalist tradition. The boundaries between these alternative approaches is quite

fluid, but at least two general trends can be distinguished. On the one hand, there is a

focus on questions related to basic processes of the living, especially their autonomy

and metabolic realization (Bourgine & Varela 1992). This tradition has been around for

a while (e.g. Varela, et al. 1974), but it really started to blossom in the late 1980s when

the name of the field “Artificial Life” (Alife) was coined by Langton (1989). Since then

Alife has generated a wide variety of research programs which are united by their

interest in the framework of non-linear dynamics and the concepts of self-organization

and emergence (cf. Langton 1995; Boden 1996).

Alongside the establishment of Alife as a more or less well defined field of research

there was a related transformation in robotics toward a biologically inspired

consideration of embodiment and situatedness (cf. Brooks 1991a). This new robotics

movement shared much of the general conceptual framework of Alife, but it focused its

efforts on understanding the emergence of embodied cognition (Steels 1994). Today the

field of such biologically inspired robotics has established itself as a viable alternative

to more traditional engineering approaches (Pfeifer, et al. 2007). A particularly

important breakthrough occurred when this approach was combined with the insights

gained from research in evolutionary algorithms (Holland 1975; Goldberg 1989). Since

the early 1990s this combined approach is referred to as “Evolutionary Robotics” (Cliff,

et al. 1993), though it also often involves simulations of artificial agents rather than

actual hardware implementations 26 . This approach explicitly rejects the computer

26 There is an unresolved tension lurking here: a core assumption of Alife is that life is a result of the

systemic organization of matter rather than something that inheres in matter itself (Langton 1989). But

others argue that a physical realization is necessary for the phenomenon to be „real‟ (Steels 1994), or that

it is at least needed for establishing a serious dialogue with the empirical sciences (Webb 2001). There

might indeed be something specific about materiality that cannot be replicated in an abstract domain of

pure logic (Ruiz-Mirazo & Moreno 2004), but for the study of behavioral dynamics a well-designed

model is often sufficient (Beer 2003). The continuing appeal of actual physical robots for scientific use of

the synthetic method can perhaps also be attributed to the fact that the physical domain is where GOFAI

was first outcompeted by alternative approaches (Brooks 1991b), a historic moment for the field.

115 | P a g e

metaphor, and views cognition in terms of complex agent-environment interactions and

non-linear dynamics instead (Harvey 1996; Beer 1995a). The basic idea behind

evolutionary robotics (ER) can be summarized as follows:

An initial population of different artificial chromosomes, each encoding the

control system (and sometimes the morphology) of a robot, are randomly created

and put in the environment. Each robot (physical or simulated) is then let free to

act (move, look around, manipulate) according to a genetically specified

controller while its performance on various tasks is automatically evaluated. The

fittest robots are allowed to reproduce (sexually or asexually) by generating

copies of their genotypes with the addition of changes introduced by some

genetic operators (e.g., mutations, crossover duplication). This process is

repeated for a number of generations until an individual is born which satisfies

the performance criterion (fitness function) set by the experimenter. (Nolfi &

Floreano 2000, p. 1)

Since we will be making extensive use of ER in the following chapters, it is worth

describing the methodology in more detail. First, it takes a holistic approach to

behavior. This is in contrast to much traditional AI, which largely studied the internal

operations of abstract systems and isolated components (Dennett 1978). ER, on the

other hand, is concerned with how adaptive behavior emerges out of the non-linear

interactions of a brain, body and world systemic whole (Beer 1997). One simple way to

achieve this is to embed a control system within a sensory-motor loop and place it

within an environment (Cliff 1991). The aim of this “synthetic ethology” (MacLennan

1992) is to combine the simplicity and control of behaviorist methods with the

ecological and contextual validity of empirical ethology.

This holistic approach to behavior is complemented by an evolutionary approach to its

optimization. In contrast to traditional AI, which mostly specified a system‟s cognitive

functions programmatically, ER tries to take the human designer out of the loop as

much as possible. To be sure, it is still necessary to specify what defines an agent, its

environment and the desired behavior, but the particular way in which this behavior is

realized depends on an evolutionary process. In this way we only have a minimal and

116 | P a g e

controllable impact of design assumptions, and it is easier to investigate the minimal

conditions for a behavioral capacity (Harvey, et al. 2005). Often the evolutionary

process leads to novel and surprising mechanisms that undermine our preconceptions

about the necessary conditions for a certain behavior to emerge.

Finally, the holistic approach to behavior and the evolutionary approach to its

optimization are complemented by a dynamical approach to its realization (Beer 1997).

Whereas the control systems for traditional AI are typically implemented in terms of

symbolic representations that are specified by the designer, ER tries to minimize the

impact of prior assumptions about what internal operations might be necessary. The aim

is to provide a generic substrate for the controller which can then be shaped by the

selective pressures of the evolutionary algorithm. One popular way of implementing this

generic substrate is in terms of continuous-time dynamical systems (Beer 1995b). An

advantage of using such systems is that the autonomous system is no „back box‟; it is

possible to use dynamical systems theory to understand and formalize the system‟s

behavior (Beer 1997). Moreover, this theory is especially attractive in relation to a

holistic approach to behavior because it can deal with changes in behavior in a unified

mathematical manner that spans brain, body and world (Kelso 1995) as well as various

temporal scales (Thelen & Smith 1994).

We can also identify three broad contexts in which the ER methodology (and Alife

more generally) is used. First, it is by and large the case that the sciences of the artificial

are part of theoretical science. To be sure, many have succumbed to the functionalist

temptation to view their artificial systems as actual empirical instances of the

phenomena they are investigating, especially in the early years of the field (e.g. Langton

1989). However, there is a growing consensus that this confuses the model with what is

being modeled. Note that this does not diminish the scientific value of the synthetic

approach, but shifts its emphasis toward creating “opaque thought experiments” by

which it is possible to systematically explore the consequences of a theoretical position

(Di Paolo, et al. 2000). The idea is that we use theories about the empirical world to

inform the design of ER models, and that these models in turn constrain the

interpretation of the theories (Moreno 2002). This can happen negatively, such as when

the ER methodology is used as a subversive tool to undermine theoretical claims for

117 | P a g e

necessity, but also positively, as when it is used to synthesize a model that serves as a

proof of concept (Harvey, et al. 2005).

What makes ER attractive to science, namely its capacity for the synthesis of systematic

thought experiments of indefinite complexity, also aligns it with the aims of analytic

philosophy (Dennett 1994). Broadly speaking, we can capture this aspect of ER with the

slogan “philosophy of mind with a screwdriver” (Harvey 2000). It is not always the aim

to explicitly model one‟s philosophical assumptions in the process of synthesizing an

artificial system, but in practice it is difficult – if not impossible – to avoid doing so at

least implicitly. The fact that the systems we create embody our presuppositions has

been exploited with great effect by Dreyfus, who traces the limited success of traditional

AI to its underlying Cartesian philosophy (Dreyfus & Dreyfus 1988). Moreover, it has

been argued that the subsequent turn toward embodied-embedded AI coincides with a

shift to a more Heideggerian philosophy (Wheeler 2005). The advantage of probing

philosophical positions with ER rather than with traditional thought experiments are the

increased capacity to deal with complex systems, as well as to test them in more

realistic, but still fully controllable settings.

Finally, an important but often underappreciated aspect is the synthetic methodology‟s

pedagogical value. It is quite a formative experience to spend countless hours in front of

the computer trying to get an artificial agent to solve what should be a simple task, but

getting nothing but senseless behavior (cf. Dennett 1984). It quickly becomes clear that

these systems do not know what they are doing; they have no understanding of their

situation (Haugeland 1997) nor do they care about the fact that they don‟t (Di Paolo

2003). Similarly, it is a humbling experience to consistently have your own and others‟

cherished presuppositions and expectations undermined by an opportunistic

evolutionary algorithm. Over time this subversive process starts to affect the way in

which you approach problems, expanding the range of possible explanations to be

considered, while at the same time teaching you to be careful about positing necessary

and sufficient conditions.

In order to illustrate the effectiveness of the ER methodology it is helpful to mention a

few concrete examples. While ER has been successful at synthesizing artificial agents

118 | P a g e

that are capable of engaging their environment in a robust, timely and adaptive manner,

there has been some debate about the internal mechanisms that are necessary for these

agents to switch between qualitatively different behaviors depending on situational

changes. In other words, it has been demonstrated that the „intra-context frame problem‟

can be resolved, but a solution to the „inter-context frame problem‟ arguably requires a

different kind of mechanism (Wheeler 2008). In response to this debate it is possible to

cite a recent ER study by Izquierdo and Buhrmann (2008), where a single dynamical

system was optimized to perform two qualitatively different behaviors, chemotaxis and

legged locomotion, without providing a priori structural modules, explicit learning

mechanisms, or an external signal for when to switch between them. The agent‟s ability

to switch its behavior appropriately when placed from one situation into another is

explained in terms of the interactions between the controller‟s dynamics, its body and

environment, thereby calling into question the internalist assumption that the necessary

and sufficient conditions for context-switching behavior must reside in the individual

alone.

In fact, there are many examples of ER models that teach us to be careful about what

internal conditions we presuppose on the basis of observed behavior, and vice versa.

Consider, for instance, the common assumption that some form of neural plasticity is a

necessary condition for learning, an assumption which has come under attack by a

number of ER studies. It has been shown that an embodied agent that is controlled by a

continuous-time recurrent neural network without synaptic plasticity (i.e. connection

weights remain fixed during a trial) nor any other a priori modular structures, can

perform a continuous associative learning task (Izquierdo, et al. 2008). One simple

solution to this problem is that the continuous to-be-remembered signal is simply

associated with the activity of a network component that has a slower timescale.

However, while this is an instance where it is possible to match a specific behavioral

property to a localized internal component, such structural isomorphism is itself not a

necessary condition. Buckley and colleagues (2008), for example, have shown that the

capacity to solve a task demanding multiple behavioral modes does not directly say

anything about the complexity of the attractor structure of the internal dynamics.

Contrary to common intuition, the agents were able to satisfy the task with only

119 | P a g e

transient dynamics around a single fixed point attractor, thereby demonstrating that it is

not necessary for a distinct behavioral mode to be associated with a distinct attractor. In

fact, further doubt has been cast on how much can be understood about the limitations

of an agent‟s behavior from the limitations on its internal dynamics A nice illustration

of this idea is an ER study which demonstrates that even a purely reactive (stateless)

system, i.e. a system whose outputs are at each moment only determined by its current

inputs, can engage in non-reactive behavior due to the ongoing history of interaction

resulting from its situatedness (Izquierdo-Torres & Di Paolo 2005). It is therefore

conceivable that a natural agent‟s behavior that appears to depend on some form of state

may actually depend on a relational rather than an internal form of state. This work

reinforces the idea that embodied behavior can exhibit properties that cannot be deduced

directly from those of the individual‟s internal milieu itself.

7.2 An integrative methodology

Since its beginnings in the early 1990s ER has established itself as a viable

methodology for synthesizing models of what has become known as „minimally

cognitive behavior‟, namely the simplest behavior that raises issues of genuine cognitive

interest (Beer 1996). We have described some general illustrative examples of this

methodology in the previous section. Within the context of this research framework

there has also been a growing interest in using this synthetic method in order to

investigate the interaction dynamics of social cognition (e.g. Williams, et al. 2008; Di

Paolo, et al. 2008; Froese & Di Paolo 2008a; Ikegami & Iizuka 2007; Iizuka & Di Paolo

2007b; Iizuka & Ikegami 2004a; Quinn 2001; Di Paolo 2000; 1999). As a specialization

of the ER methodology, we can conceptualize this research as a theoretical investigation

into the dynamics of „minimally social behavior‟ (Froese & Di Paolo in press-b).

What is especially interesting about some of these recent advances in ER is that the

synthetic method has been used to create models which are explicitly inspired by

psychological experiments. Moreover, some of these models have been specifically

designed to generate insights that have the potential to become the starting point for

mutually informing collaborations between ER and the traditional empirical sciences,

especially social psychology (cf. Di Paolo, et al. 2008; Rohde 2008). Here we will

120 | P a g e

continue this effort to move ER into a more productive relationship with the rest of

cognitive science. The crucial step of moving ER beyond mere technological wizardry

or model collection and into a principled scientific research program is to link ER and

science together in terms of hypothesis generation and verification. This integrative

methodology consists of four essential steps:

(i) Synthesis of model: The first step is generally the identification of an interesting

empirical experiment whose theoretical interpretation could benefit from a modeling

study. This might be the case for a variety of reasons. For example, it could be that

the original interpretation of the experiment is incomplete and that a more detailed

dynamical analysis is desirable, or that the given explanation posits some potentially

unnecessary conditions of necessity that might have been introduced due to hidden

philosophical presuppositions. Another motivation could be to test the viability of a

novel hypothesis without having to replicate an entire empirical study.

(ii) Emergence of behavior: How the target behavior is realized is not pre-specified by

the experimenter. The behavior is an emergent phenomenon that depends on the

particular history of agent-environment interactions that is realized by the simulation

synthesized in step (i). As such, it cannot be found as a static element within the

program but must be observed as a pattern of activity.

(iii) Analysis of behavior: The behavioral phenomena that emerge in step (ii) are

essentially opaque, especially if the agent-environment system is characterized by

complex non-linear dynamics. In other words, the observed behavior is typically in

need of further systematic analysis to determine its essential structures and

conditions of possibility. This typically takes the form of behavioral psychophysical

tests, lesion studies, and formal dynamical analysis.

(iv) Generation of hypotheses: The insights gained in step (iii) form the basis for a

theoretical response in relation to the study‟s original motivation. They also inform

the process of generating novel hypotheses, which then become the basis for the

design of novel simulations for step (i).

121 | P a g e

The relationship between the synthetic, emergent, analytic, and generative aspects of the

ER methodology are illustrated in Figure 7-1.

Generation of hypotheses

Analysis of behavior

Computer

Science

Evolutionary

Robotics

Model

Phenomenon

Synthesis of model

Emergence of behavior

Figure 7-1. Illustration of the key steps involved when using evolutionary robotics as a tool for cognitive

science. The methodological circle typically starts with a theoretical, empirical or simply exploratory

motivation that leads to (i) the synthesis of a new simulation model, which when run gives rise to (ii) the

emergence of model behavior, whose complex non-linear realization necessitates (iii) a behavioral and

dynamical analysis of that behavioral phenomenon. Finally, the insights gained lead to (iv) the generation

of novel hypothesis, and the circle can start again.

Note that steps (i)-(ii) and (ii)-(iii) already exist in current ER work (and in Alife more

generally). It is step (iv) which crucially turns these disparate elements into a coherent

scientific research program. Actually, a full-blown ER-based scientific study should

ideally include an empirical element in this methodological circle so that it consists of

two distinct phases: (i) An ER model, which can be novel or based on an existing

empirical experiment, is used to generate a novel hypothesis, and (ii) this hypothesis

then becomes the basis for an empirical experiment to verify its validity. It is also

possible for ER to be even more involved with empirical sciences: before the

psychological experiment it can be used as a means of testing experimental designs and

running simple pilot studies, and afterwards it is helpful for interpreting ambiguous

empirical data (cf. Rohde & Di Paolo 2007; 2008). However, this ideal situation where

122 | P a g e

ER and empirical science go hand-in-hand all the way is a demanding interdisciplinary

endeavor that takes considerable effort to realize in practice. Nevertheless, one

promising approach is to address work in psychology that already makes use of minimal

technological interfaces as part of its experimental design (Rohde 2008).

Previous work in ER on the topic of social cognition has shown that simple models can

be used to highlight the important role of the interaction process itself for the

appropriate unfolding of inter-individual coordination behavior. These results support

the “interaction theory” of social cognition which holds that primary intersubjectivity

(Trevarthen 1979), i.e. an embodied intersubjective interaction that is best understood in

terms of enactive perception, is the foundation of our everyday social abilities (e.g.

Gallagher 2008d; 2001). This explicit recognition of the essential role of the interaction

process for social cognition provides a much needed challenge to the classical

cognitivist „problem of other minds,‟ which is traditionally solved by postulating some

kind of „mind-reading‟ ability, either in terms of theoretical inference and reasoning

(e.g. Carruthers 1996), or in the form of internal simulation (e.g. Gallese & Goldman

1998).

However, an explicit recognition of the role of the interaction process itself for social

cognition should only be seen as the beginning, as pointing out a new problem space

that demands to be further investigated. Indeed, what is needed is a much more

extensive reevaluation of the constitutive relationship between individual agency, social

interactions and societal context (De Jaegher & Froese 2009). The enactive approach

attempts to go beyond mere recognition of the importance of the interaction process to

assessment of its constitutive role for the unfolding of social cognition (De Jaegher

2009). In particular, it has been proposed that the interaction process itself can take on a

form of constitutive autonomy, i.e. that multi-agent interactions can organize into a

dynamical system that maintains its own identity (cf. Chapter 4, p. 58). It has also been

argued that when the individual activities of cognitive agents become coupled in this

kind of manner, previously inaccessible domains of co-regulated cognition can become

available in the form of participatory sense-making (cf. Chapter 4, p. 64). And the

effects of the social interaction process do not remain limited to the cognitive domain,

but constitute the embodied mind intersubjectively at even the most fundamental levels;

123 | P a g e

a process of “self-other co-determination” (Thompson 2001). The ER experiments

presented in Chapters 8-10 will illustrate these conceptual and methodological ideas in

more concrete terms.

7.3 Implementation details

After this general introduction to the ER methodology, and a brief assessment of its

applicability to the problem of investigating the dynamics of social cognition, it is

necessary to describe in more detail the way in which the models presented in the

following chapters have been implemented as computer simulations. These details do

not vary significantly between the experiments so they are only described once here. If

there are any specific differences pertaining to a particular experiment, these will be

noted explicitly in the text.

All simulated agents are controlled by two structurally identical continuous-time

recurrent neural networks (CTRNNs), as described by Beer (1995b). They were chosen

to be clones because work by Iizuka and Ikegami (2004a) on a related task suggests that

genetically similar agents are potentially better at coordination. The agents face each

other in an unlimited continuous 1-D space (i.e. one agent faces „up‟ and one agent

faces „down‟). Distance and time units are of an arbitrary scale. Each agent can only

move horizontally, and sense only via a binary receptor field. The field is activated (set

to 1) when the agents cross an object in their environment, otherwise it is set to 0. The

location of each agent is represented by a continuous variable, and the velocity is

controlled by a fully inter-connected CTRNN with self-connections. No symmetry is

imposed on the network structure. The time evolution of node activation y i is

determined by Equation 7-1.

Equation 7-1

i

N

x bi

y y w z ( y ) I , z ( x)

1/(1 e )

j 1

ji

j

i

In this equation y i represents the activation of node i, z i is the node output as calculated

by the standard sigmoid function, τ i is its time constant, b i is a bias term, and w ji is the

124 | P a g e

strength of the connection from the node j to i. Each node i receives a weighted input I i

from the agent‟s receptor field, which is calculated according to Equation 7-2.

Equation 7-2 I w Sense(RF)

i

The function Sense(RF) returns the current state of the agent‟s receptor field (0 or 1).

This receptor state is connected to each CTRNN node i via a dedicated input weight

matrix w I . While such a distributed network structure is more complex than having a

dedicated „input‟ node, some initial exploratory evolutionary runs revealed that this

more distributed way of perturbing the state of the system resulted in solutions that were

more easily optimized (i.e. increased „evolvability‟). The ranges of these parameters

vary between experiments, and are described in the chapters.

The behavior of the agents is optimized by using a genetic algorithm (GA) which is

based on the “microbial GA”, a steady-state GA with tournament selection (cf. Harvey

2001). Until some termination criterion is reached, two members of the population are

chosen at random, both have their fitness evaluated, and while the „winner‟ of the

tournament remains unchanged in the population, the „loser‟ is replaced by a slightly

mutated copy of the „winner‟. No crossover operator was used, especially as this might

conflict with the increased genetic diversity entailed by niching (Froese & Spier 2008).

Each member of the population is a clonal pair of agents whose overall performance

will be tested in a given experimental setup. In this GA we define a generation as the

number of tournaments required to generate a number of offspring equal to the

population size. An evolutionary run finishes at some maximum number of generations,

though it is sometimes manually terminated before the maximum is reached if solutions

are sufficiently good.

Niching. In order to enhance the evolvability of the CTRNNs, the standard microbial

GA has been extended with a simple „geographical‟ method to allow different

subpopulations to evolve semi-independently within the overall population (cf. Spector

& Klein 2006; Izquierdo, et al. 2008). A minimalist, 1-D wrap-around „geography‟ was

used. This was implemented by running 10 evolutionary runs in parallel, each with a

population size of 10 solutions. Thus, for a particular evolutionary run ER i , after every

125 | P a g e

generation, two solutions get selected at random to compete against the best, most

recent solutions of the two neighboring subpopulations that are being evolved by runs

ER i-1 and ER i+1 . The winner of each competition remains in the subpopulation of ER i for

the next generation. The rest of the next generation is determined by tournament

competition within ER i .

Genetic encoding. All CTRNN parameters and gains are encoded by a real-valued

vector (gene range [0, 1]). At the start of the GA the gene vector is initialized with

random values drawn from a uniform distribution (range [0, 1]). The mutation operator

changes each gene by a random value drawn from a Gaussian distribution (μ = 0; σ 2 =

0.02) with reflection at the gene boundaries. Before every fitness evaluation, each gene

is decoded linearly to the corresponding parameter range, except for the gains and time

constants, which are exponentially scaled.

Evaluation function. When the desirability of a solution is evaluated, it is tested for a

number of trials, typically within the range of 15 and 100 (the precise number is

specified in the chapters). A relatively large number of trials can be beneficial for the

evolutionary process because the behavior of the CTRNN solutions is highly susceptible

to initial conditions, e.g. the respective starting positions of the agents. Each trial run

consists of 800 units of time, unless specified otherwise. At the start of each trial, the

agents have their internal node activations set to 0.

Fitness weighting. Even with the increased genetic diversity due to „geographical‟

niching, and the random spread of the initial conditions, it is still the case that

evolutionary runs are highly susceptible to get stuck in local optima of the search space,

especially because the difficulty associated with the starting positions is highly variable.

For example, agents evolved to engage in perceptual crossing can easily locate each

other when they begin a trial in each other‟s proximity, but fail when they start near

their respective static objects. Under these conditions a typical evolutionary run will

simply start optimizing solutions for starting positions that are most easily optimized,

because this will result at least in some improvement. However, eventually the solutions

will become too specialized to be then adapted to generalize over all starting positions.

This problem was overcome by using a weighted measure of the overall success of a

126 | P a g e

solution, whereby the contribution of a trial‟s score was inversely proportional to its

ranking among all of the trials for that evaluation (cf. Husbands, et al. 1998). This

arrangement dynamically ensures that those starting positions which are difficult are

given more weight.

In the following chapters we will make use of simulations based on these principles to

investigate a range of phenomena. Chapter 8 provides an initial motivation for an

integrative perspective that includes the interaction process as a crucial element of its

explanatory framework. This is followed in Chapter 9 by a more detailed investigation

into the capacity of the interaction process to organize the behavior of individual agents,

including under impaired and unfavorable conditions. Chapter 10 builds on these

insights and develops them further by focusing on the minimal conditions for an

interaction process to become the basis for social interaction, and how this affects the

behavioral repertoire of the agents.

127 | P a g e

8 Investigating sensitivity to social contingency

The overall aim of this thesis is to show that a consideration of sociality helps to address

the cognitive gap faced by the life-mind continuity thesis by shifting a part of the

explanatory burden away from the individual agent to the enabling dynamics of the

interaction process, social interactions and cultural context. So far we have argued for

this shift theoretically. As a first experimental step toward achieving this goal it is

helpful to demonstrate and evaluate the possibility of a scientific perspective that

includes the dynamics of interaction as an essential element of its explanations.

One promising target for such an endeavor is Murray and Trevarthen‟s (1985) double

TV monitor experiment. In this psychological study 2 month old infants were animated

by their mothers to engage in coordination via a live double video link. However, when

the live video of the mother was replaced with a video playback of her actions recorded

previously, the infants became distressed or removed. These results, and those of a more

rigorous follow-up study by Nadel and colleagues (1999), indicate that 2 month old

infants are sensitive to social contingency, i.e. the mutual responsiveness during an

interaction, and that this sensitivity plays a role in the unfolding of coordination.

Figure 8-1. The double TV monitor experiment. The original illustration from Nadel, et al. (1999) has

been modified so as to indicate the abstractions made for the simulation model. The dashed circles and

arrows represent the two interconnected agents, implemented as embodied dynamical systems.

128 | P a g e

Traditional explanations of this sensitivity have focused on inborn factors. For example,

Gergely and Watson (1999) have postulated the presence of an innate cognitive module

which enables the detection of social contingency, and Russell (1996) hypothesizes that

infants have a native capacity to understand intentionality and to process agency. Are

these postulations of innate capacities on the part of the infant necessary in order to

explain the empirical results?

Iizuka and Di Paolo (2007b) used an evolutionary robotics (ER) approach to test

whether simpler solutions could also emerge from the dynamics of the interaction

process itself. In their simulation model the evolved agents, which will be described in

more detail later (cf. Figure 8-2), successfully acquired the capacity to discriminate

between „live‟ (two-way) and „recorded‟ (one-way) interaction. Moreover, an analysis

of the resulting dynamics suggests that the interaction process itself plays an important

role in enabling this behavior. Similar results were also found by other related

simulation studies (e.g. Di Paolo, et al. 2008; Ikegami & Iizuka 2007; Iizuka & Ikegami

2004a; Di Paolo 2000; 1999).

It could be argued that the result of Iizuka and Di Paolo‟s modeling study only

represents a specific subset of the general solution space, in particular because they used

ER to explicitly generate agents that terminate the ongoing interaction when there is a

lack of social contingency. In other words, they employed a fitness function which adds

fitness scores to those solutions which lead to this avoidance behavior. We address this

issue more indirectly by testing whether termination of interaction emerges under more

general conditions. Answering this question is important if the argument is made that

these findings might apply more generally. Moreover, by changing the simulation setup

in this manner we have moved the model closer to the original double TV monitor

experiment: the infants presumably did not have the specific goal to detect whether they

were dealing with a live video or just a recording. It is more likely that they were simply

attempting to establish social coordination with their mothers but were unable to do so.

In summary, we will use an ER methodology to generate pairs of simulated agents

capable of reliably establishing and maintaining a coordination pattern under noisy

conditions. Unlike previous related work, agents are only evolved for this ability and

129 | P a g e

not for their capacity to discriminate social contingency (i.e., a live responsive partner)

from non-contingent engagements (i.e., a recording). However, as it turned out, when

they are made to interact with a recording of their partner made during a successful

previous interaction, the coordination pattern cannot be established. An analysis of the

system‟s underlying dynamics reveals (i) that stability of the coordination pattern

requires ongoing mutuality of interaction, and (ii) that the interaction process is not only

constituted by, but also constitutive of, individual behavior. We suggest that this

stability of coordination is a general property of a certain class of interactively coupled

dynamical systems, and conclude that psychological explanations of an individual‟s

sensitivity to social contingency need to take into account the role of the interaction

process.

8.1 Methods

We implemented a minimal ER model analogous to Murray and Trevarthen‟s (1985)

double TV monitor experiment by building on recent work by Iizuka and Di Paolo

(2007b). The goal of the agents is to cross their sensors as far away from their starting

positions as possible, a task which requires mutual localization, convergence on a target

direction, and movement in that direction while not losing track of each other. This task

is non-trivial since sensory stimulation only correlates with the overlapping of position

(when the centers of the agents are less than 20 units of space apart); it does not convey

the direction or speed of movement of the other agent. The agents are 40 units wide,

have an on/off sensor at their center, and can only move left or right by controlling the

output of their left and right motor nodes (see Figure 8-2).

Figure 8-2. A schematic view of the model adapted from Iizuka and Di Paolo (2007b). The two identical

agents are 40 units wide, only able to move in a horizontal direction, and equipped with a single on/off

sensor at their centre. They face each other in an unlimited continuous 1-D space.

130 | P a g e

Agents are controlled by a CTRNN consisting of 3 fully-connected nodes with selfconnections

27 . Similar settings have already been successfully used by Iizuka and Di

Paolo (2007b). The main differences are that (i) the agents of the current study only

have 3 nodes, (ii) the input is fed to all nodes instead of one dedicated sensory node,

(iii) and each actuator node has its own gain parameter. The first difference was chosen

to further minimize the conditions of the model and facilitate analysis; differences (ii)

and (iii) were implemented because they were found to increase the evolvability of the

solutions.

Noise is introduced into the simulation for two main reasons: (i) since the agents are

identical they will need to make use of noise in order to break the symmetry of their

movements and converge on a common target direction, and (ii) robustness against

noise increases the ability of „live‟ agents to cope with playback situations (Iizuka &

Ikegami 2004a). Accordingly, at each Euler time step there is a 5% probability that the

current sensory state is flipped into its opposite state. We add a small perturbation to the

motor outputs at each time step drawn from a Gaussian distribution (μ = 0; σ 2 = 0.05).

The noise is applied to the outputs before the application of motor gains. In order to

further increase the robustness of the behavioral strategies, the initial relative

displacement between the agents varies (range [-25, 25]). Starting from any of these

possible relative positions, the task for the agents is to coordinate their behavior such

that they cross each other as far away from position 0 as possible. Since the agents are

started in opposite orientation („up‟ vs. „down‟), it is not possible for the evolutionary

algorithm to hard code any trivial solution (e.g. „always move left‟).

In terms of the evolutionary algorithm, the population size is 40 and the algorithm

terminates at 5000 generations. No niching was used. During each fitness evaluation an

27 For each of the three nodes of the CTRNN the parameter ranges are as follows: time constant τ i has [1,

100], bias b i has range [-3, 3]), and weights w ji have range [-8, 8]. All nodes receive the same sensory

input, namely the sensor state multiplied by an input gain with range [1, 100]. The overall agent velocity

is calculated as the difference between the left and right motor nodes. The velocity of each motor node is

calculated by mapping its output onto the range [-1, 1] and then multiplying it by an output gain

parameter (range [1, 50]). The time evolution of each agent‟s controller is calculated by using Euler

integration with a time step of 0.1.

131 | P a g e

agent is tested in 15 trials; to increase the robustness of the evolving solutions to noise

and variations in initial conditions only the lowest score achieved in any of the trials is

chosen as the overall score. Each trial run consists of 50 units of time (500 Euler time

steps). At the start of each trial agents have their internal node activations set to small

random values drawn from a standard uniform Gaussian distribution (μ = 0; σ 2 = 1). The

initial distance between the agents varies; agent ‘down’ always gets placed at position 0,

while agent ‘up’ starts at a different position for each trial (15 different positions evenly

distributed across range [-25, 25]).

The fitness score of a trial run is calculated on the basis of a single factor, namely the

absolute value of the final crossing position of the two agents (divided by a factor of

10). Thus, in contrast to the work done by Iizuka and Di Paolo (2007b), these agents

were not explicitly evolved to break off the interaction pattern when detecting a lack of

social contingency. Instead, we simply aimed to generate a model that under normal

(but noisy) circumstances results in highly fit coordination behavior. Presumably, and

this is our null hypothesis, the agents capable of such robust behavior should be able to

sustain an interaction even when faced with the ‘playback’ condition.

8.2 Results

The GA was run 4 times. The fittest agent, with a score of 244.8, was produced during

the 4 th run in generation 3477. This solution was then tested extensively; agent „down‟

was always placed at position 0, while agent „up‟ starts at a different position for each

trial (101 positions evenly distributed across range [-50, 50]). Each trial is repeated 150

times. The mean score across this range of initial conditions is plotted in Figure 8-3

(left). The agents are able to generalize their behavior well beyond the range that they

were originally evolved to cope with. On average the best initial position for agent „up‟

turned out to be at 11 (mean score: 292.9).

132 | P a g e

Mean fitness

400

300

200

100

0

-50 -25 0 25 50

Relative displacement

0

-50 -25 0 25 50

Relative displacement

Figure 8-3. Left: Mean score achieved by the fittest agent starting from various initial positions, with

standard deviation. Right: Mean score by the fittest agent but this time interacting with non-responsive,

recorded movements obtained from the original trials.

In order to demonstrate the general robustness of the evolved agents under this initial

condition, we ran another set of trials from this initial position, while varying noise

levels (see Figure 8-4). The motor noise was varied while the sensor noise remained

constant at evolutionary strength (5%), and sensor noise was varied while motor noise

remained constant (σ 2 = 0.05). At each noise level we tested the agents for 150 trials.

400

300

200

100

0

0 0.5 1 1.5 2 2.5

Motor noise level

0

0 10 20 30 40 50

Sensor noise level

Figure 8-4. Robustness to noise: mean fitness score achieved over 150 trials by the fittest evolved agent

starting from position 11 for a range of noise levels, with standard deviation. Original noise strength

during evolution is 0.05 for motor (left) and 5% for sensor noise (right).

The agents are able to cope with a wide range of perturbations. Indeed, their overall

performance degrades gracefully until the sensor and motor signals are completely

swamped by noise. In the case of sensor noise, for example, average performance only

approaches 0 just before reaching the 50% mark (at which point sensory activation

becomes completely arbitrary). This demonstrates that the agents are able to produce

highly robust coordination behavior.

133 | P a g e

Finally, another 150 trials were conducted with agent „up‟ at position 11 (under normal

noise conditions). The movement of agent „down‟ during the best trial (score: 321) was

recorded for playback. Another 150 trials were then run under playback conditions: the

initial conditions reflect those of the recorded best trial run (agent „up‟ always starts at

position 11 and with the same initial internal activation), and the movement of agent

„down‟ replicate those which it produced during the recording. While the sensorimotor

noise for agent „up‟ was different during each of these trials, no additional noise was

added to the recorded movement of agent „down‟.

The results are striking: whereas the original 150 trials of mutual (two-way) interaction

were highly successful (mean score: 268), the 150 trials of playback (one-way)

interaction were a drastic failure (mean score: 19). Effectively, agent „up‟ was not able

to sustain an interaction with the playback movements of agent „down‟. The severity of

this failure is especially surprising since under normal conditions the active agent is

robust against various forms of noise, and able to cope effectively with a wide range of

initial conditions. Moreover, during the playback condition its „partner‟ performs what

had previously been a highly fit behavioral repertoire. Still, the active agent is unable to

adapt to the situation of interacting with a non-responsive „partner‟. It could be argued

that this result is unique to the chosen situation. However, this is not the case: when

testing agent „up‟ with each of the original trials we get the same result (see Figure 8-3,

right).

8.3 Behavioral analysis

In order to explain these results we will first analyze the behavior of the agents. The

behavior under normal conditions can be broken down conceptually into three important

aspects: (i) localization, (ii) alignment, and (iii) coordination. We will briefly discuss

the first two aspects and then focus on the third. The activity during the first time steps

of the best trial run is shown in Figure 8-5.

134 | P a g e

Figure 8-5. Initial activity of the two agents during the best trial run. From top to bottom the traces show

the evolution over time of (i) their relative displacement, (ii) their noisy input signal and actual sensory

contact, (iii) their velocity, and (iv) the CTRNN node outputs of agent „up‟.

Initially the agents have no knowledge of how their own position relates to that of their

partner. Moreover, they have no way of gaining that information except when changing

their sensory input by engaging in movement. However, it turns out that one

stereotypical behavioral pattern is sufficient to solve the non-trivial problem of reliable

localization. First, each agent moves rightwards for a few units of time, and then starts

moving leftwards. This sweeping behavior usually takes up to 5 units of time and under

evolved conditions always enables the agents to locate each other. In the case of

negative initial displacement they will encounter each other during their rightward

sweep; otherwise they will cross their positions during their leftward return.

Interestingly, the agents always end up with positive relative displacement after their

initial localization. With this clever maneuver the agents have significantly reduced the

complexity of their coordination task: while sensory input is ambiguous (there is no

indication about the direction or speed of the other agent‟s movement), it has now been

135 | P a g e

co-arranged as a „touching on the left‟ indicator! This change of the sensory meaning is

possible because the CTRNN controllers are not symmetric.

How does the final oscillatory coordination pattern emerge out of the relative

movements of the agents? Before analyzing the behavior of the agents in more detail it

is necessary to briefly describe the evolved CTRNN controller, as shown in Figure 8-6.

Most importantly, the sensory input excites all of the nodes with a gain of S = 10.9, and

the right output gain (44.5) is almost twice as high as the left output gain (24.9). The

two motor nodes are inhibited by the non-motor node and they also inhibit each other

while hardly affecting the non-motor node.

Figure 8-6. The best evolved CTRNN controller. The circles represent the nodes with their time constants

and biases. The arrows represent connections, with the size indicating the weight strength. Dotted arrows

represent inhibitory connections. All nodes receive identical sensory input.

As an example, we can see in Figure 8-5 that the output of the right motor node (z 3 ) of

agent „up‟ starts to slightly decrease just before time t = 8, due to lack of sensory

stimulation. This shift in velocity entails that agent „down‟ catches up with agent „up‟

and they remain in contact (I i = 1) until just before t = 9. During this contact agent „up‟

regains its previous rightward velocity due to sensory stimulation. After separating

again (I i = 0) the firing of the left motor node goes down followed by the right motor

node which eventually leads to the behavioral pattern being reinitiated. This behavior

136 | P a g e

appears to be an individual achievement, and we therefore might expect that agent „up‟

should be able to engage with a playback recording.

Figure 8-7. Initial activity during a playback trial run in which the movements of agent „down‟ are the

same as in Figure 8-5. From top to bottom the traces show the evolution over time of (i) the relative

displacement between the agents, (ii) their noisy input signal and the actual moments of sensory contact,

(iii) their velocities, and (iv) the CTRNN node outputs of agent „up‟.

The activity during the playback trial run is shown in Figure 8-7. At first the „live‟ agent

aligns itself with the „playback‟ agent as it did in the original situation. During mutual

(two-way) interaction agent „down‟ would always respond to contact by moving away

slightly; however, in the playback situation this co-regulation is prevented from

occurring. Accordingly, every noise-displaced encounter results in a slight decrease of

relative displacement between the two agents, thereby in turn making it more likely that

there will be another sensory stimulation. Up to about t = 3, agent „up‟ is still able to

partially regulate this displacement on its own by adjusting the output of its right motor

node. However, from that point onwards the right motor node saturates at z 3 = 1, and

137 | P a g e

thereafter remains unaffected by further sensory stimulation. Finally, at around t = 6 the

positive feedback loop between increasing sensory stimulation and mounting leftward

velocity becomes unstable in such a way that recovery from breakdown is impossible.

The live agent falls behind the playback agent, and heads in the opposite direction.

Why does this breakdown of coordination not occur when both agents engage in „live‟

interaction? The simple answer provided by this model is that the stability of ongoing

coordination requires mutuality of interaction. After the initial alignment we find that

coordinated movement in one direction consists of continuous co-regulated oscillatory

behavior. Agents control their respective velocities such that they cross their sensors at

relatively regular intervals. This iterative reaction chain constitutes an ongoing pattern

of turn-taking; noise perturbations get amplified in a way that requires continuous coregulated

re-establishment of the interaction (cf. Ikegami & Iizuka 2007).

8.4 Dynamical analysis

Can we account for the oscillating pattern in dynamical terms? Since the output of the

non-motor node z 1 (y 1 ) is saturated at 1 during coordination, it can be treated as a fixed

parameter. The rest of the system consists only of the two motor nodes 28 . If agents are

not in contact (I i = 0), there is a stable equilibrium point at (-3.4, -7.5). This state slows

down rightward velocity, and the agents make contact. When I i = 1 the equilibrium

point is shifted to (0.3, 1.9). This speeds up the rightward velocity of the agent. The

vector field of this autonomous dynamical system is shown in Figure 8-8.

Interestingly, under normal conditions the dynamical system never reaches either of

these two equilibrium points, because their existence is made transitory through the

ongoing interaction. This is illustrated in Figure 8-9 (left) in terms of the motor node

firing rates for agent „up‟ over a whole run (50 units of time). The trajectory settles

down into an oscillatory pattern that traces the corner near point (0, 1), in the middle of

the two equilibrium points (located at (0.95, 1) when I i = 1, and at (0.30, 0) when I i = 0).

28 The parameters for these two nodes are τ 2 = 1.6, b 2 = 2.6, w 12 = -3.7, w 22 = 1.0, w 32 = -7.9, and τ 3 = 1.1,

b 3 = 2.9, w 13 = -5.8, w 23 = -5.8, w 33 = 2.3 (values rounded to one decimal place).

138 | P a g e

The state trajectory for the playback situation of the same run is displayed in Figure 8-9

(right). At first the trajectory moves into the same region of state space but then, during

the period of prolonged contact, the left motor node gets saturated while the right motor

node remains at 1. This continues until the system almost reaches the equilibrium point

at (0.95, 1), but it eventually causes agent „up‟ to slow down too much thereby breaking

out of the coordination pattern.

Figure 8-8. The vector fields of the autonomous dynamical system consisting of the left and right motor

nodes only (the remaining node is saturated at z 1 (y 1 ) = 1). Left: sensory input I = 0, and there is a globally

attracting stable equilibrium point at (-3.4, -7.5). Right: input I = 1, and the equilibrium point is (0.3, 1.9).

Figure 8-9. State trajectory of the outputs for the 2 motor nodes of agent „up‟ during 50 units of time. The

trace starts at the top right corner of each graph. The gray and black dot represent the globally attracting

stable equilibrium point when sensory input I = 0 and I = 1, respectively. Left: mutual (two-way)

interaction. Right: playback (one-way) interaction.

After agent „up‟ slows down enough such that the playback movement of agent „down‟

overtakes it, its input I i stays at 0. This causes its motor system to settle near the

equilibrium point at (0.30, 0), from where it is occasionally perturbed by sensor noise.

139 | P a g e

Thus, without the responsive help of the other agent, agent „up‟ is unable to regulate its

behavior such as to avoid falling into this attractor, an event which limits its further

behavior to mere leftward movement. The ability to oscillate is co-determined by the

agents through their interaction.

So far we have said nothing about the kind of dynamics that underlie the process by

which the agents can co-arrange their behavior so as to coordinate their oscillatory

interactions in either the left- or right-hand direction. Indeed, the state trajectory that is

shown in Figure 8-9 is focused on one stable regime of behavior and its breakdown,

namely when agent „up‟ is coordinating its oscillatory behavior rightwards. There exists

a complementary mode of behavior for the leftward direction. How are these two modes

of behavior separated dynamically in the state space of the CTRNN? If we look closely

at Figure 8-8 we see that the transient trajectories in the area between the two attractors

all run in parallel directions, namely alongside a hypothetical line that would cross both

attractors if they co-existed in one state space. Thus, as the system switches back and

forth between the two attractors, it is possible for transient dynamics to oscillate back

and forth either on their left- or right-hand side. It is likely therefore that the collective

decision of direction that emerges through the interaction of the two agents during the

beginning of a trial is the result of a coordinated bifurcation into one or the other of

these two transitory regions of state space 29 .

8.5 Summary

With our simulation study it was found that stable and robust coordination can be

reliably established between simulated agents. While the agents were only selected on

the basis of this coordination ability (rather than their capacity to detect social

contingency), coordination still breaks down when a „live‟ agent is forced to interact

with a playback of movements from a previous, successful trial. Agents interacting with

such a non-responsive „partner‟ do not have the capacity to generate and sustain the

29 Further work is required to determine the precise dynamics underlying the agents‟ coordinated behavior

at the decision point. Collaborations with Jose Fernandez-Leon, who has replicated and extended this

simulation study, are underway in order to better understand the operations of the system.

140 | P a g e

kind of oscillatory behavior necessary for coordination. Thus, what at first appears to be

a behavioral capacity of the individual agent is shown to actually emerge out of a

combination of the internal dynamics as well as the interaction process.

We are thus faced with a peculiar situation in which the behavior of the individual

agents brings forth the interaction process, and that interaction process enables the

behavior of the individual agents (cf. Figure 8-10). This makes a reduction of the

coordination breakdown to an individual agent‟s capacity to detect social contingency

impossible. Moreover, it points to the autonomy of the interaction process, as postulated

by the enactive approach to social cognition (cf. Chapter 4). A more detailed analysis of

the dynamics of the interaction process in this context is desirable, especially in terms of

an artificial life investigation into the systemic basis of constitutive autonomy (cf.

Froese & Di Paolo 2008b). Finally, this focus on the efficacy of the interaction process

also has practical relevance for the design of multi-agent systems, especially in cases

where an agent‟s „role‟ in a formation is not an intrinsic property of that individual but

emerges from the mutual interactions between multiple agents (cf. Quinn, et al. 2003).

Interaction Process

Enables/Constrains

Individual Behavior

Enables/Constrains

Figure 8-10. An illustration of the reciprocal relationship between the coordinated interaction process and

the individual behavior of the agents. The mutual enabling/constraining makes a reduction of the

coordination breakdown to an individual agent‟s capacity to detect social contingency impossible.

It is worth emphasizing that we do not claim that our model instantiates the behavioral

phenomenon which we are investigating, nor that the baby-mother interaction studied

by Murray and Trevarthen (1985) is reducible to such a simple system. The model is

purely conceptual in that it shows at work a possible explanation that may later be

considered and tested in specific empirical cases. Thus, by generating simple models

which do not presuppose the methodological individualism which prevails in social

cognitive science and psychology, we can re-conceptualize the space of possible

explanations (Di Paolo, et al. 2008). In particular, the model presented in this paper

141 | P a g e

suggests that the capacity for social behavior is strongly dependent on the existence of

an appropriate social context, one whose stability is in turn dependent on the active and

responsive engagement of the participants.

On this basis we propose that an explanation for the infants‟ distressed reaction, which

is observed when confronting them with a video recording rather than a live stream of

their mother, also needs to take into account the role of the interaction process. Of

course, this does not mean that the infants cannot by themselves alone detect social

contingency or that they cannot develop and internalize this ability. But this model does

open up the possibility for explanations that do not suppose any necessity for inborn

behavioral capabilities and/or a complex perceptual strategy on the part of the infant.

Nevertheless, it might still be argued that what the results show is only the possibility of

the constitutive role of the interaction process, but that it says nothing about what is the

explanation for the empirical cases. After all, we adults are able to perceive the presence

of others without having to directly interact with them at the time, for instance when I

perceive someone walking in the distance with his back turned to me. The possibility of

this detached other-perception clearly shows that more research needs to be done. It is

worth emphasizing again, however, that we have not excluded the possibility that

internal mechanisms are playing a role in the full explanation, but have rather advocated

a more inclusive explanatory strategy that makes available a more encompassing

scientific perspective. For example, it may well be that sensitivity to social contingency

is initially a socially mediated phenomenon, but that this external mediation becomes

internalized during development (cf. Vygotsky 1934). However, even in adult life the

primary basis of social understanding might still be interaction (Gallagher 2001). That

this is indeed a possibility has been supported by the psychological study by Auvray

and colleagues (2009), which showed that the behavior of adult participants can be

effectively organized in the presence of an appropriate interaction process (cf. Chapter

6, p. 90). The aim of the modeling experiments presented in the next two chapters is to

get a better understanding of this finding.

142 | P a g e

9 Investigating the interaction process

In this chapter we continue the investigation into the organizing dynamics of the

interaction process by means of another set of modeling experiments. These are based

on the minimalist psychological study of perceptual crossing by Auvray, Lenay and

Stewart (2009), which has been described as a detailed case study in Chapter 6 (p. 90).

The value of modeling this experiment has already been shown by Di Paolo, Rohde and

Iizuka (2008), who used evolutionary robotics to generate a simulation model which

successfully replicated the main empirical results while at the same time gaining some

additional insights into the dynamics of the interaction process. For example, the

problems that the model agents had with avoiding interactions with their respective

static objects led them to predict similar difficulties for human participants. This

prediction was already supported by the empirical data presented by Auvray and

colleagues, but previously went unnoticed. Moreover, they found it practically

impossible to artificially evolve a robust behavioral strategy without introducing

temporal delays into the simulation, thereby leading to the additional hypothesis that

there is a crucial role of timing between external stimulation and the participants‟

behavior. In the cognitive sciences this combination of empirical and modeling work on

minimal perceptual crossing has already been used to support the development of the

interactionist approach to social cognition (Gallagher 2008c, pp. 162-166), as well as

the enactive approach to social interaction (De Jaegher & Froese 2009).

In this chapter we will continue this modeling research with the aim of gaining a better

appreciation of the further potential of this general experimental setup and, at the same

time, of improving our understanding of the constitutive role of the interaction process.

We begin by using a similar modeling setup as that used by Di Paolo and colleagues

(2008), and provide a comprehensive analysis of the evolved behavioral strategy by

means of a set of psycho-physical tests. Their results are successfully replicated. The

novel aspect of this re-implementation is the great simplicity of the evolved agents,

which enables a detailed dynamical understanding of their behavior. The original task is

also modified in two ways in order to further test the extent to which successful

behavior depends on the dynamics of the interaction process. In a first variation, the

outputs of the receptor fields are switched between the agents, a modification that

143 | P a g e

cripples their ability to make sense of sensory-motor correlations. Nevertheless, it is

found that even under this impaired condition stable perceptual crossing reliably

emerges from the inter-agent interactions. In a second variation, we changed the task so

as to introduce a conflict between individual behavior and global stability, namely by

evolving agents to locate the mobile object which is not the other agent. It is found that

agents can temporarily succeed at this task, but only by regularly falling back into stable

patterns of perceptual crossing. These variations lead to novel hypotheses about human

behavior that are open to verification by additional psychological experiments 30 .

Finally, we use the psycho-physical studies of the evolved agents to derive a traditional

hypothesis about the sub-personal processes which give rise to their behavior. However,

a detailed analysis of the internal dynamics of the agents refutes this hypothesis in favor

of one which instead focuses on temporality and the interaction process. Our inability to

predict the operations of even such simple dynamical systems serves as a warning

against similar attempts to understand sub-personal mechanisms on the basis of

behavioral observations.

9.1 Methods

The simulation model includes two agents which, following Auvray, et al. (2009), face

each other in a 1-D environment (cf. Figure 6-5, p. 90). The 1-D environment wraps

around on itself after 600 units of space (i.e. the environment is a circle with a

circumference of 600 units). In the simulation all distance and time units are of an

arbitrary scale. Each model agent can control the horizontal movement of its „body‟, i.e.

the position of its receptor field that occupies a total of four units of space. The sensory

input of an agent is activated (set to 1) when its receptor field overlaps with another

object in the 1-D space, otherwise the input remains off (set to 0). The position of each

agent is represented by a continuous variable, and the velocity is determined by a fully

30 In fact, the switched receptor field condition has already been the target of a recent pilot study by Di

Paolo and De Jaegher (personal communication) at the University of Sussex. It appears that participants

are able to deal with this condition effectively.

144 | P a g e

inter-connected CTRNN with self-connections 31 . No noise was applied to any part of

the simulation.

In terms of the evolutionary algorithm, the population size was set to 100 and an

evolutionary run finished at a maximum of 5000 generations, though it was sometimes

manually terminated beforehand if solutions were already sufficiently good. When the

desirability of a solution is evaluated, it is tested for a total of 100 trials that are evenly

spread out across the set of possible initial conditions: 10 * 10 trials over 600 * 600

different possible starting positions, where the difference in positions is determined by a

step-size of 60 (= 600 / 10) units of space. Moreover, to prevent the CTRNNs from

simply learning how to deal with an arbitrary set of starting positions, for each

evaluation, the whole set of trials is adjusted by a general position offset drawn from a

uniform random distribution (offset range [0, 60]), and each particular position is also

displaced by a random value drawn from a Gaussian distribution (μ = 0; σ 2 = 30). Each

trial run consists of 800 units of time. At the start of each trial, both agents have their

internal node activations set to 0.

It is important to note that Di Paolo, Rohde and Iizuka (2008) introduced a time delay

between the activation of an agent‟s receptor field, i.e. due to an encounter in the 1-D

environment, and the perturbation of the agent‟s CTRNN. This was apparently needed

in order to evolve more robust and dynamic solutions for this particular task. The need

for an explicit time delay is peculiar because a CTRNN, as a universal function

approximator (cf. Funahashi & Nakamura 1993), should in principle be capable of

incorporating such a delay within its own network structure. Accordingly, we have

spent a considerable amount of effort trying to evolve similarly robust and dynamic

solutions without the external imposition of such a delay. However, this effort supported

the findings of Di Paolo and colleagues that in practice such an approach does not seem

to generate solutions that robustly generalize over all initial conditions. Consequently,

31 For each of the three nodes of the CTRNN the parameter ranges are: time constant τ i has [1, 200], bias

b i has range [-8, 8]), and weights w ji have range [-8, 8]. All nodes receive the same sensory input, namely

the sensor state multiplied by an input gain with range [1, 100]. The overall agent velocity is calculated as

the difference between the left and right motor nodes (range [-1, 1]). No output gains were used. The time

evolution of each controller is calculated by means of Euler integration with a time step of 0.1.

145 | P a g e

we introduced a delay of 25 units of time into the simulation, which finally made the

evolution of more robust solutions possible. This delay is an integral part of these

solutions. Preliminary tests with shorter delays showed that the evolved agents are

capable of dealing with small alterations to some extent, but shortening the delay below

20 units makes them completely incapable of distinguishing between an encounter with

a static object and the other agent.

The reason for why the evolutionary process is unable to incorporate a time delay into

the CTRNNs used in this experiment, and why the inclusion of such a time delay is

practically needed for robust solutions at all, deserves further study in the future. It is

likely that a CTRNN, as a purely formal system where each node immediately affects

all the nodes to which it is connected, is inherently unsuitable for giving rise to delayed

activity. In material systems, on the other hand, delays are widespread. Moreover, they

are an important property of biological neural systems, where synaptic and conduction

delays depend on the length of the synaptic path. From this biological perspective it

might appear that time delays are merely an unintended side-product of the material

underpinnings of the nervous system. Moreover, since delays increasingly disrupt the

possibility of synchrony in large networks with long-distance connectivity, it seems that

their effects are mainly deleterious and in need of compensation, for example through

inter-neuron „shortcut‟ connections (cf. Buzsáki 2006, p. 78). This is in contrast with

our finding that the extension of the CTRNN controller with a delay structure appears to

be an integral part of the evolved solutions to the task.

What might the role of the delay consist of? Further study of this aspect of the model is

still needed, but we can already propose a hypothesis. In this particular experiment, the

delay in the sensory-motor loop is constitutive of a loose system-environment coupling,

i.e. delay provides a source of relative decoupling. At first sight the need for decoupling

might appear counterintuitive, especially from the point of view of embodied-embedded

robotics. After all, a decisive point of that approach was precisely to get away from the

highly decoupled systems of GOFAI. Instead, the focus has shifted to robotic systems

that are tightly embedded in their environment via continuous and immediate sensorymotor

interaction. In other words, from this perspective we would expect that the

presence of decoupling will interfere with an embodied-embedded system‟s ability to

146 | P a g e

espond to environmental changes in a robust and timely manner, and therefore is a

factor that should be eliminated if possible.

However, there is a growing amount of research in robotics which demonstrates that

small amounts of decoupling might not only be desirable but actually essential for a

variety of behaviors. In effect, decoupling provides a form of mediacy between a system

and its environment. And, as Jonas (1966) has argued at length, mediacy and autonomy

are complements of each other. They give rise to a kind of dialectic at the core of life, a

tension which is most clearly expressed in developmental changes and throughout the

major transitions of evolution. While it is beyond the scope of this chapter to present

Jonas‟ philosophy in more detail, it is important to note that there are already a number

of studies in embodied-embedded robotics which have begun to explore these ideas. For

example, the essential role of sensory-motor decoupling for active perception has been

investigated in terms of a dedicated „gating‟ neuron (Iizuka & Ikegami 2004b), a

homeostatic neural mechanism (Di Paolo & Iizuka 2008), and externally imposed time

delays (Rohde & Di Paolo 2008). Moreover, even control engineering can practically

benefit from considering the role of relative decoupling. For instance, it has been shown

that adding slippery soles to a quadruped robot increases the stability of locomotion

while saving motor energy (Iida & Pfeifer 2004). In general, these studies show how the

incorporation of mediacy can increase a system‟s robustness, because with less rigid

coupling it is less at the whim of environmental perturbations, and flexibility, because

relative decoupling creates a gap that can be filled by active perception strategies.

Given these advantages for the design of embodied-embedded robotics, it is likely that

the role of mediacy in biological systems will become an important area of research in

the future. Furthermore, it is possible that such practical concerns will lead robotics to

align itself more closely with the theoretical framework of the enactive paradigm, in

which the bio-philosophy of Jonas plays a central role (cf. Di Paolo 2003). While there

clearly remains much more to be done in this area, here we will simply follow Di Paolo,

Rohde and Iizuka‟s (2008) approach by using an externally defined time delay in order

to bootstrap the artificial evolution of more active behavioral strategies.

147 | P a g e

9.2 Experiments

In this section we describe three sets of experiments. First, the performance of the

simulated agents was evaluated in terms of the experimental setup of the original

psychological study. We then introduce two novel studies that are specifically aimed at

investigating the extent of the self-organizing properties of the inter-individual

interaction process. In the first instance we tested the robustness of the solution that was

evolved for the standard task by exchanging the input signals of the receptor fields

between the two agents, as this significantly limits their ability for engaging in

individual sensory-motor exploration. Then we changed the experimental setup to one

in which the „intentions‟ of the individual agents and the overall interaction dynamics

are in conflict with each other, namely by evolving agents that were rewarded to interact

with each other‟s „shadow‟ object, i.e. an object that is active (mobile) but not interactive

(non-contingent).

9.2.1 Experimental setup 1: Original setup

As a first step we tried to replicate the modeling work that has already been done by Di

Paolo, Rohde and Iizuka (2008). Their simulation model is relatively faithful to the

details of Auvray, Lenay and Stewart‟s (2009) experimental setup, with one significant

difference: while the original task for the participants was to click a mouse whenever

they encountered each other, the task for the model agents is to locate the partner agent

and spend as much time as possible as close to each other as possible. Implicit in this

task is the requirement for the agents not to become „trapped‟ by the static object or

„shadow‟ object of the other agent. Accordingly, in their model the fitness score F of

each trial is calculated to be inversely proportional to the average distance between the

two agents (it is therefore the same for both). The score F for a particular trial is

determined by Equation 9-1.

Equation 9-1

F

1

T

d(

t)

1

0 300

In this equation T is the total number of time steps per trial, 300 is the maximum spatial

distance between the agents (since the 1-D environment wraps around between 0 and

148 | P a g e

600 units), and d(t) is the spatial distance between the agents at time step t. Fitness

scores vary continuously (fitness range [0, 1]), with 0 being the worst.

The advantage of this evaluation function is that it simplifies the task for the model

agents, since they do not have to engage in any additional explicit classification

response (i.e. some form of „clicking‟). Moreover, because the score is a continuous

measure this increases the evolvability of the solutions. In contrast to a discrete

evaluation measure of a number of successful clicks, here every approaching behavior is

rewarded in a proportional manner, even if the agents do not happen to find each other

(i.e. they do not actually need to cross).

However, there are also drawbacks in using the average distance between the agents as

a measure of the desirability of the evolved solutions. Most importantly, this measure

fails to properly distinguish between (i) an agent‟s general exploratory movements and

interactions, and (ii) the explicit distinction of an ongoing interaction as an encounter

with the receptor field of the other agent. In contrast, it is possible for a human

participant in the original psychological experiments to spend a considerable amount of

time engaging in interactions with the static and/or shadow object of the other, which is

often the case, but then still decide against a mouse click response. Nevertheless, due to

the fact that this fitness measure turned out to be much more evolvable than equivalent

measures based on „clicking‟ accuracy, we chose to retain it for this study. Future work

could investigate whether the addition of an explicit „clicking‟ ability changes the

general behavioral strategies of the agents, or perhaps even the inclusion of a model of

arm morphology (cf. Rohde & Di Paolo 2008).

We explored a range of CTRNN network sizes, starting with 11 nodes (the maximum

size used by Di Paolo and colleagues), but were able to find robust solutions with as

little as 4 nodes. We then chose the highest scoring solution out of the population which

had achieved the highest average fitness out of the 10 different evolutionary runs for

further testing. During testing the duration of each trial was doubled to 1600 units of

time in order to better assess the general robustness of the selected solution. In addition,

we were able to manually prune several connections in the evolved CTRNN without

significantly affecting its performance. Indeed, this pruning eventually allowed us to

149 | P a g e

educe the solution to a 3-node CTRNN network, which is significantly smaller than the

11-node network obtained by Di Paolo and colleagues. This pruned CTRNN was then

further optimized for 800 generations. The resulting CTRNN is shown in Figure 9-1.

Figure 9-1. The CTRNN controller used for experimental setup 1. Legend: Circles represent CTRNN

nodes with time constants τ i , block-tailed arrows represent bias connections b i , diamond-tailed arrows

represent weighted input connections Iw i , normal arrows represent motor outputs z i (left motor: L-M;

right motor: R-M), circle-headed arrows represent weighted inter-node connections w ij (including selfconnections).

Negative connections are depicted as dashed lines, while positive connections are solid. The

size of the arrows is roughly proportional to the strength of the connection, while the size of the circles is

roughly proportional to the speed of the CTRNN node.

In order to get an initial rough idea of what the different fitness scores mean in terms of

agent behavior for this solution, it is helpful to consider the following illustrative cases:

(i) A fitness score of 0 is an absolute limit point that is only ever attained when

evolving solutions without delay, which often produces agents that suddenly stop

moving when their receptor field becomes activated. Thus, when these agents start

the trial on their respective static objects, in this case they will not move for the rest

of the trial, and remain maximally distant from each other (the static objects are

located 300 units apart).

(ii) A fitness score of 0.5 is obtained, for example, when permanently turning off the

receptor field of the evolved agents, which then continually circle the 1-D

environment until the end of the trial. The behavior is also often displayed by

150 | P a g e

andomly initialized CTRNNs that do not yet respond to perturbations. In these

cases the agents move continuously in opposite directions around the 1-D

environment, and therefore spend equal amounts of time near the maximally-distant

and maximally-close positions.

(iii) A fitness score of 1 is another absolute limit point that is only ever attained by

agents that have been evolved without delay, and which immediately stop moving

when their receptor field gets activated. Thus, when these agents start the trial on

top of each other, they will not move for the rest of the trial, and remain maximally

close to each other.

Accordingly, the most interesting behavior for this CTRNN will be found by examining

trials that have fitness scores of around 0.75. In these cases the agents have successfully

engaged in perceptual crossing for at least some amount of time during the trial, but

they must have also faced some difficulties. This usually means that an agent was

delayed by an encounter with its static object or the shadow of the other agent during

the beginning of the trial, or that there was some other kind of interference which

caused the perceptual crossing to break down temporarily before being reestablished.

In order to obtain a comprehensive overview of this solution‟s performance we tested it

for each possible combination of starting positions of the two agents (600 x 600 trials).

The results of these tests are depicted graphically in Figure 9-2.

151 | P a g e

600

500

400

300

200

100

1

0.9

0.8

0.7

Region 1

Region 2

Region 3

Region 4

Region 5

Region 6

100 200 300 400 500 600

0.6

Figure 9-2. Graphical representation of fitness scores achieved at each possible combination of starting

positions for agent „up‟ (x-axis) and agent „down‟ (y-axis). Note that the axes wrap around due to the 1-D

circular shape of the virtual environment. Fitness scores range from 0.60 to 0.96 with an average of 0.87.

See text for an explanation of the different regions.

We can identify several salient regions in Figure 9-2 which enable us to make some

general comments about the evolved solution. First, it should be noted how well the

model agents manage to generalize over all initial conditions; no trial resulted in a

fitness score of less than 0.60. Second, the evolved solution is not strictly symmetrical.

However, this is not surprising since we did not enforce any structural symmetry on the

CTRNN controller. Second, there are some regions of clear qualitative changes

indicated by sharp grayscale differences. These regions have been marked as 1 to 6 in

Figure 9-2 for ease of reference. A brief description of the behavior of the agents when

initially starting from these different regions gives an overview of their general

behavioral domain:

1) The fitness in this region is relatively lower because both agents get perturbed

by their respective static objects during the beginning of the trials. This will

cause them to briefly oscillate around the object. After a few contacts the agents

proceed to break out of the oscillation, continue to move on, and eventually

come into contact. From this point onward they engage in perceptual crossing

until the end of the trial.

152 | P a g e

2) The low fitness diagonal region going on the right-hand side from starting

position (0, 0) to (600, 600) can be explained as follows: agent „up‟ (x-axis)

moves rightwards without any input, while agent „down‟ (y-axis) moves

leftwards. This means that when agent „up‟ starts within 52 units of space to the

right of agent „down‟, it will encounter the other agent‟s shadow object. This

will trigger its receptor field and cause it to briefly turn back. However, agent

„down‟ has not been perturbed by this encounter and continues moving

leftwards. Agent „up‟ will thus not make any further contact and continue

moving rightwards. In sum, what happens in this region is that agent „up‟ gets

delayed, and the agents thus spend more time apart from each other. This ceases

to be a problem when agent „up‟ starts further along the x-axis compared to the

starting position of agent „down‟.

3) This whole region is relatively flawless. The agents move unperturbed past each

other‟s static object, eventually come into contact, and engage in perceptual

crossing until the end of the trial. The increasing fitness gradient toward the

corner (600, 0) reflects the fact that the agents have to travel less distance to

meet each other.

4) This streak of uneven fitness distribution reflects the fact that the agents, when

they start from this region, sometimes encounter each other in the vicinity of the

static object of agent „up‟ (located at x = 148). This interference causes the

agents to break off perceptual crossing eventually, at least under some

conditions.

5) This streak of uneven fitness distribution reflects the fact that the agents, when

they start from this region, sometimes encounter each other in the vicinity of the

static object of agent „down‟ (located at y = 448). This interference causes the

agents to break off perceptual crossing eventually, at least under some

conditions.

6) This solid line of relative low fitness represents cases where the agents

encounter each other at the beginning of the trial, but eventually disengage,

153 | P a g e

move on, and finally establish proper perceptual crossing until the end of the

run. These cases are the only trials in which the two agents break off perceptual

crossing even though there is no external interference present. Such spontaneous

disengagement did not occur with the original 4-node solution; the 3-node

solution is thus slightly less robust. The behavior appears to be related to the

way in which the agents initially engage each other. In certain cases they are

unable to stabilize the interaction appropriately.

We now have a rough understanding of the behavioral domain of the agents. However,

so far we have not gained any detailed understanding of what allows the agents to be

sensitive to the social contingency of their interactions. In other words, how do the

agents distinguish an interaction with their partner from an interaction with the static or

shadow object? To avoid prolonging any interaction with the shadow object is actually

relatively straightforward, since the situation itself is unstable. The other agent will not

be perturbed by the encounter with its shadow and therefore move away, trailing its

shadow object behind it, and thus eventually terminate the one-sided interaction

process. However, if an agent happens to encounter its static object then things are more

complicated, since oscillating around this object is based on a stable environmental

situation. Nevertheless, the evolved agents do manage to disengage from interactions

with their static objects eventually. How is this possible?

In order to get a better understanding of this discriminatory capacity it is helpful to

study a particular interaction in more depth. We chose a representative trial starting

from position (100, 500), which is illustrated in Figure 9-3. The fitness score was 0.74.

During this trial the agents first encounter their respective static objects (before 1000

time steps), continue to oscillate around this object (from 1000 to 3000 steps), then

disengage and continue searching (from 3000 to 4500 steps), then finally locate each

other (after ca. 4500 steps), and establish perceptual crossing until the end (16000

steps). On the basis of this trial it is possible to investigate why the agents disengage

from the interactions with their static objects, but continue to engage in perceptual

crossing once they make contact with each other.

154 | P a g e

Figure 9-3. Illustration of the behavior of the agents during a representative trial starting from point (100,

500) with a score of 0.74. They first encounter their respective static objects, then continue searching, and

finally locate each other and establish perceptual crossing until the end of the run (16000 time steps). Top:

the position of the agents and objects in the 1D environment over time. Middle: the node outputs of agent

„up‟ over time. Bottom: the status of the receptor field and the velocity of agent „up‟ over time. Note that

a change of receptor field status reaches the agent‟s controller only after a delay of 25 units of time (250

steps).

How does the sensitivity to social contingency emerge in this system? The first thing to

notice is that both agents disengage from their respective static objects after the third

time that their receptor field has become activated. We will therefore initially focus our

analysis on this particular moment in time. We know from Di Paolo and colleagues

(2008) that it is possible for the agents to base their behavior on the duration of

stimulation afforded by an encounter. Similarly, in this trial the third encounter between

agent „up‟ and its static object lasts 116 steps, while the third encounter with the other

agent only lasts 56 steps. The static object therefore perturbs the agent almost exactly

155 | P a g e

twice as long as the other agent. The reason for this, of course, is that the other agent

moves with an opposite velocity, while the static object remains stationary. Do the

agents make use of this difference in stimulus duration in order to distinguish between

types of interaction?

In order to determine if this is indeed the case we performed some psycho-physical tests

on the model agents. By altering the size of the objects within the 1D environment, it is

possible to systematically vary the length of stimulation encountered by the agents.

Similarly, we can also alter the size of the body of the agents. Thus, if the sensitivity of

the agents relies on this temporal factor (duration of contact), then it should be possible

to alter their behavior accordingly. To explore this hypothesis we conducted two tests:

1) We start the trial from the same initial conditions as before. However, just before

the 3 rd interaction with the static object (at 2100 steps) we decrease the size of the

static object to 3 units of space (from the usual 4). In this case the performance of

the agents is drastically altered (score 0.06); both agents fail to disengage from their

static objects and continue to oscillate around them until the end of the trial. As

expected, the decrease in static object size entailed a shorter stimulation during the

3 rd contact (101 rather than 116 steps).

2) We start the trial from the same initial conditions as the original setup. However,

just before the 3 rd perceptual crossing (at 5500 steps) we increase the size of the

agents to 10 units of space (from the usual 4). In this case the performance of the

agents drops significantly (score 0.59). After making the 3 rd contact they drift apart.

As expected, the increase in agent size entailed a longer stimulation during the 3 rd

encounter (108 rather than 56 steps).

These tests appear to demonstrate that an agent‟s sensitivity to social contingency

largely depends on the duration of the 3 rd contact. It therefore seems that we can explain

this sensitivity in terms of two thresholds related to the duration of contact, t 1 and t 2 .

These determine the cut-off points between two distinct behaviors, namely continued

oscillation and linear exploration. More precisely, we hypothesize to find a mechanism

such that: 0 < t 1 < „continue to oscillate around stimulus‟ < t 2 < „continue to explore the

156 | P a g e

est of environment‟. Is it possible to deduce the values for these thresholds from further

psycho-physical tests? We will return to this issue during the more detailed analysis

presented in Section 9.3.

9.2.2 Experimental setup 2: Switched receptor fields

It could be argued on the basis of experimental setup 1 that the agents actually employ a

solitary strategy to solve the task, namely by internally distinguishing between two

different lengths of stimulation. To be sure, part of the basis of this distinction, namely

the duration of contact between agents, is co-determined by their velocities. However,

this co-determination does not necessarily require coordination. As long as both agents

are moving at different velocities the possibility of making this distinction effectively

remains the same, and no mutual interaction is necessary to establish this difference in

the first place. Moreover, the experimental setup ensures that interactions between an

agent and the other‟s shadow are inherently unstable, thereby removing the shadow as a

possibility for further entrainment. On this view, the distinction between a static object

and the other agent would be largely an individual accomplishment without any need

for the autonomous dynamics of an interaction process 32 .

In order to test this null hypothesis we modified the experimental setup slightly, namely

by switching the inputs to the receptor fields between the two agents. Under this setup

both agents still control the location of their respective receptor fields, but their CTRNN

controllers receive the perturbations due to the other agent‟s field. In this manner we

have severely disrupted the individual sensory-motor correlations, but crucially the

possibility for engaging in mutual interactions leading to the establishment of perceptual

32 Note that this discussion of the model raises difficult questions about how to best define the concept of

„interaction process‟. When do we mark its beginning, when its end? Does the interaction process have to

involve mutual perturbation of internal state or, more abstractly, does it require mutual inter-dependence

of conditions for the sustainment of behaviors? In this thesis we make use of the former, more intuitive

interpretation, but the latter might in the end turn out to be more accurate. After all, it is possible that one

agent‟s erratic searching behavior partly constitutes, via the rigid agent-shadow link, the condition for the

other agent to continue interacting with the seemingly responsive shadow object, a behavior which in turn

partly constitutes the condition for the first agent, namely that it remains without stimulation from its

partner and continues its solitary search, which potentially keeps the other further entrained, etc.

157 | P a g e

crossing has remained unchanged from the original experiment. Consequently, if the

agents in addition to their individual capacities also rely on mutually responsive

interaction in order to distinguish between a static object and each other, they should

still be able to perform the task even if they are incapable of using reliable sensorymotor

correlations at the individual level. The results of a comprehensive test of this

experimental setup are shown in Figure 9-4.

Figure 9-4. Graphical representation of fitness scores achieved at each possible combination of starting

positions for agent „up‟ (x-axis) and agent „down‟ (y-axis). Note that the axes wrap around due to the 1-D

circular shape of the virtual environment. Fitness scores range from 0.40 to 0.965 with an average score

of 0.87.

As predicted, the average fitness of this modified condition (0.87) is not different from

that of the normal condition (0.87). Moreover, even the fitness distribution of the

comprehensive results of this modified condition look strikingly similar to those of the

normal condition shown in Figure 9-2. To be sure, some of the salient patterns and

asymmetries have shifted slightly, but these general differences might be expected,

especially considering that we have effectively cross-wired the sensory-motor loops of

the two agents. How are the agents able to cope with the distortion of their sensorymotor

coupling? As a comparison it is helpful to examine again the conditions for trial

shown in Figure 9-3, but with switched input parameters. A trace of the movement of

the agents is shown in Figure 9-5.

158 | P a g e

Figure 9-5. Illustration of the behavior of the agents during a representative trial starting from point (100,

500). Same experimental setup as shown in Figure 9-3, except that the input to the receptor fields of the

agents has been exchanged between them. Fitness score: 0.71

During the beginning of the trial the agents encounter similar situations, thereby

providing each other with relatively matching simulation. At 3000 time steps, however,

agent „down‟ moves across its static object but, due to the input switching, remains

unperturbed, while agent „up‟ is stimulated and turns back, does not find anything, and

continues moving rightwards. Finally, the agents encounter each other and engage in

perceptual crossing until the end of the trial. The switched inputs might thus even be

helpful in some circumstances because most of the time the agents do not cross their

respective static objects at the same time. Thus, when an agent turns back after being

perturbed by the other agent who has just passed its own static object, it does not find

anything there and does not get held back any further. Similarly, the other agent will

have remained oblivious to this occurrence as well. Moreover, the only time when there

will be no interference from the swapped receptor fields at all is when the agents engage

in mutual interaction, as this interaction results in identical (matching) receptor

activations.

In sum, by modifying the original experimental setup in the current manner we have

thus demonstrated that the interaction process not only makes interaction with the

shadow object unstable, thereby removing it as a possibility for further entrainment, but

that it also plays a role in making perceptual crossing a stable possibility. Even without

any consistent sensory-motor correlations as a basis for individual behavior, the agents

essentially negate this lack by means of mutually responsive interactions. In this manner

it is possible for successful perceptual crossing to self-organize in terms of the relative

stabilities of the interaction process. On this basis we can hypothesize that if we

159 | P a g e

changed the psychological study accordingly, human participants would similarly be

able to continue to accomplish the task successfully.

9.2.3 Experimental setup 3: Conflicting behaviors

It could be argued that the only reason why perceptual crossing emerges under the

modified conditions of experimental setup 2 is because the two agents actively establish

the interaction process. After all, this is precisely what they were originally evolved for,

and the success of their strategy might therefore still be better accounted for by

appealing to their individual behavioral efforts rather than to the self-organizing

dynamics of the interaction process. How can we separate out the contribution of these

two factors?

It is certainly the case that in the original experimental setup of Auvray, Lenay and

Stewart‟s psychological study, both the individual interactors and the interaction

process are essentially „cooperating‟ together in the successful completion of the task:

(i) if one individual finds the other‟s shadow, then the other will still be looking and the

shadow will move away, thereby preventing further interactions, and (ii) if one

individual finds the receptor field of the other, the other has effectively found that

individual, too, thereby entailing further interactions. It follows that the stability of the

interaction process in the experimental situation and the intentions of the individual

interactors are reciprocally reinforcing. But just how important is the organization of the

interaction process for its own stability? Is its existence as a process largely supported

by the behavior of the individual agents or does it also possess some self-organizing

efficacy of its own?

Fortunately, it is also possible to investigate a „competitive‟ situation in which the

interaction process self-sustains even despite the individual intentions of the interactors.

Consider, for example, the case of self-perpetuating arguments in which all participants

actually want to stop arguing. Such a case could provide strong support for theories

which propose that social interactions can be characterized by their autonomous

dynamics (cf. De Jaegher & Di Paolo 2007). Note that the notion of a „competitive‟

situation, as it is used here, refers to the competing goal-directedness of an individual

160 | P a g e

interactor and the stability of the collective interaction process. Of course, this does not

exclude the possibility that these interactors themselves also have conflicting intentions,

but such inter-individual conflict is not necessary for an individual to be in conflict with

the stability of the interaction process itself.

In order to further investigate the autonomous role of the interaction process within this

particular experimental situation, we therefore need to change the basic setup of Auvray

and colleagues to give rise to this kind of „competitive‟ situation. The task of detecting

social contingency remains the same as before: the individuals must distinguish between

those interactions that occur with the other‟s receptor field, and those that result from

the mobile shadow object, as well as avoid any interaction with the static object.

However, in contrast to the original psychological study, here the agents are required to

stay with their partner‟s shadow object, rather than staying with the receptor field of

their actual partner. The task is therefore to detect a certain kind of mobile object that

gives rise to non-contingent interactions, a task that can only be achieved by detecting

and avoid interactions with contingently responsive mobile objects.

Due to the asymmetry inherent in this setup (i.e. agents face in opposite directions, but

their shadows are displaced in the same direction), it is impossible for both participants

to be interacting with each other‟s shadow at the same time. Therefore, in order to

complete the task it is now necessary for the participants to avoid engaging in interindividual

interaction with each other. This will not be easy because (i) engaging in

perceptual crossing is still a relatively stable behavior, at least for as long as both

interactors remain convinced that they are interacting with the other‟s shadow, and (ii)

crossing with the other‟s shadow remains inherently unstable, since that other

participant will keep on looking for the shadow of its partner. In this manner we have

created an experimental setup in which the intentions of the individuals and the

dynamics of the inter-individual interaction process are in direct conflict.

In order to implement this setup in terms of a simulation model we used the same

parameters as for experimental setup 1, but with the essential difference that the

function to evaluate an agent‟s fitness (cf. Equation 9-1) now measures the average

distance to the other agent‟s shadow. Since the asymmetry of this setup means that the

161 | P a g e

evaluation function can now give different values for the two agents (i.e. they cannot be

in contact with each other‟s shadow at the same time), we decided to have each agent

controlled by a different CTRNN. In other words, whereas in the previous experiments

we evaluated a solution by having two identical CTRNNs cooperating on the task, here

we are evaluating two solutions by having them compete with each other during the

trials. More specifically, the evolutionary algorithm has been changed so that two

randomly selected parents are removed from the current generation, and tested against

each other for 100 trials of evaluation. The losing solution is discarded. The winning

solution and a mutated copy of its genome are added to the population of the next

generation. This tournament is repeated until there are no more parents in the current

generation, at which point the process is repeated with the newly created population.

It could be argued that this approach still does not represent proper competition between

the solutions because of the likelihood of genetic convergence of the population (cf.

Froese & Spier 2008). Genetic convergence could make the solutions almost identical,

and therefore effectively turn this situation into a „cooperative‟ one once again, at least

at the genetic level. Nevertheless, this worry is unnecessary as several attempts to

evolve agents to solve this task under these conditions did not succeed. The target

solution appears to be too unstable to make such convergence possible, and even after

thousands of generations the evolved behavior is nowhere near as successful in terms of

fitness as that evolved for experimental setup 1. These difficulties indicate that there are

limits to the ability of the interaction process to entrain the behavior of the individual

agents so as to sustain appropriate patterns of interaction. Suitable initial conditions

must be present for the emergence of a self-organizing interaction process.

In response to these difficulties of evolving a competitive scenario from scratch we

chose a slightly more advantageous starting point for the evolutionary process, namely

by seeding the populations with the best evolved agent from experimental setup 1. In

this case we still have two contrasting influences on the behavior of the agents, i.e. a

competitive fitness evaluation forcing the agents to disengage, and the entraining

stability of the interaction process in the form of perceptual crossing. The results of

comprehensive tests of the best agent taken from an evolutionary run that lasted almost

5000 generations are shown in Figure 9-6a. The results for the same agent, but with

162 | P a g e

swapped receptor fields, are shown in Figure 9-6b. Note that for the purposes of these

trials the agent competes against a clone of itself.

(a)

(b)

Figure 9-6. Graphical representation of fitness scores achieved at each possible combination of starting

positions for agent „up‟ (x-axis) and agent „down‟ (y-axis). Note that the axes wrap around due to the 1-D

circular shape of the virtual environment. (a) Normal condition. Fitness scores range from 0.09 to 0.94

with an average of 0.80. (b) Swapped receptor fields. Fitness scores range from 0.06 to 0.94 with an

average of 0.80.

The comprehensive results shown in Figure 9-6 indicate that in terms of fitness scores

the best evolved solution is in many respects qualitatively similar to the solutions found

for the previous experiments, though there is a slight reduction in overall robustness.

For example, there is a noteworthy exception that can be seen as a small black square in

the top left corner of Figure 9-6a, an area where this solution received almost no score

because both agents got stuck on their respective static objects. However, this particular

problematic situation is largely resolved when we modify this setup to the „swapped

receptor field‟ condition, as is shown in Figure 9-6b. This result is consistent with what

we have learned on the basis of experimental setup 2. Of course, some points of low

fitness remain, but this is to be expected considering the conflicting influences shaping

agent behavior.

How are these two competing factors reflected in the behaviors and mutual interactions

of the agents? We find that the original perceptual crossing evolved for experimental

setup 1 is retained across generations, although the precise strategy of the agents has

163 | P a g e

adapted slightly to the new constraints. Figure 9-7 shows a sample trial run selected

from Figure 9-6a.

Figure 9-7. Illustration of the behavior of the agents during a representative trial starting from point (100,

500) with a score of 0.72. They first encounter their respective static objects, then continue searching, and

finally locate each other and establish perceptual crossing until the end of the run (16000 time steps). Top:

the position of the agents and objects in the 1-D environment over time. Middle: the node outputs of agent

„up‟ over time. Bottom: the status of the receptor field and the velocity of agent „up‟ over time. Note that

a change of receptor field status reaches the agent‟s controller only after a delay of 25 units of time (250

steps).

It is revealing to compare this trial to the one illustrated in Figure 9-3. The scores of the

two trials are almost identical, but there are some qualitative differences in behavior.

The first thing to notice is that in this modified setup the agents make much less contact

with the objects in their environment. For example, they only make contact with their

respective static objects once (rather than three times) before moving on. Similarly,

perceptual crossing is established with only two contacts (rather than periods of three).

These shorter periods of interaction might have evolved to make it easier for the agents

164 | P a g e

to break away from the stable, but contingent interaction before becoming too entrained.

The second thing to notice is that the perceptual crossing is now characterized by a

certain spatial drift, whereas before it was localized to one area of the environment. This

is due to the asymmetry introduced by the competitive fitness function (both agents

have their shadows displaced in the same direction). For example, after engaging in

basic perceptual crossing in between time steps 4000 and 8000, agent „down‟ traces

along the shadow object of agent „up‟ until about 11000 time steps. At this point the

other agent, not having made any contact with agent „down‟ for a while, starts to speed

up its rightward exploratory movement, and the agents are eventually forced back into

establishing perceptual crossing.

This inherent trade-off between (i) staying in contact with the other‟s shadow and (ii)

the self-organizing stability of the interaction process is especially noticeable when

agent „up‟ starts the trial located between agent „down‟ and its shadow (trial not

depicted). In this case agent „up‟ moves rightwards, makes contact with the other‟s

shadow object, and traces its movement leftwards for a short period. However, since

agent „down‟ has not been perturbed during this activity it keeps increasing its leftward

velocity until it eventually breaks away. In other words, lack of interaction makes this

an unstable strategy and prevents the self-organizing dynamics of the interaction

process to emerge. However, when the agents do make contact they generally engage in

a pattern of interaction similar to that shown in Figure 9-7, where short bursts of

perceptual crossing are interspersed with periods of distancing. Here we thus have a

situation in which the interaction process sustains itself, even though the individual

agents have been specifically evolved to break this mutual interaction in favor of

localizing the other‟s shadow.

Of course, individual behavior still plays an enabling role in this situation. It is the

agents who make contact with each other, and thus allow the interaction process to take

hold in the first place. And it is the agents who at times manage to break away from this

entrainment as well. But nevertheless it is the interaction process which constrains their

behavior most of the time, even despite their individual goals. On this basis we can

hypothesize that if we changed the psychological study accordingly, human participants

would encounter great difficulty in accomplishing the task successfully. To some extent

165 | P a g e

this hypothesis is already supported by the empirical data presented by Auvray and

colleagues who found in their original study that the probability of a participant‟s click

after stimulation by either the other‟s receptor field or his shadow object was not

significantly different (cf. Auvray, et al. 2009, p. 39).

9.3 Dynamical analysis

The results of these simulated experimental setups and the original empirical data

presented by Auvray and colleagues undermine any attempt to attribute sensitivity to

social contingency to the individual agents under these conditions. It is the collective

interaction process, itself enabled by simple exploratory and discriminatory individual

behavior, which in turn enables and constrains appropriate „social‟ behavior on the

individual level so as to self-sustain its entraining presence. We know, for example, that

human participants of the study cannot tell from their perspective alone whether they

are actually interacting with another responsive partner or merely with an unresponsive

copy. Yet their individual behavior is clearly influenced by the presence of social

contingency as such, since on average they spend more time interacting in contingent

situations. How can we explain this interaction between the individual and collective

levels of dynamics?

In order to get a better understanding of how the individual and interaction levels relate

to each other, let us for a moment entertain the traditional perspective of methodological

individualism, i.e. we adopt the common perspective that the individual agent is the

only correct unit of analysis. The two behavioral tests described in Section 9.2.1 support

the idea that an agent‟s discriminatory ability depends on the duration of the 3 rd contact.

If this contact is too long it is a static object, if it is short enough then it is the other or

the other‟s shadow. We have also seen that this ambiguity in the latter half of the

discrimination is not a problem. Note, however, that already here we have to appeal to

mutual responsiveness in order to discount the shadow as a serious possibility, since it

only affords unstable interactions. Nevertheless, it still appears possible that the agent‟s

controller is implementing a simple thresholding circuit that determines which of the

two behaviors, i.e. oscillation or exploration, should become active depending on the

166 | P a g e

length of stimulation. A hypothetical circuit of this traditional internalist explanation is

illustrated in Figure 9-8.

Input I

D1

„body‟

„brain‟

K 1

∫ I dt

0

T1

T2

G1

G2

left motor

-

+

2

right motor

D2

Velocity v

„world‟

Figure 9-8. Diagram of a hypothetical circuit that could explain the discriminatory behavior of the agents.

D1 and D2 are delays (total delay: 25 units of time), ∫ I dt is the integral of the input signal I for K units of

time, T1 and T2 are thresholds, and G1 and G2 are output gains.

We know from the implementation of the agents that their controller is separated from

their environment by a delay, which can be represented as part of their body (D1 and

D2). We also know that the velocity v of the agents is the difference between left and

right motor activation. In addition, we have established that the crucial factor is duration

of stimulation. Accordingly, we posit an integrator element, which takes the delayed

input signal I and outputs the integral, calculated over K units of time. This integral is

then passed through a threshold unit which becomes active (i.e. it outputs 1, rather than

0) when the integral falls between two thresholds T1 and T2. The initial behavioral tests

described in Section 9.2.1 allow us to set T2 at about 105 time steps. The output of the

threshold unit is then multiplied by the left motor gain G1 and subtracted from the

agent‟s overall velocity. At the same time we know that when the agent receives no

input (I = 0) it continues to move rightwards. Accordingly, we posit another connection

from the delayed input signal that passes through an inverter, gets multiplied by the

right motor gain G2, and is added to the agent‟s overall velocity.

Is it possible to determine the precise values for these parameters, especially T1 and T2?

Let us consider a trial from the experimental setup 1 in which the agents did not do very

well in order to get an idea of the range of these values. We can hypothesize, for

167 | P a g e

example, that the low fitness found in Region 6 of Figure 9-2 can be explained in terms

of a long contact time between the agents. To test this hypothesis we chose a

representative trial starting from (557, 32). In this case the agents spontaneously

disengage their interaction (after 2000 time steps) after having found each other in the

very beginning of the trial. They then continue to explore the environment, finally reestablish

perceptual crossing (after 10000 time steps), and continue to interact

appropriately until the end of the trial.

This initial coordination failure highlights the importance of co-regulation for

perceptual crossing to be established. It is not enough for the agents to simply encounter

each other, but they also have to encounter each other with the right kind of velocity. It

is also interesting to note that the decisive contact was the 4 th encounter in this case

rather than the usual 3 rd . Might this have something to do with the cause of the

spontaneous failure? Initially this idea appears to be rejected by a quick test of

decreasing the size of both agents to 3 units of space (instead of the normal 4) just

before the 4 th encounter, which results in a performance with an almost maximum score.

It is a matter of stimulation length after all, but the number of contacts appears to be less

important. Moreover, this modified contact lasts 55 steps, which is short even for the

duration of normal contact with another agent. Might this be an indication for the value

of T1? However, there is a serious problem for our hypothesis: the original final

encounter between the two agents only lasted 64 steps as well. This value is still within

the range of contact durations with the other agent that we determined in the previous

series of tests (between T1 and T2). So why do the agents disengage?

At this point we have to reject the possibility that we can explain the agent‟s sensitivity

to social contingency in terms of an objective cut-off point of contact duration, and turn

to a more dynamical explanation. As a first step in this direction, it is important to get

an idea of the general shape of the dynamical landscape of the CTRNN shown in Figure

9-1. When the CTRNN is decoupled from the 1-D environment it is characterized by

two fixed point attractors, depending on whether the input parameter I is set to 0 or 1

(see Figure 9-9). It turns out that the velocity of the agents is strongly coupled to the

value of this parameter. The velocity of the agent at attractor 0 , when input I = 0, is 0.86

units of space per unit of time, whereas for the attractor 1 , when I = 1, the velocity is -

168 | P a g e

0.96. This is indeed the basis for a tight sensory-motor coupling: the value of the input

parameter is largely determined by the movement of the agent, and the movement of the

agent is largely determined by the input parameter (for a similar result on the basis of a

related task, see Froese & Di Paolo 2008a).

(a)

(b)

Figure 9-9. The attractor landscape for the CTRNN shown in Figure 9-1. For 50 times the node

activations were initialized to random activation values drawn from the trial shown in Figure 9-3, and the

network was allowed to settle for 8000 time steps. The input was either (a) set to 0, which revealed a

fixed point attractor at (0.04, 0.91, 0.02), or (b) set to 1, which revealed a fixed point attractor at (0.98,

0.02, 0.88). The attractors are represented by a „*‟. Input-driven switching between the two attractors

results in a hysteresis of motor outputs.

It is also noteworthy that the switching behavior between the two attractors is

characterized by a form of hysteresis (see Figure 9-9 and Figure 9-10). We will explain

this behavior by means of binary approximation of the sigmoided outputs (left motor

node, right motor node, node 3). The only way for the system to settle at attractor 0 (0, 1,

0), for instance, is to approach it from (0, 0, 0). Thus, if input I = 0 and the system

happens to be at (1, 1, 0), then it first needs to pass through (0, 1, 0) and (0, 0, 0) before

finally reaching (0, 1, 0). Similarly, for the system to settle at attractor 1 , if input I = 1

and the system currently happens to be at attractor 0 (0, 1, 0), then it first switches to (1,

1, 0) before finally shifting toward attractor 1 at (1, 0, 1).

It is also important to note that there are significant differences in the trajectory speeds

of the different regions of activation space, something which cannot be seen in Figure

9-9 (but cf. Figure 9-10). In the case of attractor 0 the trajectories going from (1, 1, 0) to

169 | P a g e

(0, 1, 0) are initially relatively slow. But when the output of the right motor node drops

into the region of somewhat below 0.5, the system almost instantaneously switches off

the output of the left motor node. Then the trajectories continue to slowly make their

way from somewhere near (0, 0, 0) to the attractor 0 at (0, 1, 0). In the case of attractor 1 ,

if the system happens to be near attractor 0 at (0, 1, 0), then the system almost

immediately switches on the left motor by going into state (1, 1, 0), and only then

slowly begins to turn off the right motor as it approaches the attractor 1 at (1, 0, 1). This

effectively means that the time taken to switch between output velocities depends not

only on the current state of the input parameter, but is also determined by the current

state of the system. In other words, the behavior of the agents is not purely reactive to

the input parameter but, due to the hysteresis of the motor outputs with different

trajectory speeds, crucially depends on the agent‟s history of interactions. The influence

of internal state is completely missed out in the hypothetical circuit diagram shown in

Figure 9-8.

Moreover, this historical dimension in terms of the influence of previous interactions

brings us to the next step of our dynamical analysis, since up to now we have only

considered the CTRNN in isolation. How does the system behave when it is coupled to

the 1-D environment and interacts with other objects including the other agent? The

change in state of the system during the first 8000 time steps of the trial starting from

position (100, 500), as was shown in Figure 9-3, is of interest here. How does the

dynamical analysis help us to better understand what is going on?

Let us consider the hysteresis relationship between the left and right motors during this

trial, as shown in Figure 9-10. Before engaging in a new interaction in the environment,

the system is usually near attractor 0 (top-left) and thus with the output of the right motor

on full power. When the input parameter I changes from 0 to 1 the system jumps to

Region 1, where it starts to slowly pass down through Region 2. Here both motors are

competing somewhat and thus slow down the agent on the way back toward the source

of stimulation. This typically results in another contact that is more extended, and which

pushes the system far down into Region 3. From this region in state space the activation

of the left motor node decays rapidly toward 0, thereby returning the agent to its initial

rightwards motion, and slowly moving its state upwards to attractor 0 .

170 | P a g e

Region 1

Region 2

Region 3

Figure 9-10. Change of state between the left and right motors during the trial that was shown in Figure

9-3. Attractor 0 and attractor 1 are marked by „*‟ in the top-left and bottom-right, respectively. Arrows

indicate direction and relative velocity of trajectories (i.e. long arrows = fast trajectories, short arrows =

slow trajectories). For an explanation of Regions 1-3 in the state space, see the description in the text.

While this description of the hysteresis between the two motor nodes explains the

observed oscillatory behavior of the agent when it passes objects in its environment, it

does not yet explain how the agent distinguishes between the static object and the other

agent. We have already determined that the third contact is crucial for this decision

process. How can we explain this dynamically?

The third contact happens after the agent has made initial contact, turned around for

another contact, and is now on its way to return to its original rightward velocity. In

other words, the system is still tracing its way back up toward attractor 0 when it gets

perturbed again, and this pushes it directly across into Region 2 (rather than Region 1,

which happens after it is perturbed for the first time). The duration of stimulation during

this encounter determines how far down in the state space from Region 2 to Region 3

the system will end up. The more stimulation, the further toward Region 3, and the more

quickly the activation of the left motor node will decay, rapidly shifting the state back

leftwards. In this case the system will quickly resume its rightward velocity. In other

words, a lengthy stimulation results in a significantly quicker return to rightward

velocity, which prevents another contact with the object to occur, and the agent

therefore moves away. Otherwise, if the third simulation is short enough, for example

due to the responsiveness of the other agent, the system will spend some time slowly

moving down Region 2, before eventually reaching Region 3 and then switching

171 | P a g e

quickly as before. This difference in time needed for the left motor to become

deactivated, which is provided by the responsive counter-movement of the other agent,

is essentially at the basis of the agent‟s ability to distinguish between its static object

and that other agent.

In summary, contrary to the methodologically individualistic perspective that we briefly

entertained at the start of this dynamical analysis, we found that the discriminatory

ability exhibited by the agents only emerges during interaction. The processes that drive

the necessary internal state changes via appropriate input-switching are external to the

agent, and in this case they are partly constituted by the responsive behavior of the other

agent. An agent in an empty 1-D environment would be forever doomed to linear

movement, lacking the ability to internally switch between the two attractor landscapes.

Moreover, the hypothetical circuit of the agent‟s internal operations turned out to be

wholly inadequate. A more detailed dynamical analysis failed to find internal threshold

mechanisms, but instead revealed the important role of different temporal scales

distributed across the state-space, as well as the externalization of processes necessary

for the agent‟s discriminatory ability. These external processes did not perturb the

system along trajectories on a fixed state-space, but caused the entire state-space itself

to switch between different attractor landscapes. The behavior of the agents was thus

largely determined by dynamical transients rather than fixed attractors.

9.4 Discussion

The modeling results presented in this chapter have provided support for the idea that

the organization of an inter-individual interaction process can enable and constrain

individual behavior in ways that are beneficial for problem-solving. With experimental

setup 1 we replicated the findings of previous studies (Auvray, et al. 2009; Di Paolo, et

al. 2008), namely that the interaction process can enable the agents to complete a task

that appears impossible from an individual‟s perspective. The emergence of ongoing

perceptual crossing between two interacting agents gives rise to a form of multi-agent

interaction that depends on the ongoing interaction process, and at the same time also

makes it more likely for that kind of mutual interaction to persist. This reciprocal

dependency between individual agent behavior and overall interaction dynamics in this

172 | P a g e

modeling experiment is a paradigmatic example of the constitutive autonomy of the

interaction process (De Jaegher & Froese 2009).

The fact that the results of Auvray and colleagues can be replicated in a simulation

model involving relatively simple interacting dynamical systems indicates that such

autonomous interaction processes (and their enabling/constraining effects) might be

much more pervasive than at first assumed. Thus, while De Jaegher and Di Paolo (2007)

illustrate their enactive approach to social cognition with examples drawn from human

interactions, these models make a plausible case that more basic forms of life can give

rise to autonomous interaction processes, too. Indeed, it turns out that it is not even

necessary for an agent involved in such an interaction to intend to interact with another

agent 33 . Interactions in a multi-agent system are sufficient to effectively organize

individual behavior into joint actions that exceed the capacities of each agent alone,

even without the agents realizing that this is the case. As such, this model provides

concrete support for the claim that the interaction process has the capacity to expand an

agent‟s behavioral domain, even in the most minimal cases of multi-agent interaction.

We have shown how the interaction process is constitutive of the agents‟ successful

strategy, an essential contribution that remains hidden when focusing on the behavior of

an individual alone. If this constitutive impact is relevant for even such minimal forms

of interaction, we can begin to wonder: How much of abstract reasoning, the traditional

hallmark of human cognition, is based on our being embodied and embedded in the

complex multi-agent systems that define our socio-cultural context? This model can

thus provide us with a first sense of how sociality can play an important explanatory

role in defending the life-mind continuity thesis.

In experimental setup 2, we disrupted the sensory-motor loops of the modeled agents in

such a way that they could no longer properly regulate their individual behaviors.

However, engaging in reciprocal perceptual crossing remained a possibility within this

modified setup. The results show that the agents managed to complete the task in spite

33 Moreover, in the case of these model agents, they do not even fulfill the requirement for constitutive

autonomy (since their identity is externally defined), and thus their interaction does not strictly satisfy the

definition of multi-agent interaction developed in Chapter 4. On the other hand, could this be a model for

the emergence of an autonomous entity out of non-autonomous elements (cf. Froese & Di Paolo 2008b)?

173 | P a g e

of this sensory-motor disruption. It turns out that their remaining interactional capacity

was sufficient for the emergence of the relevant interaction dynamics, which then

effectively organized the impaired individual behaviors appropriately. This is an

indication that, given the right circumstances, an interaction process can enable and

organize the capacities of the interactors in such a way that individual impairment is

overcome in a collective manner, even without the need for external control. Here we

might have the beginning of an explanation of why human subjects with impaired

sensory-motor capacities can perform normally under social conditions (cf. Chapter 6,

pp. 104-111).

In experimental setup 3 we modified the task of the agents in such a way that their

individual behaviors and the overall interaction dynamics are in conflict, namely by

requiring the agents to interact with the other‟s shadow (an inherently unstable situation

in this setup). The results for this experiment give further support to De Jaegher and Di

Paolo‟s (2007) claim that, under some circumstances, an interaction process can

constrain the behaviors of the interacting agents such that it continues to persist even

despite the efforts of the individual interactors. In the simulation model, the agents

„struggle‟ to stay close to the other‟s shadow object, and nevertheless continually fall

back into the more stable interaction pattern of mutual perceptual crossing. This

experiment indicates that we should be careful when assessing responsibility for an

individual‟s behavior. The outcome of our actions can be unconsciously constrained in

undesirable directions by certain processes existing in our social context, thus leading to

conflict despite our intentions to behave otherwise.

Lastly, a lesson to be learnt from the dynamical analysis presented in Section 9.3 is that

we should be wary of positing hypothetical cybernetic circuits (box diagrams) that could

explain observed behavior. As Hurley writes in relation to her own „shared circuits

model‟ (SCM) of cognition: “While SCM is described cybernetically, dynamic systems

theory could represent interactions of its implementing neural processes and embodied

activity over time as evolution of a phase space, and investigate its attractor structure”

(Hurley 2008, p. 20). In this simple example we determined that the system was

sensitive to the duration of a stimulus, posited a comparator mechanism on the

operational level, and finally had to concede that no such mechanism, as a reified unit of

174 | P a g e

operation, exists within the actual system. The observed sensitivity is an emergent

outcome of the shape of the state space of the system and the temporality of the

environment with which it is coupled. It is the interaction process between the agents

which constrains their behavioral response so as to sustain this interaction. The duration

of stimulation is not an independent feature of the agent‟s environment, but something

which is co-determined through the velocities of both agents and their manner of

interacting. Thus, at best such models might help us to make the system‟s operations

more intelligible, but there is the danger that we are merely providing a re-description of

the behavior which emerges from the system‟s interactions with its environment (a

danger we have already alluded to in terms of the ER models presented in Chapter 7, pp.

114-120). It would be a category mistake to reify components of such a re-description

as real entities existing at the sub-personal level.

9.5 Summary

We replicated a recent simulation model of a minimalist experiment in perceptual

crossing and confirmed the results with significantly simpler artificial agents. A series

of psycho-physical tests of their behavior informed a hypothetical circuit model of their

internal operation. However, a detailed study of the actual internal dynamics reveals this

circuit model to be unfounded, thereby offering a tale of caution for those hypothesizing

about sub-personal processes in terms of behavioral observations (e.g. additional neural

mechanisms to account for Ian‟s gesturing, cf. Chapter 6, pp. 104-111). In particular, it

has been shown that the appropriate behavior of the agents largely emerges out of the

interaction process itself rather than being an individual achievement alone.

We also extended the original simulation model in two novel directions in order to

further test the extent to which perceptual crossing between agents can self-organize in

a robust manner. These modeling results suggest new hypotheses that can become the

basis for further psychological experiments. This chapter thereby has contributed to the

ongoing efforts to establish a mutually informative dialogue between psychology and

evolutionary robotics (cf. Rohde 2008), especially in order to investigate the dynamics

of social interaction.

175 | P a g e

The simulation experiments presented in this paper lend support to the enactive

approach to social cognition. The results show that, under some circumstances, an

interaction process can take on a self-organizing identity of its own, and that such a

process can effectively enable and constrain the behavioral repertoire of the individuals

involved in the interaction. More precisely, the experiments indicate (i) that some

interaction processes can beneficially extend the behavioral domain of individuals in

novel directions without requiring any form of external supervision, and conversely (ii)

that some interaction processes can organize behavioral repertoires in ways that are in

conflict with an individual‟s aims. More research is needed in order to better understand

the circumstances which lead to one or the other situation. Future work might eventually

help us to better structure our social environment such that beneficial situations are

more likely to emerge spontaneously.

176 | P a g e

10 Investigating social interaction

The original experimental setup by Auvray, et al. (2009) presented participants with a

task that is largely epistemic in nature, i.e. click when you think that you have located

the other participant. Under the circumstances this kind of task invites an individual

strategy based on detached reflection and action. The surprising result is that the

organization of the setup made it possible for the participants to solve the task with

which they were individually faced in a collective manner, though they were themselves

unaware of the collective nature of their success. The autonomy of the interaction

process itself ensured that their individual strategies were effectively complementary in

their impact. The modeling experiments presented in the previous chapter presented

further evidence for the robustness of this effect.

But are these modeling experiments of the previous chapter also investigations into

social interaction as we have defined it in Chapter 4 (cf. pp. 64-71)? It appears that what

we have been investigating so far is the kind of interaction we have defined as a multiagent

interaction (cf. Chapter 4, pp. 58-64). The existence of an autonomous interaction

process between two or more agents is not enough. What is missing is a co-regulated

act. In other words, a social act requires that a participant regulates their sensory-motor

coupling such that the behavior is completed by the regulation of at least one other

agent. Only with this particular kind of co-regulation of interaction does an agent‟s

behavior qualify as properly social, rather than being essentially a solitary behavior that

sometimes just happens to involve another individual. Accordingly, the experimental

setups we have investigated in the previous chapter have given us insights into the

dynamics of the interaction process and its power to organize the behavior of individual

agents even without the presence of social interaction. This is an achievement in its own

right, as well as an important step toward showing that a consideration of sociality

enables us to address the „cognitive gap‟ of the life-mind continuity thesis from the

bottom-up.

In this chapter we will continue along this path by using an evolutionary robotics (ER)

approach to fine-tune the original experimental design so as to determine the minimal

conditions under which it becomes possible to study social interaction within a

177 | P a g e

perceptual crossing setup. First, we will present two modeling experiments that

motivate changes to the original setup by further marginalizing the role of individualbased

strategies, and then we report on a modeling experiment which motivates a

change to the task given to the participants in order to encourage social interaction. The

chapter concludes with a novel hypothesis about the role of social interaction in

detecting social contingency that is open to empirical verification.

10.1 Experimental setup 4: Infinitely small objects

It has been demonstrated in Chapter 9 that the simulated agents make use of the

duration of contact with objects in order to discriminate interactions with a static object

(always same length of stimulation for same velocity) and the other agent (potentially

shorter or longer stimulation, depending on whether the other passes by in in-phase or

anti-phase movement). This is a reliable basis of distinction because the other agent is

always moving. It is unlikely, however, that this is the main strategy employed by

humans in the original psychological study.

To be sure, during the training phase the participants were asked to interact with a fourpixel

wide object in three conditions. The target object was either (i) static, (ii) moving

at a constant speed of 15 units/second, or (iii) moving at a constant speed of 30

units/second, and each of these one min. training phases was announced as such. It

could thus be possible that the participants learned the correlation between contact

duration and whether an object is static or moving. In practice, however, the difference

in duration is small enough such that it is unlikely to be the main strategy of the

participants, though there is some evidence that shorter contact duration made a

difference, leading to 31.3% of clicking response (cf. event E6 [1, p. 40]). Still, we

hypothesize that the successful behavior of the participants could be based on different

types of interactions afforded by the static object and the other active participant, rather

than their differing durations of isolated contacts.

We know from the experiments in Chapter 9 that even when a difference in duration of

contact was detectable, the evolved strategy still depended constitutively on some coregulatory

activity of the agents, even though they were greatly aided by an external

178 | P a g e

factor: body and object size. The question we want to address is: can we use ER to

investigate the kinds of strategies that are available when such a duration-based strategy

is excluded from the experimental design? One way to approach this is to make all

objects (i.e. agents, shadow objects, and static objects) within the virtual environment

infinitely small. This can be done by simply checking whether the sign of the difference

of the locations (of the agent and some target object) has changed compared to the

previous time step. If the sign has changed, then we activate the agent‟s receptor field.

Since in this case all objects afford an equal duration of contact (i.e. 1 time step) no

matter the velocity, it is no longer possible for the agents to trivially rely on the fact that

other moving objects entail a shorter (or longer) duration of contact. Can we use ER to

generate a strategy that enables the agents to successfully locate each other even under

this more ambiguous situation?

We successfully evolved a 6-node CTRNN to cope with this modified experimental

setup. To make the solutions more evolvable it was necessary to include a large

magnitude of input gains (range [-1000, 1000]) so as to compensate for the minimal

period of stimulation, and reducing the amount of sensory delay to 5 units of time was

also helpful. As in the previous modeling experiments, the solutions were evaluated in

terms of how close the two agents were to each other on average during a trial. To test

the robustness of this solution we performed a comprehensive set of test trials. An

overview of this solution‟s fitness is depicted in Figure 10-1. The average score is not

significantly different from that of experimental setup 1 (cf. Chapter 9, pp. 148-157).

179 | P a g e

600

500

400

300

200

100

0.9

0.85

0.8

0.75

0.7

0.65

100 200 300 400 500 600

Figure 10-1. Graphical representation of fitness scores achieved at each possible combination of starting

positions for agent „up‟ (x-axis) and agent „down‟ (y-axis). Note that the axes wrap around due to the 1-D

circular shape of the virtual environment. Fitness scores range from 0.62 to 0.94 with an average of 0.84.

How does the evolved solution manage to consistently solve the task under these

modified conditions? It is helpful to describe the strategy in terms of a representative

trial (shown in Figure 10-2). Unfortunately, it turns out that the strategy of the agents is

based on the close proximity of the two shadow objects. All the agents have to do is

distinguish between one stimulation and two consecutive stimulations. This is a robust

individual-based strategy to locate the other since: (i) passing the static object only

causes one activation of the receptor, and (ii) passing the other agent with its attached

shadow results in two activations. In other words, the evolutionary process has found

another solution that essentially relies on a factor that is external to the interaction

process, namely the spatial relationship between the agents and their shadows.

180 | P a g e

Figure 10-2. Illustration of the behavior of the agents during a representative trial starting from point

(246, 436) with a score of 0.79. They first encounter their respective static objects, then continue

searching, and finally locate each other and establish perceptual crossing until the end of the run (16000

time steps). Top: the position of the agents and objects over time. Middle: the status of the receptor field

and the velocity of agent „up‟ over time. Bottom: the status of the receptor field and the velocity of agent

„down‟ over time. Note that a change of receptor field status reaches the agent‟s controller only after a

delay of 5 units of time (50 time steps).

Ironically, this behavioral strategy is robust because the shadow, which was meant to

introduce an essential ambiguity into the experiment, has been appropriated to

disambiguate the target from the static object. Is this a strategy that would be used by

the human participants of the original study? Participants were indeed told about the

experimental setup, including the three types of objects that they could encounter, but

“the precise relation of the mobile lure yoked to the avatar was not explained” (Auvray,

181 | P a g e

et al. 2009, p. 38). Nevertheless, a large percentage of responses was preceded by a

double stimulation (event E2, 32.3%, ibid. p. 40), indicating that the shadow-link might

have played a role in the positive empirical results.

10.2 Experimental setup 5: Maximally distant shadows

What kind of strategies would be available if participants cannot take advantage of the

external agent-shadow relationship? Of course, we do not want to completely sever the

link between the movements of the agents and their shadow objects, since this is an

essential aspect of the experimental setup (socially contingent vs. active but noncontingent

interactions). Instead, we make the link between them maximally distant

(150 units) 34 . The rest of the setup remains the same as in the previous experiment. In

this manner we want to test whether the evolutionary algorithm can come up with

strategies that depend on the co-regulated activity of the agents alone. We evolved 6-

node CTRNNs for several thousand generations and then chose the fittest solutions to

run test trials. The outcome of a typical trial is shown in Figure 10-3.

34 Strictly speaking, in a 600 unit-wide circular world, being apart 300 units would be maximally distant.

However, in this case there is another stable perceptual crossing situation, in which the agents can interact

at a distance by stimulating each other with their shadow objects.

182 | P a g e

Figure 10-3. Illustration of the behavior of the agents during a representative trial starting from point

(314, 411) with a score of 0.56. See caption of Figure 10-2 and text for a more detailed description of the

graphs. Note the same patterns of receptor stimulation for agent „up‟ (middle graph) both when it interacts

with its static object as well as when it first interacts with agent „down‟ during the middle of the trial.

First, the agents explore their static objects for some time. Then they proceed to explore

the rest of the space, encounter each other and engage in some initial perceptual

crossing. This mutual interaction breaks down after a while, and they continue exploring

until they re-establish perceptual crossing at another location. Such break-downs occur

more often than in the original setup, because agents are more likely to miss each other

with infinitely small object sizes. Indeed, the possibility of interaction break-down

183 | P a g e

could be a first indication that the agents have to be much more active and responsive in

their interaction in order to disambiguate the situation. They cannot make use of

persistent and reliable external factors to assess the viability of their behavior, and thus

they are more open to commit errors and make mistaken responses. Due to this

variability the behavior of the agents during the trial run thus looks much more lifelike

than that of previous solutions. This modeling experiment leads us to the prediction that

the performance of human participants under these modified conditions would not be

significantly different than from the original setup. In other words, while object size and

shadow link width might aid some individual-based strategies, these factors are neither

necessary nor essential for the collective success.

However, there still remains a problem in terms of this model. When the agents meet

without receiving different stimulation beforehand, they engage on the basis of identical

controllers (same structure and same internal state) such that they will mirror their

behavior perfectly. This produces the same sensory-motor correlation as if they were

oscillating around their static object. And since agents are more likely to encounter each

other, evolution produces solutions which treat the occurrence of this sensory-motor

pattern as always being due to the other rather than to the static object (a good choice,

given the circumstances). However, occasionally this will result in both agents getting

stuck oscillating around their static objects for the whole of a trial, giving rise to what

looks like truly pathological behavior (for similar problems on a related task, cf. Rohde

& Di Paolo, 2008).

10.3 Experimental setup 6: Coordinated behavior

How can we use ER to generate solutions that are better at distinguishing the other

agent from the static object? As a first step, we remove the possibility of functionally

identical CTRNNs encountering each other by simply activating the receptor field of a

randomly chosen agent at the start of the trial, thus ensuring a minimal difference in

individual histories. Moreover, in Chapter 8 we showed that sensitivity to social

contingency can emerge from the interaction process if agents are required to coordinate

their behaviors in a way that forces them to break the symmetry of their interactions.

We therefore introduce some additional requirements into the fitness function. First, we

184 | P a g e

also explicitly reward the agents for crossing each other, rather than just remaining in

spatial proximity. Second, if they engage in perceptual crossing (defined as at least two

consecutive crosses, less than 10 units of time apart), we increase the reward

proportionally to the distance traveled together (they are not rewarded for traveling

alone). The evaluation function combines these factors as follows:

// If no perceptual crossing (PC), then just use mean distance

if (NumOfPercCrossings == 0)

trialFitness = meanDistance;

// If there has been some PC, then increase reward incrementally

else if (NumOfPercCrossings < 20)

trialFitness = meanDistance + NumOfPercCrossings;

// If more PC, then increase reward according to displacement

else

trialFitness = meanDistance + 20 + DistanceTraveled;

We found that capping the influence of number of perceptual crossings at 20, and only

then taking the distance traveled together into account increased the evolvability of the

solutions. Note that since the agents are structural clones, the traveling together is a nontrivial

activity since it requires them to break the symmetry of their interactions. Note

also that because it is impossible to coordinate this maneuver with a static object, we

have further emphasized the possibility of distinguishing the other in terms of its

responsiveness (i.e. response to attempts at breaking behavioral symmetry).

We evolved 8-node CTRNNs with this modified fitness function. The agents are able to

coordinate their behavior so as to travel together while interacting (cf. Figure 10-4).

While engaging in perceptual crossing, the agents eventually start to drift together

horizontally. In other words, even though the agents are structurally identical, have

minimally different histories (internal states), and are not affected by noise during the

trial, they are nevertheless able to regulate the interaction such that the symmetry of

their individual behaviors is broken. In fact, when one of the agents encounters its static

object during this coordination process, the agents are able to re-negotiate the direction

of drift and return the other way, much like what was found in the pioneering work by

Quinn, et al. (2003). It is important to emphasize that the ability of the agents to

negotiate the direction of travel in either direction amounts to an interactively mediated

185 | P a g e

expansion of their individual behavioral domains. As solitary agents they can only

traverse the environment in one direction.

Figure 10-4. A representative trial run starting from (90, 390) and scoring 122 points. Initially, one agent

gets stuck on its static object, and the other on the other‟s shadow. Then the shadow interaction breaks

down, the agents meet and start moving together until the end of the trial. Note that they jointly bounce

back from the static object located at position 448.

Nevertheless, it is still the case that this strategy is not as robust as the solutions which

we have excluded by modifying the experimental design. For instance, if both agents

happen to encounter their static objects at the start of the trial, it is possible that they

simply continue to oscillate around them until the trial is terminated (typically obtaining

a fitness score between 0.25 and 0). A related problem occurs if the two agents meet

while having the same internal state (e.g. due to same history of interactions), perform

exactly the same behavior, and then move apart. In both cases the problem is that there

is no principled way for the agents to tell apart an interaction with a static object and

interaction with another agent with identical internal state. Both give rise to the same

basic pattern of stimulation. Surprisingly, this problem occurs even if the possibility of

identity of internal state has been removed in principle. The initial binary difference in

stimulus, which we introduced to give the agents at least a minimally different history

of interactions, is ineffective because it was not exploited by any long-term activity of

the CTRNN controllers. The fast time constants of the evolved CTRNN entail that this

difference is not carried for long as a difference in internal state.

At first sight the failure to break away from the static object appears to be evidence that

the agents are not sensitive to the social contingency of their interaction; otherwise they

would presumably notice the lack of responsiveness of the object and eventually move

away. But this way of looking at the problem essentially demands an individual

186 | P a g e

esponse alone, i.e. detecting a lack of social contingency when there is none to be

detected. In contrast, perhaps the existence of this pathological behavior is an indication

of the truly social nature of the evolved solution? Indeed, if only one agent becomes

trapped it will eventually be freed by the other agent, which entrains it in an interaction

process such that they move away together. In other words, this solution depends on the

responsiveness of the other to such an extent that if the presumed „other‟ is not

responsive to the interaction the strategy fails 35 .

This individual failure can also be related to an important insight we can learn from the

experimental design process, namely just how difficult it is to evolve a behavioral

strategy that is primarily (or exclusively) based on mutually contingent interaction. One

important aspect of this difficulty is surely that detecting an object‟s responsiveness as

such is a much more demanding task than detecting environmental cues (e.g. difference

in stimulus duration, difference in number of contacts, difference in noise, etc.). This is

because the latter phenomena can become manifest in conditions that are largely

independent of an agent‟s behavior (e.g. as long as an agent moves, passing the other

„agent-shadow‟ will cause two stimulations, while passing the static object will cause

one stimulation). Moreover, basing a behavioral strategy on the other introduces an

inherent risk to the situation. The other‟s behavior can be influenced by your own

actions, but it evades your direct control in principle. This is especially problematic

when the presumed „other‟ does not react to you in a suitable manner, but your

individual ability depends on the other‟s behavior. This is the case for the „pathological‟

behavior of the simulated agents.

The increasing complexity of the task is also indicated by the practical need that we

have had to increase the number of nodes in the evolving CTRNN controller to support

35 The fact that the agents in this modeling experiment are unable to distinguish between the static object

and the other individually, but can do so when the other is present, deserves further study, especially in

relation to empirical findings in the cognitive sciences. For example, studies of rehabilitation after brain

damage have shown that patients often find (i) sensory-motor tasks impossible to achieve individually in

an abstract context, (ii) have difficulty with them in a pragmatic context, and (iii) can function almost

normally in socially situated circumstances (cf. Gallagher & Marcel 1999).

187 | P a g e

etter evolvability (three nodes for the experiments in Chapter 8, four for Chapter 9, six

for Section 10.1 and 10.2, eight for this section). In order to determine whether it is

possible to synthesize a solution that performs robustly even when confronted by the

static object in isolation, we therefore repeated the evolutionary process but now with

10-node CTRNNs. Presumably, giving the individual agents more complex controllers

will make it more likely that they are able to resolve the situation appropriately in terms

of individual-based strategies as well. For this setup we also removed the random initial

stimulus, since this attempt to influence the internal state of the agents made no

difference to the previous strategy. We comprehensively tested the best solution after a

few thousand generations of optimization. The results are shown in Figure 10-5.

Figure 10-5. Graphical representation of fitness scores at each possible combination of starting positions

for agent „up‟ (x-axis) and agent „down‟ (y-axis). Note that the axes wrap around due to the 1-D circular

shape of the environment. Left: Trials of 1600 units of time. Fitness scores range from 3.53 to 316 with an

average of 128. Right: Trials of 3200 units of time. Fitness scores range from 121 to 316 with an average

of 129.

Two things are immediately evident from these test results. First, the evolved strategy is

very robust in coping with different starting positions, and second, even the worst trials

are not complete failures. Moreover, the regions of low fitness visible in the fitness map

shown on the left of Figure 10-5 are due to unfavorable starting conditions, which

require more time to resolve successfully. A comprehensive test with trials that are

twice as long does not show any problematic regions (cf. the fitness map on the right of

Figure 10-5). This demonstrates that the agents are able to disentangle themselves from

the static object even without the aid of the other agent (otherwise there would be

188 | P a g e

egions of low fitness for those initial conditions where both agents first encounter their

respective static objects). A representative trial of this situation is shown in Figure 10-6.

Figure 10-6. A representative trial run starting from (238, 289) and scoring 122. Initially, the agents get

stuck on their static objects. Then this interaction breaks down without outside interference, the agents

eventually meet and start moving away together until the end of the trial. Note that agent „down‟ gets

perturbed by the other‟s shadow on the way to its static object and then undergoes an additional iteration

of interactions.

We have thus managed to evolve a strategy that can cope with the high ambiguity of

this modified experimental setup successfully. The agents never get completely trapped

by their static objects or the shadow object of their partner. When an agent meets its

static object it engages in bursts of interaction that slowly decrease in frequency until

the agent eventually breaks free and continues on its way. To be sure, if the agents meet

189 | P a g e

each other with identical internal states then they will disengage from the interaction in

the same manner, because this situation is in principle identical to interacting with a

static object. But if their internal states are at variance due to differing histories of

interaction, then they are capable of breaking the symmetry of their behavior and

engage in perceptual crossing while jointly traveling around the environment. Indeed,

their internal states are different most of the time, especially because the occasional and

uncorrelated perturbations due to the other agent‟s shadow object function as a source

of noise. In Figure 10-6, for example, we can see that this kind of perturbation causes

agent „down‟ to undergo an additional burst of interactions with its static object. Thus, if

the two agents meet with different states we know that their behavior will not be

identical and that they will create a qualitatively different pattern of interaction. Toward

the end of the trial shown in Figure 10-6 we can see this: the mutual interaction does not

exhibit the same slow decrease in frequency of bursts of interactions, but is more

irregular.

Since these agents can successfully solve the task, we can hypothesize that the outcome

of the original psychological study will not be significantly altered when making objects

infinitely small and displacing the shadow object by 150 units. The evolutionary

robotics methodology has thus allowed us to fine-tune the essential elements of the

experimental design. Indeed, we can venture a further hypothesis that part of the reason

why the original study found less response to the static object, when compared to the

two mobile objects, was that entrainment with this object was often broken by the

actions of the other participant (either because of its shadow or its receptor).

10.4 Summary

The strategy of the agents in the final experimental setup is composed of both individual

and interactive factors: the agents are individually capable of avoiding static objects

(and shadow objects), and they can also perturb each other through their interaction so

as to sustain their mutual interaction without depending on factors that are external to

that interaction. But have we finally managed to synthesize a model of social interaction

rather than just multi-agent interaction as proposed in the introduction to this chapter?

190 | P a g e

We know that we are dealing with models of social interaction in the case of those

experiments where the agents are required to travel together, because such coordinated

movement is a type of activity that cannot be achieved by the regulation of any

individual alone. In particular, the agents must break the symmetry of their behaviors;

coordinate a direction of movement, and then move together in that direction while

continuing to interact. Moreover, this coordination is flexible in that the direction of

movement can be re-negotiated if necessary, for example when faced by a static object.

It is also worth emphasizing that, because we eliminated external clues about the

location of the other agent (all objects give rise to the same minimal, solitary stimulus

upon contact), the agents have to disambiguate the experimental situation by means of

the properties of the interaction process itself. A successful strategy requires, in

anthropomorphic terms, that an agent proposes a particular direction of travel, and that

this offer is accepted by the other agent. Otherwise there is an opening for further

negotiation, or the interaction simply fails (as in the case of interacting with the static

object). Here we thus have all the necessary ingredients to speak of a model of social

interaction.

On the basis of these modeling results it is possible to hypothesize that if the

psychological experiment is repeated with this modified setup (e.g. infinitely small

objects and maximally distant shadow objects) and modified task (e.g. primarily

pragmatic rather than epistemic), we will find, in contrast to the original study, that

there is a statistically significantly higher probability of clicking in response to

encounters with the other participant. In fact, it is easy to imagine that if participants

only clicked when they have managed to establish sideways perceptual crossing 36 , then

the clicking responses could easily be 100% correct (since such co-regulated behavior is

simply impossible with the shadow or the static object). This would be an example

where the presence of social interaction significantly improves the individual abilities

when compared to multi-agent interaction alone.

36 That such coordinated behavior is in fact possible for human participants, at least under original

experimental conditions, has already been demonstrated by some exploratory studies. The subjects were

asked to engage in sideways movement together without being told any direction in advance (Di Paolo,

personal communication).

191 | P a g e

Finally, in addition to this increase in behavioral performance, we can expect to find

some qualitative differences in the participants‟ experience as well. Indeed, whereas in

the original psychological experiment the socially contingent and non-contingent

situations entailed no difference in meaning for the participants (i.e. there was no

statistically significant difference in clicking response), we can hypothesize that

successful completion of this modified task is associated with a specifically social

phenomenology. As such, we might have found minimal experimental conditions for

participatory sense-making, which would make the modified psychological study an

especially appropriate target for phenomenological investigation, perhaps by means of

explicitation interviews (e.g. Petitmengin 2006). This would be a novel opportunity to

determine the structural and qualitative differences between object- and otherperception

under minimalist and controllable conditions. We will return to the

phenomenology of intersubjectivity in Chapter 12.

10.5 Discussion

The simulation models that were presented in Chapters 8 to 10 together form a series of

thought experiments that were inspired by the enactive approach to cognition, especially

by the theoretical framework that was developed in Chapter 4. However, in order count

as strong support for the life-mind continuity thesis proposed by the enactive paradigm,

a final problem must be addressed. There is still a lingering worry that the results of

these modeling experiments can simply be appropriated by a more general, embodied

and dynamical approach to cognitive science (cf. Section 2.2). In other words, if the

model agents are not autonomous in the enactive sense of the term, then what makes

these results supportive of that paradigm rather than of a broadly conceived embodiedembedded

cognitive science? To be sure, the models can be insightful for a variety of

different approaches. But do they also provide insights that are specifically „enactive‟

such that they could not also have simply been accounted for by, for instance, a generic

dynamical approach to social cognition?

In response to these worries it is important first to emphasize the widespread impact

which the enactive paradigm has had ever since the publication of The Embodied Mind

by Varela et al. in 1991. In other words, many of the core themes of contemporary

192 | P a g e

embodied-embedded cognitive science, including the role of embodiment, situatedness,

dynamics, emergence and active perception, have been strongly influenced by the ideas

of that book. Moreover, certainly the evolutionary robotics methodology that we have

used to synthesize the models is also a popular tool for other embodied-embedded

approaches. But they are rarely aware of the fact that the methodology‟s dynamical

perspective on behavior and cognition follows directly from an autopoietic perspective

on life when two key abstractions are made (Beer 1995a): (i) we focus our investigation

on an agent‟s behavioral dynamics alone, and (ii) we abstract the set of destructive

perturbations that this agent can undergo as a viability constraint. Since these two

abstractions basically make or break the relevance of the dynamical approach for the

enactive paradigm, it is worth spelling out Beer‟s argument in more detail.

Beer (2004) begins with the observation that a natural agent‟s normal behavior takes

place within a highly structured subset of its total domain of interaction. This makes it

possible to capture the behavioral dynamics while ignoring other structural details

which may not be directly relevant. Moreover, since it is only meaningful to study an

agent‟s behavior while it is living, we can largely take an agent‟s ongoing metabolic

autonomy for granted in our models. This abstraction fits nicely with Barandiaran and

Moreno‟s (2006) claim that the hierarchical decoupling of the nervous system entails

that the dynamic organization of cognition is metabolically underdetermined. Finally,

the possibility of undergoing a lethal interaction is represented as a viability constraint

on the agent‟s behavior such that if any actions are ever taken that carry the agent into

this terminal state, no further behavior is possible.

It follows from these considerations that research with artificial embodied-embedded

systems has the potential to develop a mutually informing relationship with some of the

theoretical foundations of enactive cognitive science. However, the insights generated

by this research are equally informative for other approaches in cognitive science, even

those whose interest in embodied, embedded and dynamical phenomena is merely in

terms of an extension to functionalism (e.g. Wheeler 2005; Clark 1997). This ongoing

appropriation should come as a warning to the enactive paradigm, especially because

there is a growing consensus that functionalism is fundamentally incompatible with the

enactive notion of autonomy (cf. Di Paolo 2009; Thompson & Stapleton 2009). Indeed,

193 | P a g e

due to its functional level of abstraction much of embodied-embedded research cannot

aid our understanding of how natural cognition arises through the precarious, selfconstituted

activity of biological systems (cf. Section 3.4). As such, the embodiedembedded

approach is unable to systematically address the kind of criticisms which

have recently been leveled against it by Dreyfus and others (cf. Section 2.1). Have the

models presented in this thesis managed to avoid this dilemma?

Arguably, this is at least partly the case. There is one clear sense in which this modeling

work is specifically situated within the enactive paradigm alone, namely the constant

focus on constitutive autonomy. To be sure, the simulated agents themselves were not

autonomous in this sense, but this was the result of a pragmatic choice to focus on the

dynamics of the interaction process instead. Given the current state of the art, the design

of a system, which when run gives rise to autonomous entities which happen to interact

with their environment in such a manner so as to engage in mutual interactions that give

rise to an autonomous interaction process, would have first required us to address the

unresolved problem of „second-order emergence‟ (Froese & Ziemke 2009). While this

is a worthwhile goal in itself, it would have unnecessarily distracted from the actual

target of this investigation, namely the constitutive role of the autonomous interaction

process for bootstrapping individual behavior. This view on the interaction process in

itself, as an autonomous system, is thoroughly enactive. It has opened up the possibility

of a systematic research program that can bridge the „cognitive gap‟ of the life-mind

continuity thesis (Froese & Di Paolo 2009; De Jagher & Froese 2009): the enactive

paradigm can potentially integrate the distance between the life of a single cell and the

mind of a human being into one complex meshwork of autonomous systems.

In addition, by means of the enactive detour through a focus on the autonomy of the

interaction process, these models might actually represent an important step toward

using evolutionary robotics not as a method to optimize predefined „agents‟, but rather

as a way to generate the conditions for the self-organization of such systems (cf. Froese

& Di Paolo 2008b). As such, the insights generated by these models will not be easily

appropriated by some form of dynamical functionalism. In fact, it is not even clear how

the organization of the self-sustaining dynamics of the interaction process can be

captured in quantitative terms, or if that is going to be possible at all (cf. Stewart 2000).

194 | P a g e

It remains to be seen to what extent we can go beyond psycho-physical tests and

qualitative descriptions when it comes to the study of an autonomous organization.

Finally, there is another crucial difference between the enactive paradigm and generic

embodied-embedded cognitive science apart from the concept of autonomy, and that is

the role played by our very own experience. In fact, lived experience has arguably been

the central theme of its research program, though many proponents of „enaction‟ have

preferred to focus on the „embodied-embedded‟ insights of The Embodied Mind, rather

than on developing the implications of its provocative subtitle Cognitive Science and

Human Experience. Nonetheless, the inception of the enactive paradigm started on the

foundation of human experience, and its recent incorporation of the autopoietic tradition

was similarly motivated by existential concerns (Weber & Varela 2002): How can we

account for the fact that we care, that we are concerned beings with a point of view that

enables a world to show up in a meaningful manner? We must have recourse to our own

experience in order to verify whether we in fact are such beings and, consequently,

whether these questions require an answer or not. It is from this experiential basis that

the theoretical framework of the enactive paradigm ultimately derives, and it is to this

basis that our investigations must ultimately return.

At this point we can submit a more general criticism to embodied-embedded cognitive

science. While the field has been happy to appeal to the existential phenomenology of

Heidegger, Merleau-Ponty, Dreyfus and others in order to support the use of an

embodied, embedded and dynamical framework, this effort has largely remained a

scholarly exercise rather than an experiential inquiry. To be sure, so far this approach

has been a productive one, but signs of stagnation are already appearing. We ourselves

need to start diving into these forgotten realms of experience if we want to continue to

push cognitive science in new directions. More yet: we need to make sure that the kind

of science we derive from our insights lets us return to lived experience with newfound

understanding. This is a tall order indeed, with profound implications, and the methods

employed in this thesis have hardly done justice to the task. But at least the models have

in the end led us to propose a testable hypothesis about the dynamical conditions that

must hold in order for an experience to be lived as qualitatively social. Of course,

further work remains to be done before such a test is possible with sufficient scientific

195 | P a g e

igor, especially because there is still a need to devise adequate means of specifying the

structure and content of experience. While a more detailed response to this challenge is

beyond the scope of this thesis, in the following chapters we will conduct some initial

explorations in the phenomenology of social life.

196 | P a g e

11 Beyond methodological physicalism

In the previous chapters we have used evolutionary robotics and a dynamical systems

approach to gain a better understanding of the processes by which individual and social

domains of phenomena can be constitutively interrelated. The modeling experiments

have demonstrated that it is possible to question the widespread assumption of

methodological individualism, and to do so without thereby descending into some

mysterious notion of social life. On the contrary, the systemic approach has enabled us

to capture something of the specificity of social phenomena, namely the organization of

their relational structure, and thereby given novel credibility to the life-mind continuity

thesis. We have used the mutually supportive concepts of the enactive approach, i.e. the

notions of autonomy, sense-making, embodiment, and emergence, to outline the

beginnings of a research program into the origins of cognition on the basis of these

foundational biological principles.

It appears that we have accomplished what we set out to achieve, and so it might be

expected that the thesis ends here. But if we stopped our investigation here we would

neglect an essential component of enactive cognitive science, namely the investigation

of the subjective dimension of our bodily existence. The systematic integration of the

living body (systems biology) and the lived body (phenomenological philosophy)

around a normatively laden conception of life is one of the great achievements of the

enactive approach. However, apart from indicating some structural constraints, so far

we have said nothing about what it is like to be in a social situation. And it is precisely

by addressing this first-person aspect of subjectivity that we move beyond the purely

systemic (e.g. Luhmann 1984; Maturana & Varela 1987) and dialectical materialist (e.g.

Vygotsky 1934) approaches to the social. Both of these have much in common with the

enactive approach, though they are limited by an impoverished conceptualization of the

target phenomenon.

Of course, the task of explicating the characteristic nature of experiential phenomena in

a manner that is scientifically respectable is beset by many difficulties, and an important

part of future research will be to develop sound first- and second-person methodologies

(e.g. Depraz, et al. 2003; Petitmengin 2006). It is beyond the scope of this thesis to enter

197 | P a g e

into this complex debate (cf. Varela & Shear 1999), except to point out that even in this

endeavor minimalist technological interfaces could feature as a crucial tool (cf. Froese

& Spiers 2007; Auvray, et al. 2007). Instead, we will have recourse to insights from the

phenomenological tradition, an important philosophical movement that was founded by

Husserl in the beginning of the 20 th century and was continued by Heidegger, Merleau-

Ponty, Scheler, Sartre and others. Phenomenology has recently been brought into closer

connection with cognitive science (cf. Gallagher & Zahavi 2008; Zahavi 2005; Roy, et

al. 1999; Gallagher 1997), and has been an integral part to the enactive approach from

the start (e.g. Varela, et al. 1991; Varela 1996b). Here we are going to appeal to central

insights that have more or less withstood the test of time. Nevertheless, it is essential

that this is not just a scholarly exercise, so the validity of these insights will be verified

as best as possible by illustrating them with concrete examples from everyday

experience.

So what is it like to live through a social phenomenon? This question, if directed to our

everyday selves, throws us right into the middle of a complex mixture of experience,

preconception, and interpretation which is hard to disentangle. As already alluded to in

Chapter 2, just like our assumptions determine the way we make sense of theories and

empirical data, so they shape the sense-making of our experience. We have already

addressed the problem of methodological individualism, but there is another widespread

assumption that is preventing a proper appreciation of the social. To put it boldly: most

research in social cognition is trapped within an idealized world of physical forces. This

is because much of mainstream science, with the exception of some progressive physics

(cf. Bitbol 2002), is governed by a degenerate form of Cartesian metaphysics. To be

sure, it differs somewhat from the substance dualism of the 17 th century, which divided

all phenomena into the physical (res extensa) and the mental (res cogitans), by

collapsing both domains into the physical 37 . However, as a remainder of this original

abstraction, materialism is no different from idealism: Both give ontological primacy to

the abstract organization of knowledge (ideas) over the concrete manifestation of the

37 The shift from a dualistic to a degenerate monist metaphysics has not been limited to the scientific

community alone, as attested to by the common reference to the brain when actually talking about mental

phenomena (e.g. “that is too much for my brain to compute”).

198 | P a g e

phenomena (being) which they are meant to explain. In the case of materialism the most

basic form of this knowledge is expressed in terms of the laws of physics, which is why

we will call this assumption methodological physicalism. In its most extreme form this

doctrine is expressed as what has been called „scientism‟, i.e. the belief that nothing

really exists but what is scientifically demonstrated. It will be the aim of the following

two chapters to question the validity of methodological physicalism.

How is this assumption expressed in cognitive science? For example, it is accepted as

an unquestioned fact that our primary and most basic access to the world is objectcentered

and defined by purely physical terms, such that “all mammals live in basically

the same sensory-motor world of permanent objects arrayed in a representational space”

(Tomasello 1999, p. 16). In the case of humans this basic access eventually gets

complemented during development by a folk psychology that lets us „see through the

surface‟ of bodily movements such as arm extensions, finger curlings, etc., to the hidden

intentions of others (Meltzoff 1995). But this is only an achievement of higher cognitive

functions. Thus, for example, when an adult talks to an infant too young to comprehend

a joint attentional scene, for the infant “the adult is just making noises” (Tomasello

1999, p. 100), i.e. the physical sounds produced by wet tissue rapidly slapping together

in the throat cavity. Methodological physicalism is thus at the root of the famous

„problem of other minds‟, i.e. the question of how it is possible to „see through the

surface‟, and guides the interpretation of most empirical results.

In contrast to the metaphysically biased starting point of mainstream cognitive science

the enactive approach is based on phenomenological considerations, i.e. a return to the

primacy of experience itself. Its central starting point is the claim that our primary mode

of understanding the world is in terms of sense-making. If we pay close attention to the

manner in which we live through concrete situations, for example, we can say that the

brewing storm feels menacing, a cozy pub looks inviting, a wind chime can sound

playful, etc. And these expressive qualities are not mere poetic qualifiers added to our

perceptual experience after we are presented with a raw physical object. More precisely,

this expressiveness comes before the object, it is the object‟s condition of possibility: an

uneasy feeling about the weather focuses our attention on the darkening horizon, the bits

of joyful conversation coming from a soft light down an alleyway reveal a pub on closer

199 | P a g e

inspection, soft melodies drifting in from the balcony turn out to be dangling pieces of

wood gently making contact in a late summer breeze, and so forth. In this manner we

primarily perceive aspects of the world in their immediate sense for us, as menacing,

inviting, playful, etc., even before attentive reflection reveals „a storm‟, „a pub‟, „a wind

chime‟, etc., as detached from our situation and devoid of any significance.

By taking the phenomenology of our situatedness as our starting point, rather than the

materialist half of Cartesian dualist metaphysics, we transform the „problem of other

minds‟ into something more manageable. Instead of explaining why children start

perceiving others as acting for reasons on the basis of seeing mere physical objects and

movements (an absolute gap between meaningless physical change and meaningful

behavior), we need to account for how a general perception of sense can develop into a

perception of goals and intentions (a merely relative gap between different kinds of

meaningful behavior). As a first step in this direction, we introduce the phenomenology

of intersubjectivity (Chapter 12). On the basis of this change from a metaphysical to a

phenomenological starting point it is possible to organize the empirical data of social

psychology, developmental studies, and primatology in a novel manner, thereby giving

us a fresh perspective on the issue of cumulative cultural development (Chapter 13).

Finally, the thesis finishes by offering some reflections on what has been accomplished

and pointing to potential implications that are still in need of further investigation.

200 | P a g e

12 Phenomenological considerations

In this chapter we will draw on some central insights from Husserlian phenomenology,

especially in relation to the problem of intersubjectivity which has taken center stage for

most of the tradition. While there is a growing recognition of the importance of a

phenomenologically informed approach to intersubjectivity for the cognitive sciences

(e.g. Gallagher & Zahavi 2008; Zahavi 2005; 2001), and the enactive approach in

particular (e.g. Thompson 2007; 2001), these are only the tentative beginnings of a

mutually informative research program. For an insightful discussion of the relevant

phenomenological literature the reader is referred to Zahavi‟s (1996) excellent treatment

of the phenomenology of intersubjectivity, to which this chapter owes much. In the

context of this thesis, we will limit our efforts to a consideration of how the

phenomenology of intersubjectivity can contribute to a better understanding of the lifemind

continuity thesis. Of particular interest will be to determine how the presence of

others impacts on the structures of agency and sense-making (perception).

This chapter will unfold in three parts: In Section 12.1 we will review aspects of the

phenomenology of intersubjectivity in relation to perception, which is how Husserl first

got drawn into a serious appreciation of the constitutive role of intersubjectivity, and

which has also recently been used to motivate a similar turn in the cognitive sciences

(e.g. Gallagher 2008a; Zahavi 2005, pp. 166-167). In Section 12.2 we will clarify the

way in which others are given to us in our experience, giving special attention to how

this encounter affects our perspective on the world. These phenomenological reflections

indicate how the objectivist epistemic perspective that is characteristic of detached

human sense-making (e.g. the scientific attitude) is constitutively dependent on our

relationship with others. In Section 12.3 the life-mind continuity thesis is reformulated

to include this phenomenological support for the constitutive role of sociality.

12.1 The phenomenology of perception

One of the key insights of the phenomenology of intersubjectivity is that the existence

of others plays a constitutive role for our perception. In order to illustrate this insight, let

us begin with a concrete example of object perception: how is the wall that is located

201 | P a g e

ehind the desk given to me in my experience? How is its presence for me as an object

constituted? The general proposal of the enactive approach to perception is that the

world (our experiential world) is enacted, that is, it claims that perception consists in

perceptually guided action. The perceiver does not have access to some metaphysically

supposed pre-given, perceiver independent world of objects; an autonomous agent is

always embedded within a particular sensory-motor loop that is shaped by the overall

dynamics of the organism-environment systemic whole. Accordingly, as discussed at

length in Chapter 3, such an agent can only perceive its surroundings by appropriately

regulating its sensory-motor interactions:

Thus the overall concern of an enactive approach to perception is not to

determine how some perceiver-independent world is to be recovered; it is,

rather, to determine the common principles or lawful linkages between sensory

and motor systems that explain how action can be perceptually guided in a

perceiver-dependent world. (Varela, et al. 1991, p. 173)

On this view, perceived objects appear as the invariants that happen to emerge from the

closed loop of an agent‟s ongoing embodied activity and the resulting stimulations (an

idea inherited from the cybernetic tradition, e.g. von Foerster 1976). It is important to

emphasize that we are specifically talking about sensory-motor invariants, which differ

from the ecological invariants of Gibsonian psychology in that only the former give a

constitutive role to motor activity (cf. Mossio & Taraborelli 2008). In brief, as the

philosopher Alva Noë has recently put it, “the invariant structure of reality unfolds in

the active exploration of appearances” (Noë 2004, p. 85). Since it is my behavior which

enables me to establish such regularities, it follows that my perceptual capabilities are

enabled and constrained by my behavioral capabilities:

How things look to me is constrained by my sensorimotor knowledge. It is my

possession of basic sensorimotor skills (which include the abilities to move and

point and the dispositions to respond by turning and ducking, and the like) that

enables my experience to acquire visual content at all. (Noë 2004, p. 90; cf.

O‟Regan & Noë 2001).

202 | P a g e

Thus, according to the enactive approach, my perception of the wall behind my desk is

constituted by my possession of basic sensory-motor skills, such as perhaps scanning it

with my eyes or moving my head and body in a certain manner 38 . Perception is, after

all, conceived as perceptually guided action. This sensory-motor account of perception

has generated a lot of excitement in cognitive science, but also a number of criticisms

(e.g. Prinz 2006; Clark 2006; Velmans 2007). These need to be taken seriously and

carefully addressed in order to move the theory of sense-making and enactive

perception forward (e.g. Thompson 2005; Di Paolo 2009). Here we will focus on one

problematic aspect for postulating a sensory-motor basis for perception, namely the

perceptual presence of absent phenomena, or what is sometimes called the problem of

“perceptual presence” (Noë 2004, p. 59). This problem is of particular interest in the

current context because a related worry in phenomenology has been resolved by appeal

to the constitutive role of open intersubjectivity.

Let us return to the example of the wall behind the desk. How come it is given to me in

my perceptual experience as a 3D object that has another side which is currently out of

view? How come it is given to me as separating my current location from whatever is

outside the room, rather than just as a flat 2D appearance? Noë suggests that our

experience of absent profiles should be understood as a type of „virtual‟ presence:

“They are present to perception as accessible” (Noë 2004, p. 63; emphasis added). In

other words, on this account I do not experience the wall as a flat facade that separates

me from some meaningless void because of my embodied sensory-motor skills, which

would let me view its other side from the outside, if I left the room to inspect it.

But is this appeal to „virtual‟ presence and sensory-motor accessibility an adequate

description of our perceptual experience? According to the later work of Husserl and the

subsequent phenomenological tradition, as well as recent work in cognitive science, this

is not the case (cf. Gallagher 2008a; Zahavi 1996, pp. 43-51; Gallagher & Zahavi 2008,

38 It is important to emphasize again that there are important differences between the enactive paradigm,

which has been developed by Varela and colleagues and is the focus of this thesis, and the „enactive‟

sensory-motor approach to perception that has been proposed by Noë (cf. Torrance 2005; Thompson

2005). However, they are sufficiently similar with respect to the role of embodied action for perception

with respect to the current issue that these differences need not concern us at this point.

203 | P a g e

p. 100-104; Thompson 2007, p. 384). The idea that we constitute the absent profiles of

an object in terms of past or future embodied action necessitates an appeal to a temporal

separation, an aspect that is itself not given in our perceptual experience. Even though I

momentarily do not visually perceive the backside directly, I nevertheless experience

the wall as having such a backside now while I am looking at it from inside my room. In

other words, it is argued that since leaving the room to check the wall‟s backside would

involve a temporal and spatial displacement from my current perceptual situation,

neither the movement nor its potential accessibility can explain the fact that I currently

perceive the wall as a whole object, no matter which side aspect is given to me, rather

than as a temporarily distributed set of profiles. As Gallagher, following Husserl, points

out: “When I perceive an object the present front is not a front with respect to a past or

future back, but is determined through its reference to a present co-existing back. The

object is perceived at any given moment as possessing a plurality of co-existing

profiles” (Gallagher 2008a, p. 172). But since I can only be in one place at a time, how

then can we account for this perceptual presence of co-existing profiles?

We can begin to resolve this problem for the enactive approach by noting that Varela

and colleagues make use of a much broader notion of embodiment than that used in the

sensory-motor approach by Noë and O‟Regan. The term “embodied action” is indeed

meant to highlight that cognition and perception depend on having a body with various

kinds of skills. But the term is similarly meant to emphasize “that these individual

sensorimotor capacities are themselves embedded in a more encompassing biological,

psychological, and cultural context” (Varela, et al. 1991, pp. 172-173). To be sure,

much of enactive cognitive science has focused on the biological and psychological

context, but it is also informed by considerations of our social and cultural background,

especially in terms of our role as practicing scientists (cf. Varela, et al. 1991, pp. 9-12;

Thompson 2007; 2001). Moreover, even in the very beginning of the enactive approach

the role of social context enters into the very definition of what it means to be an

intelligent agent, such that “intelligence shifts from being the capacity to solve a

problem to the capacity to enter into a shared world of significance” (Varela, et al., p

207; emphasis added). How does this broader notion of embodiment, which includes

our situatedness in a social and cultural context, help us to account for our perception of

the wall as a full-fledged object?

204 | P a g e

First, following Gallagher (2008a, p. 171), we can note that there is good evidence from

developmental psychology that we gain access to a meaningful world of objects through

our interactions with others. Not only are the most dominant and central experiences for

a young infant its relations to others, these relations also shape its perception of the

world by engaging in joint attention. In other words, environmental objects, such as a

wall, first take on meaning in the pragmatic contexts within which we see and imitate

the actions of others (cf. Merleau-Ponty 1960; Trevarthen & Hubley 1978; Tomasello

1999). Second, even in our adult life we find that the wall is given as „a wall‟ within a

common public totality of surroundings, i.e. in phenomenological terms the situation of

our individual being is always already a form of „being-with others‟ (cf. Heidegger

1927; Zahavi 1996, pp. 124-127). Similarly, research in cognitive anthropology has

demonstrated that our relations to the objects of our perception continue to be deeply

interwoven with our relations to others even in adult life (cf. Hutchins 1995). Thus, the

capacity for worldly engagement that is characteristic for adult humans is neither

acquired nor performed in isolation. As Husserl famously puts it:

Thus everything objective that stands before me in experience and primarily in

perception has an apperceptive horizon of possible experience, own and foreign.

Ontologically speaking, every appearance that I have is from the very beginning

a part of an open endless, but explicitly realized totality of possible appearances

of the same, and the subjectivity belonging to this appearance is open

intersubjectivity. (Hua XIV/289; see also Hua IX/294, XV/497; quoted by

Zahavi 2005, p. 167)

In this manner Husserl was led via a deepening phenomenological analysis of object

perception from an individualistic theory of the constitution of objects, which was

somewhat reminiscent of Noë and O‟Regan‟s recent sensory-motor approach, to an

appreciation of the constitutive role of other subjects. More precisely, following

Zahavi‟s and Gallagher‟s interpretation, in order to account for the phenomenology of

an object as something that transcends the current perceptual profile that we have of it,

we need to posit the possibility of other subjects, which can potentially perceive the

other profiles at the same time. We could even go as far as to say that the transcendence

205 | P a g e

of the world as such is derived from the radical transcendence of the other (e.g. Levinas

1978). Or, as Sartre puts it: “it is not in the world that the Other is first to be sought but

at the side of consciousness as a consciousness in which and by which consciousness

makes itself be what it is” (1943, p. 296). Here we have a phenomenological equivalent

to the self-other co-determination of the enactive approach (Thompson 2001), though

the latter would be more inclined to view the self-world-other relation in an essentially

reciprocal manner since it is also the world which frames the factual possibility of

encountering the other.

For our present purposes we can leave the broader implications of intersubjectivity for

the phenomenon of worldhood aside, and focus on the claim that it is the possibility of

engaging in intersubjective interaction, an open intersubjectivity, which accounts for the

presence of the whole object. According to this phenomenological account, the reason

that I experience the wall of the room in its full presence through its current partial

profile (what is sometimes called „transcendence in immanence‟) is because I can

engage in intersubjective interactions that shape the sense of my experience. I can, for

example, hold a conversation with someone outside my window who informs me that

the outer surface of the house could use some cleaning 39 . Of course, the claim is not that

it is necessary that another subject is factually present at the time for object-perception

to be possible, which is why it specifically is an open intersubjectivity. Rather, it is

claimed that objects are experienced as existing independently of us because we coconstitute

the experiential domain in which the object is located as a public realm of

shared meaning that includes the possibility of another perspective.

39 While it is beyond the scope of this chapter to address the role of language in the co-constitution of our

objective world, it is clearly a crucial element: “In the experience of dialogue, there is constituted

between the other person and myself a common ground; […] we are collaborators for each other in

consummate reciprocity. Our perspectives merge into each other, and we co-exist through a common

world” (Merleau-Ponty 1945, p. 413). It is this kind of merging of perspectives which is necessary to

account for the presence of an object as an object, namely by providing a synthesis of co-existing

perceptual profiles. More fascinating work remains to be done here, especially because language was a

focal topic in the biology of cognition by Maturana (e.g. Maturana 1978; Maturana, et al. 1995).

206 | P a g e

But this intersubjective resolution of the problem of „perceptual presence‟ leaves us

with a dilemma with regard to the continuity aspect of the LMCT. For if we accept that

the full presence of the wall as a 3D object is constituted in terms of the potentiality of

my experience of others as others that perceive its currently hidden profiles, then what

does this entail for other forms of life? We seem to have two controversial options: (i)

we claim that open intersubjectivity (in the sense of being able to take another‟s

perspective on the world) is present for other living beings as well, or (ii) we assert that

the presence of a 3D world is a unique aspect of human phenomenology. While the

former claim remains highly contentious even with regard to our closest primate

relatives (Tomasello, et al. 2003) and lacks scientific support for most other species

(Tomasello 1999), the latter is in tension with the evident skill with which these species

negotiate their environments. Should we really conceive of the lived world of these

species as lacking the dimension of depth? At least for animals with evident depthadapted

sense organs (e.g. vision, hearing, forms of touch, etc.), this is an unacceptable

conclusion. We propose to resolve this dilemma in two steps.

First, let us reconsider the constitution of perceptual presence in terms of sensory-motor

invariants. In the case of humans, at least, psychological experiments using minimalist

haptic interfaces have shown that the emergence of an experience of distal presence (in

terms of an obstacle located in 3D space) is entailed by the skilful mastery of basic

sensory-motor correlations and laws (Auvray, et al. 2005; 2007). However, as already

indicated above, phenomenologists have argued that this kind of sensor-motor activity

includes a temporal progression which prevents it from accounting for the perceptual

presence of a complete object, because “the object is perceived at any given moment as

possessing a plurality of co-existing profiles” (Gallagher 2008a, p. 172). How can the

sequence of sensory-motor events be integrated into a coherent whole? Is the appeal to

open intersubjectivity necessary, even if it prevents us from attributing the experience of

distal presence to solitary animals?

We suggest a novel resolution of the tension by appealing to another foundational theme

of phenomenology and enactive cognitive science, namely the issue of temporality (cf.

Varela 1999). Husserl himself has argued that every experiential moment is not simply

an isolated point in time, but is rather temporally extended in terms of a tripartite

207 | P a g e

etention-present-protention structure (cf. Hua X). Importantly, it is in this temporal

horizon of backward retention and forward protention that perception incorporates its

non-actualized possibilities. Accordingly, the existence of the temporal horizon of the

present moment allows us to explain the solitary constitution of an object as presently

consisting of a plurality of other possible perspectives. These potential profiles can be

contained in the present moment‟s retention and protention, even without the need for

open intersubjectivity.

While Husserl‟s account of inner time consciousness is derived from human experience,

it is possible that his account of its tripartite structure defines a more general condition

of lived experience as such. It could therefore potentially hold for other living beings as

well. In other words, if we accept that depth perception can be grounded in tripartite

temporality, then we have managed to recover the possibility of such perception even

for non-social animals, as long as their lived existence is characterized by this kind of

temporality. To be sure, in the phenomenological literature the status of temporality for

non-human animals remains ambiguous and controversial (e.g. Buchanan 2007; Hayes

2007). From the perspective of the enactive paradigm, however, there are reasons to be

optimistic. For instance, there has been work by Varela (1999) and van Gelder (1999)

which links the phenomenology of time consciousness with a particular organization of

neural dynamics. Accordingly, we can hypothesize that animals with nervous systems

that are organized in a manner sufficiently similar to ours are also embedded in a

tripartite temporality. More fundamentally, as Jonas (1996) has argued at length, there

are good reasons to claim that even metabolic forms of life have some of the existential

credentials characteristic of human beings. The only time when a living being coincides

with itself is when it has died. Life is essentially a temporal form of existence, as its

essential form is only maintained by continuous material change. More precisely, the

satisfaction of metabolic needs is necessarily a precarious affair which creates a need to

project toward future possibilities on the basis of past events. Whether this primordial

temporality of life formally matches Husserl‟s description of human tripartite time

consciousness would require further work. In any case, there is a strong possibility that

the problem of perceptual presence faced by the sensory-motor approach can be

resolved in terms of temporal integration without appeal to open intersubjectivity,

perhaps even for non-human forms of life.

208 | P a g e

Have we gone too far? It appears that our argument has undermined the phenomenology

of intersubjectivity in preference for a solution that gives support to methodological

individualism. To be sure, with the appeal to the temporal structure of experience we

have indicated how it is possible to retain the insights of the sensory-motor approach to

perceptual presence without requiring that other subjects must potentially be present as

others. But where does this leave the constitutive role of open intersubjectivity? This is

where the second step comes into play. We claim that the aspect of perception which is

intersubjectively co-constituted is a certain kind of detachment, an attitude of „taking

as‟, which enables us to specifically perceive something as something. For example, I

don‟t just see the wall from the perspective of my own current concerns (e.g. as trapping

me inside the room) but also from the perspective of alternative concerns including

those others might have (e.g. as blocking the view, making parking difficult, etc). These

alternative perspectives on the situation need not be actually thematized; their

potentiality as legitimate concerns makes me see the wall as „a wall‟ that is independent

from my current perspective of it, i.e. as an object in the strict sense of the word. Here

we need to appeal to the possible presence of others as others in order to account for the

existence of these potential co-existing perspectives of concern, as well as the

corresponding relativization or de-centering of our own current perceptual perspective.

On this phenomenological view, it is open intersubjectivity which co-constitutes the

characteristically human de-centered presence, a presence for which the sensory-motor

constitution of spatiality is a necessary but not sufficient condition. This appeal to the

constitutive role of others as a means of going beyond the limitations of individual

sensory-motor knowledge has occasionally been recognized in sensor-motor approaches

to psychology (e.g. Piaget 1967), but is still lacking in cognitive science. We will return

to this „de-centering‟ presence of the other in the next section, in the context of a more

detailed phenomenological investigation of how we perceive others.

12.2 The phenomenology of intersubjectivity

How precisely do we encounter others as others? In orthodox cognitive science this

„problem of other minds‟ is usually addressed according to the computationalist sense-

209 | P a g e

model-plan-act schema: (i) we first perceive a set of physical facts, such as material

bodies and movements; (ii) these facts are the input for some kind of cognitive

processing, such as inference or simulation, thereby producing a representation of the

other‟s mental states, (iii) that representation allows us to plan how to act in a socially

appropriate manner, and (iv) we respond by executing that plan with motor commands

as best as possible (for a critical analysis of inference and simulation based approaches,

cf. Gallagher 2001).

Though there are some essential differences between the competing theories of social

cognition in mainstream cognitive science, they are in agreement that the meaningful

expressions of the other are essentially a secondary attribution to the primary perception

of merely physical circumstances. This assumption is a direct outcome of what we have

called methodological physicalism, and has already been criticized extensively by more

phenomenologically oriented theorists. What is most peculiar about this assumption is

that it does not match what is given in our experience: we directly perceive the other as

another subject in its own right without having to engage in inference or simulation (cf.

Gallagher 2008b; Zahavi 2001). Moreover, if we accept that part of what it means to

experience an objective world is that we encounter it as a shared existence for other

subjects, i.e. the world is experienced as our common world (Hua I/123; cf. Zahavi

1996, pp. 25-26), then the traditional approach, based as it is on an objectivist premise,

presupposes what it sets out to explain. The attempt to reduce the presence of others to a

combination of physical facts is doomed to failure because the condition of possibility

for those facts includes the presence of others. In brief, it is impossible to conceive of

objectivity without positing intersubjectivity at the same time.

However, the claim that others are immediately given in our experience should not be

misunderstood as asserting that we have direct access to another‟s experience. On the

contrary, a crucial element that defines the other is its peculiar „otherness‟, otherwise

the phenomenon of intersubjectivity would be logically inexplicable. In the words of

Husserl: “if what belongs to the other‟s own essence were directly accessible, it would

be merely a moment of my own essence, and ultimately he himself and I myself would

be the same” (Hua I/139). To be sure, to say that the other is defined by his otherness is

not enough. After all, all real objects of our experience, including self and world, are

210 | P a g e

characterized by a certain type of transcendence in relation to the constituting subject.

Moreover, we can even find structures of otherness (alterity) within the constituting

subject itself (cf. Zahavi 1999). What is special about the transcendence of the other is

that the other necessarily eludes our grasp in a unique manner.

While it is true that objects also elude our grasp, i.e. they are necessarily only given in

profiles that never exhaust the constituted object as such, we can still posit their identity

as an ideal limit point, which we pre-reflectively understand through our mastery of

sensory-motor engagement with them 40 . Another subject, in contrast, is always prone to

change in such a way that it escapes any attempt at grasping its identity in the form of a

simple object perception. As long as the other remains an autonomous subject in its own

right, there is always the possibility that the affordances of interaction change in

surprising and unexpected ways. In relation to the transcendence of things we can say

that “the real lends itself to unending exploration; it is inexhaustible” (Merleau-Ponty

1945, p. 378). To be sure, this is also the case for our encounters with other subjects, but

there it runs into what we could call a meta- or second-order transcendence: others do

not only lend themselves to unending exploration, they also spontaneously lend

themselves to unending explorations of different styles of unending exploration.

This insistence on the radical otherness of the other might at first seem like a minor

technical point, but it actually is at the heart of why a consideration of intersubjectivity

is so important for any adequate account of human cognition. It is only in the case when

the subject encounters the particular transcendence of the other that we can say that its

immanent sphere of „ownness‟ is surpassed toward a shared world of objects (Hua

XIV/442). In this way we have turned the traditional problem of other minds on its

head: “the otherness of „someone else‟ becomes extended to the whole world, as its

„Objectivity‟, giving it this sense in the first place” (Hua I/173). We thus find that the

40 “The ipseity is, of course, never reached: each aspect of the thing which falls to our perception is still

only an invitation to perceive beyond it, still only a momentary halt in the perceptual process. […] What

makes the „reality‟ of the thing is therefore precisely what snatches it from our grasp. The aseity of the

thing, its unchallengeable presence and the perpetual absence into which it withdraws, are two

inseperable aspects of transcendence” (Merleau-Ponty 1945, p. 271). Might we here have the basis for a

phenomenological explanation of Schrödinger‟s uncertainty principle?

211 | P a g e

categories of transcendence, objectivity and reality are intersubjectively constituted, that

is, they can only be constituted by a subject who has experienced other subjects. More

precisely, these categories are co-constituted, as illustrated by the following example of

their application to our understanding of ourselves:

Thus for me the Other is first the being for whom I am an object; that is, the

being through whom I gain my objectness. If I am to be able to conceive of even

one of my properties in the objective mode, then the Other is already given. […]

In experiencing the look, in experiencing myself as an unrevealed object-ness, I

experience the inapprehensible subjectivity of the Other directly and with my

being. (Sartre 1943, p. 294)

Accordingly, even our presence to ourselves as a temporal-spatial object in the world is

a phenomenon that is mediated by the presence of the other (Sartre 1943, p. 290-291).

Moreover, Husserl claims that the same holds for the categories of immanence,

appearance, and inwardness. It is only when I experience myself as an object under the

gaze of another subject, that I can distinguish my personal inwardness from its public

external manifestation. Similarly, only when a subject experiences that the same object

can be experienced by several subjects, and that it is given in various profiles, that the

subject is in a position to realize that there is a distinction between the object itself and

its appearance, its being-for-me (cf. Zahavi 1996, pp. 38-39).

What might lived experience be like for a subject who has not been able to perceive

others as others? While this is extremely difficult to imagine from our socialized and

enculturated perspective, the transformative power of the other is still evident from

within the perspective of our own adult existence, so we can entertain some tentative

comparative reflections. Merleau-Ponty, for instance, observes that “no sooner has my

gaze fallen upon a living body in the process of acting than the objects surrounding it

immediately take on a fresh layer of significance: they are no longer simply what I

myself could make of them, they are what this other pattern of behavior is about to

make of them” (1945, p. 411-412). This is a good example of how our own sensemaking

can be shaped by the presence of another subject. Another fitting example is

212 | P a g e

Sartre‟s description of the experience of being in a public park, facing a lawn with a row

of benches along its edge, when a man happens to walk by those benches:

Thus suddenly an object has appeared which has stolen the world from me.

Everything is in place; everything still exists for me; but everything is traversed

by an invisible flight and fixed in the direction of the new object. The

appearance of the Other in the world corresponds therefore to a fixed sliding of

the whole universe, to a decentralization of the world which undermines the

centralization which I am simultaneously effecting. (Sartre 1943, p. 279)

It is in situations like these, namely when we are prompted to make sense of the world

in relation to the perspective of another subject, that we also become aware of our own

contributions to the structure of our experience, for example the centralization which we

ourselves continually effect on our perceptual world. In this way we can reaffirm that

the de-centered presence, which is characteristic of our existence in the world, is coconstituted

by the presence of the other. It is likely that non-social forms of life exist in

a centralized world that is not given as centralized (because the comparative experience

of sharing the world with another centralizing perspective is missing). We will return to

these comparative considerations in the next section.

This completes the brief review of the phenomenology of intersubjectivity. Of special

interest was how our experience of self and world is shaped and co-constituted by the

possible and actual presence of other subjects. We have seen that this guiding question

touched upon the basic perceptual presence of objects, as well as on the conditions for

notions of objectivity, appearance, inwardness, and transcendence. Our detached and

de-centered presence in the world, which is a condition of possibility for the scientific

attitude, is an intersubjective achievement. On this basis we can now return to the

LMCT and provide some clarification of its key concepts.

12.3 A phenomenologically informed continuity thesis

The main motivation for this thesis was an analysis of the constitutive role of sociality

for mind and cognition in order to support the LMCT as a unifying framework for

213 | P a g e

cognitive science. We have pursued this goal by drawing on insights developed in two

traditions, namely the enactive paradigm and Husserlian phenomenology. The aim of

this section is to combine the insights of these traditions in a mutually informative

manner, and reformulate the LMCT accordingly.

It should be clear that relating the enactive and the phenomenological traditions in a

fruitful manner is both a challenge and an opportunity. We have argued that the

organizational (or behavioral) approach to the LMCT is not enough, and that we need to

incorporate phenomenological considerations. However, the appeal to the first-person

perspective might cause some resistance in cognitive science, especially in the

cognitivist mainstream, while introducing the enactive approach into current debates in

phenomenology could also be met with some skepticism:

Phenomenologists never conceive of intersubjectivity as an objectively existing

structure in the world that can be described and analyzed from a third-person

perspective. On the contrary, intersubjectivity as a relation between subjects

must be analyzed, as such, from a first-person and a second-person perspective.

(Zahavi 2005, p. 176)

In order to resolve these tensions it is helpful to remind ourselves that we are dealing

with a subjectivity that is embodied and embedded, both of which are characteristics

that can be explored from the perspectives of science and phenomenology. And the

same applies to intersubjectivity as well: “we must consider the relation with others not

only as one of the contents of our experience but as an actual structure in its own right”

(Merleau-Ponty 1960, p. 140). Indeed, the enactive approach to social cognition is well

positioned to provide the phenomenological tradition with a dynamical account of the

interaction process, while the latter can sharpen the sensitivity of the former to the

phenomena that need explaining. Moreover, both traditions are joined in their focus on

the primacy of embodied (rather than linguistic) interactions (cf. Zahavi 2005, p. 176).

It is with respect to the constitutive impact of such embodied interactions that the

enactive approach can be of help to the phenomenological tradition, for example by

relating the changing qualitative presence of the other to the particular dynamics of

changes in ongoing bodily coordination:

214 | P a g e

[The other‟s] autonomy demands frequent readjustments of my individual sensemaking.

When interaction and individual intentions coordinate, we feel mutually

skilful to navigate the interaction: we experience a kind of transparency of the

other-in-interaction. But when, for a variety of reasons, a breakdown occurs, and

until a new coordination is attained, we experience the other as opaque. (De

Jaegher & Di Paolo 2007, p. 504)

It is therefore likely that an integrative methodology that combines both dynamical and

phenomenological insights would be mutually beneficial. However, there still remains a

lingering conceptual tension. The phenomenological analysis of intersubjectivity has led

us to argue that “intersubjectivity exists and develops in relation between world-related

subjects, and the world is brought to articulation only in the relation between subjects”

(Zahavi 2005, p. 177). However, this stands in contrast to the basic enactive account of

agency and sense-making, which posits the bringing forth of a world for the adaptive

agent without any mention of the constitutive role of other agents.

We will use this tension to our advantage by forcing us to become clearer about the

enactive approach to agency and sense-making. The lack of intersubjectivity is not fatal

to the enactive account of these basic notions, but we must be careful that we do not

over-anthropomorphize them. But how should we conceive of the experiential world of

an agent who is incapable of interacting with others as others in their own right? To be

sure, it is likely that it is characterized by some minimal transcendence, since sensorymotor

regularities are partly dependent on environmental affordances, and thus

inherently escape the agent‟s grasp to some extent. Moreover, we have argued that even

basic sensory-motor skills will give rise to the perceptual presence of complete objects

without the need to appeal to open intersubjectivity, namely due to the intrinsic tripartite

temporal structure of lived experience. Nevertheless, without the relativizing and decentering

presence of the other as other, this kind of alterity of the world is likely to

remain undifferentiated from the alterity already encountered within the isolated subject

itself (i.e. the opacity of the agent‟s relation to itself). Thus, whereas Jonas claims that

“inwardness is coextensive with life” (1966, p. 58), a position which has been

influential in the recent development of the notion of sense-making (cf. Weber & Varela

215 | P a g e

2002; Di Paolo 2005; Thompson 2007), this is no longer a precise enough description of

the phenomenon of life. As Husserl remarks in the case of an imagined solitary human:

For the human being who has not undergone the experience of empathy, or from

the standpoint of the abstraction from any empathy, there is no “inwardness” of

an “externality”; such a human being would have all of the lived experiences –

and all of the objectivities, of whatever sort – that are included under the title of

inwardness, but the concept of inwardness would be lost. (Hua XIII/420; quoted

by Zahavi 1996, p. 39)

The solitary basic agent is thus best described as experiencing an „inwardness‟, but an

inwardness which is not experienced as an inwardness. This agent would still be a

center of needs and concerns embedded in a meaningful context related to its particular

circumstances and viability constraints, as described by Jonas, but this differentiation as

a center for a world is not a structure of its experience as such (cf. Heidegger 1929).

What we have in this case is the „organism-Umwelt‟ dyad that is so well described in the

work of the biologists von Uexküll (1934). But in order for the distinction between

„inner‟ and „outer‟ to become present in experience as such it is necessary that we are

dealing with an intersubjectively constituted form of life, one which unfortunately does

not get addressed by Jonas (1966) 41 .

On this view, it is only with a certain process of socialization that there is a possibility

of making sense of one‟s existence as a sense-making existence. Thus, if we want to

talk about the kind of sense-making activities that are involved in enacting the physical

world which we experience from a detached human perspective, then appealing to a

simple „organism-Umwelt‟ dyad is not sufficient. Following the phenomenological

tradition, we first have to elucidate the human condition in terms of a „self-world-other‟

triad. This is because the subject-object dichotomy, which is at the heart of the

theoretical attitude that makes abstract knowledge possible, is only an idealized and

41 Indeed, considering that all known forms of life are closely interconnected in various kinds of

networks, e.g. the relationships of predator and prey, mating partners and rivals, symbiosis, and the whole

ecosystem context, the social nature of life is a striking omission in Jonas‟ bio-phenomenology.

216 | P a g e

derivative perspective which requires intersubjectivity as its necessary foundation. The

central question that needs to be addressed by the LMCT, therefore, is not how to get

from the basic „organism-Umwelt‟ to the human „self-world‟ structure, but rather how to

get from the former to a „self-world-other‟ structure. And then, only on the basis of this

transition, is it possible to determine the conditions for the emergence of the subjectobject

dichotomy which has been mistakenly taken as the primary epistemic attitude by

mainstream cognitive science. Nothing specific has been said about the conditions for

this dichotomy in this thesis, but there is no reason to believe that the LMCT cannot

also accommodate this final transition. Indeed, it appears that Heidegger‟s (1927) claim

that this transition is caused by break-downs in ongoing coping can be addressed in the

framework of enactive cognitive science, and is perhaps amenable to evolutionary

robotics modeling (cf. Di Paolo & Iizuka 2008).

217 | P a g e

13 Toward an enactive approach to culture

Culture is a rich and important topic for the enactive approach that has only recently

begun to be addressed (e.g. Thompson 2007; Steiner & Stewart 2009; Stewart, in press;

Froese 2009). In Chapter 4 we ventured for a moment into the debate about culture by

analyzing Steiner and Stewart‟s (2009) emphasis of the essential heteronomy of cultural

values. In brief, the idea is that in order to undergo enculturation the living subject has

to constrain its own behavioral autonomy in order to appropriate the pre-existing social

practices that already form an established context of cultural normativity. Paradoxically,

only through the incorporation of these heteronomous constraints does the subject

eventually gain an increase in autonomy that reaches beyond the acquired skills on the

socio-cultural stage and enables the expansion of individual behavioral abilities. This

dialectic of autonomy/constraint, which Jonas (1966) founded on the emergence of life

and traced throughout the later transitions in evolution, thus finds its expression once

more during enculturation. In effect, the process of becoming a part of a culture appears

to be a more specific form of social learning whereby an especially large body of preestablished

practices ends up being incorporated.

However, simple emphasis of continuity between sociality and culture should not be the

final word on this manner. Indeed, this thesis would not be complete without at least a

cursory discussion of how the enactive approach could also approach the perplexing

phenomenon of human culture. In other words, while these first steps toward an

enactive approach to cultural cognition are promising in the sense that they appear to

follow the same bio-logic that we applied to our analysis of sociality, there still remains

one major worry: the specificity of human culture. Thus, even if we accept the relatively

controversial claim that there are legitimate ways for us to attribute culture to other

species (cf. Byrne, et al. 2004), it nevertheless remains to be explained why there is

such a surprising diversity and prevalence of human culture. Can the enactive paradigm

perhaps shed some light on this mystery?

This chapter attempts to respond to this question in a twofold manner. First, it presents a

preliminary critical analysis of the mainstream‟s answer to the challenge of explaining

humanity‟s cumulative cultural development. The results of this analysis point to some

218 | P a g e

significant problems that are derived from the mainstream‟s uncritical assumption of

methodological individualism (cf. Chapter 5) and methodological physicalism (cf.

Chapter 11). Second, throughout this analysis the enactive paradigm is presented as a

favorable position from which to move the debate forward. More precisely, it can

provide a fresh perspective on cumulative cultural development by diagnosing the

origins of the traditional framework‟s shortcomings, and also by offering some new

paths for future research. Of course, at this stage much of this work remains speculative,

and it is offered here merely as an opportunity for stimulating further debate.

13.1 The ‘ratchet effect’

One popular way of explaining the mechanism of cumulative cultural development is in

terms of the “ratchet effect” (Tomasello, et al. 1993): a combination of faithful imitation

and creative innovation. The basic idea is that when a practice is first invented by an

individual it can be quite primitive, but when it is replicated by others they might also

make small improvements, which are then in turn copied and potentially improved by

others, and so on over historical time. In this process the source of innovation can be as

simple as trial and error, and imitation apparently only requires accurate copying of the

physical movements of the other. Note that these two factors appear to be so basic that

they should also be within the behavioral capacity of most non-human social animals.

However, cumulative cultural development is arguably a phenomenon that is unique to

humans (Tomasello 2001). What explains this discrepancy?

It might be thought that other animals lack the creative capacity for innovation, but this

hypothesis is not supported by the empirical evidence: “Perhaps surprisingly, for many

animal species it is not the creative component, but rather the stabilizing ratchet

component, that is the difficult feat” (Tomasello 1999, p. 5). Indeed, it turns out that

many non-human animals, including our closest primate relatives, have great difficulty

in copying the precise physical movements of others. The traditional explanation of this

incapacity is that imitative learning is “made possible by a very special form of social

cognition, namely, the ability of individual organisms to understand conspecifics as

beings like themselves who have intentional and mental lives like our own” (Tomasello

1999, p. 5). And, so the rest of the argument goes, since (i) this intentional stance is a

219 | P a g e

type of social understanding that non-human animals supposedly lack, and (ii) imitation

is a necessary prerequisite for the emergence of the ratchet effect, we appear to have

found a potential biological factor that can explain why human beings have given rise to

cumulative cultural development and other species do not.

As might be expected, from the point of view of the enactive paradigm this explanation

is unsatisfactory in several respects. Even we assume that something like the ratchet

effect is the driving mechanism behind cumulative cultural development, neither of its

two essential components can simply be accepted at face value. However, since this

thesis is primarily concerned with sociality, the following discussion will not include an

analysis of what is required for the ability to innovate and only focus on the origins of

the stabilizing component (i.e. imitative learning) 42 . In brief, the central problem of the

mainstream position is the unquestioned assumption of what we have diagnosed as

„methodological physicalism‟. This assumption has created an explanatory blind spot

because our ability to attend to others in terms of abstract physical properties has been

taken for granted. In other words, perception is understood as a form of information

processing whereby mental representations of the physical world (including the bodies

of others), i.e. of the world as it is described by physics, are made available in the mind

for further processing by other cognitive processes. It follows that the mainstream

position‟s methodological physicalism leaves individual deficits in social understanding

as the only valid explanation for a lack of imitative ability.

In contrast to this position a critical analysis of some of the key empirical evidence from

the perspective of the enactive approach lends support to three claims: (i) for most

animals the default mode of perceiving others is to perceive them directly in terms of

goals, intentions and general mental attitude, (ii) what makes humans special is their

capacity to override this default mode of perception by attending to others in terms of

their abstract physical properties, and (iii) it is our capacity to override the default mode,

42 There certainly remains a story to be told about how to ground creativity in the notion of autonomous

agency. However, the intrinsic openness of an autonomous system to change its own organization in

relation to its current structure and history of interactions already gives us a good starting point for this

endeavor. Indeed, it has already been suggested that autonomy may be at the heart of creative play among

animals, and thus partly responsible for their higher cognitive functions (Di Paolo, et al. in press).

220 | P a g e

ather than other animals‟ supposed lack of social understanding, which explains why

we are better at imitative behavior, and therefore why specifically our species has given

rise to cumulative cultural development.

In the rest of this chapter we will spell out these three claims in more detail. In

particular, we will evaluate important experimental evidence from primatology as well

as developmental and social psychology from the perspective provided by the enactive

paradigm. This critical analysis will give empirical support to the claims and raise some

questions for future research. The chapter concludes with brief reflections on some

evidence in evolutionary anthropology.

13.2 Primatology

Just a decade ago it was widely believed that non-human primates did not understand

the intentions of others (cf. Tomasello 1999), but in recent years this consensus has been

subjected to drastic revisions even though the form that a new hypothesis should take is

not entirely clear (Tomasello, et al. 2003). Here we will suggest that the empirical

evidence of primatology supports our phenomenological starting point, and that other

primates also experience others primarily in terms of their goals, intentions and a

general mental attitude. As a first step toward the establishment of this new hypothesis

we must question the validity of previous findings that appeared as evidence to the

contrary. Why has it taken researchers so long to realize that non-human primates have

the capacity to understand others as other intentional beings?

First of all, we must consider the psychological state of the lab animals, who often

suffer from profound traumatic experiences. Accordingly, we should always be careful

not to over-generalize any empirical findings in this area of research (cf. Racine, et al.

2008). This is especially true of negative results, which might rather be the result of

social withdrawnness and other individual deficits contingent on a rather unnatural

developmental history, as well as the general arbitrariness of lab experiments from the

perspective of those being tested. For example, the reason why studies which required

chimpanzees to follow a communicative sign to the location of food resulted in negative

evidence could be that this kind of gesturing is something that they would never do in

221 | P a g e

the wild and therefore failed to understand (Tomasello, et al. 2003). Indeed, it should

thus come as no surprise that implementing experimental situations in more ecologically

plausible ways has led to more positive results of social understanding (cf. Call &

Tomasello 2008) 43 . Similarly, as would be expected, enculturated apes are better at

coping with experimental settings, probably because their social understanding is more

akin to that of the experimenters (e.g. Savage-Rumbaugh, et al. 2001). We should also

not forget the subtle but prevalent power structures impinging on the animals that spend

their life under lab conditions. If the social and cognitive identity of the animal has been

significantly impaired (i.e. a kind of mental „death‟), then we should expect to find little

evidence for sociality. Thus, we follow De Jaegher and Di Paolo (2008, pp. 38-39) in

insisting that the interactors must be autonomous agents in some broad sense so as to be

even capable of engaging in social interactions. In sum, it cannot be emphasized enough

that the practice of using a lab animal‟s irresponsiveness to social cues as the basis for

denying social abilities to its species as a whole is simply unacceptable.

Second, we can identify a strong tendency toward negative interpretative bias. This is at

least implicitly acknowledged by Tomasello and colleagues who, as if almost doubting

their own findings, assure the reader that their “studies show what they seem to show,

namely, that chimpanzees actually know something about the content of what others see

and, at least in some situations, how this governs their behavior” (Tomasello, et al.

2003, p. 155; emphasis added). How does this apparent negative bias manifest itself in

other experiments? For example, it has been argued in a well-known study by Povinelly

and Eddy (1996) that because some apes were found gesturing (i.e. begging for food) to

„blind‟ experimenters (i.e. who wearing buckets over their heads), they cannot be said to

perceive others as intentional beings. However, it has also been shown that congenitally

blind humans gesture normally, even when they speak to another blind listener alone

(Iverson & Goldin-Meadow 1998). Strictly speaking, we must therefore either insist that

blind humans also have no understanding of others as intentional beings, or we treat

Povinelly and Eddy‟s empirical data as it is, namely deeply inconclusive. As it turns

43 Human beings may also appear as socially incapable when confronted with unfamiliar situations, an

experience many of us have had in the context of foreign cultures. For example, in the Philippines it is

common practice to indicate the location of a joint attentional target by pointing with the lips or the

mouth, a gesture whose intended meaning might be lost on a foreigner, if it is noticed at all.

222 | P a g e

out, this latter interpretation is even supported by the evidence presented by Povinelli

and Eddy (1996). They had also found that chimpanzees did discriminate situations in

which one human was facing them and another had her back turned.

The unfortunate prevalence of developmental disorders, unnatural experimental settings,

and a strong negative interpretative bias can help us to explain some of the negative

findings of primatology, but they do not cover all relevant cases. Consider for example

an experimental study by Tomasello and colleagues (1997) in which a chimpanzee was

removed from her group and taught arbitrary signals by means of which she obtained

desired food from a human. When she was eventually returned to the group and began

to use these newly learned gestures to obtain treats from the experimenters while in full

view of the other chimpanzees, there was not one observed instance of others attempting

to imitate the effective gestures. This lack of imitation is especially surprising since the

rest of the chimpanzees were observing the gesturer in action and they were themselves

highly motivated for the food. The conclusion proposed by Tomasello and colleagues

was that chimpanzees do not understand each other as intentional agents, since this is a

necessary pre-requisite for imitation and the chimps failed to exhibit such behavior.

But our phenomenologically clarified starting point offers a different explanation: what

if the other chimpanzees did understand the trained individual to be gesturing for food,

but they failed to attend to the physical manner in which this gesture was realized? In

other words, they perhaps perceived the expressiveness of the gesture as a goal-directed

action aimed at obtaining some desired food, but they could not abstractly attend to the

particular physical movements by which this gesture was embodied, and therefore failed

to imitate the gesture correctly. Thus, if we bracket the assumption of methodological

physicalism, it becomes conceivable that the problem that is faced by the chimpanzees

is not a failure to understand the intentions of the others, but rather the inability to adopt

a detached perspective which could help them to reveal aspects of the other‟s presence

in terms of its „mere‟ physical properties. In this way we have reversed the traditional

epistemic hierarchy such that directly perceiving others as expressive, intentional beings

is the primary experience, and the capacity of perceiving others as physical objects is a

secondary, additional achievement.

223 | P a g e

We are thus led to a reversal of the guiding questions of this field. Rather than being

faced by the „problem of other minds‟ we are now confronted with the „problem of

other bodies‟: what are the evolutionary and developmental origins of our ability to

attend to the presence of others in terms of their abstract physical properties? In order to

begin answering this question we can turn to the evidence of child development.

13.3 Developmental and social psychology

If we want to better understand the origin of our ability to abstractedly attend to the

physical properties of other subjects, it is useful to consider work on neonate imitation

because this latter ability might presuppose some capacity for the former. In Chapter 6

we already discussed the case of neonate imitation as an interesting example of bodily

coordination because it can take place even without knowledge of a visually formed

body image (e.g. Meltzoff & Moore 1977). But did we not suggest that such imitation

demonstrated the effective role of the interaction process for organizing an individual‟s

behavior? Is the ability to adopt a detached perceptual attitude thus even a necessary

postulate to explain these results?

A closer look at the experimental protocol reveals that the authors indeed invested a

considerable amount of effort to make sure that elements of social interaction could not

be an explanatory factor in the emergence of imitative behavior. For instance, the adults

were instructed by the experimenters to perform gestures at equally long intervals,

separated by a neutral face that was unresponsive to the infant‟s gestures. The idea was

to show that neonates were capable of imitating gestures, as well as even improving

these gestures, without social feedback about how well they were doing. Of course, if

social feedback was really such a confounding factor, then we can hypothesize that they

would perform even better within an appropriate social context. Moreover, it is very

likely that the non-social experiments were conducted within a larger social context,

though not much is said of this in the papers on neonate imitation. For example, the

baby probably has to be sufficiently animated and engaged by the experimenter so that

it is interested to direct its attention appropriately and concentrates on the relevant

events of what is going on around it.

224 | P a g e

The problem is that if we cannot even appeal to such a minimal socially mediated

context in order to explain neonate imitation, then the results appear to be in conflict

with those of the „double TV monitor‟ experiments by Murray and Trevarthen (1985),

Nadel, et al. (1999) and others (cf. Chapter 8). There it was demonstrated that if infants

are faced with a „social‟ interaction that lacks social contingency, i.e. where the other

participant is not responsive to their gestures, they become distressed and/or removed.

Though the double TV monitor experiments lack a temporal analysis of the progression

of infant behavior during the replay condition, we can hypothesize that there is a short

period during which infants are still responsive, and that this period is sufficiently long

to make Meltzoff and Moore‟s studies possible. Nevertheless, since Meltzoff and Moore

are essentially interested in demonstrating the ability of neonates to imitate in situations

that lack social contingency, it might be an appropriate challenge for future research to

combine the two experimental paradigms so as to test the response of the neonates to a

video playback of the experimenter‟s gestures. If the outcome in this control condition

is significantly different then we would have confirmed our suspicion that even in the

original study there might have been some subtle social contingency at play that was

affecting the behavior of the infants.

Be this as it may, let us assume for the sake of argument that it is indeed possible to get

young infants to imitate gestures of an unresponsive „partner‟, especially if there was a

preparatory period of social interaction beforehand and the experiment itself does not

last too long. In other words, we take it as given that human infants are well predisposed

to displaying imitative behavior and that this capacity can be expressed to some extent

even without the need for an ongoing interaction process. Moreover, we assume that the

internal enabling conditions for this imitative ability are much less developed or lacking

in other primate species. Nevertheless, at this point it needs to be emphasized that, of

course, these assumptions are highly speculative and that more research remains to be

done. Moreover, it is clear that simply shifting the burden of the explanation to some

kind of innate bias that separates human infants from the young of other primate species

does not accomplish much in terms of our understanding of the phenomenon. Let us

therefore turn our attention to an investigation of the kind of situations in which the

imitative behavior of human infants is typically expressed.

225 | P a g e

To begin with we can note that there is empirical evidence which suggests that whereas

chimpanzees are more likely to reproduce the target goal of a demonstration but not the

particular means of attaining that goal (a form of behavior sometimes called „emulation

learning‟), human children typically imitate the demonstrator‟s actual physical actions

(e.g. Call, et al. 2005). The traditional explanation of this behavioral difference, as we

might already expect, is that non-human primates simply cannot understand others as

being intentional subjects like themselves. But this explanation is awkward given that

(i) nothing is said about why the capacity to adopt an intentional stance should lead to

imitation rather than emulation (i.e. it might be necessary but not sufficient for the

former option), and (ii) both chimpanzees and human children appear to understand

equally well the goal of the demonstrator‟s unfolding behavior. Therefore, a much more

parsimonious explanation is that both understand the other in terms of goal-directed

behavior, but that human infants appear to have the additional capacity to pay attention

to the particular means of how this goal is achieved. In other words, it seems that human

infants can more easily attend to others in terms of their abstract physical properties and

then make use of this additional information. Can we specify more precisely under what

circumstances the infants make us of this ability?

First of all, it is important to note that it has indeed been shown that infants are not

simply imitators of mindless physical movements, but are actually able to understand

the intentions of others as well. In a well-known study by Meltzoff (1995), for example,

it was demonstrated that 18-month-old children could infer the adult‟s intended act by

watching failed attempts, since they proceeded to complete the intended act themselves.

Moreover, compelling evidence suggests that cases in which infants directly imitate the

means of another‟s goal-directed behavior can be explained by the fact that the other‟s

action appeared to be arbitrary under the given circumstances (cf. Gergely, et al. 2002).

In other words, if a performed action does not appear to be contingent on any factors

that are relevant to the current situation, the performance of that particular action might

nevertheless be a necessary means of achieving the goal, albeit a means which the infant

does not (yet) understand. Gergely and colleagues argue that in such ambiguous

situations it is more rational for the infant to imitate the action, since it is better to err on

the side of safety, but otherwise it is more appropriate to simply select what appears to

be one‟s best available means for achieving the goal (emulation). Indeed, it has been

226 | P a g e

shown that even infants as young as 12 months understand a demonstrator‟s action in

terms of the meaning these choices have to achieve the intended goal, and that they use

this understanding when deciding whether or not to imitate particular aspects of that

action (Schwier, et al. 2006). We therefore suggest that human infants primarily make

sense of others as being agents like themselves, and only resort to an abstract evaluation

of the other‟s physical movements when their primary sense-making of the other agent

remains inconclusive about that other‟s particular intentions.

13.4 Evolutionary anthropology

In the previous section we have presented some empirical evidence which nicely

complements the phenomenologically informed critique of methodological physicalism

that we developed in Chapters 11 and 12. It is interesting to recall that in the end of that

phenomenological analysis we were led to suggest that the human ability for abstraction

is perhaps based on a socially mediated form of object perception, i.e. object perception

that has been decentered by open intersubjectivity. Interestingly, this appears to place us

into a typical „chicken and egg‟ dilemma: (i) we have argued that detached object

perception is an outcome of (adult) human intersubjectivity, and (ii) we have accepted

the claim of Tomasello and colleagues that human culture requires imitation, which is a

behavior that arguably depends on detached object perception itself. It therefore appears

to be impossible to say which of the two came first!

Fortunately, the enactive paradigm is well positioned in order to transform this apparent

vicious circle into another one of its „creative circles‟ (cf. Varela 1984). Indeed, the

modeling experiments that were presented in Chapters 7 to 10 have already shown the

possibility of an inter-individual interaction process that simultaneously enables and is

enabled by the individual behavior. The relationship between imitation and sociality

might therefore be another example of a co-dependence that bootstraps itself into

existence. We can now reformulate the question of cumulative cultural development in

enactive terms: Could the crucial difference between chimpanzees and humans perhaps

be that the latter are somehow better at providing the conditions of emergence for this

autonomous process of enculturation? While it is beyond the scope of this thesis to

provide an adequate response to this question, we can nevertheless submit it as a novel

227 | P a g e

working hypothesis for the enactive paradigm. At least the rest of this thesis has already

begun to provide the theoretical, mathematical and phenomenological framework which

could be the basis for a more systematic investigation. Finally, we will simply conclude

this chapter by highlighting some points of interest related to evolutionary anthropology

which could be potential starting points of this future research.

Let us begin by considering what is entailed by our newly developed understanding of

imitative behavior. We have argued that the phenomenological standpoint commits us to

the view that perception of others occurs primarily in terms of meaning and intentions,

and that perception of others in terms of their abstract physical properties is a secondary

achievement. This perspective has allowed us to propose a novel interpretation of some

crucial evidence in primatology and developmental studies, and led us to replace the

„problem of other minds‟ with the „problem of other bodies‟. Note that this enactive

reformulation of the central problem has important consequences for our understanding

of hominid evolution. Thus, when investigating the historical beginnings of cumulative

cultural development, we are no longer looking for the origin of the capacity for „mindreading‟,

but rather for the emergence of our ability to specifically attend to the abstract

physical properties and movements of others‟ bodies in relation to the world.

In fact, changing our perspective in this manner throws up new puzzles for evolutionary

anthropology because the ability to perceive others in terms of their intentions clearly

has adaptive benefits. For example, it is more flexible in that it enables an understanding

of others not only in “previously observed or highly similar situations but also in novel

situations” (Call & Tomasello 2008, p. 187). Conversely, it can even be maladaptive to

copy the precise movements of someone else, especially if the purpose of the action is

beyond one‟s understanding (i.e. it might be idiosyncratic and unnecessary or even plain

wrong and dangerous). Nevertheless, we have seen that there is strong evidence that this

is precisely the typical behavior of young infants. We can therefore ask: what adaptive

benefits could be associated with the ability to attend to others in terms of the abstract

movements of their physical bodies?

One interpretation of the empirical evidence in infant studies is that imitation is the

rational choice of means when the reasons for the other‟s particular action are unclear in

228 | P a g e

elation to the current circumstances and the target goal (cf. Gergely, et al. 2002). But

we have already noted that imitation is not always beneficial. So the question is: what is

the context that enables this to be a rational choice in most cases? To be sure, imitative

behavior is advantageous in situations which often involve arbitrary actions of others,

and where it is important to attend to these actions in detail. A paradigmatic example of

such a situation is a complex symbolic context, in which the meaning of actions is fixed

by historical convention alone. In other words, physical imitation highly out-competes

emulation when trying to interact with others in terms of language. It should therefore

come as no surprise that Homo sapiens appears to be the only primate species which is

capable of accomplished vocal imitation (Fitch 2000). However, this brings us back to

the creative circularity of enculturation because we can now ask: is our ability to imitate

an evolutionary outcome or an enabling condition of this linguistic context?

We can get a better handle on this question by considering the evidence of comparative

psychology. For example, it has recently been demonstrated that at least enculturated

chimpanzees are also capable of performing imitative behavior. Like the young infants

in Gergely et al.‟s (2002) study, they appear to have a similar understanding of the

rationality of others‟ intentional actions in relation to the goal of the task, and they can

use this understanding to evaluate when it is better to imitate an action rather than to

emulate it (Buttelmann, et al. 2007). This study is noteworthy in two important respects:

(i) it provides additional evidence that chimpanzees can understand others as intentional

beings like themselves, which is what we already would expect, and (ii) it demonstrates

that they can also attend to the abstract physical movements. Most importantly, it turns

out that the chimpanzees make use of this ability for abstraction only in situations when

the other‟s reason for choosing a particular means of acting toward the intended goal

has remained ambiguous. Thus, the fact that chimpanzees do not appear to imitate like

human infants when observed in the wild, but are spontaneously able to do so when they

are appropriately embedded within a human cultural background, suggests that human

imitation might have historically originated primarily as part of a developmental rather

than an evolutionary process. Of course, it is possible that the prevalent existence of this

special context put additional selective pressure on individuals to develop the ability to

abstract and imitate as quickly as possible. In other words, the possibly innate ability of

229 | P a g e

human neonates to imitate facial gestures might thus be a result of the Baldwin effect

(Baldwin 1897) rather than a cause of cumulative cultural development.

This enactive approach to culture lends itself to another speculation. At a certain point

in our history, when the ability to attend to the movement of others abstractly and the

concomitant capacity for imitation were developed in sufficiently detailed manner, new

complex practices could begin to be preserved in a precise manner. In other words, we

suggest that it was through this refined combination of abstraction and imitation that a

social medium could first be established. This medium has at least three important

properties: (i) its manifestation is concrete because it is effectively embodied in the

behavioral practices of the individual participants; (ii) its structure is arbitrary because

these practices are essentially contingent on historically determined conventions; and

(iii) its existence is independent of the particular set of individuals which momentarily

manifest it concretely through their arbitrarily structured behavior. All of these factors

potentially entail qualitative changes in the historical trajectories that are related to this

medium: (i) its concrete manifestation can give rise to new forms of creative innovation

that are based on the particular properties of the medium itself (Shanon 1998); (ii) its

arbitrary structure enables it accommodate an open-ended complexity of forms that are

underdetermined by immediate needs; and (iii) its relative independence enables these

forms to become almost self-sufficient.

The combination of all of these factors can potentially help us to explain the origin of

cumulative cultural development. Indeed, together they appear to indicate the historic

point when social processes can for the first time achieve conditions comparable to that

of the biological autonomy of life, namely a kind of „needful freedom‟ (Jonas 1966) in

relation to their constituent components. Accordingly, here we could have the birth of a

cultural form of autonomy, which from the perspective of its component individuals

would be encountered as a form of heteronomy (cf. Steiner & Stewart 2009). Since

these social processes are largely freed from the material and energetic necessities of

realizing biological autonomy, their autonomous development could be less constrained

than that of living systems. Indeed, this possibility is supported by empirical evidence in

evolutionary anthropology which shows that human cumulative cultural evolution is a

process that begins around 0.3 million years ago, at which point its development

230 | P a g e

ecomes “increasingly autocatalytic” (Ambrose 2001, p. 1752). Of course, it remains to

be seen to what extent the rather conservative concepts of the enactive paradigm, e.g.

autonomy as maintenance of identity and adaptivity as compensation of perturbation,

are adequate for dealing with such a process of developmental becoming. What is it that

drives autonomous systems to be en-active in this radical way?

231 | P a g e

14 Conclusion

If enactive cognitive science wants its basic notions of autonomous agency and sensemaking

to be the foundation for a general theory of mind and cognition, then it needs to

appeal to a strong version of the life-mind continuity thesis. Only if the continuity thesis

is in fact a valid working hypothesis can this bottom-up approach to cognitive science

systematically reject the criticism that it is dealing with interesting, but ultimately

irrelevant, descriptions of biological phenomena. So far, however, proponents of the

enactive paradigm have been plagued by what we have called the cognitive gap, i.e. an

inability to conceive of how the principles applicable to simple organisms can be used

systematically to explain the highest reaches of human cognition. It has been argued that

this impasse is largely due to the methodological individualism that is still prevalent in

cognitive science, and that a proper consideration of the constitutive role of sociality for

agency and sense-making is needed in order to make the life-mind continuity thesis a

viable approach. We therefore have addressed the cognitive gap from a theoretical,

experimental and phenomenological perspective.

In terms of theory, the enactive approach to social cognition is developed in a novel

direction by highlighting the specific manner in which the dynamics of the interaction

process opens up new behavioral domains. In particular, we develop novel definitions of

multi-agent systems and social interaction, the latter of which emphasizes the essential

co-regulation of social acts. This theoretical background provides the motivation for

using an evolutionary robotics methodology to synthesize a set of novel minimalist

simulation models. These are based on actual experiments in social psychology, so as to

promote a mutually informative dialogue between the two disciplines. A detailed

dynamical analysis of these models supports the enactive approach; the behavior of the

agents in a multi-agent system is not an individual achievement alone but rather codetermined

by their mutual interaction and organized effectively by this interaction

process. It is demonstrated that when this interaction process is co-regulated as part of a

social context (rather than just a multi-agent system), the extension to individual

behavioral capacities is even further increased.

232 | P a g e

These results are complemented by a phenomenological investigation which has

revealed another common assumption in mainstream cognitive science that we have

called methodological physicalism, i.e. the idea that the function of perception is to

furnish the perceiver with information about the abstract physical properties of the

world. It is argued that this assumption prevents a proper appreciation of sociality

because it mistakenly focuses scientific efforts on the „problem of other minds‟: how

social understanding is possible on the basis of physical facts that are devoid of any

significance. A critical analysis of phenomenological observations, on the other hand,

indicates that the detached perceptual attitude that is characteristic of adult human

perception is essentially an intersubjective and socially mediated ability. Finally, the

systemic and phenomenological insights are combined to provide a novel perspective on

the origins of cumulative cultural development. This perspective suggests a more

coherent interpretation of the available empirical data. It is concluded that the life-mind

continuity thesis is a viable working hypothesis even when accounting for specifically

human abilities, and that an appreciation of the constitutive role of sociality for life and

mind confirms it to be a serious contender for a unified theory of cognitive science.

This thesis has demonstrated that the enactive paradigm can enable us to understand

agency, sociality, and culture in a novel manner by drawing on theoretical, experimental

and phenomenological methods. On this basis it has been possible to dissolve some

outstanding problems faced by more traditional approaches, as well as to formulate new

hypotheses that are open to validation by future empirical experiments. It has been

acknowledged that the novel interpretations that have been suggested for evidence in

social psychology, infant studies, primatology, and evolutionary anthropology still need

to be worked out in more detail, but the outlines of a possible working hypothesis have

been indicated. Indeed, in many ways this thesis has only outlined the beginnings of

what could become an important research program in its own right, by suggesting how

the enactive approach is able to integrate scientific and phenomenological, artificial and

empirical, intellectual and practical traditions into one coherent framework.

233 | P a g e

15 References

Ambrose, S. H. (2001), “Paleolithic Technology and Human Evolution”, Science, 291, pp. 1748-1753

Anderson, M. L. (2003), “Embodied Cognition: A field guide”, Artificial Intelligence, 149(1), pp. 91-130

Auvray, M., Hanneton, S., Lenay, C. & O‟Regan, K. (2005), “There is something out there: Distal

attribution in sensory substitution, twenty years later”, Journal of Integrative Neuroscience, 4(4), pp.

505-521

Auvray, M., Lenay, C. & Stewart, J. (2009), “Perceptual interactions in a minimalist virtual

environment”, New Ideas in Psychology, 27(1), pp. 32-47

Auvray, M., Philipona, D., O‟Regan, J. K. & Spence, C. (2007), “The perception of space and form

recognition in a simulated environment: The case of minimalist sensory-substitution devices”,

Perception, 36, pp. 1736-1751

Baldwin, J. M. (1897), “Organic Selection”, Nature, 55, p. 558

Barandiaran, X., Di Paolo, E. A. & Rohde, M. (2009), “Defining Agency: Individuality, Normativity,

Asymmetry, and Spatio-temporality in Action”, Adaptive Behavior, 17(5), pp. 367-386

Barandiaran, X. & Moreno, A. (2006), “On what makes certain dynamical systems cognitive: A

minimally cognitive organization program”, Adaptive Behavior, 14(2), pp. 171-185

Barandiaran, X. & Moreno, A. (2008), “Adaptivity: From Metabolism to Behavior”, Adaptive Behavior,

16(5), pp. 325-344

Beer, R. D. (1995a), “A dynamical systems perspective on agent-environment interaction”, Artificial

Intelligence, 72(1-2), pp. 173-215

Beer, R. D. (1995b), “On the dynamics of small continuous-time recurrent neural networks”, Adaptive

Behavior, 3(4), pp. 471-511

Beer, R. D. (1996), “Toward the evolution of dynamical neural networks for minimally cognitive

behavior”, in: P. Maes, M. J. Mataric, J.-A. Arcady, J. Pollack & S. W. Wilson (eds.), From Animals

to Animats 4: Proc. of the 4 th Int. Conf. on Simulation of Adaptive Behavior, Cambridge, MA: The

MIT Press, pp. 421-429

Beer, R. D. (1997), “The dynamics of adaptive behavior: A research program”, Robotics and Autonomous

Systems, 20(2-4), pp. 257-289

Beer, R. D. (2000), “Dynamical approaches to cognitive science”, Trends in Cognitive Sciences, 4(3), pp.

91-99

Beer, R. D. (2003), “The dynamics of active categorical perception in an evolved model agent”, Adaptive

Behavior, 11(4), pp. 209-243

Beer, R. D. (2004), “Autopoiesis and Cognition in the Game of Life”, Artificial Life, 10(3), pp. 309-326

234 | P a g e

Bitbol, M. (2002), “Science as if situation mattered”, Phenomenology and the Cognitive Sciences, 1(2),

pp. 181-224

Bitbol, M. & Luisi, P. L. (2004), “Autopoiesis with or without cognition: defining life at its edge”,

Journal of the Royal Society Interface, 1(1), pp. 99-107

Boden, M. A. (1996) (ed.), The Philosophy of Artificial Life, New York, NY: Oxford University Press

Boden, M. A. (2006a), Mind as Machine: A History of Cognitive Science, New York, NY: Oxford

University Press

Boden, M. A. (2006b), “Of Islands and Interactions”, Journal of Consciousness Studies, 13(5), pp. 53-63

Bourgine, P., & Stewart, J. (2004), “Autopoiesis and Cognition”, Artificial Life, 10(3), pp. 327-345

Bourgine, P. & Varela, F. J. (1992), “Introduction: Towards a Practice of Autonomous Systems”, in: F. J.

Varela & P. Bourgine (eds.), Towards a Practice of Autonomous Systems: Proc. of the 1 st Euro. Conf.

on Artificial Life, Cambridge, MA: The MIT Press, pp. 1 - 3

Brooks, R. A. (1991a), “Intelligence without representation”, Artificial Intelligence, 47(1-3), pp. 139-160

Brooks, R. A. (1991b), “New Approaches to Robotics”, Science, 253, pp. 1227-1232

Brooks, R. A. (1997), “From earwigs to humans”, Robotics and Autonomous Systems, 20(2-4), pp. 291-

304

Brooks, R. A. (2001), “The relationship between matter and life”, Nature, 409(6818), pp. 409-411

Buchanan, B. (2007), “The Time of the Animal”, PhaenEx, 2(2), pp. 61-80

Buckley, C. L., Fine, P., Bullock, S. & Di Paolo, E. A. (2008), “Monostable controllers for adaptive

behavior”, in: M. Asada, J. C. T. Hallam, J.-A. Meyer & J. Tani (eds.), From Animals to Animats 10:

Proc. of the 10 th Int. Conf. on Simulation of Adaptive Behavior, Berlin, Germany: Springer-Verlag,

pp. 103-112

Buttelmann, D., Carpenter, M., Call, J. & Tomasello, M. (2007), “Encultured chimpanzees imitate

rationally”, Developmental Science, 10(4), pp. F31-F38

Buzsáki, G. (2006), Rhythms of the Brain, New York, NY: Oxford University Press

Byrne, R. W., Barnard, P. J., Davidson, I., Janik, V. M., McGrew, W. C., Miklósi, Á. & Wiessner, P.

(2004), “Understanding culture across species”, Trends in Cognitive Sciences, 8(8), pp. 341-346

Call, J., Carpenter, M. & Tomasello, M. (2005), “Copying results and copying actions in the process of

social learning: chimpanzees (Pan troglodytes) and human children (Homo sapiens)”, Animal

Cognition, 8, pp. 151-163

Call, J. & Tomasello, M. (2008), “Does the chimpanzee have a theory of mind? 30 years later”, Trends in

Cognitive Sciences, 12(5), pp. 187-192

235 | P a g e

Carruthers, P. (1996), "Simulation and self-knowledge: A defense of theory-theory", in: P. Carruthers &

P. K. Smith (eds.), Theories of Theories of Mind, Cambridge, UK: Cambridge University Press, pp.

22-38

Catmur, C., Walsh, V. & Heyes, C. M. (2007), “Sensorimotor learning configures the human mirror

system”, Current Biology, 17, pp. 1527-1531

Chalmers, D. J. (1996), The Conscious Mind: In Search of a Fundamental Theory, New York, NY:

Oxford University Press

Chiel, H. J. & Beer, R. D. (1997), “The brain has a body: Adaptive behaviour emerges from interactions

of nervous system, body and environment”, Trends in Neurosciences, 20(12), pp. 553-557

Clark, A. (1997), Being There: Putting brain, body, and world together again, Cambridge, MA: The MIT

Press

Clark, A. (2001), Mindware: An Introduction to the Philosophy of Cognitive Science, Oxford, UK:

Oxford University Press

Clark, A. (2003), Natural-Born Cyborgs: Minds, Technologies, and the Future of Human Intelligence,

Oxford, UK: Oxford University Press

Clark, A. (2005), “Beyond the Flesh: Some Lessons from a Mole Cricket”, Artificial Life, 11(1-2), pp.

233-244

Clark, A. (2006), “That lonesome whistle: a puzzle for the sensorimotor model of perceptual experience”,

Analysis, 66(289), pp. 22-25

Clark, A. (2008), Supersizing the Mind: Embodiment, Action, and Cognitive Extension, New York, NY:

Oxford University Press

Cliff, D. (1991), “Computational Neuroethology: A Provisional Manifesto”, in: J.-A. Meyer & S. W.

Wilson (eds.), From Animals to Animats: Proc. of the 1 st Int. Conf. on Simulation of Adaptive

Behavior, Cambridge, MA: The MIT Press, pp. 29-39

Cliff, D., Harvey, I. & Husbands, P. (1993), “Explorations in Evolutionary Robotics”, Adaptive Behavior,

2(1), pp. 73-110

Cole, J. (1995), Pride and a Daily Marathon, Cambridge, MA: The MIT Press

Cole, J. (2009), “Impaired embodiment and intersubjectivity”, Phenomenology and the Cognitive

Sciences, 8(3), pp. 343-360

Cole, J., Gallagher, S. & McNeill, D. (2002a), “Gesture following deafferentation: A phenomenologically

informed experimental study”, Phenomenology and the Cognitive Sciences, 1(1), pp. 49-67

Cole, J., Gallagher, S. & McNeill, D. (2002b), “Social cognition and primacy of movement revisited”,

Trends in Cognitive Sciences, 6(4), pp. 155-156

236 | P a g e

Colombetti, G. (in press), “Enaction, sense-making and emotion”, in: J. Stewart, O. Gapenne & E. A. Di

Paolo (eds.), Enaction: Towards a New Paradigm for Cognitive Science, Cambridge, MA: The MIT

Press

Dennett, D. C. (1978), “Why not the whole iguana?”, Behavioral and Brain Sciences, 1, pp. 103-104

Dennett, D. C. (1984), “Cognitive Wheels: The Frame Problem of AI”, in: C. Hookway (ed.), Minds,

Machines, and Evolution: Philosophical Studies, Cambridge, UK: Cambridge University Press, pp.

129-151

Dennett, D. C. (1994), “Artificial Life as Philosophy”, Artificial Life, 1(3), pp. 291-292

Depraz, N., Varela, F. J. & Vermersch, P. (2003), On Becoming Aware: A pragmatics of experiencing,

The Netherlands, Amsterdam: John Benjamins Publishing

De Jaegher, H. (2006), Social Interaction Rhythm and Participatory Sense-Making: An Embodied,

Interactional Approach to Social Understanding, with Some Implication for Autism, unpublished

D.Phil. thesis, Brighton, UK: University of Sussex

De Jaegher, H. (2009), “Social understanding though direct perception? Yes, by interacting”,

Consciousness and Cognition, 18, pp. 535-542

De Jaegher, H. & Di Paolo, E. A. (2007), “Participatory sense-making: An enactive approach to social

cognition”, Phenomenology and the Cognitive Sciences, 6(4), pp. 485-507

De Jaegher, H. & Di Paolo, E. A. (2008), "Making Sense in Participation: An Enactive Approach to

Social Cognition", in: F. Morganti, A. Carassa & G. Riva (eds.), Enacting Intersubjectivity: A

Cognitive and Social Perspective on the Study of Interactions, Amsterdam, Netherlands: IOS Press,

pp. 33-47

De Jaegher, H. & Froese, T. (2009), “On the Role of Social Interaction in Individual Agency”, Adaptive

Behavior, 17(5), pp. 444-460

Di Paolo, E. A. (1999), On the Evolutionary and Behavioral Dynamics of Social Coordination: Models

and Theoretical Aspects, unpublished D.Phil. thesis, Brighton, UK: University of Sussex

Di Paolo, E. A. (2000), “Behavioral coordination, structural congruence and entrainment in a simulation

of acoustically coupled agents”, Adaptive Behavior, 8(1), pp. 25-46

Di Paolo, E. A. (2003), “Organismically-inspired robotics: homeostatic adaptation and teleology beyond

the closed sensorimotor loop”, in: K. Murase & T. Asakura (eds.), Dynamical Systems Approach to

Embodiment and Sociality, Adelaide, Australia: Advanced Knowledge International, pp. 19-42

Di Paolo, E. A. (2005), “Autopoiesis, adaptivity, teleology, agency”, Phenomenology and the Cognitive

Sciences, 4(4), pp. 429-452

Di Paolo, E. A. (2008), “A Mind of Many”, Constructivist Foundations, 3(2), pp. 89-91

Di Paolo, E. A. (2009), “Extended Life”, Topoi, 28(1), pp. 9-21

237 | P a g e

Di Paolo, E. A. (in press), “Overcoming autopoiesis: An enactive detour on the way from life to society”,

in: R. Magalhaes & R. Sanchez (eds.), Autopoiesis in Organizations and Information Systems,

Elsevier

Di Paolo, E. A. & Harvey, I. (2003), “Decisions and Noise: The Scope of Evolutionary Synthesis and

Dynamical Analysis”, Adaptive Behavior, 11(4), pp. 284-288

Di Paolo, E. A. & Iizuka, H. (2008), “How (not) to model autonomous behavior”, BioSystems, 91(2), pp.

409-423

Di Paolo, E. A., Noble, J. & Bullock, S. (2000), “Simulation Models as Opaque Thought Experiments”,

in: M. A. Bedau, J. S. McCaskill, N. H. Packard & S. Rasmussen (eds.), Artificial Life VII: Proc. of

the 7 th Int. Conf. on Artificial Life, Cambridge, MA: The MIT Press, pp. 497-506

Di Paolo, E. A., Rohde, M. & De Jaegher, H. (in press), “Horizons for the Enactive Mind: Values, Social

Interaction, and Play”, in: J. Stewart, O. Gapenne & E. A. Di Paolo (eds.), Enaction: Towards a New

Paradigm for Cognitive Science, Cambridge, MA: The MIT Press

Di Paolo, E. A., Rohde, M. & Iizuka, H. (2008), “Sensitivity to social contingency or stability of

interaction? Modelling the dynamics of perceptual crossing”, New Ideas in Psychology, 26(2), pp.

278-294

Dreyfus, H. L. (1972), What Computers Can’t Do: A Critique of Artificial Reason, New York, NY:

Harper and Row

Dreyfus, H. L. (1981), “From Micro-Worlds to Knowledge Representation: AI at an Impasse”, in: J.

Haugeland (ed.), Mind Design: Philosophy, Psychology, Artificial Intelligence, Cambridge, MA: The

MIT Press, pp. 161-204

Dreyfus, H. L. (1991), Being-in-the-World: A Commentary on Heidegger’s Being and Time, Division 1,

Cambridge, MA: The MIT Press

Dreyfus, H. L. (2007), “Why Heideggerian AI failed and how fixing it would require making it more

Heideggerian”, Philosophical Psychology, 20(2), pp. 247-268

Dreyfus, H. L. & Dreyfus, S. E. (1988), “Making a mind versus modelling the brain: artificial intelligence

back at a branch-point”, Daedalus¸ 117(1), p. 15-44

Dunbar, R. I. M. (1998), “The Social Brain Hypothesis”, Evolutionary Anthropology, 6, pp. 178-190

Dupuy, J.-P. (2009), On the Origins of Cognitive Science: The Mechanization of Mind, Cambridge, MA:

The MIT Press

Egbert, M. D. & Di Paolo, E. A. (2009), “Integrating Autopoiesis and Behavior: An Exploration in

Computational Chemo-ethology”, Adaptive Behavior, 17(5), pp. 387-401

Eldredge, N. & Gould, S. J. (1972), “Punctuated equilibria: An alternative to phyletic gradualism”, in: T.

J. M. Schopf (ed.), Models in Paleobiology, San Francisco, CA: Freeman, Cooper & Co.

238 | P a g e

Fitch, W. T. (2000), “The evolution of speech: a comparative review”, Trends in Cognitive Sciences, 4(7),

pp. 258-267

Fodor, J. A. (1975), The Language of Thought, Cambridge, MA: Harvard University Press

Fogel, A. (1993), Developing through relationships: Origins of communication, self and culture, London,

UK: Harvester Wheatsheaf

Froese, T. (2007), “On the role of AI in the ongoing paradigm shift within the cognitive sciences”, in: M.

Lungarella, F. Iida, J. Bongard, & R. Pfeifer (eds.), 50 Years of Artificial Intelligence: Essays

Dedicated to the 50 th Anniversary of Artificial Intelligence, Berlin, Germany: Springer, pp. 63-75

Froese, T. (2009), “Hume and the enactive approach to mind”, Phenomenology and the Cognitive

Sciences, 8(1), pp. 95-133

Froese, T. & Di Paolo, E. A. (2008a), “Stability of coordination requires mutuality of interaction in a

model of embodied agents”, in: M. Asada, J. C. T. Hallam, J.-A. Meyer & J. Tani (eds.), From

Animals to Animats 10: Proc. of the 10 th Int. Conf. on Simulation of Adaptive Behavior, Berlin,

Germany: Springer-Verlag, pp. 52-61

Froese, T. & Di Paolo, E. A. (2008b), “Can evolutionary robotics generate simulation models of

autopoiesis?”, Cognitive Science Research Paper (CSRP), 598, Brighton, UK: University of Sussex

Froese, T. & Di Paolo, E. A. (2009), “Sociality and the life-mind continuity thesis”, Phenomenology and

the Cognitive Sciences, 8(4), pp. 439-463

Froese, T. & Di Paolo, E. A. (in press-a), “Modeling social interaction as perceptual crossing: An

investigation into the dynamics of the interaction process”, Connection Science

Froese, T. & Di Paolo, E. A. (in press-b), “Toward Minimally Social Behavior: Social Psychology Meets

Evolutionary Robotics”, Advances in Artificial Life: Proc. of the 10 th Euro. Conf. on Artificial Life,

Berlin, Germany: Springer-Verlag

Froese, T. & Spier, E. (2008), “Convergence and Crossover: The Permutation Problem Revisited”,

Cognitive Science Research Paper (CSRP), 596, Brighton, UK: University of Sussex

Froese, T. & Spiers, A. (2007), “Toward a Phenomenological Pragmatics of Enactive Perception”, in:

Enactive/07: Proc. of the 4 th Int. Conf. on Enactive Interfaces, Grenoble, France: Association

ACROE, pp. 105-108

Froese, T., Virgo, N. & Izquierdo, E. (2007), “Autonomy: a review and a reappraisal”, in: F. Almeida e

Costa, L. M. Rocha, E. Costa, I. Harvey & A. Coutinho (eds.), Advances in Artificial Life: Proc. of

the 9 th Euro. Conf. on Artificial Life, Berlin, Germany: Springer-Verlag, pp. 455-464

Froese, T. & Ziemke, T. (2009), “Enactive Artificial Intelligence: Investigating the systemic organization

of life and mind”, Artificial Intelligence, 173(3-4), pp. 366-500

Funahashi, K. & Nakamura, Y. (1993), “Approximation of dynamical systems by continuous time

recurrent neural networks”, Neural Networks, 6(6), pp. 801-806

239 | P a g e

Gallagher, S. (1997), “Mutual enlightenment: Recent phenomenology in cognitive science”, Journal of

Consciousness Studies, 4(3), pp. 195-214

Gallagher, S. (2001), "The Practice of Mind: Theory, Simulation or Primary Interaction?", Journal of

Consciousness Studies, 8(5-7), pp. 83-108

Gallagher, S. (2005), How the Body Shapes the Mind, New York, NY: Oxford University Press

Gallagher, S. (2007), “Simulation trouble”, Social Neuroscience, 2(3-4), pp. 353-365

Gallagher, S. (2008a), “Intersubjectivity in perception”, Continental Philosophy Review, 41(2), pp. 163-

178

Gallagher, S. (2008b), “Direct perception in the intersubjective context”, Consciousness and Cognition,

17(2), pp. 535-543

Gallagher, S. (2008c), Brainstorming: View and Interviews on the Mind, Exeter, UK: Imprint Academic

Gallagher, S. (2008d), "Inference or interaction: social cognition without precursors", Philosophical

Explorations, 11(3), pp. 163-174

Gallagher, S. & Cole, J. (1995), “Body schema and body image in a deafferented subject”, Journal of

Mind and Behavior, 16, pp. 369-390

Gallagher, S. & Marcel, A. J. (1999), “The self in contextualized action”, Journal of Consciousness

Studies, 6(6), pp. 4-30

Gallagher, S. & Meltzoff, A. N. (1996), “The earliest sense of self and others: Merleau-Ponty and recent

developmental studies”, Philosophical Psychology, 9, pp. 213-236

Gallagher, S. & Zahavi, D. (2008), The Phenomenological Mind: An Introduction to Philosophy of Mind

and Cognitive Science, London, UK: Routledge

Gallese, V. & Goldman, A. (1998), "Mirror neurons and the simulation theory of mind reading", Trends

in Cognitive Science, 2(12), pp. 493-501

Gergely, G., Bekkering, H. & Király, I. (2002), “Rational imitation in preverbal infants”, Nature, 415, p.

755

Gergely, G. & Watson, J. (1999), “Early social-emotional development: contingency perception and the

social biofeedback model”, in: P. Rochat (ed.), Early social cognition: Understanding others in the

first months of life, Hillsdale, NJ: Lawrence Erlbaum, pp. 101-137

Godfrey-Smith, P. (1996), “Spencer and Dewey on Life and Mind”, in: M. Boden (ed.), The Philosophy

of Artificial Life, Oxford, UK: Oxford University Press, pp. 314-331

Goldberg, D. E. (1989), Genetic Algorithms in Search, Optimization, and Machine Learning, Redwood

City, CA: Addison-Wesley

Hanna, R. & Thompson, E. (2003), “The Mind-Body-Body Problem”, Theoria et Historia Scientarum,

7(1), pp. 24-44

240 | P a g e

Harnad, S. (1990), “The symbol grounding problem”, Physica D, 42, pp. 335-346

Harnish, R. M. (2002), Minds, Brains, Computers: An Historical Introduction to the Foundations of

Cognitive Science, Malden, MA: Blackwell Publishers

Harvey, I. (1996), “Untimed and misrepresented: connectionism and the computer metaphor”, Artificial

Intelligence and Simulation of Behavior Quarterly, 96, pp. 20-27

Harvey, I. (2000), “Robotics: Philosophy of Mind using a Screwdriver”, in: T. Gomi (ed.), Evolutionary

Robotics. From Intelligent Robots to Artificial Life: Proc. of the 7 th Int. Symposium on Evolutionary

Robotics, Ontario, Canada: AAI Books, pp. 207-230

Harvey, I. (2001), “Artificial Evolution: A Continuing SAGA”, in: T. Gomi (ed.), Evolutionary Robotics.

From Intelligent Robotics to Artificial Life: Proc. of the 8 th Int. Symposium on Evolutionary Robotics,

Berlin, Germany: Springer-Verlag, pp. 94-109

Harvey, I. (2004), “Homeostasis and Rein Control: From Daisyworld to Active Perception”, in: J.

Pollack, M. Bedau, P. Husbands, T. Ikegami, & R. A. Watson (eds.), Artificial Life IX: Proc. of the

9 th Int. Conf. on the Simulation and Synthesis of Living Systems, Cambridge, MA: The MIT Press, pp.

309-314

Harvey, I., Di Paolo, E. A., Wood, R., Quinn, M. & Tuci, E. A. (2005), “Evolutionary Robotics: A new

scientific tool for studying cognition”, Artificial Life, 11(1-2), pp. 79-98

Haselager, W. F. G. (2005), “Robotics, philosophy and the problems of autonomy”, Pragmatics &

Cognition, 13(3), pp. 515-532

Haugeland, J. (1985), Artificial Intelligence: The Very Idea, Cambridge, MA: The MIT Press

Haugeland, J. (1997), “What is mind design?”, in: J. Haugeland (ed.), Mind Design II: Philosophy,

Psychology, Artificial Intelligence, Cambridge, MA: The MIT Press, pp. 1-28

Hayes, J. (2007), “Heidegger‟s Fundamental Ontology and the Problem of Animal Life”, PhaenEx, 2(2),

pp. 42-60

Heath, J. (2009), “Methodological individualism”, in: E. N. Zalta (ed.), The Stanford Encyclopedia of

Philosophy (Summer 2009 Ed.), http://plato.stanford.edu/archives/sum2009/entries/methodologicalindividualism/

Heidegger, M. (1927), Sein und Zeit, trans. by: J. Macquarrie & E. Robinson, Being and Time, Oxford,

UK: Blackwell Publishing Ltd., 1962

Heidegger, M. (1929), Die Grundbegriffe der Metaphysik: Welt, Endlichkeit, Einsamkeit, trans. by: W.

McNeill & N. Walker, The Fundamental Concepts of Metaphysics: World, Finitude, Solitude,

Bloomington, IN: Indiana University Press, 1995

Holland, J. H. (1975), Adaptation in Natural and Artificial Systems, Ann Arbor, MI: University of

Michigan Press

241 | P a g e

Hurley, S. (2008), “The shared circuits model (SCM): How control, mirroring, and simulation can enable

imitation, deliberation, and mindreading”, Behavioral and Brain Sciences, 31(1), pp. 1-22

Husbands, P., Holland, O. & Wheeler, M. (eds.) (2008), The Mechanical Mind in History, Cambridge,

MA: The MIT Press

Husbands, P., Smith, T., Jakobi, N. & O‟Shea, M. (1998), “Better Living through Chemistry: Evolving

GasNets for Robot Control”, Connection Science, 10(3-4), pp. 185-210

Husserliana (Hua)

Husserliana I: Husserl, E. (1929), Cartesianische Meditationen und Pariser Vorträge, Den Haag,

Netherlands: Martinus Nijhoff, 1950; trans. by: D. Cairns, Cartesian Meditations: An Introduction to

Phenomenology, The Hague, Netherlands: Martinus Nijhoff, 1960

Husserliana VI: Husserl, E. (1936), Die Krisis der europäischen Wissenschaften und die transzendentale

Phänomenologie. Eine Einleitung in die phänomenologische Philosophie, Den Haag, Netherlands:

Martinus Nijhoff, 1976; trans. by: D. Carr, The Crisis of European Sciences and Transcendental

Phenomenology. An Introduction to Phenomenology, Evanston, IL: Northwestern Uni. Press, 1970

Husserliana IX: Husserl, E. (1925), Phänomenologische Psychologie. Vorlesungen Sommersemester

1925, Den Haag, Netherlands: Martinus Nijhoff, 1962; trans. by: J. Scanlon, Phenomenological

Psychology: Lectures, Summer Semester, 1925, The Hague, Netherlands: Martinus Nijhoff, 1977

Husserliana X: Husserl, E. (1893-1917), Zur Phänomenologie des inneren Zeitbewusstseins (1893-1917),

Den Haag, Netherlands: Martinus Nijhoff, 1966; trans. by: J. B. Brough, On the Phenomenology of

the Consciousness of Internal Time (1893-1917), Dordrecht, Netherlands: Kluwer Academic, 1991

Husserliana XIII: Husserl, E. (1905-1920), Zur Phänomenologie der Intersubjektivität. Texte aus dem

Nachlass. Erster Teil: 1905-1920, Den Haag, Netherlands: Martinus Nijhoff, 1973

Husserliana XIV: Husserl, E. (1921-1928), Zur Phänomenologie der Intersubjektivität. Texte aus dem

Nachlass. Zweiter Teil: 1921-1928, Den Haag, Netherlands: Martinus Nijhoff, 1973

Husserliana XV: Husserl, E. (1929-1935), Zur Phänomenologie der Intersubjektivität. Texte aus dem

Nachlass. Dritter Teil: 1929-1935, Den Haag, Netherlands: Martinus Nijhoff, 1973

Hutchins, E. (1995), Cognition in the wild, Cambridge, MA: The MIT Press

Iida, F. & Pfeifer, R. (2004), “„Cheap‟ Rapid Locomotion of a Quadruped Robot: Self-Stabilization of

Bounding Gait”, in: F. C. A. Groen, N. Amato, A. Bonarini, E. Yoshida & B. Kröse (eds.), Intelligent

Autonomous Systems 8, Amsterdam, Netherlands: IOS Press, Inc., pp. 642-650

Iizuka, H. & Di Paolo, E. A. (2007a), “Toward Spinozist robotics: Exploring the minimal dynamics of

behavioral preference”, Adaptive Behavior, 15(4), pp. 359-376

Iizuka, H. & Di Paolo, E. A. (2007b), “Minimal Agency Detection of Embodied Agents”, in: F. Almeida

e Costa, L. M. Rocha, E. Costa, I. Harvey & A. Coutinho (eds.), Advances in Artificial Life: Proc. of

the 9 th Euro. Conf. on Artificial Life, Berlin, Germany: Springer-Verlag, pp. 485-494

242 | P a g e

Iizuka, H. & Di Paolo, E. A. (2008), “Extended Homeostatic Adaptation: Improving the link between

internal and behavioral stability”, in: M. Asada, J. C. T. Hallam, J.-A. Meyer & J. Tani (eds.), From

Animals to Animats 10: Proc. of the 10 th Int. Conf. on Simulation of Adaptive Behavior, Berlin,

Germany: Springer-Verlag, pp. 1-11

Iizuka, H. & Ikegami, T. (2004a), “Adaptability and diversity in simulated turn-taking behavior”,

Artificial Life, 10(4), pp. 361-378

Iizuka, H. & Ikegami, T. (2004b), “Simulating autonomous coupling in discrimination of light

frequencies”, Connection Science, 16(4), pp. 283-299

Ikegami, T. & Iizuka, H. (2007), “Turn-taking interaction as a cooperative and co-creative process”,

Infant Behavior & Development, 30(2), pp. 278-288

Ikegami, T. & Suzuki, K. (2008), “From homeostatic to homeodynamic Self”, BioSystems, 91(2), pp.

388-400

Iverson, J. M. & Goldin-Meadow, S. (1998), “Why people gesture when they speak”, Nature, 396, p. 228

Izquierdo, E. & Buhrmann, T. (2008), “Analysis of a dynamical recurrent neural network evolved for two

qualitatively different tasks: walking and chemotaxis”, in: S. Bullock, J. Noble, R. Watson, & M. A.

Bedau (eds.), Artificial Life XI: Proceedings of the Eleventh International Conference on the

Simulation and Synthesis of Living Systems, Cambridge, MA: The MIT Press, pp. 257-264

Izquierdo-Torres, E. & Di Paolo, E. A. (2005), “Is an embodied system ever purely reactive?”, in: M.

Capcarrere, A. A. Freitas, P. J. Bentley, C. G. Johnson & J. Timmis (eds.), Advances in Artificial

Life: Proc. of the 8th Euro. Conf. on Artificial Life, Germany, Berlin: Springer-Verlag, pp. 252–261

Izquierdo, E., Harvey, I. & Beer, R. (2008), “Associative Learning on a Continuum in Evolved

Dynamical Neural Networks”, Adaptive Behavior, 16(6), pp. 361-384

Jonas, H. (1966), The Phenomenon of Life: Toward a Philosophical Biology, Evanston, Illinois:

Northwestern University Press, 2001

Jonas, H. (1968), “Biological Foundations of Individuality”, International Philosophical Quarterly, 8, pp.

231-251

Jonas, H. (1992), “The Burden and Blessing of Mortality”, The Hastings Center Report, 22(1), pp. 34-40

Kant, I. (1790), Kritik der Urteilskraft, trans. by: W. S. Pluhar, Critique of Judgment, Indianapolis, IN:

Hacket Publishing Company, 1987

Kelso, J. A. S. (1995), Dynamic Patterns: The Self-Organization of Brain and Behaviour, Cambridge,

MA: The MIT Press

Kirsh, D. (1991), “Today the earwig, tomorrow man?”, Artificial Intelligence, 47(1-3), pp. 161-184

Kuhn, T. S. (1962), The Structure of Scientific Revolutions, Chicago, IL: University of Chicago Press

243 | P a g e

LaFrance, M. (1982), “Posture mirroring and rapport”, in: M. Davis (ed.), Interaction Rhythms:

Periodicity in communicative behavior, New York, NY: Human Sciences Press, pp. 279-298

Langton, C. G. (1989), “Artificial Life”, in: C. G. Langton (ed.), Artificial Life: Proceedings of an

Interdisciplinary Workshop on the Synthesis and Simulation of Living Systems, Santa Fe Institute

Studies in the Sciences of Complexity, vol. 4, Redwood City, CA: Addison-Wesley, pp. 1-47

Langton, C. G. (1995) (ed.), Artificial Life: An Overview, Cambridge, MA: The MIT Press

Legrand, D. (2006), “The bodily self: The sensori-motor roots of pre-reflexive self-consciousness”,

Phenomenology and the Cognitive Sciences, 5(1), pp. 89-118

Levinas, E. (1979), Le temps et l’autre, trans. by: R. A. Cohen, Time and the Other, Pittsburgh, PA:

Duquesne University Press, 1987

Levine, J. (1983), “Materialism and Qualia: The Explanatory Gap”, Pacific Philosophical Quarterly, 64,

pp. 354-361

Lindblom, J. & Ziemke, T. (2003), “Social Situatedness of Natural and Artificial Intelligence: Vygotsky

and Beyond”, Adaptive Behavior, 11(2), pp. 79-96

Luhmann, N. (1984), Soziale Systeme: Grundriß einer allgemeinen Theorie, Frankfurt, Germany:

Suhrkamp Verlag

Luisi, P. L. (2003), “Autopoiesis: a review and reappraisal”, Naturwissenschaften, 90, pp. 49-59

MacLennan, B. J. (1992), “Synthetic Ethology: An Approach to the Study of Communication”, in: C. G.

Langton, C. Taylor, D. Farmer & S. Rasmussen (eds.), Artificial Life II, Redwood City, CA:

Addison-Wesley, pp. 631-658

Marcel, A. J. (1992), “The personal level in cognitive rehabilitation”, in: N. von Steinbuchel, E. Poppel &

D. von Cramon (eds.), Neuropsychological Rehabilitation, Berlin, Germany: Springer, pp. 155-168

Maturana, H. R. (1978), “Biology of Language: The Epistemology of Reality”, in: G. Miller & E.

Lenneberg (eds.), Psychology and Biology of Language and Thought: Essays in Honor of Eric

Lenneberg, New York, NY: Academic Press, pp. 27-63

Maturana, H. R. (1988), “Reality: The Search for Objectivity or the Quest for a Compelling Argument”,

The Irish Journal of Psychology, 1(9), pp. 25-82

Maturana, H. R. (2002), “Autopoiesis, Structural Coupling and Cognition: A history of these and other

notions in the biology of cognition”, Cybernetics & Human Knowing, 9(3-4), pp. 5-34

Maturana, H. R., Mpodozis, J. & Letelier, J. C. (1995), “Brain, Language and the Origin of Human

Mental Functions”, Biological Research, 28, pp. 15-26

Maturana, H. R. & Varela, F. J. (1980), Autopoiesis and Cognition: The Realization of the Living,

Dordrecht, Holland: Kluwer Academic Publishers

244 | P a g e

Maturana, H. R., & Varela, F. J. (1987), The Tree of Knowledge: The Biological Roots of Human

Understanding, Boston, MA: Shambhala Publications

McCarthy, J. & Hayes, P. J. (1969), “Some philosophical problems from the standpoint of artificial

intelligence”, in: B. Meltzer & D. Michie (eds.), Machine Intelligence 4, Edinburgh, UK: Edinburgh

University Press, pp. 463-502

McClelland, J. L., Rumelhart, D. E. & the PDP Research Group (1986), Parallel Distributed Processing:

Explorations in the Microstructure of Cognition. Vol. 2: Psychological and Biological Models,

Cambridge, MA: The MIT Press

McGann, M. (2007), “Enactive theorists do it on purpose: Toward an enactive account of goals and goaldirectedness”,

Phenomenology and the Cognitive Sciences, 6(4), pp. 463-483

McMullin, B. (2004), “Thirty Years of Computational Autopoiesis: A Review”, Artificial Life, 10(3), pp.

277-295

Meltzoff, A. N. (1995), “Understanding the Intentions of Others: Re-Enactment of Intended Acts by 18-

Month-Old Children”, Developmental Psychology, 31(5), pp. 838-850

Meltzoff, A. N. & Borton, R. W. (1979), “Intermodal matching by human neonates”, Nature, 282, pp.

403-404

Meltzoff, A. N. & Moore, M. K. (1977), “Imitation of facial and manual gestures by human neonates”,

Science, 198, pp. 75-78

Meltzoff, A. N. & Moore, M. K. (1983), “Newborn infants imitate adult facial gestures”, Child

Development, 54, pp. 702-709

Meltzoff, A. N. & Moore, M. K. (1989), “Imitation in newborn infants: exploring the range of gestures

imitated and the underlying mechanisms”, Developmental Psychology, 25, pp. 954-962

Meltzoff, A. N. & Moore, M. K. (1997), “Explaining facial imitation: A theoretical model”, Early

Development and Parenting, 6, pp. 179-192

Merleau-Ponty, M. (1945), Phénomènologie de la perception, trans. by: C. Smith, Phenomenology of

perception, New York, NY: Routledge & Kegan Paul, 1962

Merleau-Ponty, M. (1960), “Les relations avec autrui chez l‟enfant”, Paris, France: Cours de Sorbonne,

trans. by W. Cobb, “The Child‟s Relations with Others”, in: M. Merleau-Ponty (1964), The Primacy

of Perception And Other Essays on Phenomenological Psychology, the Philosophy of Art, History

and Politics, J. M. Edie (ed.), Evanston, IL: Northwestern University Press, pp. 96-155

Miller, G. (2005), “What is the biological basis of consciousness?”, Science, 309(5731), p. 79

Millikan, R. G. (1989), “Biosemantics”, The Journal of Philosophy, 86(6), pp. 281-297

Moreno, A. (2002), “Artificial Life and Philosophy”, Leonardo, 35(4), pp. 401-405

245 | P a g e

Moreno, A. & Etxeberria, A. (2005), “Agency in Natural and Artificial Systems”, Artificial Life, 11(1-2),

pp. 161-175

Mossio, M. & Taraborelli, D. (2008), “Action-dependent perceptual invariants: From ecological to

sensorimotor approaches”, Consciousness and Cognition, 17, pp. 1324-1340

Murray, L. & Trevarthen, C. (1985), “Emotional regulations of interactions between two-month-olds and

their mothers”, in: T. M. Field & N. A. Fox (eds.), Social perception in infants, Norwood, NJ: Ablex

Publishing, pp. 177-197

Nadel, J., Carchon, I., Kervella, C., Marcelli, D. & Réserbat-Plantey, D. (1999), “Expectancies for social

contingency in 2-month-olds”, Developmental Science, 2(2), pp. 164-173

Nagel, T. (1974), “What is it like to be a bat?”, Philosophical Review, 83(4), pp. 435-450

Nagel, E. (1977), “Teleology revisited: Goal-Directed Processes in Biology”, The Journal of Philosophy,

74(5), pp. 261-279

Newell, A. & Simon, H. A. (1976), “Computer Science as Empirical Enquiry: Symbols and Search”,

Communications of the Association for Computing Machinery, 19(3), pp. 113-126

Noë, A. (2004), Action in Perception, Cambridge, MA: The MIT Press

Nolfi, S. & Floreano, D. (2000), Evolutionary Robotics: The biology, intelligence, and technology of selforganizing

machines, Cambridge, MA: The MIT Press

O‟Regan, J. K. & Noë, A. (2001), “A sensorimotor account of vision and visual consciousness”,

Behavioral and Brain Sciences, 24(5), pp. 939-1031

Oyama, S. (2009), “Friends, Neighbors, and Boundaries”, Ecological Psychology, 21(2), pp. 147-154

Pascal, F. & O‟Regan, J. K. (2008), “Commentary on Mossio and Taraborelli: Is the enactive approach

really sensorimotor?”, Consciousness and Cognition, 17(4), pp. 1341-1342

Pentland, A. (2007), “On the Collective Nature of Human Intelligence”, Adaptive Behavior, 15(2), pp.

189-198

Petitmengin, C. (2006), “Describing one‟s subjective experience in the second person: An interview

method for the science of consciousness”, Phenomenology and the Cognitive Sciences¸ 5(3-4), pp.

229-269

Pfeifer, R. (1996), “Building „Fungus Eaters‟: Design Principles of Autonomous Agents”, in: P. Maes, M.

J. Mataric, J.-A. Meyer, J. Pollack & S. W. Wilson (eds.), From Animals to Animats 4: Proc. of the

4 th Int. Conf. on Simulation of Adaptive Behavior, Cambridge, MA: The MIT Press, p. 3-12

Pfeifer, R., Lungarella, M. & Iida, F. (2007), “Self-Organization, Embodiment, and Biologically Inspired

Robotics”, Science, 318, pp. 1088-1093

Pfeifer, R. & Scheier, C. (1999), Understanding Intelligence, Cambridge, MA: The MIT Press

Piaget, J. (1962), Play, Dreams, and Imitation in Childhood, New York, NY: Norton

246 | P a g e

Piaget, J. (1967), Biologie et connaissance, Paris, France: Editions Gallimard; transl. as: Biology and

Knowledge: An Essay on the Relations between Organic Regulations and Cognitive Processes,

Chicago, IL: The University of Chicago Press, 1971

Povinelli, D. J. & Eddy, T. J. (1996), “What young chimpanzees know about seeing”, Monographs of the

Society for Research in Child Development, 61, pp. 1-152

Prinz, J. (2006), “Putting the Brakes on Enactive Perception”, Psyche, 12(1), pp. 1-19

Quinn, M. (2001), “Evolving communication without dedicated communication channels”, in: J. Kelemen

& P. Sosik (eds.), Advances in Artificial Life: Proc. of the 6 th Euro. Conf. on Artificial Life, Berlin,

Germany: Springer-Verlag, pp. 357-366

Quinn, M., Smith, L., Mayley, G. & Husbands, P. (2003), “Evolving controllers for a homogeneous

system of physical robots: structured cooperation with minimal sensors”, Phil. Trans. R. Soc. Lond.

A, 361, pp. 2321-2343

Racine, T. P., Leavens, D. A., Susswein, N. & Wereha, T. J. (2008), “Conceptual and Methodological

Issues in the Investigation of Primate Intersubjectivity”, in: F. Morganti, A. Carassa & G. Riva (eds.),

Enacting Intersubjectivity: A Cognitive and Social Perspective on the Study of Interactions,

Amsterdam, Netherlands: IOS Press, pp. 65-79

Rizzolatti, G., Fogassi, L. & Gallese, V. (2001), “Neurophysiological mechanisms underlying the

understanding and imitation of action”, Nature Reviews Neurscience, 2, pp. 661-670

Rohde, M. (2008), Evolutionary Robotics Simulation Models in the Study of Human Behaviour and

Cognition, unpublished D.Phil. thesis, Brighton, UK: University of Sussex

Rohde, M. & Di Paolo, E. A. (2007), “Adaptation to sensory delays: An evolutionary robotics model of

an empirical study”, in: F. Almeida e Costa, L. M. Rocha, E. Costa, I. Harvey & A. Coutinho (eds.),

Advances in Artificial Life: Proc. of the 9 th Euro. Conf. on Artificial Life, Berlin, Germany: Springer-

Verlag, pp. 193-202

Rohde, M. & Di Paolo, E. A. (2008), “Embodiment and Perceptual Crossing in 2D: A Comparative

Evolutionary Robotics Study”, in: M. Asada, J. C. T. Hallam, J.-A. Meyer & J. Tani (eds.), From

Animals to Animats 10: Proc. of the 10 th Int. Conf. on Simulation of Adaptive Behavior, Berlin,

Germany: Springer-Verlag, pp. 83-92

Roy, J.-M., Petitot, J., Pachoud, B. & Varela, F. J. (1999), “Beyond the Gap: An Introduction to

Naturalizing Phenomenology”, in: J. Petitot, F. J. Varela, B. Pachoud & J.-M. Roy (eds.),

Naturalizing Phenomenology: Issues in Contemporary Phenomenology and Cognitive Science,

Stanford, CA: Stanford University Press, pp. 1-80

Ruiz-Mirazo, K. & Moreno, A. (2004), “Basic Autonomy as a Fundamental Step in the Synthesis of

Life”, Artificial Life, 10(3), pp. 235-259

Russel, J. (1996), Agency: Its role in mental development, Hove, UK: Taylor & Francis

247 | P a g e

Russel, S. & Norvig, P. (2002), Artificial Intelligence: A Modern Approach, 2 nd ed., Prentice Hall

Sartre, J.-P. (1943), L’Être et le néant, trans. by H. E. Barnes, Being and Nothingness: An essay on

phenomenological ontology, Oxon, UK: Routledge, 2003

Savage-Rumbaugh, S., Fields, W. M., Segerdahl, P. & Rumbaugh, D. M. (2005), “Culture prefigures

cognition in Pan/Homo Bonobos”, Theoria: An International Journal for Theory, History and

Foundations of Science, 20(54), pp. 311-328

Savage-Rumbaugh, S., Fields, W. M. & Taglialatela, J. P. (2001), “Language, Speech, Tools and Writing:

A Cultural Imperative”, Journal of Consciousness Studies, 8(5-7), pp. 273-292

Schwier, C., van Maanen, C., Carpenter, M. & Tomasello, M. (2006), “Rational imitation in 12-monthold

infants”, Infancy, 10, pp. 303-311

Shanon, B. (1998), “What is the function of consciousness?”, Journal of Consciousness Studies, 5(3), pp.

295-308

Spector, L. & Klein, J. (2006), “Trivial Geography in Genetic Programming”, in: T. Yu, R. Riolo, & B.

Worzel (eds.), Genetic Programming Theory and Practice III, Springer US, pp. 109-124

Steels, L. (1994), “The artificial life roots of artificial intelligence”, Artificial Life, 1(1-2), pp. 75-110

Steiner, P. & Stewart, J. (2009), “From autonomy to heteronomy (and back): the Enaction of Social Life”,

Phenomenology and the Cognitive Sciences, 8(4), pp. 527-550

Stewart, J. (1992), “Life = Cognition: The epistemological and ontological significance of Artificial

Life”, in: F. J. Varela & P. Bourgine (eds.), Toward a Practice of Autonomous Systems: Proc. of the

1 st Euro. Conf. on Artificial Life, Cambridge, MA: The MIT Press, pp. 475-483

Stewart, J. (1996), “Cognition = life: Implications for higher-level cognition”, Behavioral Processes, 35,

pp. 311-326

Stewart, J. (2000), “From Autopoiesis to Semantic Closure”, Annals of the New York Academy of

Sciences, 901, pp. 155-162

Stewart, J. (in press), “Foundational issues in enaction as a paradigm for cognitive science: From the

origin of life to consciousness and writing”, in: J. Stewart, O. Gapenne & E. A. Di Paolo (eds.),

Enaction: Towards a New Paradigm for Cognitive Science, Cambridge, MA: The MIT Press

Thelen, E. & Smith, L. (1994), A Dynamic Systems Approach to the Development of Cognition and

Action, Cambridge, MA: The MIT Press

Thompson, E. (2001), “Empathy and Consciousness”, Journal of Consciousness Studies, 8(5-7), pp. 1-32

Thompson, E. (2004), “Life and mind: From autopoiesis to neurophenomenology. A tribute to Francisco

Varela”, Phenomenology and the Cognitive Sciences, 3(4), pp. 381-398

Thompson, E. (2005), “Sensorimotor subjectivity and the enactive approach to experience”,

Phenomenology and the Cognitive Sciences, 4(4), pp. 407-427

248 | P a g e

Thompson, E. (2007), Mind in Life: Biology, Phenomenology, and the Sciences of Mind, Cambridge, MA:

The Belknap Press of Harvard University Press

Thompson, E. & Stapleton, M. (2009), “Making Sense of Sense-Making: Reflections on Enactive and

Extended Mind Theories”, Topoi, 28(1), pp. 23-30

Thompson, E. & Varela, F. J. (2001), “Radical embodiment: neural dynamics and consciousness”, Trends

in Cognitive Sciences, 5(10), pp. 418-425

Tomasello, M. (1988), “The role of joint attentional process in early language development”, Language

Sciences, 10, pp. 69-88

Tomasello, M. (1999), The cultural origins of human cognition, Cambridge, MA: Harvard University

Press

Tomasello, M. (2000), “Primate Cognition: Introduction to the Issue”, Cognitive Science, 24(3), pp. 351-

361

Tomasello, M. (2001), “Cultural Transmission: A View From Chimpanzees and Human Infants”, Journal

of Cross-Cultural Psychology, 32(2), pp. 135-146

Tomasello, M., Call, J. & Hare, B. (2003), “Chimpanzees understand psychological states – the question

is which ones and to what extent”, Trends in Cognitive Sciences, 7(4), pp. 153-156

Tomasello, M., Kruger, A. C. & Ratner, H. H. (1993), “Cultural learning”, Behavioral and Brain

Sciences, 16, pp. 495-552

Tomasello, M., Call, J., Warren, J., Frost, G. T., Carpenter, M., & Nagell, K. (1997), “The ontogeny of

chimpanzee gestural signals: A comparison across groups and generations”, Evolution of

Communication, 1(2), pp. 223-259

Torrance, S. (2005), “In search of the enactive: Introduction to special issue on enactive experience”,

Phenomenology and the Cognitive Sciences, 4(4), pp. 357-368

Trevarthen, C. B. (1979), “Communication and cooperation in early infancy: A description of primary

intersubjectivity”, in: M. Bullowa (ed.), Before Speech, Cambridge, Cambridge University Press, pp.

321-347

Trevarthen, C. & Hubley, P. (1978), “Secondary intersubjectivity: Confidence, confiding and acts of

meaning in the first year”, in: A. Lock (ed.), Action, Gesture and Symbol: The Emergence of

Language, London, UK: Academic Press, pp. 183-229

van Gelder, T. (1998), “The dynamical hypothesis in cognitive science”, Behavioral and Brain Sciences,

21(5), pp. 615-665

van Gelder, T. (1999), “Wooden iron? Husserlian phenomenology meets cognitive science”, in: J. Petitot,

F. J. Varela, B. Pachoud & J.-M. Roy (eds.), Naturalizing Phenomenology: Issues in Contemporary

Phenomenology and Cognitive Science, Stanford, CA: Stanford University Press, pp. 245-265

249 | P a g e

van Gelder, T. & Port, R. F. (1995), “It‟s About Time: An Overview of the Dynamical Approach to

Cognition”, in: R. F. Port. & T. van Gelder (eds.), Mind as Motion: Explorations in the Dynamics of

Cognition, Cambridge, MA: The MIT Press, pp. 1-43

Varela, F. J. (1976), “Not One, Not Two”, The Co-Evolution Quarterly, 12, pp. 62-67

Varela, F. J. (1979), Principles of Biological Autonomy, New York, NY: Elsevier North Holland

Varela, F. J. (1984), “The creative circle: Sketches on the natural history of circularity”, in: P.

Watzlawick (ed.), The Invented Reality: How do we know what we believe we know? Contributions

to constructivism, New York, NY: W. W. Norton & Company, pp. 309-323

Varela, F. J. (1991), “Organism: A meshwork of selfless selves”, in: A. I. Tauber (ed.), Organisms and

the Origins of Self, Dordrecht, Netherlands: Kluwer Academic Publishers, pp. 79-107

Varela, F. J. (1992), “Autopoiesis and a Biology of Intentionality”, in: B. McMullin & N. Murphy (eds.),

Proc. of Autopoiesis and Perception: A Workshop with ESPRIT BRA 3352, Dublin, Ireland: Dublin

City University, pp. 4-14

Varela, F. J. (1996a), “The early days of autopoiesis: Heinz and Chile”, Systems Research, 13(3), pp. 407-

416

Varela, F. J. (1996b), “Neurophenomenology: A Methodological Remedy for the Hard Problem”, Journal

of Consciousness Studies, 3(4), pp. 330-349

Varela, F. J. (1997), “Patterns of Life: Intertwining Identity and Cognition”, Brain and Cognition, 34(1),

pp. 72-87

Varela, F. J. (1999), “The Specious Present: A Neurophenomenology of Time Consciousness”, in: J.

Petitot, F. J. Varela, B. Pachoud & J.-M. Roy (eds.), Naturalizing Phenomenology: Issues in

Contemporary Phenomenology and Cognitive Science, Stanford, CA: Stanford University Press, pp.

266-317

Varela, F. J. & Shear, J. (1999), „First-person Methodologies: What, Why, How?‟, Journal of

Consciousness Studies, 6(2-3), pp. 1-14

Varela, F. J., Maturana, H. R. & Uribe, R. (1974), “Autopoiesis: The organization of living systems, its

characterization and a model”, BioSystems, 5, pp. 187-196

Varela, F. J, Thompson, E. & Rosch, E. (1991), The Embodied Mind: Cognitive Science and Human

Experience, Cambridge, MA: The MIT Press

Velmans, M. (2007), “Where experiences are: Dualist, physicalist, enactive and reflexive accounts of

phenomenal consciousness”, Phenomenology and the Cognitive Sciences, 6(4), pp. 547-563

Von Foerster, H. (1960), “On Self-Organizing Systems and their Environments”, in: M. C. Yovits & S.

Cameron (eds.), Self-Organizing Systems, London, UK: Pergamon Press, pp. 31-50

250 | P a g e

Von Foerster, H. (1973), “On constructing a reality,” in: W. F. E. Preiser (ed.), Environmental Design

Research, vol. 2, Stroudsburg, PA: Dowden, Hutchinson and Ross, pp. 35-46

Von Foerster, H. (1976), “Objects: Tokens for (Eigen-)Behaviors”, ASC Cybernetics Forum, 8(3-4), pp.

91-96

von Glasersfeld, E. (1984), “An Introduction to Radical Constructivism”, in: P. Watzlawick (ed.), The

Invented Reality: How do we know what we believe we know? Contributions to constructivism, New

York, NY: W. W. Norton & Company, pp. 17-40

von Uexküll, J. (1934), “Streifzüge durch die Umwelten von Tieren und Menschen: Ein Bilderbuch

unsichtbarer Welten”, trans. by: C. H. Schiller, “A stroll through the worlds of animals and men: a

picture book of invisible worlds”, in: C. H. Schiller (ed.), Instinctive Behavior: The Development of a

Modern Concept, New York, NY: International Universities Press, 1957, pp. 5-80. Also appeared in:

Semiotica, 89(4), 1992, pp. 319-391

Vygotsky, L. S. (1934/1978), Mind in society: The development of higher psychological processes,

Cambridge, MA: Harvard University Press

Webb, B. (2001), “Can robots make good models of biological behaviour?”, Behavioral and Brain

Sciences, 21, pp. 1033-1050

Weber, A. & Varela, F. J. (2002), “Life after Kant: Natural purposes and the autopoietic foundations of

biological individuality”, Phenomenology and the Cognitive Sciences, 1(2), pp. 97-125

Wheeler, M. (1997), “Cognition‟s Coming Home: the Reunion of Life and Mind”, in: P. Husbands & I.

Harvey (eds.), Proc. of the 4 th Euro. Conf. on Artificial Life, Cambridge, MA: The MIT Press, pp. 10-

19

Wheeler, M. (2005), Reconstructing the Cognitive World: The Next Step, Cambridge, MA: The MIT

Press

Wheeler, M. (2008), “Cognition in Context: Phenomenology, Situated Robotics and the Frame Problem”,

International Journal of Philosophical Studies, 16(3), pp. 323-349

Williams, P., Beer, R. & Gasser, M. (2008), "Evolving referential communication in embodied dynamical

agents", in: S. Bullock, J. Noble, R. Watson & M. Bedau (eds.), Artificial Life XI: Proceedings of the

11th Int. Conf. on the Simulation and Synthesis of Living Systems, Cambridge, MA: The MIT Press,

pp. 702-709

Winograd, T. (1972), Understanding Natural Language, New York, NY: Academic Press

Winograd, T. & Flores, C. F. (1986), Understanding Computers and Cognition: A New Foundation for

Design, Norwood, NJ: Ablex Publishing Corporation

Wood, R. & Di Paolo, E. A. (2007), “New Models for Old Questions: Evolutionary Robotics and the „A

Not B‟ Error”, in: F. Almeida e Costa, L. M. Rocha, E. Costa, I. Harvey & A. Coutinho (eds.),

251 | P a g e

Advances in Artificial Life: Proc. of the 9 th Euro. Conf. on Artificial Life, Berlin, Germany: Springer-

Verlag, pp. 1141-1150

Zahavi, D. (1996), Husserl und die transzendentale Intersubjektivität: Eine Antwort auf die

sprachpragmatische Kritik, trans. by: E. A. Behnke, Husserl and Transcendental Intersubjectivity: A

response to the Linguistic-Pragmatic Critique, Athens, OH: Ohio University Press, 2001

Zahavi, D. (1999), Self-Awareness and Alterity: A Phenomenological Investigation, Evanston, Ill.:

Northwestern University Press

Zahavi, D. (2001), “Beyond Empathy: Phenomenological Approaches to Intersubjectivity”, Journal of

Consciousness Studies, 8(5-7), pp. 151-167

Zahavi, D. (2005), Subjectivity and Selfhood: Investigating the First-Person Perspective, Cambridge,

MA: The MIT Press

Ziemke, T. (1999), “Rethinking Grounding”, in: A. Riegler, A. von Stein & M. Peschl (eds.),

Understanding Representation in the Cognitive Sciences, New York, NY: Plenum Press, pp. 177-190

Ziemke, T. (2001), “Are Robots Embodied?”, in C. Balkenius, J. Zlatev, H. Kozima, K. Dautenhahn & C.

Breazeal (eds.), Proc. of the 1 st Int. Workshop on Epigenetic Robotics: Modeling Cognitive

Development in Robotic Systems, Lund University Cognitive Studies, 85, pp. 75-93

Ziemke, T. (2003), “What‟s that thing called embodiment?”, in: R. Alterman & D. Kirsh (eds.), Proc. of

the 25 th Annual Conf. of the Cognitive Science Society, Mahwah, NJ: Lawrence Erlbaum, pp. 1305-

1310

Ziemke, T. (2007), “What‟s life got to do with it?”, in: A. Chella & R. Manzotti (eds.), Artificial

Consciousness, Exeter, UK: Imprint Academic, pp. 48-66

252 | P a g e

Sociality and the life-mind continuity thesis - Dr. Tom Froese

Sociality and the life-mind continuity thesis - Dr. Tom Froese ... View more Sociality and the life-mind continuity thesis - Dr. Tom Froese

Delete template?

Save as template ?

Sociality and the life-mind continuity thesis - Dr. Tom Froese Sociality and the life-mind continuity thesis - Dr. Tom Froese