25.12.2013 Views

Computer Animation for Articulated 3D Characters

Computer Animation for Articulated 3D Characters

Computer Animation for Articulated 3D Characters

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Computer</strong> <strong>Animation</strong> <strong>for</strong> <strong>Articulated</strong> <strong>3D</strong> <strong>Characters</strong><br />

Szilárd Kiss<br />

October 15, 2002<br />

Abstract<br />

We present a review of the computer animation literature, mainly concentrating<br />

on articulated characters and at least some degree of interactivity<br />

or real time simulation. Advances in different techniques such as<br />

key-frame, motion capture (also known as mocap), dynamics, inverse kinematics<br />

(IK), controller systems, and even neural networks are discussed.<br />

We try to analyze these methods from different aspects: semantics,<br />

data input, animation types, generating the animations, combining animations,<br />

animation varying, target type, and even the path planning perspective<br />

is taken into account <strong>for</strong> the reviewed animation systems.<br />

1 Introduction<br />

This report is an overview of motion (animation) control methods <strong>for</strong> articulated<br />

figures. These articulated figures consist of rigid links (segments in H-anim<br />

[42] terminology which is a standard <strong>for</strong> articulated human figures) connected<br />

by joints with 1, 2 or 3 Degrees Of Freedom (DOFs). The control methods<br />

use kinematics (time-based joint rotation values) or dynamics (<strong>for</strong>ce-based<br />

simulation of movement) to drive the articulated figures, methods <strong>for</strong> which<br />

the concepts and basics are presented in many animation textbooks [61] [59].<br />

Besides the task of animating which could be quite complex by itself, some of<br />

the control methods presented take into account the context (environment) in<br />

which the character is being animated, such as the environment objects and<br />

obstacles or even the group and social behaviour [76]. The methods discussed<br />

differ not only in their motion control methods, but also in their animation possibilities<br />

like quality, speed, etc. and thus they provide possibilities <strong>for</strong> different<br />

target applications.<br />

This research will be integrated into my own work via an animation editor,<br />

which is being currently developed. The animation editor comes to complete a<br />

geometry editor [46] and an H-anim based bone hierarchy editor [47] to provide<br />

a system capable of producing the avatar or agent bodies needed to populate<br />

virtual environments. Since the aim is not just to provide a simple animation<br />

editor but a whole animation system which can be used <strong>for</strong> specifying basic<br />

movements and creating combinations of movements, the goal of this report was<br />

to find previous research dealing with animation in general, and with animation<br />

combination respectively qualitative assessment of animations in particular.<br />

In our system, interactivity is the key requirement, which asks <strong>for</strong> speed in<br />

the system, and more importantly, <strong>for</strong> animation generation and combination<br />

possibilities that result in compelling animation sequences. The outcome of this<br />

review is <strong>for</strong>mulated in section 6.<br />

1


2 <strong>Animation</strong> data<br />

Traditionally, computer animation data is stored as key-frame data. This data<br />

structure has two components: the key which specifies relative time moments<br />

and the animation data values at these time moments such as rotation angle/displacement/etc.<br />

values [58] [61].<br />

Creating these data values can be done manually (by talented animators) or can<br />

be captured automatically using different types of capture hardware [62] [75]<br />

or simpler mechanical devices like a bicycle [17]. Mechanical, <strong>for</strong>ce-based measurements<br />

were used in the past <strong>for</strong> (medical) musculoskeletal measurements<br />

and analysis such as the walking characteristics analysis presented in [43].<br />

Newer, magnetic or optical commercial motion capture systems use different<br />

markers <strong>for</strong> tracking coordinates and even mechanical (or optical fiber) wearable<br />

systems exists <strong>for</strong> capturing movement, these latter being more cumbersome,<br />

but providing a greater area of movement freedom. Markerless, image-based<br />

motion capture systems also start to emerge, where the input is one or more<br />

video camera feed [28]. Motion capture systems usually store data as absolute<br />

coordinate values, but there are motion capture file <strong>for</strong>mats that specify rotation<br />

data, which can be extracted from absolute coordinates data to be used <strong>for</strong><br />

hierarchically articulated characters.<br />

Next to body animation data, special motion capture systems are also used<br />

to capture facial animation data. [67] uses this technique to identify MPEG4<br />

Facial <strong>Animation</strong> Parameters (FAPs) in captured data and reuse it with <strong>3D</strong><br />

and cartoon heads, consisting of a fixed skeleton and animated with spring<br />

simulations (muscle).<br />

Motion capture data can be used <strong>for</strong> directly animating (directing in real-time)<br />

characters similar to the tennis game and virtual dancing in [45] and [62], even<br />

enhanced with additional motion/gesture recognition routines [26]. This type<br />

of animation may be useful <strong>for</strong> interactive systems, television, etc. If the data<br />

is applied later to characters, then it is stored in key-frame <strong>for</strong>mat, usually<br />

rotation angle data.<br />

There are systems that extract motion key-frames from any kind of data source<br />

(manual, IK, procedural, mocap) and apply that data to their purposes, <strong>for</strong><br />

instance qualitative variance [23] or composing animations [64] [13], methods<br />

described in section 3.2 and its subsections.<br />

2.1 <strong>Animation</strong> data types<br />

There are several distinctions that can be applied to animation data. One of<br />

them is reusability. Some data can be reused in the sense of re-run cycles<br />

(like walking, running [81] [18]) or by simply repeating a certain movement<br />

sequence when needed. There are also animation sequences that are dependent<br />

on the character’s properties (position, orientation) relative to target entities<br />

and thus cannot be used in a pre-recorded manner, and these are animations<br />

like grasping or pointing [40] which are usually achieved using IK or similar<br />

techniques described in section 3.1.<br />

2


3 Generating animation<br />

Table 3.1 enumerates and additionally roughly classifies the animation methods<br />

that are discussed in the rest of the paragraph. The classification dimensions<br />

are:<br />

1. Interactivity/Speed. The higher this value is, the more usable the system<br />

is in interactive virtual systems. Simple motion combination methods are<br />

the easiest and fastest <strong>for</strong>m of animation.<br />

2. Reusability/Speed. This dimension appreciates motion control methods<br />

that reuse recorded movements, not requiring new motion data generated<br />

from scratch.<br />

3. Generality. Physics based, IK motion systems score higher, since in theory<br />

they can produce an unlimited number of motion types, while procedural<br />

systems are limited by the motion data they possess (quality and nor<br />

quantity limitation).<br />

4. Quality. Physics based and qualitative variance methods score higher by<br />

producing unique and context-sensitive, adapted motions.<br />

3.1 Physical simulation<br />

A wide-spread method <strong>for</strong> animating articulated figures is kinematics. It is<br />

based on the motion properties time, position and velocity. The term <strong>for</strong>ward<br />

kinematics is used when joint rotations are specified in function of time. The<br />

term inverse kinematics refers to the problem of specifying the <strong>for</strong>ward kinematics<br />

values when the end-point (end effector) of the character articulation<br />

is known. There are many different methods to solve this problem, and this<br />

attention trans<strong>for</strong>med the concept into a well-known animation (simulation)<br />

alternative. The most common examples of <strong>for</strong>ward and inverse kinematics are<br />

the key-frame editors (which interpolate <strong>for</strong>ward kinematics values from a few<br />

intermediate values) and the possibilities of fixing the extremities of articulated<br />

figures while moving the other components or dragging the extremities themselves<br />

(in this case IK methods resolve the joint rotations) present in many<br />

modeling/animation packages [30].<br />

Dynamic simulations are <strong>for</strong>ce-based simulation methods that take in account<br />

the <strong>for</strong>ces that occur in articulated characters as well as additional properties<br />

or events like velocities, mass, torques, collision [40]. This animation method<br />

is computationally expensive and it is used <strong>for</strong> instance in realistic simulations<br />

[37] [83] [78] and tests <strong>for</strong> usability [57] [5].<br />

With these simulation methods, the end-result generally consists of timedependent<br />

values which represent the <strong>for</strong>ward kinematics angle data that can<br />

be used to drive the target articulated character.<br />

The IK problem has been approached from different angles. There are numerical<br />

and analytical methods [54] [5], but there are simpler, intuitive, non<br />

mathematics-inspired methods also that are aiming <strong>for</strong> speed gain: [34] tries to<br />

approximate target points with <strong>for</strong>ward kinematics, while [30] describes techniques<br />

where the hierarchies (limbs) are inverted.<br />

3


Interactivity/Speed<br />

Quality<br />

Procedural animation (based on data)<br />

- Fourier trans<strong>for</strong>mations to extract and alter qualitative features<br />

- image and signal processing techniques: filtering motion features<br />

- signal processing techniques <strong>for</strong> emotional trans<strong>for</strong>ms<br />

- learning and applying motion styles through HMMs and analogy<br />

- LMA properties based alterations (<strong>for</strong> ef<strong>for</strong>t and shape)<br />

- knowledge-driven alterations <strong>for</strong> specific subdomain<br />

- noise functions <strong>for</strong> qualitative variance<br />

- constraints and displacement maps<br />

- constraints based retargeting with spacetime approach<br />

- motion warping (time based alteration)<br />

- simulated annealing (scaling & fine-tuning)<br />

- path-based motion alteration<br />

- motion prototypes annotated with behavioral tags<br />

- combination based on height (foot elevation) criteria<br />

- 2nd order derivative to extract and combine basic motions<br />

- priority (importance) levels and weights based combination<br />

- segmenting data and recombining with transitional motions<br />

- combining non-overlapping data based on spacetime constraints<br />

- timewarping to match motion sequences<br />

- fade-in, fade-out functions<br />

- curve blending functions<br />

- interpolation between motion sequences<br />

- grouping mutually exclusive motions <strong>for</strong> faster combination<br />

Reusability/Speed<br />

Physical simulation (control systems)<br />

- dynamic Newton equations system<br />

- mass data and moments of inertia with motion equations<br />

- inverse dynamics with energy minimization constraints<br />

- decentralized, agent-based approach (<strong>for</strong>ces = agents)<br />

- inverse kinetics (IK with mass)<br />

- muscles as motion actuators with sensor feedback<br />

- proportional-derivative (damped spring) controllers (PD)<br />

- PD servos <strong>for</strong> dynamic simulation of joint torques<br />

- Genetic Programming method to create PD controllers<br />

- closed loop method <strong>for</strong> balancing in open loop walking<br />

- dynamics system with feedback control method<br />

- simplified inverse dynamics with mass (Newton-Euler eq.)<br />

- iterative numerical integration <strong>for</strong> IK<br />

- IK constraints with <strong>for</strong>ward kinematics and feedback<br />

- inverting hierarchies (articulations)<br />

- ”sagittal elevation” resolution method <strong>for</strong> IK<br />

- direct manipulation<br />

- Neural Networks based learning from examples/feedback<br />

- Machine Learning (adaptive state machines)<br />

Generality<br />

Table 3.1: Classification dimensions <strong>for</strong> different animation methods


[73] relates about a molecular simulation based IK algorithm used to calculate<br />

the motion of a skeleton’s joints in real-time, using 5 magnetic track sensors as<br />

end effectors. The algorithm is an iterative algorithm <strong>for</strong> numerical integration<br />

of constrained motion in a Lagrange framework.<br />

[8] describes a heuristic, iterative approach <strong>for</strong> IK resolution. In this approach,<br />

each functional segment of a limb hierarchy (in their specific case a robot arm) is<br />

regarded as a set of agents, yielding in a decentralized approach. The constraints<br />

are regarded as <strong>for</strong>ces, while each <strong>for</strong>ce is represented by an agent. There are 4<br />

types of <strong>for</strong>ces used: goto (target), pain (operation range), relax (equilibrium<br />

point) and orientation (<strong>for</strong> grabbing). Path smoothing must be used after the<br />

IK resolution to eliminate jitter.<br />

[78] uses kinematics constraints <strong>for</strong> physics-based simulation of artificial animals<br />

(fish). Inverse dynamics is used through solving motion equations (the<br />

Newtonian laws of motion) resulting in <strong>for</strong>ces that are physically correct but<br />

not realistic. Constrained optimization is used to find motions requiring less energy<br />

(open-loop controller). Another method presented <strong>for</strong> physical animation<br />

is motion synthesis using muscles as motion actuators, method which allows<br />

the integration of virtual sensors feedback. Finite-state machines are used to<br />

add and remove constraint equations.<br />

The paper [14] presents a concept called ”Inverse Kinetics”, a variant of the IK<br />

incorporating center of mass in<strong>for</strong>mation. Beside the pseudo-inverse Jacobian<br />

matrix calculations <strong>for</strong> IK, the mass distribution in<strong>for</strong>mation is considered in the<br />

calculations to provide center of mass position control and posture optimization<br />

<strong>for</strong> a more physically correct appearance. The same idea is related with more<br />

detail in [15].<br />

Another type of control system is the Proportional-Derivative (damped spring)<br />

controller [52] [34], PD <strong>for</strong> short. It is a proportional scale and deviation minimizing<br />

derivative system, which means that it generates a <strong>for</strong>ce or torque<br />

proportional to differences of positions and velocities of source and target coordinates.<br />

Similar is the work presented in [52] where an open loop cyclic motion (walking)<br />

is extended with a closed loop feedback <strong>for</strong> inclusion of balancing properties.<br />

Finite state automata is used <strong>for</strong> balancing feedback, while a PD controller is<br />

used <strong>for</strong> the walking simulation.<br />

[39] uses also the PD technique <strong>for</strong> servos to compute joint torques <strong>for</strong> the dynamic<br />

simulation of human athletics movements (running, bicycling, vaulting).<br />

Manually constructed control systems (incorporating intuition, observations,<br />

biomechanical data and built using a control techniques toolbox) are used <strong>for</strong><br />

characters and environment objects. User has the possibility to change certain<br />

high-level properties of the simulation (speed, orientation, etc.). They also use<br />

mass-spring clothes simulation to add secondary motion to the animations.<br />

[71] describes a dynamics method with a musculoskeletal basis structure and<br />

uses a dynamic Newton equations system to resolve the animation. The system<br />

presented is defined as dynamic having the possibility to add or remove equations<br />

respectively to change attributes on the fly. Articulation relationships are<br />

modeled with constraint equations, also dynamically alterable.<br />

The dynamics system in [49] uses as enhancement <strong>for</strong> efficiency a feedback<br />

control scheme with a two-stage collision response algorithm. The collision<br />

5


esponse algorithm consists of a system <strong>for</strong> calculating the velocity change at<br />

impact and of a second system <strong>for</strong> calculating <strong>for</strong>ces and counter <strong>for</strong>ces <strong>for</strong><br />

non-penetrating objects. The user has the possibility to specify kinematics<br />

trajectories <strong>for</strong> selected DOFs.<br />

[41] relates over an inverse dynamics technique applied to a simplified skeleton<br />

having mass in<strong>for</strong>mation attached to the body segments. A recursive algorithm<br />

is used based on the Newton-Euler motion equations to calculate the <strong>for</strong>ce and<br />

torque of joint actuators.<br />

A walking gait controller is presented in [72] that generates walking motion<br />

along a curved path and uneven terrain with high control path specification.<br />

The high-level parameters are step length, step height, heading direction and<br />

toe-out. A ”sagittal elevation” motion algorithm is used to specify the footfloor<br />

interaction model using sagittal elevation angle <strong>for</strong> realistic contact. The<br />

motion algorithm takes into account the properties of parent joints and segments<br />

(rotation, body properties) to calculate position and positioning data relative<br />

to the basis key-frame data.<br />

[37] and [83] present human running respectively human diving animation methods<br />

based on the same principles. They use mass data and moments of inertia<br />

in combination with a commercial motion equations generating package. The<br />

movements are separated in phases, and a state machine is used to control the<br />

action selection mechanism. The movement actions in case of diving are more<br />

diverse as in the case of running, including jumping, twisting next to running<br />

and balancing.<br />

[53] presents interactive techniques <strong>for</strong> controlling physically-based animated<br />

characters using the mouse and the keyboard. Assigning mouse movements or<br />

keyboard shortcuts to DOFs, direct manipulation of limited complexity characters<br />

(2D and few DOFs) is possible. The input devices also en<strong>for</strong>ce limitations,<br />

<strong>for</strong> instance a mouse can control 4 DOFs at a time (the control procedure is<br />

time-dependent). It also uses a state automaton to combine the animations.<br />

Another category of controllers span from the field of artificial intelligence and<br />

neural networks (NN). [36] uses neural networks <strong>for</strong> learning motion (from examples<br />

or positioning feedback), but applies it only to simple articulations (no<br />

humans): a pendulum (3 limbs), a car (learning how to do parking), a lunar<br />

module (learn landing) and a dolphin (learn swimming). Similar is their earlier<br />

work [35] concentrating on automatic locomotion learning <strong>for</strong> fish and snakelike<br />

creatures, having highly flexible, many DOF joints, but also lacking the<br />

complexity that is necessary <strong>for</strong> human articulated figures. Their work was<br />

certainly inspired by the pioneering work in this area conducted by Sims [70].<br />

Similar is the approach taken by [34] to achieve physical simulation with the<br />

aid of Genetic Programming (GP) to produce controller programs. Key marks<br />

(a subset of key frames and tracks) are used to specify the goals of an animation,<br />

compared with spline control points which have uncertainty incorporated<br />

through values and time. From this, a Proportional-Derivative controller is<br />

calculated.<br />

[7] presents a data-driven NN approach that is aimed to grasp the essence and<br />

intrinsic variability of human movement. They use a recurrent RPROP artificial<br />

neural network with state neurons and weighted cost functions in combination<br />

with a pre-processing normalization step according to a mannequin based<br />

6


on average human characteristics, with the aim to learn and simulate human<br />

movements.<br />

[27] relates about physics based animation using adaptive state machines as<br />

controllers. The state machines adapt by machine learning techniques, and they<br />

use balancing, pre and post states <strong>for</strong> optimal control. The controller selection<br />

is based on query and weighted appropriateness result, controllers having scopes<br />

and weights defining their capabilities. The technique is presented as a more<br />

general, longer-term solution than data blending and warping, but it is rather<br />

a question of physical simulation versus data based approach.<br />

[29] use high level cognitive modeling <strong>for</strong> action planning. Domain knowledge is<br />

represented by preconditions, actions and effects, while characters are directed<br />

by goals. The AI <strong>for</strong>malism ”situation calculus” is used to describe changing<br />

worlds using sorted first-order logic which means the use of the possible successor<br />

state axiom instead of the effect axiom (what the world can turn into). The<br />

paper presents an animation method hierarchy (pyramid hierarchy) with geometric,<br />

kinematic, physical, behavioral, cognitive modeling representing higher<br />

and higher levels of animation techniques. They use cognitive modeling to<br />

direct behavioral animation.<br />

[40] presents another type of domain knowledge. Virtual touch sensors are used<br />

in the hands of virtual humans to simulate grasping behavior. Small spheres<br />

are used as collision sensors in combination with grasping knowledge (deducted<br />

from the grasped object’s properties) <strong>for</strong> a physically correct grasping sequence.<br />

A particular method <strong>for</strong> character animation is presented in [63]. A timed, parametric,<br />

stochastic and conditional production L-system expanded with <strong>for</strong>ce<br />

fields, synthetic vision and audition is used to simulate the development and<br />

behavior of static objects, plants and autonomous creatures. An extra iteration<br />

step is used to derive the state of objects at increasing time moments. The<br />

animation system used is a real-time multi-L-system interpreter enhanced with<br />

ray tracing image production. Emergent behaviors can emerge by using virtual<br />

sensor functions in the conditions of production rules. A physically based virtual<br />

sensor is the tactile sensor (the other two virtual sensors are the image-based<br />

visual sensor and the acoustic sensor) which evaluates the global <strong>for</strong>ce field at<br />

a certain position.<br />

[20] uses different levels of detail <strong>for</strong> physical simulations. Similar to graphics<br />

LOD techniques, different simulation levels are used <strong>for</strong> animation: dynamic<br />

simulation, hybrid kinematic/dynamic simulation and point-mass simulation.<br />

These levels are selected according to the interaction probabilities. Objects<br />

have space of influence, determining collision possibilities and thus the need<br />

<strong>for</strong> a detailed dynamic simulation. The switching of simulation levels is done<br />

during simulation steps that match certain criteria <strong>for</strong> a smooth transition,<br />

like the character being airborne. Another possible simulation LOD selection<br />

criteria the authors mention could be the cost/benefit heuristic, which requires<br />

the developer to specify importance values in the simulation environment.<br />

3.2 Procedural animation<br />

We divided the procedural animation methods from two point of views. First,<br />

we take a look at the combination techniques <strong>for</strong> motion data (left column, bottom<br />

in table 3.1), which provide the easiest interactivity possibilities by simply<br />

7


eusing a motion library. In addition to combining motion sequences, there is<br />

another category of motion generating methods (left column, top) which aim<br />

to alter the motion data based on properties defined <strong>for</strong> the character, its mental<br />

model, behavior, etc. The latter methods are used generally together with<br />

motion combination techniques, thus this category is possibly less interactive<br />

than the motion combining methods category alone (<strong>for</strong> instance by being more<br />

computationally demanding and not per<strong>for</strong>ming in real-time) but also provides<br />

more qualitative variance in the generated motion sequences.<br />

An animation method that does not fit the following classifications is the ”impostor”<br />

technique presented in [2]. Their aim was to produce real-time (fast)<br />

animations <strong>for</strong> geometrically complex characters, possible multiple characters.<br />

There<strong>for</strong>e they use the 2D sprites technique in <strong>3D</strong> by pre-generating snapshot<br />

pictures of the complex characters, with several levels of detail, and mapping<br />

these on billboards instead of really animating the characters. Re-rendering<br />

is controlled by a temporal coherence model, meaning if threshold values (differences<br />

between the snapshot and the animated character) are exceeded, only<br />

then the impostor is re-rendered.<br />

3.2.1 Combining animation data<br />

Combining animation sequences (motion capture data) can be a tedious task<br />

if done by hand. There<strong>for</strong>e, automatic motion combination techniques have<br />

emerged from the simplest ones to most complex ones, from fading functions<br />

to overlapping and blending techniques.<br />

One of the most simple motion combination methods is the use of fade-in and<br />

fade-out functions [65][54] or the motion parameter curve blending method used<br />

by [82]. Similar is the method presented by [3] where motions are segmented<br />

into motion phases that can be named, stored and executed separately, and possibly<br />

connected via transitional motions. Slightly more complex is the motion<br />

interpolation technique used by [72] to combine walking movement sequences<br />

into data with height changes at foot contact that fits the terrain variations the<br />

character is walking on.<br />

In addition to the simple fade-in and fade-out functions, [65] uses radial basis<br />

functions to interpolate between distinct motion data sequences. The approach<br />

is based on the ”verbs and adverbs” analogy, where the original motion (verb) is<br />

interpolated according to some property (adverb) <strong>for</strong> motion transition purposes.<br />

This way, a smooth transition is achieved between motion sequences. In an<br />

earlier publication [66] they use spacetime constraints <strong>for</strong> the same purpose, <strong>for</strong><br />

reassembling the non-overlapping motion sequences. Although motion data is<br />

used, this is broken up and converted into dynamics <strong>for</strong>mulation incorporating<br />

spacetime and IK constraints, and an energy (torque) minimizing approach is<br />

used to enhance the naturalness of the generated movement.<br />

[21] and [22] uses motion prototypes, a dictionary of gestures to create animations<br />

from text and dialogue theory. The gesture motions are annotated and<br />

synchronized <strong>for</strong> animating dialogues using parallel transition networks (PaT-<br />

Nets), see section 5.3 <strong>for</strong> a description of this and other animation architectures.<br />

However, these methods work satisfactory only with non-conflicting motion<br />

sequences. When the motion data sets try to act upon the same target joints,<br />

the results can be unexpected. To cope with this problem, different methods<br />

8


<strong>for</strong> combining motion data were proposed, based mainly on different weighting<br />

and summing techniques.<br />

[40] [69] uses priority (importance) weights and weights-based combination of<br />

conflicting data in initiating and terminating movement phases, with an ease-in<br />

and ease-out technique based on cubic step functions, to achieve smoothness<br />

next to the meaningful combination of movements. These methods were collected<br />

in a software library [13] <strong>for</strong> the management of animation data combinations.<br />

In this library, actions have scope (affected joints) and priority (relative<br />

importance of action) properties, while agents (articulated characters) also have<br />

scope, they can accept or reject actions. The blending of motions is executed<br />

depending on the action types motion or posture.<br />

[64] uses the traditional animators knowledge: ”human motions are created from<br />

combinations of temporarily overlapping gestures and stances” but goes one<br />

step further and uses groups of mutually exclusive animations which compose<br />

levels with different importance weights to achieve next to motion combination<br />

a certain amount of speed gain by simplifying the search <strong>for</strong> conflicting motions.<br />

[12] uses a range of motion generator techniques: dynamic, motion capture<br />

(live motion), motion capture playback and IK motion tasks, where the motion<br />

generator techniques can be used in combination (one technique per motion<br />

generator which acts on a set of joints). In their case, conflicting joints are<br />

summed using priority levels and weights.<br />

A particular case of motion combination is presented in [44] where cyclic motion<br />

capture data is used in combination with a PD controller. The controller is<br />

responsible <strong>for</strong> executing the linear and angular acceleration of state variables<br />

over time as a result of the path planning algorithm.<br />

3.2.2 Altering (varying) animation data<br />

If motion sequences are reused frequently, then the repetitions lend a monotone<br />

feeling to the animation. Since motion data (motion libraries) are limited,<br />

researchers tried to introduce variations in motion data to achieve non-repetitive<br />

movement sequences. They achieved this in various ways, ranging from noise<br />

functions to B-splines and from IK to neural networks.<br />

Qualitative variations<br />

The first set of methods discussed uses key-frame data and alters it qualitatively<br />

using different functions and methods. The goal is to introduce qualitative<br />

variations into the animation data, making non-repetitive re-usage of the same<br />

data set possible.<br />

[79] applies Fourier trans<strong>for</strong>mations to motion data and uses frequency analysis<br />

to extract basic factors (like walk) and qualitative factors (like brisk, slow,<br />

etc.). Intrapolation and extrapolation in the frequency domain is then used<br />

to create new motion types from neutral data and emotional mood variations.<br />

The new motions can be generated continuously, and additional manual control<br />

possibilities exist to influence the animation by hand.<br />

[1] per<strong>for</strong>ms emotional trans<strong>for</strong>ms on motion data using signal processing techniques.<br />

They analyze and extract neutral motion and emotional differences<br />

from motion data and apply these to other types of neutral data. The varying<br />

components used (extracted) are speed and spatial amplitude.<br />

9


[82] uses motion warping techniques to alter key-frame data. By warping the<br />

motion parameter curves to animator-specified constraint key-frames, local variations<br />

are introduced in the motion data without changing the fine structure of<br />

the original motion. Motion sequences are combined by overlapping and blending<br />

the parameter curves. Can be used to create libraries of reusable motion<br />

by specifying a small number of constraint key-frames.<br />

[19] uses similar techniques from the image and signal processing domain as<br />

complementary techniques to key-framing, motion capture, and procedural animation.<br />

The techniques are multiresolution motion filtering, multitarget motion<br />

interpolation with dynamic timewarping, waveshaping and motion displacement<br />

mapping. In multiresolution motion filtering, low frequencies contain the general<br />

motion, while high frequencies the detail, subtleties and noise. By filtering<br />

certain bands, different styles of motions can be obtained. Timewarping techniques<br />

are used to match movement patterns indifferently of the animation time,<br />

resulting in combination (warping) of movement sequences. Motion waveshaping<br />

is used to introduce motion limits or to alter movement styles. Motion<br />

displacement mapping is used to locally alter the movement sequence without<br />

changing the global aspect of the movement. The authors state that the most<br />

important feature of this latter technique would be the alteration of a grasping<br />

movement towards a different target object (although the new target object<br />

should be also close to the original target object).<br />

[16] uses ”Style” Hidden Markov Models (HMMs) to learn patterns, in particular<br />

the style of movements from motion capture data and then uses these<br />

Style HMMs to generate motions with different style and scope. The goal is to<br />

extract a parameterized model that covers the generic behavior of a family of<br />

motion capture data. The Style HMMs are applied by using analogy rules, and<br />

it is applicable to long motion data sequences and this technique is regarded by<br />

the authors as an alternative <strong>for</strong> motion libraries.<br />

New motion from data<br />

A second set of methods could be defined as methods using motion data as<br />

basis <strong>for</strong> new motion calculations, such as noise functions or knowledge-driven<br />

methods, and even physical simulation methods are used sometimes.<br />

[64] uses noise functions on existing key-frame data in combination with IKbased<br />

motion calculations <strong>for</strong> simulating vivid, natural movements from existing<br />

key-frame motion data.<br />

A hybrid numerical IK method is presented in [54] <strong>for</strong> interactive motion editing.<br />

Constraints are specified on the key-frame motion data, <strong>for</strong> which B-splines are<br />

fitted. These fittings result in hierarchical displacement maps that are used <strong>for</strong><br />

motion alterations, from which the animation data is calculated (using IK) by<br />

eliminating redundant coordinates. Example constraints specify bending below<br />

doorways, terrain following, motion combination and even motion retargeting<br />

(next group of motion altering methods).<br />

[18] uses interactive, knowledge driven altering techniques <strong>for</strong> running data<br />

which are based on empirical (relations between parameters), physical (<strong>for</strong><br />

body trajectory) and constraint knowledge (<strong>for</strong> flight/support states and s-<br />

tance/swing phases). Examples of parameter relations are presented such as:<br />

increase in velocity is linear with the increase in step length up to a certain<br />

speed (24km/h) then only the step frequency is increased to further increase<br />

10


the velocity. Other relations regard the level of expertise (experts have greater<br />

stride length) or the relation between leg length and ”froude” number (inertial<br />

<strong>for</strong>ce / gravitational <strong>for</strong>ce).<br />

[23] uses LMA (Laban Movement Analysis) properties based movement analysis<br />

and alteration. Ef<strong>for</strong>t and shape properties are used (the other three properties<br />

used by LMA are body, space and relationship) to introduce qualitative differences<br />

(mood, internal state). Key-frame data is used from various sources, and<br />

the shape parameter influences the key-frame values, while the ef<strong>for</strong>t parameter<br />

influences the execution speed of the key-frame data, specifying also the frame<br />

distribution between key-points. The altered key-frames are regarded as ’via’<br />

and ’goal’ type end-effectors <strong>for</strong> an IK interactive system. The same requirements<br />

<strong>for</strong> virtual human animation are presented in an earlier work [4]. Based<br />

on an analysis of animation methods, the paper concludes that both empirical<br />

data and cognitive properties are important in animation, resulting in the approach<br />

taken based on LMA and human properties like personality, emotions,<br />

temperament. They express the view that <strong>for</strong> directing attention and attentiondependent<br />

motions (walking, reaching) sensing actions and interleaving motions<br />

are needed.<br />

Motion retargeting<br />

A different set of methods are the ones that concentrate on motion retargeting.<br />

The motion data is altered to be usable <strong>for</strong> articulated characters of different<br />

sizes and types.<br />

[10] parameterizes the motion capture data and applies it to other target articulations.<br />

The second order derivative is used to search <strong>for</strong> significant changes<br />

in motion, by analyzing zero-crossings to extract basic action segments, which<br />

are later interpolated using cubic splines to yield new motion sequences. Uses<br />

different types of tags <strong>for</strong> action participant objects.<br />

[31] retargets motion to new characters using specific features (contact points)<br />

of the original motion as constraints. Their argument is that procedural and<br />

simulation retargeting means new data, without the qualities of the original<br />

detailed mocap, key-frame data. They use a spacetime approach which eliminates<br />

the spatial jerkiness of frame-based retargeting methods by look-ahead<br />

and look-behind techniques (anticipating constraints) <strong>for</strong> the whole motion sequence<br />

which results in smooth motion data generation.<br />

[38] uses a two-step motion adapting method <strong>for</strong> new characters. The first step<br />

consists of a geometric scaling enhanced with mass scaling and in a second step<br />

the parameters are fine-tuned using a ”simulated annealing” technique (progressive<br />

adaptation of multivariate functions towards target states). Although<br />

the method is applied to control systems and not motion data as other methods<br />

in this section, the method is still relevant from the retargeting perspective.<br />

One method which differs from the previous ones in its scope presented in [32]<br />

is the use of different paths to retarget the motion data. Paths represented as<br />

B-spline curves are altered (they are editable) and foot constraints are used to<br />

eliminate foot skating (method presented also in an earlier work [31]).<br />

11


3.3 Alternative classifications<br />

There are of course other type of animation classifications. For instance, [74]<br />

and [75] uses instead of the physical/procedural a physical/geometrical classification,<br />

where kinematics is enlisted as geometrical animation type, next to<br />

procedural animation methods. [29] defines a pyramid hierarchy with geometric,<br />

kinematic, physical, behavioral, cognitive animation methods, all building<br />

on top of each other.<br />

<strong>Animation</strong> could be classified also from the point of view of target systems.<br />

Film and animation production software use complex physical simulations to<br />

make compelling animations, since their target objective is not the real-time<br />

applications market. On-line environments, games on the other hand have the<br />

real-time requirement, thus they need fast, reusable animations, thus they orient<br />

more in the direction of procedural animation.<br />

4 Movement semantics<br />

Semantic analysis and semantic generation of movement are important components<br />

in the behavior (brain) specification <strong>for</strong> animated characters as stated by<br />

[3] [4], where they use this semantics as a method to alter movement sequences.<br />

[18] uses a high-level motion control system based on dynamic parameters in<br />

combination with empirical, physical as well as constraints knowledge to procedurally<br />

generate walking/running animation data. Others like [40] use object<br />

grasping knowledge <strong>for</strong> sensor-based grasping simulation of different types of<br />

objects in virtual environments.<br />

Classical cartoon creators use different animation sequences to empathize the<br />

movement semantics in their cartoon characters. <strong>Computer</strong> character animators<br />

can use the same techniques when creating key-frame animations [51] to achieve<br />

expressiveness, believable behavior. Additionally, film theory concepts could be<br />

used <strong>for</strong> animations [56], mainly <strong>for</strong> staging, viewpoint animation, etc.<br />

On the other hand, there are exact, medical and psychological-social measurements<br />

<strong>for</strong> the semantics of human movement. Regarding medical measurements,<br />

[43] presents a series of walking characteristics, while [81] discusses phases of<br />

walking, <strong>for</strong>ces and moments, muscle activity resulting from human gait analysis.<br />

Psychological textbooks analyze the different human poses, assigning meanings<br />

to them. McNeill [60] analyzed the meaning of gestures and the gesturing<br />

behavior of humans during dialogs. He categorized arm gestures into the following<br />

five types: iconics, metaphorics, deictics, beats, emblems and studied their<br />

connections to dialogue characteristics. Similarly, the human facial movements<br />

were analyzed by Ekman [25] <strong>for</strong> extracting the expression of different psychological<br />

effects such as mood or mind state from face configurations (features)<br />

resulting in the Facial Action Coding System (FACS).<br />

These results found their way into computer animation. [21], and more recent<br />

the BEAT system [22] uses the McNeill gesture and dialogue semantics to<br />

generate the gesturing behavior of autonomous characters. [21] uses gestural<br />

annotations, gestural lexicon look-up, timing from speech generation in producing<br />

the animation. Similarly, [22] creates gestures from speech and dialogue<br />

rules. An arbitrary animation system is wrapped around with behavior tags<br />

and used in synchronization with speech and the semantic context of the text<br />

12


to produce the gesture animations. Even the FACS system data is used to<br />

generate the facial expressions <strong>for</strong> the simulated dialogues.<br />

[80], [55] and other facial animation systems rely heavily on the findings of Ekman<br />

in creating facial expressions. The Waters facial model [80] uses muscle<br />

simulations and the FACS system to model facial expressions, emotions. [21]<br />

uses the same FACS system to generate facial expressions such as lip shape, conversational<br />

signal, punctuator, manipulator and emblem, animated expressions<br />

to accompany their dialogue utterances.<br />

Behavioral animation is also gaining momentum. Simple animal behaviors [78],<br />

complex human dialogue behaviors [22] and some of the different motion alteration<br />

methods from section 3.2.2 use behavior, moods, emotions to influence<br />

the created animation, making it more expressive, natural and meaningful.<br />

5 Applications, tools, systems<br />

5.1 Editor types<br />

Historically, <strong>3D</strong> editor systems used different views <strong>for</strong> editing: the 3+1 approach,<br />

representing the scene from top, front, side and <strong>3D</strong> view [61]. However,<br />

there are other approaches also, where the editor interface is inserted into the<br />

<strong>3D</strong> environment, breaking away from the multiple viewpoints and also from the<br />

WIMP interface towards a fully integrated <strong>3D</strong> interface [33] [47].<br />

Classical key-frame setting operations in modeling commercial packages handle<br />

besides articulated characters also the environment animations. They do not<br />

produce interactive environments, thus this is a logical animation approach.<br />

There are also software packages that concentrate only on animation of articulated<br />

characters, and the result of these animations can be applied in interactive<br />

environments. However, these latter animations are still rigid, and running<br />

them repeated times will convey unnaturalness to the environment.<br />

These packages have different tools <strong>for</strong> enabling easy foot positioning. [30]<br />

presents a number of these approaches that are used by commercial packages.<br />

The first method is IK with locks, when <strong>3D</strong> character articulations are locked<br />

(pinned down) and IK calculations are used to calculate joint rotations when the<br />

character is moved. A typical example is pinning down the foot on the floor to<br />

eliminate foot skating. Some packages offer the same pinning functionality also<br />

<strong>for</strong> <strong>for</strong>ward kinematics, when the user positions manually the unconstrained<br />

articulations to achieve the desired result. A similar animation possibility is<br />

to drag limb extremities, where internal joint rotations are resolved by IK. An<br />

advanced animation possibility is the ”footstep generator”, when the animator<br />

specifies footprints on the ground, and the software animates the character’s<br />

movements to match the footprints. This calculation is usually a physical simulation<br />

which has to be refined, <strong>for</strong> instance by adding personality, individual<br />

characteristics. An odd method <strong>for</strong> animation is the use of inverse (or broken)<br />

hierarchies. It is based on the <strong>for</strong>ward kinematics animation of inverse limbs,<br />

typically the legs and arms. It is called broken because to achieve inverse<br />

hierarchies with non-repeating leaves, the torso has to be separated from the<br />

hierarchies and it has to be animated (kept in synchronization with the rest of<br />

the character) separately. This can be a little cumbersome, but the advantage<br />

13


is that the foot can be positioned without IK. Generally, when characters are<br />

being animated, it is also important to have references to the previous frames.<br />

Some software have built-in onion-skin effect or aid by allowing to place dummy<br />

objects as markers.<br />

Another approach to animation <strong>for</strong> which intuitive interfaces also start to e-<br />

merge is the use of key-frame or mocap data and tools to test the manipulation<br />

possibilities on these data (combination, alteration techniques) as an interactive<br />

system would do in real-time. Such a system is presented in [19] where filters<br />

can be used to alter animations.<br />

A complex, high-level animation interface is presented in [68] which allows the<br />

directing of multiple actors and cameras in real-time. Key-words are high level<br />

controls, predefining actions, while text-based sentences are used <strong>for</strong> high level<br />

actions. There are also tools <strong>for</strong> preprogramming actions, since there are many<br />

virtual humans to control. A scripting tool records all actions played by an<br />

actor and plays it back as requested. Actors can be programmed one by one.<br />

[33] uses a network of interrelated objects and constraints to visually edit the<br />

properties of virtual environments, including animation. The interesting approach<br />

is the fact that the edited environment and the controls are all threedimensional,<br />

making the editing interface integrated into the environment.<br />

[67] presents a facial animation editor system based on motion capture data<br />

and editing constraints representing laws of co-activation, symmetry laws and<br />

animation limits, where the animation itself is driven by emotion parameters.<br />

5.2 Planning<br />

A classical path planning algorithm is the A* algorithm, used by [44]. They<br />

use an orthogonal projection of the environment to a 2D plane, divided into<br />

cells. Collision detection and path planning is done based on the occupied/free<br />

cells of the plane. The free cells of the plane are trans<strong>for</strong>med into a graph, <strong>for</strong><br />

which the A* algorithm is applied. Variations exist on this method, <strong>for</strong> instance<br />

boundary cell skipping <strong>for</strong> collision avoidance or multilevel cell maps [6].<br />

A planning method that is used frequently in games takes advantage of space<br />

subdivision techniques, with different types of subdivisions. It could be applied<br />

in both 2D planes and <strong>3D</strong> space, providing both a notation <strong>for</strong> freely navigable<br />

space and a collision detection mechanism. [77] uses an octree division of free<br />

and occupied spaces in a virtual environment to notate free space <strong>for</strong> path<br />

planning, incorporating dynamic possibilities <strong>for</strong> updating octree data due to<br />

animations or possibilities <strong>for</strong> adding or removing objects, in case of content<br />

change.<br />

[63] uses an autonomous planning method based on a reactive, timed behavioral<br />

L-system that handles visual and tactile virtual sensors. The values of<br />

the sensors are used in production rules <strong>for</strong> determining the behavior of the<br />

characters.<br />

[50] provides a method of high level animation by specifying approximate paths<br />

and synchronization constraints. Their system handles automatically collisions<br />

and animation using <strong>for</strong>ward dynamics and a synchronization mechanism that<br />

is depending on relative positions of objects to each other, and not a common<br />

coordinate system. Splines are used <strong>for</strong> motion (path) specification and are<br />

14


altered by scripting depending on the synchronization constraint states.<br />

[11] presents an autonomous virtual human directing system that is based on<br />

synthetic vision sensor (the viewpoint of the virtual human). The autonomous<br />

humans are directed by a behavioral system at motivational (disregarding the<br />

behavior system), task (if the behavioral system allows it) and direct motor<br />

levels (influence the modality of the current action) according to the in<strong>for</strong>mation<br />

extracted from the visual sensors using vision techniques.<br />

[29] uses a reactive path planning method based on the possible successor state<br />

axiom instead of the effect axiom (using the ”what the world can turn into”<br />

metaphor). Their characters are directed by goals and domain knowledge, incorporating<br />

actions, preconditions and effects. For the actual path planning,<br />

a sensor and interval (effects range) based narrowing of possible world states<br />

(regarded as a tree structure) and thus plan sequences is used as a predecessor<br />

step <strong>for</strong> a faster resolution time.<br />

The following methods are methods complementary to path planning, but necessary<br />

to execute the different paths selected. [72] uses a method of gait generation<br />

along a curved path and uneven terrain with high control path specification.<br />

The high-level, intuitive parameters used are step length, step height, heading<br />

direction and toe-out. Similarly, [32] retargets motion data with different paths,<br />

using constraints to eliminate foot skating. The path is represented by a<br />

B-spline curve, which is editable to produce new movement paths.<br />

[37] uses a path following method based on the redirection of the runner character<br />

to face the point it will reach in a two-second animation cycle. This<br />

redirecting methods allows a dynamic path planning method to be used.<br />

5.3 <strong>Animation</strong> systems (architectures)<br />

For real-time virtual humans, [45] presents a number of criteria or guidelines<br />

that is recommended to achieve good animation results. These are low polygon<br />

count, fast skin de<strong>for</strong>mations, non-physical skeleton animation, etc. Given that<br />

the statement was made in 1998 referring to SGI workstation hardware, the<br />

statement is still partially valid referring to today’s ever powerful PC environments,<br />

especially if we consider a setting of multiple characters that must be<br />

animated in complex environments, with complex behavior. However, physicsbased<br />

methods also approach the real-time level, and will start to emerge, especially<br />

in professional settings.<br />

[3] and [9] describes parallel finite-state machine controllers called Parallel Transition<br />

Networks (PaT-Nets) used together with additional parameters and in<strong>for</strong>mation<br />

coded in a higher-level representation called Parameterized Action<br />

Representation (PAR). A PAR is defined as the description of a single action in<br />

terms of objects, agent, applicability conditions, preparatory specifications, execution<br />

steps, manner, termination conditions and post assertions. This is used<br />

as a sense-control-act architecture to animate ”smart avatars”, avatars whose<br />

reactive behavior is manipulated in real-time to be locally adaptive. The architecture<br />

integrates autonomy control levels (to optimize lower level reactivity<br />

or plan higher level complex tasks), gesture control based on a object-specific<br />

relational table and non-verbal communication, attention control directed by<br />

visual sensing requirements and locomotion with anticipation <strong>for</strong> favorable positioning<br />

in interaction scenarios. The paper [4] presents in detail their attention<br />

15


control method based on sensing events that simulates deliberate, spontaneous<br />

and idle attention or behavior using an automata system with an arbitrating<br />

mechanism controlling it.<br />

Part of the animation architecture presented previously is used also by [21]<br />

<strong>for</strong> the synchronization of gaze and hand movements with dialogues. Gesture<br />

PaT-Nets are used to sends hand and arm timing, position and shape to the<br />

animation system which produces a file of motions depending on this data. Motions<br />

are categorized as beats and gestures, the beats being based on speech<br />

rhythm, while gestures summarize the deictic, iconic, metaphoric gestures based<br />

on meaning representations. Beats are often superimposed on other gestures<br />

and coarticulation (2 gestures without relaxation), preparation, <strong>for</strong>eshortening<br />

(anticipation) effects are also reproduced using the PaT-Nets synchronization<br />

architecture. The further extension of the system is discussed in [22] where<br />

an automatic text-to-dialogue conversion method is used to generate synchronized<br />

animation and speech from the semantic context and a behavior-tagged<br />

animation system.<br />

[64] uses a distributed multi-agent system (MAS) with blackboard agent synchronization<br />

<strong>for</strong> animating multi-user virtual environments. Multiple animation<br />

engine values at each LAN and a global behavior engine value is used <strong>for</strong> local<br />

animation frame synchronization (one mind, multiple bodies or parallel universes<br />

analogy). For consistency, all environment objects behave as agents to<br />

avoid multiple locking (similar to database techniques).<br />

The aim of the animation system presented in [68] and [69] is to provide integrated<br />

virtual humans with (multiple source) facial and body animation and<br />

speech, as well as to provide a straight<strong>for</strong>ward interface <strong>for</strong> designers and directors<br />

to handle these virtual humans. The speech-animation synchronization<br />

is done using duration in<strong>for</strong>mation from the text-to-speech system used. The<br />

system is built upon a server/client architecture using a virtual human message<br />

protocol that enables connection over networks, and makes possible <strong>for</strong> many<br />

clients to control the virtual humans, behaving thus as virtual actors. Keywords<br />

are used as high level controls, predefining actions, text-based sentences<br />

<strong>for</strong> high level actions. Tools <strong>for</strong> preprogramming actions and a scripting tool to<br />

record and play back all actions played by an actor are used to exploit the system’s<br />

possibilities. The animation management methods used <strong>for</strong> this directing<br />

system is described in detail in [13].<br />

[48] presents an animation framework <strong>for</strong> general type of animations (not just<br />

biped) with a combination of techniques (kinematics, dynamics and control<br />

theory). The input desired velocity and heading is trans<strong>for</strong>med by a feedback<br />

controller into <strong>for</strong>ces and torques, which are distributed to the legs in contact<br />

with the ground by a linear programming algorithm. Forward dynamics is<br />

used to calculate actual body displacement, while a kinematic gait controller<br />

calculates the stepping pattern. The animation framework is tested in uneven<br />

terrain and target following experiments.<br />

With such complex systems it is possible to achieve even the animation <strong>for</strong> text<br />

to scene conversion systems such as [24] or text to animation systems as it is<br />

already being done by the BEAT system [22].<br />

16


6 Conclusions<br />

During the review of the different animation systems we looked at the applicability<br />

of the described systems to our constraints, constraints that are inherent<br />

from the setup we work on. We use internet-enabled <strong>3D</strong> virtual reality technologies<br />

to provide a wide user access possibility, there<strong>for</strong>e the target plat<strong>for</strong>ms are<br />

desktop computers with consumer graphics cards. The desktop hardware provides<br />

ever increasing computational and graphics power, but real-time complex<br />

simulation methods are not yet feasible <strong>for</strong> interactive, web-based applications.<br />

We analyzed the animation systems from three points of view. These considerations<br />

were animation quality, computational overhead and reusability, since<br />

our aim is a complete system that provides compelling animation with a small<br />

computational overhead and provides the possibility to reuse the created animations.<br />

The systems analyzed looking at these criteria were themselves complex<br />

systems that presented results <strong>for</strong> all of the three consideration directions or<br />

just <strong>for</strong> some of them, either way they were considered in our review.<br />

The two main categories of animation methods with regard to movement data<br />

production are the motion-capture based and simulation based animation<br />

methods. One of these <strong>for</strong>ms the base <strong>for</strong> an animation system, while different<br />

additional methods provide enhancements to the animations produced by the<br />

base methods.<br />

Regarding animation quality, there are two subconsiderations: variation and<br />

naturalness. Motion capture based methods are capturing motion per<strong>for</strong>med<br />

by humans, thus they capture also the naturalness of the movements, and not<br />

only with movement experts. The drawback lies in the fact that the motion<br />

sequences are invariant and in case of reuse the repetition is easily observable.<br />

On the other hand, simulation methods produce variant movement data, but<br />

they often lack the naturalness of captured movement data, and there<strong>for</strong>e are<br />

mainly used in simulating robots.<br />

The second consideration is computational overhead. In this case the motion<br />

capture methods are in advantage since they don’t need computations to calculate<br />

the movement data <strong>for</strong> each animation sequence, they just use prerecorded<br />

data. The price of the varieties provided by the simulation based animation<br />

methods is in the high computing requirements, making it too expensive <strong>for</strong><br />

certain application areas.<br />

From the first two considerations, we can see that motion capture is more favorable<br />

<strong>for</strong> our purposes, but there is still a negative property <strong>for</strong> animation<br />

sequences based solely on motion capture data: the resulting animations are<br />

invariant. The third criteria we looked at is there<strong>for</strong>e reusability, with emphasis<br />

on motion data analysis and handling. Simulation based systems always<br />

produce new data, thus the criteria does not apply to simulation. With a little<br />

computational overhead (which is less than what is needed <strong>for</strong> simulation<br />

methods), different methods can be used to analyze motion capture data, extract<br />

features, style, basic motions, etc. and use this new motion knowledge<br />

to modify the motion capture data in a natural, compelling way to produce<br />

variant animations.<br />

Our conclusion is that motion capture methods in combination with motion<br />

analysis and alteration methods provide quality, naturalness and reusability<br />

with a small footprint on speed and thus interactivity, since the computational<br />

17


overhead is small. Motion data analysis and processing is particularly important<br />

with the advent of movement capture methods from 2D video images, especially<br />

if the movement is extracted from general purpose video material where motions<br />

are not executed separately and expressively, with motion capture in mind.<br />

We need qualitative analysis to break captured movements (regardless from the<br />

source) into basic movements that represent separate actions, <strong>for</strong> a variety of<br />

combination possibilities, together with animation data qualitative variation<br />

based on movement parameters and psychological (mood, emotion) factors to<br />

simulate compelling animations <strong>for</strong> agent and avatar <strong>3D</strong> characters.<br />

Regarding the actual motion analysis and alteration methods, many options<br />

exist. However, in the light of possibly using complex movement data captured<br />

from video sources, an obvious choice would be a motion segmentation method<br />

based on motion curve parameters, where the derivatives show movement endpoints,<br />

making the separation of movement sequences into basic movements<br />

possible. However, this method would have to be extended to extract the overlapping<br />

movement sequences of video based movement capture data, since in<br />

that case the movements are most likely mixed.<br />

References<br />

[1] Kenji Amaya, Armin Bruderlin, and Tom Calvert. Emotion from Motion.<br />

In Graphics Interface ’96, 1996.<br />

[2] Amaury Aubel, Ronan Boulic, and Daniel Thalmann. Animated Impostors<br />

<strong>for</strong> Real-Time Display of Numerous Virtual Humans. In Virtual Worlds<br />

’98, pages 14–28, 1998.<br />

[3] Norman I. Badler, Rama Bindiganavale, Juliet Bourne, Jan Allbeck, Jianping<br />

Shi, and Martha Palmer. Real Time Virtual Humans. In Proceedings<br />

of International Conference on Digital Media Futures ’99. British <strong>Computer</strong><br />

Society, 1999.<br />

[4] Norman I. Badler, Diane Chi, and Sonu Chopra. Virtual Human <strong>Animation</strong><br />

Based on Movement Observation and Cognitive Behavior Models. In<br />

<strong>Computer</strong> <strong>Animation</strong> ’99, pages 128–137, 1999.<br />

[5] Norman I. Badler, Cary B. Phillips, and Bonnie L. Webber. Simulating<br />

Humans: <strong>Computer</strong> Graphics, <strong>Animation</strong>, and Control. Ox<strong>for</strong>d University<br />

Press, 1993.<br />

[6] Srikanth Bandi and Daniel Thalmann. Space discretization <strong>for</strong> efficient<br />

human navigation. In Proc. Eurographics ’98, <strong>Computer</strong> GraphicsForum,<br />

volume 17(3), pages 195–206, 1998.<br />

[7] Y. Bellan, M. Costa, G. Ferrigno, F. Lombardi, L. Macchiarulo, A. Montuori,<br />

E. Pasero, and C. Rigotti. Artificial Neural Networks <strong>for</strong> Motion Emulation<br />

in Virtual Environments. In CAPTECH’98, pages 83–99. Springer,<br />

1998.<br />

[8] Uwe Beyer and Frank Śmieja. A Heuristic Approach to the Inverse Differential<br />

Kinematics Problem. In Journal of Intelligent Robotic Systems:<br />

Theory and Applications, volume 18(4), pages 309–327, 1997.<br />

18


[9] Rama Bindiganavale, William Schuler, Jan M. Allbeck, Norman I. Badler,<br />

Aravind K. Joshi, and Martha Palmer. Dynamically Altering Agent Behaviors<br />

Using Natural Language Instructions. In Autonomous Agents 2000,<br />

pages 293–300, 2000.<br />

[10] Ramamani N. Bindiganavale. Building Parameterized Action Representations<br />

from Observation. PhD thesis.<br />

[11] Bruce M. Blumberg and Tinsley A. Galyean. Multi-Level Direction of<br />

Autonomous Creatures <strong>for</strong> Real-Time Virtual Environments. In <strong>Computer</strong><br />

Graphics (Siggraph ’95 Proceedings), pages 47–54, 1995.<br />

[12] Vincent Bonnafous, Eric Menou, Jean-Pierre Jessel, and René Caubet. Cooperative<br />

and Concurrent Blending Motion Generators. In International<br />

Conference in Central Europe on <strong>Computer</strong> Graphics, Visualization and<br />

<strong>Computer</strong> Vision (WSCG’2001), 2001.<br />

[13] Ronan Boulic, Pascal Bécheiraz, Luc Emering, and Daniel Thalmann. Integration<br />

of Motion Control Techniques <strong>for</strong> Virtual Human and Avatar<br />

Real-Time <strong>Animation</strong>. In VRST’97, pages 111–118. ACM, 1997.<br />

[14] Ronan Boulic, Ramon Mas, and Daniel Thalmann. Inverse Kinetics <strong>for</strong><br />

Center of Mass Position Control and Posture Optimization. In European<br />

Workshop on Combined Real and Synthetic Image Processing <strong>for</strong> Broadcast<br />

and Video Production , 1994.<br />

[15] Ronan Boulic, Ramon Mas, and Daniel Thalmann. Robust Position Control<br />

of the Center of Mass with Second Order Inverse Kinetics. In <strong>Computer</strong>s<br />

Graphics Journal, 1996.<br />

[16] Matthew Brand and Aaron Hertzmann. Style Machines. In <strong>Computer</strong><br />

Graphics (Siggraph ’00 Proceedings), pages 183–192, 2000.<br />

[17] David C. Brogan, Ronald A. Metoyer, and Jessica K. Hodgins. Dynamically<br />

Simulated <strong>Characters</strong> in Virtual Environments. In IEEE <strong>Computer</strong><br />

Graphics and Applications, volume 15(5), pages 58–69, 1998.<br />

[18] Armin Bruderlin and Tom Calvert. Knowledge-Driven, Interactive <strong>Animation</strong><br />

of Human Running. In Graphics Interface ’96, 1996.<br />

[19] Armin Bruderlin and Lance Williams. Motion Signal Processing. In <strong>Computer</strong><br />

Graphics (Siggraph ’95 Proceedings), pages 97–104, 1995.<br />

[20] Deborah A. Carlson and Jessica K. Hodgins. Simulation Levels of Detail<br />

<strong>for</strong> Real-time <strong>Animation</strong>. In Graphics Interface ’97, 1997.<br />

[21] Justine Cassell, Catherine Pelachaud, Norman Badler, Mark Steedman,<br />

Brett Achorn, Tripp Becket, Brett Douville, Scott Prevost, and Matthew<br />

Stone. Animated Conversation: Rule-based Generation of Facial Expression,<br />

Gesture & Spoken Intonation <strong>for</strong> Multiple Conversational Agents.<br />

In <strong>Computer</strong> Graphics (Siggraph ’94 Proceedings), pages 413–420. ACM<br />

Press, 1994.<br />

[22] Justine Cassell, Hannes Högni Vilhjálmsson, and Timothy Bickmore.<br />

BEAT: the Behavior Expression <strong>Animation</strong> Toolkit. In <strong>Computer</strong> Graphics<br />

(Siggraph ’01 Proceedings), pages 477–486, 2001.<br />

19


[23] Diane Chi, Monica Costa, Liwei Zhao, and Norman Badler. The EMOTE<br />

Model <strong>for</strong> Ef<strong>for</strong>t and Shape. In <strong>Computer</strong> Graphics (Siggraph ’00 Proceedings),<br />

pages 173–182, 2000.<br />

[24] Bob Coyne and Richard Sproat. WorldsEye: An Automatic Text-to-Scene<br />

Conversion System. In <strong>Computer</strong> Graphics (Siggraph ’01 Proceedings),<br />

pages 487–496, 2001.<br />

[25] P. Ekman and W. V. Friesen. The Facial Action Coding System (FACS): A<br />

Technique <strong>for</strong> the Measurement of Facial Action. Consulting Psychologists<br />

Press, 1978.<br />

[26] Luc Emering, Ronan Boulic, and Daniel Thalmann. Interacting with Virtual<br />

Humans through Body Actions. In IEEE Journal of <strong>Computer</strong> Graphics<br />

and Applications, volume 18(1), pages 8–11. IEEE, 1998.<br />

[27] Petros Faloutsos, Michiel van de Panne, and Demetri Terzopoulos. Composable<br />

Controllers <strong>for</strong> Physics-Based Character <strong>Animation</strong>. In <strong>Computer</strong><br />

Graphics (Siggraph ’01 Proceedings), pages 251–260, 2001.<br />

[28] P. Fua, R. Plänkers, and D. Thalmann. Realistic Human Body Modeling.<br />

In Fifth International Symposium on the 3-D Analysis of Human Movement,<br />

1998.<br />

[29] John Funge, Xiaoyuan Tu, and Demetri Terzopoulos. Cognitive Modeling:<br />

Knowledge, Reasoning and Planning <strong>for</strong> Intelligent <strong>Characters</strong>. In<br />

<strong>Computer</strong> Graphics (Siggraph ’99 Proceedings), pages 29–38, 1999.<br />

[30] George Maestri. Learning to Walk: The Theory<br />

and Practice of <strong>3D</strong> Character Motion.<br />

http://www.siggraph.org/education/materials/HyperGraph/animation/<br />

character animation/walking/learning to walk.htm.<br />

[31] Michael Gleicher. Retargetting Motion to New <strong>Characters</strong>. In <strong>Computer</strong><br />

Graphics (Siggraph ’98 Proceedings), pages 33–42, 1998.<br />

[32] Michael Gleicher. Motion path editing. In Symposium on Interactive <strong>3D</strong><br />

Techniques, pages 195–202, 2001.<br />

[33] Enrico Gobbetti and Jean-Francis Balaguer. An Integrated Environment<br />

to Visually Construct <strong>3D</strong> <strong>Animation</strong>s. In <strong>Computer</strong> Graphics (Siggraph<br />

’95 Proceedings), pages 395–398, 1995.<br />

[34] Larry Israel Gritz. Evolutionary Controller Synthesis <strong>for</strong> 3-D Character<br />

<strong>Animation</strong>. PhD thesis, 1999.<br />

[35] Radek Grzeszczuk and Demetri Terzopoulos. Automated Learning of<br />

Muscle-Actuated Locomotion Through Control Abstraction. In <strong>Computer</strong><br />

Graphics (Siggraph ’95 Proceedings), pages 63–70, 1995.<br />

[36] Radek Grzeszczuk, Demetri Terzopoulos, and Geoffrey Hinton. NeuroAnimator:<br />

Fast Neural Network Emulation and Control of Physics-Based<br />

Models. In <strong>Computer</strong> Graphics (Siggraph ’98 Proceedings), pages 9–20,<br />

1998.<br />

20


[37] Jessica K. Hodgins. Three-Dimensional Human Running. In Proceedings<br />

of the IEEE Conference on Robotics and Automation, 1996.<br />

[38] Jessica K. Hodgins and Nancy S. Pollard. Adapting Simulated Behavios<br />

<strong>for</strong> New <strong>Characters</strong>. In <strong>Computer</strong> Graphics (Siggraph ’97 Proceedings),<br />

pages 153–162, 1997.<br />

[39] Jessica K. Hodgins, Wayne L. Wooten, David C. Brogan, and James F.<br />

O’Brien. Animating Human Athletics. In <strong>Computer</strong> Graphics (Siggraph<br />

’95 Proceedings), pages 71–78, 1995.<br />

[40] Zhiyong Huang. Motion control <strong>for</strong> human animation. PhD thesis, 1997.<br />

[41] Zhiyong Huang, Nadia Magnenat-Thalmann, and Daniel Thalmann. Interactive<br />

Human Motion Control Using a Closed-<strong>for</strong>m of Direct and Inverse<br />

Dynamics. In Proc. Pacific Graphics ’94, 1994.<br />

[42] Humanoid <strong>Animation</strong> Working Group, Web<strong>3D</strong> Consortium. H-Anim:<br />

Specification <strong>for</strong> a Standard VRML Humanoid, version 1.1. http://www.hanim.org/Specifications/H-Anim1.1/,<br />

1999.<br />

[43] Verne T. Inman, Henry J. Ralston, Frank Todd, and Jean C. Lieberman.<br />

Human Walking. Williams Wilkins, 1981.<br />

[44] Jr. James J. Kuffner. Goal-Directed Navigation <strong>for</strong> Animated <strong>Characters</strong><br />

Using Real-Time Path Planning and Control. In CAPTECH’98, pages<br />

171–186. Springer, 1998.<br />

[45] Prem Kalra, Nadia Magnenat-Thalmann, Laurent Moccozet, Gael Sannier,<br />

Amaury Aubel, and Daniel Thalmann. Real-Time <strong>Animation</strong> of Realistic<br />

Virtual Humans. In IEEE Journal of <strong>Computer</strong> Graphics and Applications,<br />

volume 18(5), pages 42–56, 1998.<br />

[46] Szilárd Kiss. Web Based VRML Modeling. In IV2001 In<strong>for</strong>mation Visualisation,<br />

pages 612–617, London, England, 2001.<br />

[47] Szilárd Kiss. <strong>3D</strong> Character Modeling in Virtual Reality. In IV2002 In<strong>for</strong>mation<br />

Visualisation, pages 541–548, London, England, 2002.<br />

[48] Evangelos Kokkevis, Dimitri Metaxas, and Norman I. Badler. Autonomous<br />

<strong>Animation</strong> and Control of Four-Legged Animals. In Graphics Interface ’95,<br />

1995.<br />

[49] Evangelos Kokkevis, Dimitri Metaxas, and Norman I. Badler. User-<br />

Controlled Physics-Based <strong>Animation</strong> <strong>for</strong> <strong>Articulated</strong> Figures. In <strong>Computer</strong><br />

<strong>Animation</strong> ’96, 1996.<br />

[50] Alexis Lamouret and Marie-Paule Gascuel. Scripting Interactive<br />

Physically-Based Motions with Relative Paths and Synchronization. In<br />

Graphics Interface ’95, 1995.<br />

[51] John Lasseter. Principles of Traditional <strong>Animation</strong> Applied to <strong>3D</strong> <strong>Computer</strong><br />

<strong>Animation</strong>. In Proc. Siggraph’87, pages 35–44. ACM Press, 1987.<br />

[52] Joseph Laszlo, Michiel van de Panne, and Eugene Fiume. Limit Cycle<br />

Control and its Application to the <strong>Animation</strong> of Balancing and Walking.<br />

In <strong>Computer</strong> Graphics (Siggraph ’96 Proceedings), pages 155–162, 1996.<br />

21


[53] Joseph Laszlo, Michiel van de Panne, and Eugene Fiume. Interactive Control<br />

<strong>for</strong> Physically-Based <strong>Animation</strong>. In <strong>Computer</strong> Graphics (Siggraph ’00<br />

Proceedings), pages 201–208, 2000.<br />

[54] Jehee Lee and Sung Yong Shin. A Hierarchical approach to Interactive<br />

Motion Editing <strong>for</strong> Human-like Figures. In <strong>Computer</strong> Graphics (Siggraph<br />

’99 Proceedings), pages 39–48, 1999.<br />

[55] Yuencheng Lee, Demetri Terzopoulos, and Keith Waters. Realistic Modeling<br />

<strong>for</strong> Facial <strong>Animation</strong>. In <strong>Computer</strong> Graphics (Siggraph ’95 Proceedings),<br />

pages 55–62, 1995.<br />

[56] Ruqian Lu and Songmao Zhang. Automatic generation of <strong>Computer</strong> <strong>Animation</strong>:<br />

Using AI <strong>for</strong> Movie <strong>Animation</strong>, volume 2160, LNAI. Springer.<br />

[57] M. Grosso and R. Quach and E. Otani and J. Zhao and S. Wei and P.-H.<br />

Ho and J. Lu and N. I. Badler. Anthropometry <strong>for</strong> <strong>Computer</strong> Graphics<br />

Human Figures. Technical Report MS-CIS-89-71, 1989.<br />

[58] George Maestri. Digital Character <strong>Animation</strong> 2, volume 1 - Essential Techniques.<br />

New Riders Publishing, 1999.<br />

[59] George Maestri. Digital Character <strong>Animation</strong> 2, volume 2 - Advanced<br />

Techniques. New Riders Publishing, 2001.<br />

[60] D. McNeill. Hand and Mind: What Gestures Reveal About Thought. University<br />

of Chicago, 1992. Characterizes gestures.<br />

[61] Stuart Mealing. The art and science of computer animation. Ox<strong>for</strong>d, 1992.<br />

[62] Nadia Magnenat-Thalmann and Daniel Thalmann. <strong>Computer</strong> <strong>Animation</strong>.<br />

In Handbook of <strong>Computer</strong> Science, pages 1300–1318. CRC Press, 1996.<br />

[63] Hansrudi Noser and Daniel Thalmann. The <strong>Animation</strong> of Autonomous<br />

Actors Based on Production Rules. In Proc. <strong>Computer</strong> <strong>Animation</strong> ’96,<br />

pages 47–57. IEEE <strong>Computer</strong> Society Press, 1996.<br />

[64] Ken Perlin and Athomas Goldberg. Improv: A System <strong>for</strong> Scripting Interactive<br />

Actors in Virtual Worlds. In <strong>Computer</strong> Graphics (Siggraph ’96<br />

Proceedings), pages 205–216, 1996.<br />

[65] Charles Rose, Bobby Bodenheimer, and Michael F. Cohen. Verbs and<br />

Adverbs: Multidimensional Motion Interpolation using Radial Basis Functions.<br />

In IEEE Journal of <strong>Computer</strong> Graphics and Applications, 1998.<br />

[66] Charles Rose, Brian Guenter, Bobby Bodenheimer, and Michael F. Cohen.<br />

Efficient Generation of Motion Transitions using Spacetime Constraints.<br />

In <strong>Computer</strong> Graphics (Siggraph ’96 Proceedings), pages 147–154, 1996.<br />

[67] Zsófia Ruttkay, Paul ten Hagen, Han Noot, and Mark Savenije. Facial<br />

<strong>Animation</strong> by Synthesis of Captured and Artificial Data. In CAPTECH’98,<br />

pages 268–271, 1998.<br />

[68] G. Sannier, S. Balcisoy, N. Magnenat-Thalmann, and D. Thalmann. An<br />

Interactive Interface <strong>for</strong> Directing Virtual Humans. In Proc. ISCIS’98. IOS<br />

Press, 1998.<br />

22


[69] G. Sannier, S. Balcisoy, N. Magnenat-Thalmann, and D. Thalmann. VHD:<br />

A System <strong>for</strong> Directing Real-Time Virtual Actors. In The Visual <strong>Computer</strong>,<br />

volume 15, No 7/8, pages 320–329. Springer, 1999.<br />

[70] Karl Sims. Evolving Virtual Creatures. In <strong>Computer</strong> Graphics (Siggraph<br />

’94 Proceedings), pages 15–22. ACM Press, 1994.<br />

[71] A. James Stewart and James F. Cremer. Beyond Keyframing: An Algorithmic<br />

Approach to <strong>Animation</strong>. In Graphics Interface ’92, 1992.<br />

[72] Harold C. Sun and Dimitris N. Metaxas. Automatic Gait Generation. In<br />

<strong>Computer</strong> Graphics (Siggraph ’01 Proceedings), pages 261–270, 2001.<br />

[73] Wen Tang, Marc Cavazza, Dale Mountain, and Rae Earnshaw. Real-Time<br />

Inverse Kinematics through Constrained Dynamics. In CAPTECH’98,<br />

pages 159–170. Springer, 1998.<br />

[74] Daniel Thalmann. A New Generation of Synthetic Actors: the Real-time<br />

and Interactive Perceptive Actors. In Proc. Pacific Graphics 96, 1996.<br />

[75] Daniel Thalmann. Physical, Behavioral, and Sensor-Based <strong>Animation</strong>. In<br />

Graphicon’96, 1996.<br />

[76] Daniel Thalmann. The Foundations to Build a Virtual Human Society.<br />

In Intelligent Virtual Agents Workshop (IVA2001), volume 2190, LNAI.<br />

Springer, 2001.<br />

[77] Daniel Thalmann, Hansrudi Noser, and Zhiyong Huang. Autonomous Virtual<br />

Actors based on Virtual Sensors. In Creating Personalities <strong>for</strong> Synthetic<br />

Actors: Towards Autonomous Personality Agents, volume 1195, LNAI,<br />

pages 25–42. Springer, 1997.<br />

[78] Xiaoyuan Tu. Artificial Animals <strong>for</strong> <strong>Computer</strong> <strong>Animation</strong>: Biomechanics,<br />

Locomotion, Perception, and Behavior, volume 1635 of LNCS. Springer,<br />

1999.<br />

[79] Munetoshi Unuma, Ken Anjyo, and Ryozo Takeuchi. Fourier Principles<br />

<strong>for</strong> Emotion-based Human Figure <strong>Animation</strong>. In <strong>Computer</strong> Graphics (Siggraph<br />

’95 Proceedings), pages 91–96, 1995.<br />

[80] Keith Waters. A muscle model <strong>for</strong> animation three-dimensional facial expression.<br />

In Proc. Siggraph ’87, pages 17–24. ACM Press, 1987.<br />

[81] Michael W. Whittle. Musculo-Skeletal Applications of Three-Dimensional<br />

Analysis. In Paul Allard and Ian A.F. Stokes, editors, Three-Dimensional<br />

Analysis of Human Movement. Human Kinetics Publishers, 1995.<br />

[82] Andrew Witkin and Zoran Popović. Motion Warping. In <strong>Computer</strong> Graphics<br />

(Siggraph ’95 Proceedings), pages 105–108, 1995.<br />

[83] Wayne L. Wooten and Jessica K. Hodgins. <strong>Animation</strong> of Human Diving.<br />

In Graphics Interface ’95, 1995.<br />

23

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!