Computer Animation for Articulated 3D Characters
Computer Animation for Articulated 3D Characters
Computer Animation for Articulated 3D Characters
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Computer</strong> <strong>Animation</strong> <strong>for</strong> <strong>Articulated</strong> <strong>3D</strong> <strong>Characters</strong><br />
Szilárd Kiss<br />
October 15, 2002<br />
Abstract<br />
We present a review of the computer animation literature, mainly concentrating<br />
on articulated characters and at least some degree of interactivity<br />
or real time simulation. Advances in different techniques such as<br />
key-frame, motion capture (also known as mocap), dynamics, inverse kinematics<br />
(IK), controller systems, and even neural networks are discussed.<br />
We try to analyze these methods from different aspects: semantics,<br />
data input, animation types, generating the animations, combining animations,<br />
animation varying, target type, and even the path planning perspective<br />
is taken into account <strong>for</strong> the reviewed animation systems.<br />
1 Introduction<br />
This report is an overview of motion (animation) control methods <strong>for</strong> articulated<br />
figures. These articulated figures consist of rigid links (segments in H-anim<br />
[42] terminology which is a standard <strong>for</strong> articulated human figures) connected<br />
by joints with 1, 2 or 3 Degrees Of Freedom (DOFs). The control methods<br />
use kinematics (time-based joint rotation values) or dynamics (<strong>for</strong>ce-based<br />
simulation of movement) to drive the articulated figures, methods <strong>for</strong> which<br />
the concepts and basics are presented in many animation textbooks [61] [59].<br />
Besides the task of animating which could be quite complex by itself, some of<br />
the control methods presented take into account the context (environment) in<br />
which the character is being animated, such as the environment objects and<br />
obstacles or even the group and social behaviour [76]. The methods discussed<br />
differ not only in their motion control methods, but also in their animation possibilities<br />
like quality, speed, etc. and thus they provide possibilities <strong>for</strong> different<br />
target applications.<br />
This research will be integrated into my own work via an animation editor,<br />
which is being currently developed. The animation editor comes to complete a<br />
geometry editor [46] and an H-anim based bone hierarchy editor [47] to provide<br />
a system capable of producing the avatar or agent bodies needed to populate<br />
virtual environments. Since the aim is not just to provide a simple animation<br />
editor but a whole animation system which can be used <strong>for</strong> specifying basic<br />
movements and creating combinations of movements, the goal of this report was<br />
to find previous research dealing with animation in general, and with animation<br />
combination respectively qualitative assessment of animations in particular.<br />
In our system, interactivity is the key requirement, which asks <strong>for</strong> speed in<br />
the system, and more importantly, <strong>for</strong> animation generation and combination<br />
possibilities that result in compelling animation sequences. The outcome of this<br />
review is <strong>for</strong>mulated in section 6.<br />
1
2 <strong>Animation</strong> data<br />
Traditionally, computer animation data is stored as key-frame data. This data<br />
structure has two components: the key which specifies relative time moments<br />
and the animation data values at these time moments such as rotation angle/displacement/etc.<br />
values [58] [61].<br />
Creating these data values can be done manually (by talented animators) or can<br />
be captured automatically using different types of capture hardware [62] [75]<br />
or simpler mechanical devices like a bicycle [17]. Mechanical, <strong>for</strong>ce-based measurements<br />
were used in the past <strong>for</strong> (medical) musculoskeletal measurements<br />
and analysis such as the walking characteristics analysis presented in [43].<br />
Newer, magnetic or optical commercial motion capture systems use different<br />
markers <strong>for</strong> tracking coordinates and even mechanical (or optical fiber) wearable<br />
systems exists <strong>for</strong> capturing movement, these latter being more cumbersome,<br />
but providing a greater area of movement freedom. Markerless, image-based<br />
motion capture systems also start to emerge, where the input is one or more<br />
video camera feed [28]. Motion capture systems usually store data as absolute<br />
coordinate values, but there are motion capture file <strong>for</strong>mats that specify rotation<br />
data, which can be extracted from absolute coordinates data to be used <strong>for</strong><br />
hierarchically articulated characters.<br />
Next to body animation data, special motion capture systems are also used<br />
to capture facial animation data. [67] uses this technique to identify MPEG4<br />
Facial <strong>Animation</strong> Parameters (FAPs) in captured data and reuse it with <strong>3D</strong><br />
and cartoon heads, consisting of a fixed skeleton and animated with spring<br />
simulations (muscle).<br />
Motion capture data can be used <strong>for</strong> directly animating (directing in real-time)<br />
characters similar to the tennis game and virtual dancing in [45] and [62], even<br />
enhanced with additional motion/gesture recognition routines [26]. This type<br />
of animation may be useful <strong>for</strong> interactive systems, television, etc. If the data<br />
is applied later to characters, then it is stored in key-frame <strong>for</strong>mat, usually<br />
rotation angle data.<br />
There are systems that extract motion key-frames from any kind of data source<br />
(manual, IK, procedural, mocap) and apply that data to their purposes, <strong>for</strong><br />
instance qualitative variance [23] or composing animations [64] [13], methods<br />
described in section 3.2 and its subsections.<br />
2.1 <strong>Animation</strong> data types<br />
There are several distinctions that can be applied to animation data. One of<br />
them is reusability. Some data can be reused in the sense of re-run cycles<br />
(like walking, running [81] [18]) or by simply repeating a certain movement<br />
sequence when needed. There are also animation sequences that are dependent<br />
on the character’s properties (position, orientation) relative to target entities<br />
and thus cannot be used in a pre-recorded manner, and these are animations<br />
like grasping or pointing [40] which are usually achieved using IK or similar<br />
techniques described in section 3.1.<br />
2
3 Generating animation<br />
Table 3.1 enumerates and additionally roughly classifies the animation methods<br />
that are discussed in the rest of the paragraph. The classification dimensions<br />
are:<br />
1. Interactivity/Speed. The higher this value is, the more usable the system<br />
is in interactive virtual systems. Simple motion combination methods are<br />
the easiest and fastest <strong>for</strong>m of animation.<br />
2. Reusability/Speed. This dimension appreciates motion control methods<br />
that reuse recorded movements, not requiring new motion data generated<br />
from scratch.<br />
3. Generality. Physics based, IK motion systems score higher, since in theory<br />
they can produce an unlimited number of motion types, while procedural<br />
systems are limited by the motion data they possess (quality and nor<br />
quantity limitation).<br />
4. Quality. Physics based and qualitative variance methods score higher by<br />
producing unique and context-sensitive, adapted motions.<br />
3.1 Physical simulation<br />
A wide-spread method <strong>for</strong> animating articulated figures is kinematics. It is<br />
based on the motion properties time, position and velocity. The term <strong>for</strong>ward<br />
kinematics is used when joint rotations are specified in function of time. The<br />
term inverse kinematics refers to the problem of specifying the <strong>for</strong>ward kinematics<br />
values when the end-point (end effector) of the character articulation<br />
is known. There are many different methods to solve this problem, and this<br />
attention trans<strong>for</strong>med the concept into a well-known animation (simulation)<br />
alternative. The most common examples of <strong>for</strong>ward and inverse kinematics are<br />
the key-frame editors (which interpolate <strong>for</strong>ward kinematics values from a few<br />
intermediate values) and the possibilities of fixing the extremities of articulated<br />
figures while moving the other components or dragging the extremities themselves<br />
(in this case IK methods resolve the joint rotations) present in many<br />
modeling/animation packages [30].<br />
Dynamic simulations are <strong>for</strong>ce-based simulation methods that take in account<br />
the <strong>for</strong>ces that occur in articulated characters as well as additional properties<br />
or events like velocities, mass, torques, collision [40]. This animation method<br />
is computationally expensive and it is used <strong>for</strong> instance in realistic simulations<br />
[37] [83] [78] and tests <strong>for</strong> usability [57] [5].<br />
With these simulation methods, the end-result generally consists of timedependent<br />
values which represent the <strong>for</strong>ward kinematics angle data that can<br />
be used to drive the target articulated character.<br />
The IK problem has been approached from different angles. There are numerical<br />
and analytical methods [54] [5], but there are simpler, intuitive, non<br />
mathematics-inspired methods also that are aiming <strong>for</strong> speed gain: [34] tries to<br />
approximate target points with <strong>for</strong>ward kinematics, while [30] describes techniques<br />
where the hierarchies (limbs) are inverted.<br />
3
Interactivity/Speed<br />
Quality<br />
Procedural animation (based on data)<br />
- Fourier trans<strong>for</strong>mations to extract and alter qualitative features<br />
- image and signal processing techniques: filtering motion features<br />
- signal processing techniques <strong>for</strong> emotional trans<strong>for</strong>ms<br />
- learning and applying motion styles through HMMs and analogy<br />
- LMA properties based alterations (<strong>for</strong> ef<strong>for</strong>t and shape)<br />
- knowledge-driven alterations <strong>for</strong> specific subdomain<br />
- noise functions <strong>for</strong> qualitative variance<br />
- constraints and displacement maps<br />
- constraints based retargeting with spacetime approach<br />
- motion warping (time based alteration)<br />
- simulated annealing (scaling & fine-tuning)<br />
- path-based motion alteration<br />
- motion prototypes annotated with behavioral tags<br />
- combination based on height (foot elevation) criteria<br />
- 2nd order derivative to extract and combine basic motions<br />
- priority (importance) levels and weights based combination<br />
- segmenting data and recombining with transitional motions<br />
- combining non-overlapping data based on spacetime constraints<br />
- timewarping to match motion sequences<br />
- fade-in, fade-out functions<br />
- curve blending functions<br />
- interpolation between motion sequences<br />
- grouping mutually exclusive motions <strong>for</strong> faster combination<br />
Reusability/Speed<br />
Physical simulation (control systems)<br />
- dynamic Newton equations system<br />
- mass data and moments of inertia with motion equations<br />
- inverse dynamics with energy minimization constraints<br />
- decentralized, agent-based approach (<strong>for</strong>ces = agents)<br />
- inverse kinetics (IK with mass)<br />
- muscles as motion actuators with sensor feedback<br />
- proportional-derivative (damped spring) controllers (PD)<br />
- PD servos <strong>for</strong> dynamic simulation of joint torques<br />
- Genetic Programming method to create PD controllers<br />
- closed loop method <strong>for</strong> balancing in open loop walking<br />
- dynamics system with feedback control method<br />
- simplified inverse dynamics with mass (Newton-Euler eq.)<br />
- iterative numerical integration <strong>for</strong> IK<br />
- IK constraints with <strong>for</strong>ward kinematics and feedback<br />
- inverting hierarchies (articulations)<br />
- ”sagittal elevation” resolution method <strong>for</strong> IK<br />
- direct manipulation<br />
- Neural Networks based learning from examples/feedback<br />
- Machine Learning (adaptive state machines)<br />
Generality<br />
Table 3.1: Classification dimensions <strong>for</strong> different animation methods
[73] relates about a molecular simulation based IK algorithm used to calculate<br />
the motion of a skeleton’s joints in real-time, using 5 magnetic track sensors as<br />
end effectors. The algorithm is an iterative algorithm <strong>for</strong> numerical integration<br />
of constrained motion in a Lagrange framework.<br />
[8] describes a heuristic, iterative approach <strong>for</strong> IK resolution. In this approach,<br />
each functional segment of a limb hierarchy (in their specific case a robot arm) is<br />
regarded as a set of agents, yielding in a decentralized approach. The constraints<br />
are regarded as <strong>for</strong>ces, while each <strong>for</strong>ce is represented by an agent. There are 4<br />
types of <strong>for</strong>ces used: goto (target), pain (operation range), relax (equilibrium<br />
point) and orientation (<strong>for</strong> grabbing). Path smoothing must be used after the<br />
IK resolution to eliminate jitter.<br />
[78] uses kinematics constraints <strong>for</strong> physics-based simulation of artificial animals<br />
(fish). Inverse dynamics is used through solving motion equations (the<br />
Newtonian laws of motion) resulting in <strong>for</strong>ces that are physically correct but<br />
not realistic. Constrained optimization is used to find motions requiring less energy<br />
(open-loop controller). Another method presented <strong>for</strong> physical animation<br />
is motion synthesis using muscles as motion actuators, method which allows<br />
the integration of virtual sensors feedback. Finite-state machines are used to<br />
add and remove constraint equations.<br />
The paper [14] presents a concept called ”Inverse Kinetics”, a variant of the IK<br />
incorporating center of mass in<strong>for</strong>mation. Beside the pseudo-inverse Jacobian<br />
matrix calculations <strong>for</strong> IK, the mass distribution in<strong>for</strong>mation is considered in the<br />
calculations to provide center of mass position control and posture optimization<br />
<strong>for</strong> a more physically correct appearance. The same idea is related with more<br />
detail in [15].<br />
Another type of control system is the Proportional-Derivative (damped spring)<br />
controller [52] [34], PD <strong>for</strong> short. It is a proportional scale and deviation minimizing<br />
derivative system, which means that it generates a <strong>for</strong>ce or torque<br />
proportional to differences of positions and velocities of source and target coordinates.<br />
Similar is the work presented in [52] where an open loop cyclic motion (walking)<br />
is extended with a closed loop feedback <strong>for</strong> inclusion of balancing properties.<br />
Finite state automata is used <strong>for</strong> balancing feedback, while a PD controller is<br />
used <strong>for</strong> the walking simulation.<br />
[39] uses also the PD technique <strong>for</strong> servos to compute joint torques <strong>for</strong> the dynamic<br />
simulation of human athletics movements (running, bicycling, vaulting).<br />
Manually constructed control systems (incorporating intuition, observations,<br />
biomechanical data and built using a control techniques toolbox) are used <strong>for</strong><br />
characters and environment objects. User has the possibility to change certain<br />
high-level properties of the simulation (speed, orientation, etc.). They also use<br />
mass-spring clothes simulation to add secondary motion to the animations.<br />
[71] describes a dynamics method with a musculoskeletal basis structure and<br />
uses a dynamic Newton equations system to resolve the animation. The system<br />
presented is defined as dynamic having the possibility to add or remove equations<br />
respectively to change attributes on the fly. Articulation relationships are<br />
modeled with constraint equations, also dynamically alterable.<br />
The dynamics system in [49] uses as enhancement <strong>for</strong> efficiency a feedback<br />
control scheme with a two-stage collision response algorithm. The collision<br />
5
esponse algorithm consists of a system <strong>for</strong> calculating the velocity change at<br />
impact and of a second system <strong>for</strong> calculating <strong>for</strong>ces and counter <strong>for</strong>ces <strong>for</strong><br />
non-penetrating objects. The user has the possibility to specify kinematics<br />
trajectories <strong>for</strong> selected DOFs.<br />
[41] relates over an inverse dynamics technique applied to a simplified skeleton<br />
having mass in<strong>for</strong>mation attached to the body segments. A recursive algorithm<br />
is used based on the Newton-Euler motion equations to calculate the <strong>for</strong>ce and<br />
torque of joint actuators.<br />
A walking gait controller is presented in [72] that generates walking motion<br />
along a curved path and uneven terrain with high control path specification.<br />
The high-level parameters are step length, step height, heading direction and<br />
toe-out. A ”sagittal elevation” motion algorithm is used to specify the footfloor<br />
interaction model using sagittal elevation angle <strong>for</strong> realistic contact. The<br />
motion algorithm takes into account the properties of parent joints and segments<br />
(rotation, body properties) to calculate position and positioning data relative<br />
to the basis key-frame data.<br />
[37] and [83] present human running respectively human diving animation methods<br />
based on the same principles. They use mass data and moments of inertia<br />
in combination with a commercial motion equations generating package. The<br />
movements are separated in phases, and a state machine is used to control the<br />
action selection mechanism. The movement actions in case of diving are more<br />
diverse as in the case of running, including jumping, twisting next to running<br />
and balancing.<br />
[53] presents interactive techniques <strong>for</strong> controlling physically-based animated<br />
characters using the mouse and the keyboard. Assigning mouse movements or<br />
keyboard shortcuts to DOFs, direct manipulation of limited complexity characters<br />
(2D and few DOFs) is possible. The input devices also en<strong>for</strong>ce limitations,<br />
<strong>for</strong> instance a mouse can control 4 DOFs at a time (the control procedure is<br />
time-dependent). It also uses a state automaton to combine the animations.<br />
Another category of controllers span from the field of artificial intelligence and<br />
neural networks (NN). [36] uses neural networks <strong>for</strong> learning motion (from examples<br />
or positioning feedback), but applies it only to simple articulations (no<br />
humans): a pendulum (3 limbs), a car (learning how to do parking), a lunar<br />
module (learn landing) and a dolphin (learn swimming). Similar is their earlier<br />
work [35] concentrating on automatic locomotion learning <strong>for</strong> fish and snakelike<br />
creatures, having highly flexible, many DOF joints, but also lacking the<br />
complexity that is necessary <strong>for</strong> human articulated figures. Their work was<br />
certainly inspired by the pioneering work in this area conducted by Sims [70].<br />
Similar is the approach taken by [34] to achieve physical simulation with the<br />
aid of Genetic Programming (GP) to produce controller programs. Key marks<br />
(a subset of key frames and tracks) are used to specify the goals of an animation,<br />
compared with spline control points which have uncertainty incorporated<br />
through values and time. From this, a Proportional-Derivative controller is<br />
calculated.<br />
[7] presents a data-driven NN approach that is aimed to grasp the essence and<br />
intrinsic variability of human movement. They use a recurrent RPROP artificial<br />
neural network with state neurons and weighted cost functions in combination<br />
with a pre-processing normalization step according to a mannequin based<br />
6
on average human characteristics, with the aim to learn and simulate human<br />
movements.<br />
[27] relates about physics based animation using adaptive state machines as<br />
controllers. The state machines adapt by machine learning techniques, and they<br />
use balancing, pre and post states <strong>for</strong> optimal control. The controller selection<br />
is based on query and weighted appropriateness result, controllers having scopes<br />
and weights defining their capabilities. The technique is presented as a more<br />
general, longer-term solution than data blending and warping, but it is rather<br />
a question of physical simulation versus data based approach.<br />
[29] use high level cognitive modeling <strong>for</strong> action planning. Domain knowledge is<br />
represented by preconditions, actions and effects, while characters are directed<br />
by goals. The AI <strong>for</strong>malism ”situation calculus” is used to describe changing<br />
worlds using sorted first-order logic which means the use of the possible successor<br />
state axiom instead of the effect axiom (what the world can turn into). The<br />
paper presents an animation method hierarchy (pyramid hierarchy) with geometric,<br />
kinematic, physical, behavioral, cognitive modeling representing higher<br />
and higher levels of animation techniques. They use cognitive modeling to<br />
direct behavioral animation.<br />
[40] presents another type of domain knowledge. Virtual touch sensors are used<br />
in the hands of virtual humans to simulate grasping behavior. Small spheres<br />
are used as collision sensors in combination with grasping knowledge (deducted<br />
from the grasped object’s properties) <strong>for</strong> a physically correct grasping sequence.<br />
A particular method <strong>for</strong> character animation is presented in [63]. A timed, parametric,<br />
stochastic and conditional production L-system expanded with <strong>for</strong>ce<br />
fields, synthetic vision and audition is used to simulate the development and<br />
behavior of static objects, plants and autonomous creatures. An extra iteration<br />
step is used to derive the state of objects at increasing time moments. The<br />
animation system used is a real-time multi-L-system interpreter enhanced with<br />
ray tracing image production. Emergent behaviors can emerge by using virtual<br />
sensor functions in the conditions of production rules. A physically based virtual<br />
sensor is the tactile sensor (the other two virtual sensors are the image-based<br />
visual sensor and the acoustic sensor) which evaluates the global <strong>for</strong>ce field at<br />
a certain position.<br />
[20] uses different levels of detail <strong>for</strong> physical simulations. Similar to graphics<br />
LOD techniques, different simulation levels are used <strong>for</strong> animation: dynamic<br />
simulation, hybrid kinematic/dynamic simulation and point-mass simulation.<br />
These levels are selected according to the interaction probabilities. Objects<br />
have space of influence, determining collision possibilities and thus the need<br />
<strong>for</strong> a detailed dynamic simulation. The switching of simulation levels is done<br />
during simulation steps that match certain criteria <strong>for</strong> a smooth transition,<br />
like the character being airborne. Another possible simulation LOD selection<br />
criteria the authors mention could be the cost/benefit heuristic, which requires<br />
the developer to specify importance values in the simulation environment.<br />
3.2 Procedural animation<br />
We divided the procedural animation methods from two point of views. First,<br />
we take a look at the combination techniques <strong>for</strong> motion data (left column, bottom<br />
in table 3.1), which provide the easiest interactivity possibilities by simply<br />
7
eusing a motion library. In addition to combining motion sequences, there is<br />
another category of motion generating methods (left column, top) which aim<br />
to alter the motion data based on properties defined <strong>for</strong> the character, its mental<br />
model, behavior, etc. The latter methods are used generally together with<br />
motion combination techniques, thus this category is possibly less interactive<br />
than the motion combining methods category alone (<strong>for</strong> instance by being more<br />
computationally demanding and not per<strong>for</strong>ming in real-time) but also provides<br />
more qualitative variance in the generated motion sequences.<br />
An animation method that does not fit the following classifications is the ”impostor”<br />
technique presented in [2]. Their aim was to produce real-time (fast)<br />
animations <strong>for</strong> geometrically complex characters, possible multiple characters.<br />
There<strong>for</strong>e they use the 2D sprites technique in <strong>3D</strong> by pre-generating snapshot<br />
pictures of the complex characters, with several levels of detail, and mapping<br />
these on billboards instead of really animating the characters. Re-rendering<br />
is controlled by a temporal coherence model, meaning if threshold values (differences<br />
between the snapshot and the animated character) are exceeded, only<br />
then the impostor is re-rendered.<br />
3.2.1 Combining animation data<br />
Combining animation sequences (motion capture data) can be a tedious task<br />
if done by hand. There<strong>for</strong>e, automatic motion combination techniques have<br />
emerged from the simplest ones to most complex ones, from fading functions<br />
to overlapping and blending techniques.<br />
One of the most simple motion combination methods is the use of fade-in and<br />
fade-out functions [65][54] or the motion parameter curve blending method used<br />
by [82]. Similar is the method presented by [3] where motions are segmented<br />
into motion phases that can be named, stored and executed separately, and possibly<br />
connected via transitional motions. Slightly more complex is the motion<br />
interpolation technique used by [72] to combine walking movement sequences<br />
into data with height changes at foot contact that fits the terrain variations the<br />
character is walking on.<br />
In addition to the simple fade-in and fade-out functions, [65] uses radial basis<br />
functions to interpolate between distinct motion data sequences. The approach<br />
is based on the ”verbs and adverbs” analogy, where the original motion (verb) is<br />
interpolated according to some property (adverb) <strong>for</strong> motion transition purposes.<br />
This way, a smooth transition is achieved between motion sequences. In an<br />
earlier publication [66] they use spacetime constraints <strong>for</strong> the same purpose, <strong>for</strong><br />
reassembling the non-overlapping motion sequences. Although motion data is<br />
used, this is broken up and converted into dynamics <strong>for</strong>mulation incorporating<br />
spacetime and IK constraints, and an energy (torque) minimizing approach is<br />
used to enhance the naturalness of the generated movement.<br />
[21] and [22] uses motion prototypes, a dictionary of gestures to create animations<br />
from text and dialogue theory. The gesture motions are annotated and<br />
synchronized <strong>for</strong> animating dialogues using parallel transition networks (PaT-<br />
Nets), see section 5.3 <strong>for</strong> a description of this and other animation architectures.<br />
However, these methods work satisfactory only with non-conflicting motion<br />
sequences. When the motion data sets try to act upon the same target joints,<br />
the results can be unexpected. To cope with this problem, different methods<br />
8
<strong>for</strong> combining motion data were proposed, based mainly on different weighting<br />
and summing techniques.<br />
[40] [69] uses priority (importance) weights and weights-based combination of<br />
conflicting data in initiating and terminating movement phases, with an ease-in<br />
and ease-out technique based on cubic step functions, to achieve smoothness<br />
next to the meaningful combination of movements. These methods were collected<br />
in a software library [13] <strong>for</strong> the management of animation data combinations.<br />
In this library, actions have scope (affected joints) and priority (relative<br />
importance of action) properties, while agents (articulated characters) also have<br />
scope, they can accept or reject actions. The blending of motions is executed<br />
depending on the action types motion or posture.<br />
[64] uses the traditional animators knowledge: ”human motions are created from<br />
combinations of temporarily overlapping gestures and stances” but goes one<br />
step further and uses groups of mutually exclusive animations which compose<br />
levels with different importance weights to achieve next to motion combination<br />
a certain amount of speed gain by simplifying the search <strong>for</strong> conflicting motions.<br />
[12] uses a range of motion generator techniques: dynamic, motion capture<br />
(live motion), motion capture playback and IK motion tasks, where the motion<br />
generator techniques can be used in combination (one technique per motion<br />
generator which acts on a set of joints). In their case, conflicting joints are<br />
summed using priority levels and weights.<br />
A particular case of motion combination is presented in [44] where cyclic motion<br />
capture data is used in combination with a PD controller. The controller is<br />
responsible <strong>for</strong> executing the linear and angular acceleration of state variables<br />
over time as a result of the path planning algorithm.<br />
3.2.2 Altering (varying) animation data<br />
If motion sequences are reused frequently, then the repetitions lend a monotone<br />
feeling to the animation. Since motion data (motion libraries) are limited,<br />
researchers tried to introduce variations in motion data to achieve non-repetitive<br />
movement sequences. They achieved this in various ways, ranging from noise<br />
functions to B-splines and from IK to neural networks.<br />
Qualitative variations<br />
The first set of methods discussed uses key-frame data and alters it qualitatively<br />
using different functions and methods. The goal is to introduce qualitative<br />
variations into the animation data, making non-repetitive re-usage of the same<br />
data set possible.<br />
[79] applies Fourier trans<strong>for</strong>mations to motion data and uses frequency analysis<br />
to extract basic factors (like walk) and qualitative factors (like brisk, slow,<br />
etc.). Intrapolation and extrapolation in the frequency domain is then used<br />
to create new motion types from neutral data and emotional mood variations.<br />
The new motions can be generated continuously, and additional manual control<br />
possibilities exist to influence the animation by hand.<br />
[1] per<strong>for</strong>ms emotional trans<strong>for</strong>ms on motion data using signal processing techniques.<br />
They analyze and extract neutral motion and emotional differences<br />
from motion data and apply these to other types of neutral data. The varying<br />
components used (extracted) are speed and spatial amplitude.<br />
9
[82] uses motion warping techniques to alter key-frame data. By warping the<br />
motion parameter curves to animator-specified constraint key-frames, local variations<br />
are introduced in the motion data without changing the fine structure of<br />
the original motion. Motion sequences are combined by overlapping and blending<br />
the parameter curves. Can be used to create libraries of reusable motion<br />
by specifying a small number of constraint key-frames.<br />
[19] uses similar techniques from the image and signal processing domain as<br />
complementary techniques to key-framing, motion capture, and procedural animation.<br />
The techniques are multiresolution motion filtering, multitarget motion<br />
interpolation with dynamic timewarping, waveshaping and motion displacement<br />
mapping. In multiresolution motion filtering, low frequencies contain the general<br />
motion, while high frequencies the detail, subtleties and noise. By filtering<br />
certain bands, different styles of motions can be obtained. Timewarping techniques<br />
are used to match movement patterns indifferently of the animation time,<br />
resulting in combination (warping) of movement sequences. Motion waveshaping<br />
is used to introduce motion limits or to alter movement styles. Motion<br />
displacement mapping is used to locally alter the movement sequence without<br />
changing the global aspect of the movement. The authors state that the most<br />
important feature of this latter technique would be the alteration of a grasping<br />
movement towards a different target object (although the new target object<br />
should be also close to the original target object).<br />
[16] uses ”Style” Hidden Markov Models (HMMs) to learn patterns, in particular<br />
the style of movements from motion capture data and then uses these<br />
Style HMMs to generate motions with different style and scope. The goal is to<br />
extract a parameterized model that covers the generic behavior of a family of<br />
motion capture data. The Style HMMs are applied by using analogy rules, and<br />
it is applicable to long motion data sequences and this technique is regarded by<br />
the authors as an alternative <strong>for</strong> motion libraries.<br />
New motion from data<br />
A second set of methods could be defined as methods using motion data as<br />
basis <strong>for</strong> new motion calculations, such as noise functions or knowledge-driven<br />
methods, and even physical simulation methods are used sometimes.<br />
[64] uses noise functions on existing key-frame data in combination with IKbased<br />
motion calculations <strong>for</strong> simulating vivid, natural movements from existing<br />
key-frame motion data.<br />
A hybrid numerical IK method is presented in [54] <strong>for</strong> interactive motion editing.<br />
Constraints are specified on the key-frame motion data, <strong>for</strong> which B-splines are<br />
fitted. These fittings result in hierarchical displacement maps that are used <strong>for</strong><br />
motion alterations, from which the animation data is calculated (using IK) by<br />
eliminating redundant coordinates. Example constraints specify bending below<br />
doorways, terrain following, motion combination and even motion retargeting<br />
(next group of motion altering methods).<br />
[18] uses interactive, knowledge driven altering techniques <strong>for</strong> running data<br />
which are based on empirical (relations between parameters), physical (<strong>for</strong><br />
body trajectory) and constraint knowledge (<strong>for</strong> flight/support states and s-<br />
tance/swing phases). Examples of parameter relations are presented such as:<br />
increase in velocity is linear with the increase in step length up to a certain<br />
speed (24km/h) then only the step frequency is increased to further increase<br />
10
the velocity. Other relations regard the level of expertise (experts have greater<br />
stride length) or the relation between leg length and ”froude” number (inertial<br />
<strong>for</strong>ce / gravitational <strong>for</strong>ce).<br />
[23] uses LMA (Laban Movement Analysis) properties based movement analysis<br />
and alteration. Ef<strong>for</strong>t and shape properties are used (the other three properties<br />
used by LMA are body, space and relationship) to introduce qualitative differences<br />
(mood, internal state). Key-frame data is used from various sources, and<br />
the shape parameter influences the key-frame values, while the ef<strong>for</strong>t parameter<br />
influences the execution speed of the key-frame data, specifying also the frame<br />
distribution between key-points. The altered key-frames are regarded as ’via’<br />
and ’goal’ type end-effectors <strong>for</strong> an IK interactive system. The same requirements<br />
<strong>for</strong> virtual human animation are presented in an earlier work [4]. Based<br />
on an analysis of animation methods, the paper concludes that both empirical<br />
data and cognitive properties are important in animation, resulting in the approach<br />
taken based on LMA and human properties like personality, emotions,<br />
temperament. They express the view that <strong>for</strong> directing attention and attentiondependent<br />
motions (walking, reaching) sensing actions and interleaving motions<br />
are needed.<br />
Motion retargeting<br />
A different set of methods are the ones that concentrate on motion retargeting.<br />
The motion data is altered to be usable <strong>for</strong> articulated characters of different<br />
sizes and types.<br />
[10] parameterizes the motion capture data and applies it to other target articulations.<br />
The second order derivative is used to search <strong>for</strong> significant changes<br />
in motion, by analyzing zero-crossings to extract basic action segments, which<br />
are later interpolated using cubic splines to yield new motion sequences. Uses<br />
different types of tags <strong>for</strong> action participant objects.<br />
[31] retargets motion to new characters using specific features (contact points)<br />
of the original motion as constraints. Their argument is that procedural and<br />
simulation retargeting means new data, without the qualities of the original<br />
detailed mocap, key-frame data. They use a spacetime approach which eliminates<br />
the spatial jerkiness of frame-based retargeting methods by look-ahead<br />
and look-behind techniques (anticipating constraints) <strong>for</strong> the whole motion sequence<br />
which results in smooth motion data generation.<br />
[38] uses a two-step motion adapting method <strong>for</strong> new characters. The first step<br />
consists of a geometric scaling enhanced with mass scaling and in a second step<br />
the parameters are fine-tuned using a ”simulated annealing” technique (progressive<br />
adaptation of multivariate functions towards target states). Although<br />
the method is applied to control systems and not motion data as other methods<br />
in this section, the method is still relevant from the retargeting perspective.<br />
One method which differs from the previous ones in its scope presented in [32]<br />
is the use of different paths to retarget the motion data. Paths represented as<br />
B-spline curves are altered (they are editable) and foot constraints are used to<br />
eliminate foot skating (method presented also in an earlier work [31]).<br />
11
3.3 Alternative classifications<br />
There are of course other type of animation classifications. For instance, [74]<br />
and [75] uses instead of the physical/procedural a physical/geometrical classification,<br />
where kinematics is enlisted as geometrical animation type, next to<br />
procedural animation methods. [29] defines a pyramid hierarchy with geometric,<br />
kinematic, physical, behavioral, cognitive animation methods, all building<br />
on top of each other.<br />
<strong>Animation</strong> could be classified also from the point of view of target systems.<br />
Film and animation production software use complex physical simulations to<br />
make compelling animations, since their target objective is not the real-time<br />
applications market. On-line environments, games on the other hand have the<br />
real-time requirement, thus they need fast, reusable animations, thus they orient<br />
more in the direction of procedural animation.<br />
4 Movement semantics<br />
Semantic analysis and semantic generation of movement are important components<br />
in the behavior (brain) specification <strong>for</strong> animated characters as stated by<br />
[3] [4], where they use this semantics as a method to alter movement sequences.<br />
[18] uses a high-level motion control system based on dynamic parameters in<br />
combination with empirical, physical as well as constraints knowledge to procedurally<br />
generate walking/running animation data. Others like [40] use object<br />
grasping knowledge <strong>for</strong> sensor-based grasping simulation of different types of<br />
objects in virtual environments.<br />
Classical cartoon creators use different animation sequences to empathize the<br />
movement semantics in their cartoon characters. <strong>Computer</strong> character animators<br />
can use the same techniques when creating key-frame animations [51] to achieve<br />
expressiveness, believable behavior. Additionally, film theory concepts could be<br />
used <strong>for</strong> animations [56], mainly <strong>for</strong> staging, viewpoint animation, etc.<br />
On the other hand, there are exact, medical and psychological-social measurements<br />
<strong>for</strong> the semantics of human movement. Regarding medical measurements,<br />
[43] presents a series of walking characteristics, while [81] discusses phases of<br />
walking, <strong>for</strong>ces and moments, muscle activity resulting from human gait analysis.<br />
Psychological textbooks analyze the different human poses, assigning meanings<br />
to them. McNeill [60] analyzed the meaning of gestures and the gesturing<br />
behavior of humans during dialogs. He categorized arm gestures into the following<br />
five types: iconics, metaphorics, deictics, beats, emblems and studied their<br />
connections to dialogue characteristics. Similarly, the human facial movements<br />
were analyzed by Ekman [25] <strong>for</strong> extracting the expression of different psychological<br />
effects such as mood or mind state from face configurations (features)<br />
resulting in the Facial Action Coding System (FACS).<br />
These results found their way into computer animation. [21], and more recent<br />
the BEAT system [22] uses the McNeill gesture and dialogue semantics to<br />
generate the gesturing behavior of autonomous characters. [21] uses gestural<br />
annotations, gestural lexicon look-up, timing from speech generation in producing<br />
the animation. Similarly, [22] creates gestures from speech and dialogue<br />
rules. An arbitrary animation system is wrapped around with behavior tags<br />
and used in synchronization with speech and the semantic context of the text<br />
12
to produce the gesture animations. Even the FACS system data is used to<br />
generate the facial expressions <strong>for</strong> the simulated dialogues.<br />
[80], [55] and other facial animation systems rely heavily on the findings of Ekman<br />
in creating facial expressions. The Waters facial model [80] uses muscle<br />
simulations and the FACS system to model facial expressions, emotions. [21]<br />
uses the same FACS system to generate facial expressions such as lip shape, conversational<br />
signal, punctuator, manipulator and emblem, animated expressions<br />
to accompany their dialogue utterances.<br />
Behavioral animation is also gaining momentum. Simple animal behaviors [78],<br />
complex human dialogue behaviors [22] and some of the different motion alteration<br />
methods from section 3.2.2 use behavior, moods, emotions to influence<br />
the created animation, making it more expressive, natural and meaningful.<br />
5 Applications, tools, systems<br />
5.1 Editor types<br />
Historically, <strong>3D</strong> editor systems used different views <strong>for</strong> editing: the 3+1 approach,<br />
representing the scene from top, front, side and <strong>3D</strong> view [61]. However,<br />
there are other approaches also, where the editor interface is inserted into the<br />
<strong>3D</strong> environment, breaking away from the multiple viewpoints and also from the<br />
WIMP interface towards a fully integrated <strong>3D</strong> interface [33] [47].<br />
Classical key-frame setting operations in modeling commercial packages handle<br />
besides articulated characters also the environment animations. They do not<br />
produce interactive environments, thus this is a logical animation approach.<br />
There are also software packages that concentrate only on animation of articulated<br />
characters, and the result of these animations can be applied in interactive<br />
environments. However, these latter animations are still rigid, and running<br />
them repeated times will convey unnaturalness to the environment.<br />
These packages have different tools <strong>for</strong> enabling easy foot positioning. [30]<br />
presents a number of these approaches that are used by commercial packages.<br />
The first method is IK with locks, when <strong>3D</strong> character articulations are locked<br />
(pinned down) and IK calculations are used to calculate joint rotations when the<br />
character is moved. A typical example is pinning down the foot on the floor to<br />
eliminate foot skating. Some packages offer the same pinning functionality also<br />
<strong>for</strong> <strong>for</strong>ward kinematics, when the user positions manually the unconstrained<br />
articulations to achieve the desired result. A similar animation possibility is<br />
to drag limb extremities, where internal joint rotations are resolved by IK. An<br />
advanced animation possibility is the ”footstep generator”, when the animator<br />
specifies footprints on the ground, and the software animates the character’s<br />
movements to match the footprints. This calculation is usually a physical simulation<br />
which has to be refined, <strong>for</strong> instance by adding personality, individual<br />
characteristics. An odd method <strong>for</strong> animation is the use of inverse (or broken)<br />
hierarchies. It is based on the <strong>for</strong>ward kinematics animation of inverse limbs,<br />
typically the legs and arms. It is called broken because to achieve inverse<br />
hierarchies with non-repeating leaves, the torso has to be separated from the<br />
hierarchies and it has to be animated (kept in synchronization with the rest of<br />
the character) separately. This can be a little cumbersome, but the advantage<br />
13
is that the foot can be positioned without IK. Generally, when characters are<br />
being animated, it is also important to have references to the previous frames.<br />
Some software have built-in onion-skin effect or aid by allowing to place dummy<br />
objects as markers.<br />
Another approach to animation <strong>for</strong> which intuitive interfaces also start to e-<br />
merge is the use of key-frame or mocap data and tools to test the manipulation<br />
possibilities on these data (combination, alteration techniques) as an interactive<br />
system would do in real-time. Such a system is presented in [19] where filters<br />
can be used to alter animations.<br />
A complex, high-level animation interface is presented in [68] which allows the<br />
directing of multiple actors and cameras in real-time. Key-words are high level<br />
controls, predefining actions, while text-based sentences are used <strong>for</strong> high level<br />
actions. There are also tools <strong>for</strong> preprogramming actions, since there are many<br />
virtual humans to control. A scripting tool records all actions played by an<br />
actor and plays it back as requested. Actors can be programmed one by one.<br />
[33] uses a network of interrelated objects and constraints to visually edit the<br />
properties of virtual environments, including animation. The interesting approach<br />
is the fact that the edited environment and the controls are all threedimensional,<br />
making the editing interface integrated into the environment.<br />
[67] presents a facial animation editor system based on motion capture data<br />
and editing constraints representing laws of co-activation, symmetry laws and<br />
animation limits, where the animation itself is driven by emotion parameters.<br />
5.2 Planning<br />
A classical path planning algorithm is the A* algorithm, used by [44]. They<br />
use an orthogonal projection of the environment to a 2D plane, divided into<br />
cells. Collision detection and path planning is done based on the occupied/free<br />
cells of the plane. The free cells of the plane are trans<strong>for</strong>med into a graph, <strong>for</strong><br />
which the A* algorithm is applied. Variations exist on this method, <strong>for</strong> instance<br />
boundary cell skipping <strong>for</strong> collision avoidance or multilevel cell maps [6].<br />
A planning method that is used frequently in games takes advantage of space<br />
subdivision techniques, with different types of subdivisions. It could be applied<br />
in both 2D planes and <strong>3D</strong> space, providing both a notation <strong>for</strong> freely navigable<br />
space and a collision detection mechanism. [77] uses an octree division of free<br />
and occupied spaces in a virtual environment to notate free space <strong>for</strong> path<br />
planning, incorporating dynamic possibilities <strong>for</strong> updating octree data due to<br />
animations or possibilities <strong>for</strong> adding or removing objects, in case of content<br />
change.<br />
[63] uses an autonomous planning method based on a reactive, timed behavioral<br />
L-system that handles visual and tactile virtual sensors. The values of<br />
the sensors are used in production rules <strong>for</strong> determining the behavior of the<br />
characters.<br />
[50] provides a method of high level animation by specifying approximate paths<br />
and synchronization constraints. Their system handles automatically collisions<br />
and animation using <strong>for</strong>ward dynamics and a synchronization mechanism that<br />
is depending on relative positions of objects to each other, and not a common<br />
coordinate system. Splines are used <strong>for</strong> motion (path) specification and are<br />
14
altered by scripting depending on the synchronization constraint states.<br />
[11] presents an autonomous virtual human directing system that is based on<br />
synthetic vision sensor (the viewpoint of the virtual human). The autonomous<br />
humans are directed by a behavioral system at motivational (disregarding the<br />
behavior system), task (if the behavioral system allows it) and direct motor<br />
levels (influence the modality of the current action) according to the in<strong>for</strong>mation<br />
extracted from the visual sensors using vision techniques.<br />
[29] uses a reactive path planning method based on the possible successor state<br />
axiom instead of the effect axiom (using the ”what the world can turn into”<br />
metaphor). Their characters are directed by goals and domain knowledge, incorporating<br />
actions, preconditions and effects. For the actual path planning,<br />
a sensor and interval (effects range) based narrowing of possible world states<br />
(regarded as a tree structure) and thus plan sequences is used as a predecessor<br />
step <strong>for</strong> a faster resolution time.<br />
The following methods are methods complementary to path planning, but necessary<br />
to execute the different paths selected. [72] uses a method of gait generation<br />
along a curved path and uneven terrain with high control path specification.<br />
The high-level, intuitive parameters used are step length, step height, heading<br />
direction and toe-out. Similarly, [32] retargets motion data with different paths,<br />
using constraints to eliminate foot skating. The path is represented by a<br />
B-spline curve, which is editable to produce new movement paths.<br />
[37] uses a path following method based on the redirection of the runner character<br />
to face the point it will reach in a two-second animation cycle. This<br />
redirecting methods allows a dynamic path planning method to be used.<br />
5.3 <strong>Animation</strong> systems (architectures)<br />
For real-time virtual humans, [45] presents a number of criteria or guidelines<br />
that is recommended to achieve good animation results. These are low polygon<br />
count, fast skin de<strong>for</strong>mations, non-physical skeleton animation, etc. Given that<br />
the statement was made in 1998 referring to SGI workstation hardware, the<br />
statement is still partially valid referring to today’s ever powerful PC environments,<br />
especially if we consider a setting of multiple characters that must be<br />
animated in complex environments, with complex behavior. However, physicsbased<br />
methods also approach the real-time level, and will start to emerge, especially<br />
in professional settings.<br />
[3] and [9] describes parallel finite-state machine controllers called Parallel Transition<br />
Networks (PaT-Nets) used together with additional parameters and in<strong>for</strong>mation<br />
coded in a higher-level representation called Parameterized Action<br />
Representation (PAR). A PAR is defined as the description of a single action in<br />
terms of objects, agent, applicability conditions, preparatory specifications, execution<br />
steps, manner, termination conditions and post assertions. This is used<br />
as a sense-control-act architecture to animate ”smart avatars”, avatars whose<br />
reactive behavior is manipulated in real-time to be locally adaptive. The architecture<br />
integrates autonomy control levels (to optimize lower level reactivity<br />
or plan higher level complex tasks), gesture control based on a object-specific<br />
relational table and non-verbal communication, attention control directed by<br />
visual sensing requirements and locomotion with anticipation <strong>for</strong> favorable positioning<br />
in interaction scenarios. The paper [4] presents in detail their attention<br />
15
control method based on sensing events that simulates deliberate, spontaneous<br />
and idle attention or behavior using an automata system with an arbitrating<br />
mechanism controlling it.<br />
Part of the animation architecture presented previously is used also by [21]<br />
<strong>for</strong> the synchronization of gaze and hand movements with dialogues. Gesture<br />
PaT-Nets are used to sends hand and arm timing, position and shape to the<br />
animation system which produces a file of motions depending on this data. Motions<br />
are categorized as beats and gestures, the beats being based on speech<br />
rhythm, while gestures summarize the deictic, iconic, metaphoric gestures based<br />
on meaning representations. Beats are often superimposed on other gestures<br />
and coarticulation (2 gestures without relaxation), preparation, <strong>for</strong>eshortening<br />
(anticipation) effects are also reproduced using the PaT-Nets synchronization<br />
architecture. The further extension of the system is discussed in [22] where<br />
an automatic text-to-dialogue conversion method is used to generate synchronized<br />
animation and speech from the semantic context and a behavior-tagged<br />
animation system.<br />
[64] uses a distributed multi-agent system (MAS) with blackboard agent synchronization<br />
<strong>for</strong> animating multi-user virtual environments. Multiple animation<br />
engine values at each LAN and a global behavior engine value is used <strong>for</strong> local<br />
animation frame synchronization (one mind, multiple bodies or parallel universes<br />
analogy). For consistency, all environment objects behave as agents to<br />
avoid multiple locking (similar to database techniques).<br />
The aim of the animation system presented in [68] and [69] is to provide integrated<br />
virtual humans with (multiple source) facial and body animation and<br />
speech, as well as to provide a straight<strong>for</strong>ward interface <strong>for</strong> designers and directors<br />
to handle these virtual humans. The speech-animation synchronization<br />
is done using duration in<strong>for</strong>mation from the text-to-speech system used. The<br />
system is built upon a server/client architecture using a virtual human message<br />
protocol that enables connection over networks, and makes possible <strong>for</strong> many<br />
clients to control the virtual humans, behaving thus as virtual actors. Keywords<br />
are used as high level controls, predefining actions, text-based sentences<br />
<strong>for</strong> high level actions. Tools <strong>for</strong> preprogramming actions and a scripting tool to<br />
record and play back all actions played by an actor are used to exploit the system’s<br />
possibilities. The animation management methods used <strong>for</strong> this directing<br />
system is described in detail in [13].<br />
[48] presents an animation framework <strong>for</strong> general type of animations (not just<br />
biped) with a combination of techniques (kinematics, dynamics and control<br />
theory). The input desired velocity and heading is trans<strong>for</strong>med by a feedback<br />
controller into <strong>for</strong>ces and torques, which are distributed to the legs in contact<br />
with the ground by a linear programming algorithm. Forward dynamics is<br />
used to calculate actual body displacement, while a kinematic gait controller<br />
calculates the stepping pattern. The animation framework is tested in uneven<br />
terrain and target following experiments.<br />
With such complex systems it is possible to achieve even the animation <strong>for</strong> text<br />
to scene conversion systems such as [24] or text to animation systems as it is<br />
already being done by the BEAT system [22].<br />
16
6 Conclusions<br />
During the review of the different animation systems we looked at the applicability<br />
of the described systems to our constraints, constraints that are inherent<br />
from the setup we work on. We use internet-enabled <strong>3D</strong> virtual reality technologies<br />
to provide a wide user access possibility, there<strong>for</strong>e the target plat<strong>for</strong>ms are<br />
desktop computers with consumer graphics cards. The desktop hardware provides<br />
ever increasing computational and graphics power, but real-time complex<br />
simulation methods are not yet feasible <strong>for</strong> interactive, web-based applications.<br />
We analyzed the animation systems from three points of view. These considerations<br />
were animation quality, computational overhead and reusability, since<br />
our aim is a complete system that provides compelling animation with a small<br />
computational overhead and provides the possibility to reuse the created animations.<br />
The systems analyzed looking at these criteria were themselves complex<br />
systems that presented results <strong>for</strong> all of the three consideration directions or<br />
just <strong>for</strong> some of them, either way they were considered in our review.<br />
The two main categories of animation methods with regard to movement data<br />
production are the motion-capture based and simulation based animation<br />
methods. One of these <strong>for</strong>ms the base <strong>for</strong> an animation system, while different<br />
additional methods provide enhancements to the animations produced by the<br />
base methods.<br />
Regarding animation quality, there are two subconsiderations: variation and<br />
naturalness. Motion capture based methods are capturing motion per<strong>for</strong>med<br />
by humans, thus they capture also the naturalness of the movements, and not<br />
only with movement experts. The drawback lies in the fact that the motion<br />
sequences are invariant and in case of reuse the repetition is easily observable.<br />
On the other hand, simulation methods produce variant movement data, but<br />
they often lack the naturalness of captured movement data, and there<strong>for</strong>e are<br />
mainly used in simulating robots.<br />
The second consideration is computational overhead. In this case the motion<br />
capture methods are in advantage since they don’t need computations to calculate<br />
the movement data <strong>for</strong> each animation sequence, they just use prerecorded<br />
data. The price of the varieties provided by the simulation based animation<br />
methods is in the high computing requirements, making it too expensive <strong>for</strong><br />
certain application areas.<br />
From the first two considerations, we can see that motion capture is more favorable<br />
<strong>for</strong> our purposes, but there is still a negative property <strong>for</strong> animation<br />
sequences based solely on motion capture data: the resulting animations are<br />
invariant. The third criteria we looked at is there<strong>for</strong>e reusability, with emphasis<br />
on motion data analysis and handling. Simulation based systems always<br />
produce new data, thus the criteria does not apply to simulation. With a little<br />
computational overhead (which is less than what is needed <strong>for</strong> simulation<br />
methods), different methods can be used to analyze motion capture data, extract<br />
features, style, basic motions, etc. and use this new motion knowledge<br />
to modify the motion capture data in a natural, compelling way to produce<br />
variant animations.<br />
Our conclusion is that motion capture methods in combination with motion<br />
analysis and alteration methods provide quality, naturalness and reusability<br />
with a small footprint on speed and thus interactivity, since the computational<br />
17
overhead is small. Motion data analysis and processing is particularly important<br />
with the advent of movement capture methods from 2D video images, especially<br />
if the movement is extracted from general purpose video material where motions<br />
are not executed separately and expressively, with motion capture in mind.<br />
We need qualitative analysis to break captured movements (regardless from the<br />
source) into basic movements that represent separate actions, <strong>for</strong> a variety of<br />
combination possibilities, together with animation data qualitative variation<br />
based on movement parameters and psychological (mood, emotion) factors to<br />
simulate compelling animations <strong>for</strong> agent and avatar <strong>3D</strong> characters.<br />
Regarding the actual motion analysis and alteration methods, many options<br />
exist. However, in the light of possibly using complex movement data captured<br />
from video sources, an obvious choice would be a motion segmentation method<br />
based on motion curve parameters, where the derivatives show movement endpoints,<br />
making the separation of movement sequences into basic movements<br />
possible. However, this method would have to be extended to extract the overlapping<br />
movement sequences of video based movement capture data, since in<br />
that case the movements are most likely mixed.<br />
References<br />
[1] Kenji Amaya, Armin Bruderlin, and Tom Calvert. Emotion from Motion.<br />
In Graphics Interface ’96, 1996.<br />
[2] Amaury Aubel, Ronan Boulic, and Daniel Thalmann. Animated Impostors<br />
<strong>for</strong> Real-Time Display of Numerous Virtual Humans. In Virtual Worlds<br />
’98, pages 14–28, 1998.<br />
[3] Norman I. Badler, Rama Bindiganavale, Juliet Bourne, Jan Allbeck, Jianping<br />
Shi, and Martha Palmer. Real Time Virtual Humans. In Proceedings<br />
of International Conference on Digital Media Futures ’99. British <strong>Computer</strong><br />
Society, 1999.<br />
[4] Norman I. Badler, Diane Chi, and Sonu Chopra. Virtual Human <strong>Animation</strong><br />
Based on Movement Observation and Cognitive Behavior Models. In<br />
<strong>Computer</strong> <strong>Animation</strong> ’99, pages 128–137, 1999.<br />
[5] Norman I. Badler, Cary B. Phillips, and Bonnie L. Webber. Simulating<br />
Humans: <strong>Computer</strong> Graphics, <strong>Animation</strong>, and Control. Ox<strong>for</strong>d University<br />
Press, 1993.<br />
[6] Srikanth Bandi and Daniel Thalmann. Space discretization <strong>for</strong> efficient<br />
human navigation. In Proc. Eurographics ’98, <strong>Computer</strong> GraphicsForum,<br />
volume 17(3), pages 195–206, 1998.<br />
[7] Y. Bellan, M. Costa, G. Ferrigno, F. Lombardi, L. Macchiarulo, A. Montuori,<br />
E. Pasero, and C. Rigotti. Artificial Neural Networks <strong>for</strong> Motion Emulation<br />
in Virtual Environments. In CAPTECH’98, pages 83–99. Springer,<br />
1998.<br />
[8] Uwe Beyer and Frank Śmieja. A Heuristic Approach to the Inverse Differential<br />
Kinematics Problem. In Journal of Intelligent Robotic Systems:<br />
Theory and Applications, volume 18(4), pages 309–327, 1997.<br />
18
[9] Rama Bindiganavale, William Schuler, Jan M. Allbeck, Norman I. Badler,<br />
Aravind K. Joshi, and Martha Palmer. Dynamically Altering Agent Behaviors<br />
Using Natural Language Instructions. In Autonomous Agents 2000,<br />
pages 293–300, 2000.<br />
[10] Ramamani N. Bindiganavale. Building Parameterized Action Representations<br />
from Observation. PhD thesis.<br />
[11] Bruce M. Blumberg and Tinsley A. Galyean. Multi-Level Direction of<br />
Autonomous Creatures <strong>for</strong> Real-Time Virtual Environments. In <strong>Computer</strong><br />
Graphics (Siggraph ’95 Proceedings), pages 47–54, 1995.<br />
[12] Vincent Bonnafous, Eric Menou, Jean-Pierre Jessel, and René Caubet. Cooperative<br />
and Concurrent Blending Motion Generators. In International<br />
Conference in Central Europe on <strong>Computer</strong> Graphics, Visualization and<br />
<strong>Computer</strong> Vision (WSCG’2001), 2001.<br />
[13] Ronan Boulic, Pascal Bécheiraz, Luc Emering, and Daniel Thalmann. Integration<br />
of Motion Control Techniques <strong>for</strong> Virtual Human and Avatar<br />
Real-Time <strong>Animation</strong>. In VRST’97, pages 111–118. ACM, 1997.<br />
[14] Ronan Boulic, Ramon Mas, and Daniel Thalmann. Inverse Kinetics <strong>for</strong><br />
Center of Mass Position Control and Posture Optimization. In European<br />
Workshop on Combined Real and Synthetic Image Processing <strong>for</strong> Broadcast<br />
and Video Production , 1994.<br />
[15] Ronan Boulic, Ramon Mas, and Daniel Thalmann. Robust Position Control<br />
of the Center of Mass with Second Order Inverse Kinetics. In <strong>Computer</strong>s<br />
Graphics Journal, 1996.<br />
[16] Matthew Brand and Aaron Hertzmann. Style Machines. In <strong>Computer</strong><br />
Graphics (Siggraph ’00 Proceedings), pages 183–192, 2000.<br />
[17] David C. Brogan, Ronald A. Metoyer, and Jessica K. Hodgins. Dynamically<br />
Simulated <strong>Characters</strong> in Virtual Environments. In IEEE <strong>Computer</strong><br />
Graphics and Applications, volume 15(5), pages 58–69, 1998.<br />
[18] Armin Bruderlin and Tom Calvert. Knowledge-Driven, Interactive <strong>Animation</strong><br />
of Human Running. In Graphics Interface ’96, 1996.<br />
[19] Armin Bruderlin and Lance Williams. Motion Signal Processing. In <strong>Computer</strong><br />
Graphics (Siggraph ’95 Proceedings), pages 97–104, 1995.<br />
[20] Deborah A. Carlson and Jessica K. Hodgins. Simulation Levels of Detail<br />
<strong>for</strong> Real-time <strong>Animation</strong>. In Graphics Interface ’97, 1997.<br />
[21] Justine Cassell, Catherine Pelachaud, Norman Badler, Mark Steedman,<br />
Brett Achorn, Tripp Becket, Brett Douville, Scott Prevost, and Matthew<br />
Stone. Animated Conversation: Rule-based Generation of Facial Expression,<br />
Gesture & Spoken Intonation <strong>for</strong> Multiple Conversational Agents.<br />
In <strong>Computer</strong> Graphics (Siggraph ’94 Proceedings), pages 413–420. ACM<br />
Press, 1994.<br />
[22] Justine Cassell, Hannes Högni Vilhjálmsson, and Timothy Bickmore.<br />
BEAT: the Behavior Expression <strong>Animation</strong> Toolkit. In <strong>Computer</strong> Graphics<br />
(Siggraph ’01 Proceedings), pages 477–486, 2001.<br />
19
[23] Diane Chi, Monica Costa, Liwei Zhao, and Norman Badler. The EMOTE<br />
Model <strong>for</strong> Ef<strong>for</strong>t and Shape. In <strong>Computer</strong> Graphics (Siggraph ’00 Proceedings),<br />
pages 173–182, 2000.<br />
[24] Bob Coyne and Richard Sproat. WorldsEye: An Automatic Text-to-Scene<br />
Conversion System. In <strong>Computer</strong> Graphics (Siggraph ’01 Proceedings),<br />
pages 487–496, 2001.<br />
[25] P. Ekman and W. V. Friesen. The Facial Action Coding System (FACS): A<br />
Technique <strong>for</strong> the Measurement of Facial Action. Consulting Psychologists<br />
Press, 1978.<br />
[26] Luc Emering, Ronan Boulic, and Daniel Thalmann. Interacting with Virtual<br />
Humans through Body Actions. In IEEE Journal of <strong>Computer</strong> Graphics<br />
and Applications, volume 18(1), pages 8–11. IEEE, 1998.<br />
[27] Petros Faloutsos, Michiel van de Panne, and Demetri Terzopoulos. Composable<br />
Controllers <strong>for</strong> Physics-Based Character <strong>Animation</strong>. In <strong>Computer</strong><br />
Graphics (Siggraph ’01 Proceedings), pages 251–260, 2001.<br />
[28] P. Fua, R. Plänkers, and D. Thalmann. Realistic Human Body Modeling.<br />
In Fifth International Symposium on the 3-D Analysis of Human Movement,<br />
1998.<br />
[29] John Funge, Xiaoyuan Tu, and Demetri Terzopoulos. Cognitive Modeling:<br />
Knowledge, Reasoning and Planning <strong>for</strong> Intelligent <strong>Characters</strong>. In<br />
<strong>Computer</strong> Graphics (Siggraph ’99 Proceedings), pages 29–38, 1999.<br />
[30] George Maestri. Learning to Walk: The Theory<br />
and Practice of <strong>3D</strong> Character Motion.<br />
http://www.siggraph.org/education/materials/HyperGraph/animation/<br />
character animation/walking/learning to walk.htm.<br />
[31] Michael Gleicher. Retargetting Motion to New <strong>Characters</strong>. In <strong>Computer</strong><br />
Graphics (Siggraph ’98 Proceedings), pages 33–42, 1998.<br />
[32] Michael Gleicher. Motion path editing. In Symposium on Interactive <strong>3D</strong><br />
Techniques, pages 195–202, 2001.<br />
[33] Enrico Gobbetti and Jean-Francis Balaguer. An Integrated Environment<br />
to Visually Construct <strong>3D</strong> <strong>Animation</strong>s. In <strong>Computer</strong> Graphics (Siggraph<br />
’95 Proceedings), pages 395–398, 1995.<br />
[34] Larry Israel Gritz. Evolutionary Controller Synthesis <strong>for</strong> 3-D Character<br />
<strong>Animation</strong>. PhD thesis, 1999.<br />
[35] Radek Grzeszczuk and Demetri Terzopoulos. Automated Learning of<br />
Muscle-Actuated Locomotion Through Control Abstraction. In <strong>Computer</strong><br />
Graphics (Siggraph ’95 Proceedings), pages 63–70, 1995.<br />
[36] Radek Grzeszczuk, Demetri Terzopoulos, and Geoffrey Hinton. NeuroAnimator:<br />
Fast Neural Network Emulation and Control of Physics-Based<br />
Models. In <strong>Computer</strong> Graphics (Siggraph ’98 Proceedings), pages 9–20,<br />
1998.<br />
20
[37] Jessica K. Hodgins. Three-Dimensional Human Running. In Proceedings<br />
of the IEEE Conference on Robotics and Automation, 1996.<br />
[38] Jessica K. Hodgins and Nancy S. Pollard. Adapting Simulated Behavios<br />
<strong>for</strong> New <strong>Characters</strong>. In <strong>Computer</strong> Graphics (Siggraph ’97 Proceedings),<br />
pages 153–162, 1997.<br />
[39] Jessica K. Hodgins, Wayne L. Wooten, David C. Brogan, and James F.<br />
O’Brien. Animating Human Athletics. In <strong>Computer</strong> Graphics (Siggraph<br />
’95 Proceedings), pages 71–78, 1995.<br />
[40] Zhiyong Huang. Motion control <strong>for</strong> human animation. PhD thesis, 1997.<br />
[41] Zhiyong Huang, Nadia Magnenat-Thalmann, and Daniel Thalmann. Interactive<br />
Human Motion Control Using a Closed-<strong>for</strong>m of Direct and Inverse<br />
Dynamics. In Proc. Pacific Graphics ’94, 1994.<br />
[42] Humanoid <strong>Animation</strong> Working Group, Web<strong>3D</strong> Consortium. H-Anim:<br />
Specification <strong>for</strong> a Standard VRML Humanoid, version 1.1. http://www.hanim.org/Specifications/H-Anim1.1/,<br />
1999.<br />
[43] Verne T. Inman, Henry J. Ralston, Frank Todd, and Jean C. Lieberman.<br />
Human Walking. Williams Wilkins, 1981.<br />
[44] Jr. James J. Kuffner. Goal-Directed Navigation <strong>for</strong> Animated <strong>Characters</strong><br />
Using Real-Time Path Planning and Control. In CAPTECH’98, pages<br />
171–186. Springer, 1998.<br />
[45] Prem Kalra, Nadia Magnenat-Thalmann, Laurent Moccozet, Gael Sannier,<br />
Amaury Aubel, and Daniel Thalmann. Real-Time <strong>Animation</strong> of Realistic<br />
Virtual Humans. In IEEE Journal of <strong>Computer</strong> Graphics and Applications,<br />
volume 18(5), pages 42–56, 1998.<br />
[46] Szilárd Kiss. Web Based VRML Modeling. In IV2001 In<strong>for</strong>mation Visualisation,<br />
pages 612–617, London, England, 2001.<br />
[47] Szilárd Kiss. <strong>3D</strong> Character Modeling in Virtual Reality. In IV2002 In<strong>for</strong>mation<br />
Visualisation, pages 541–548, London, England, 2002.<br />
[48] Evangelos Kokkevis, Dimitri Metaxas, and Norman I. Badler. Autonomous<br />
<strong>Animation</strong> and Control of Four-Legged Animals. In Graphics Interface ’95,<br />
1995.<br />
[49] Evangelos Kokkevis, Dimitri Metaxas, and Norman I. Badler. User-<br />
Controlled Physics-Based <strong>Animation</strong> <strong>for</strong> <strong>Articulated</strong> Figures. In <strong>Computer</strong><br />
<strong>Animation</strong> ’96, 1996.<br />
[50] Alexis Lamouret and Marie-Paule Gascuel. Scripting Interactive<br />
Physically-Based Motions with Relative Paths and Synchronization. In<br />
Graphics Interface ’95, 1995.<br />
[51] John Lasseter. Principles of Traditional <strong>Animation</strong> Applied to <strong>3D</strong> <strong>Computer</strong><br />
<strong>Animation</strong>. In Proc. Siggraph’87, pages 35–44. ACM Press, 1987.<br />
[52] Joseph Laszlo, Michiel van de Panne, and Eugene Fiume. Limit Cycle<br />
Control and its Application to the <strong>Animation</strong> of Balancing and Walking.<br />
In <strong>Computer</strong> Graphics (Siggraph ’96 Proceedings), pages 155–162, 1996.<br />
21
[53] Joseph Laszlo, Michiel van de Panne, and Eugene Fiume. Interactive Control<br />
<strong>for</strong> Physically-Based <strong>Animation</strong>. In <strong>Computer</strong> Graphics (Siggraph ’00<br />
Proceedings), pages 201–208, 2000.<br />
[54] Jehee Lee and Sung Yong Shin. A Hierarchical approach to Interactive<br />
Motion Editing <strong>for</strong> Human-like Figures. In <strong>Computer</strong> Graphics (Siggraph<br />
’99 Proceedings), pages 39–48, 1999.<br />
[55] Yuencheng Lee, Demetri Terzopoulos, and Keith Waters. Realistic Modeling<br />
<strong>for</strong> Facial <strong>Animation</strong>. In <strong>Computer</strong> Graphics (Siggraph ’95 Proceedings),<br />
pages 55–62, 1995.<br />
[56] Ruqian Lu and Songmao Zhang. Automatic generation of <strong>Computer</strong> <strong>Animation</strong>:<br />
Using AI <strong>for</strong> Movie <strong>Animation</strong>, volume 2160, LNAI. Springer.<br />
[57] M. Grosso and R. Quach and E. Otani and J. Zhao and S. Wei and P.-H.<br />
Ho and J. Lu and N. I. Badler. Anthropometry <strong>for</strong> <strong>Computer</strong> Graphics<br />
Human Figures. Technical Report MS-CIS-89-71, 1989.<br />
[58] George Maestri. Digital Character <strong>Animation</strong> 2, volume 1 - Essential Techniques.<br />
New Riders Publishing, 1999.<br />
[59] George Maestri. Digital Character <strong>Animation</strong> 2, volume 2 - Advanced<br />
Techniques. New Riders Publishing, 2001.<br />
[60] D. McNeill. Hand and Mind: What Gestures Reveal About Thought. University<br />
of Chicago, 1992. Characterizes gestures.<br />
[61] Stuart Mealing. The art and science of computer animation. Ox<strong>for</strong>d, 1992.<br />
[62] Nadia Magnenat-Thalmann and Daniel Thalmann. <strong>Computer</strong> <strong>Animation</strong>.<br />
In Handbook of <strong>Computer</strong> Science, pages 1300–1318. CRC Press, 1996.<br />
[63] Hansrudi Noser and Daniel Thalmann. The <strong>Animation</strong> of Autonomous<br />
Actors Based on Production Rules. In Proc. <strong>Computer</strong> <strong>Animation</strong> ’96,<br />
pages 47–57. IEEE <strong>Computer</strong> Society Press, 1996.<br />
[64] Ken Perlin and Athomas Goldberg. Improv: A System <strong>for</strong> Scripting Interactive<br />
Actors in Virtual Worlds. In <strong>Computer</strong> Graphics (Siggraph ’96<br />
Proceedings), pages 205–216, 1996.<br />
[65] Charles Rose, Bobby Bodenheimer, and Michael F. Cohen. Verbs and<br />
Adverbs: Multidimensional Motion Interpolation using Radial Basis Functions.<br />
In IEEE Journal of <strong>Computer</strong> Graphics and Applications, 1998.<br />
[66] Charles Rose, Brian Guenter, Bobby Bodenheimer, and Michael F. Cohen.<br />
Efficient Generation of Motion Transitions using Spacetime Constraints.<br />
In <strong>Computer</strong> Graphics (Siggraph ’96 Proceedings), pages 147–154, 1996.<br />
[67] Zsófia Ruttkay, Paul ten Hagen, Han Noot, and Mark Savenije. Facial<br />
<strong>Animation</strong> by Synthesis of Captured and Artificial Data. In CAPTECH’98,<br />
pages 268–271, 1998.<br />
[68] G. Sannier, S. Balcisoy, N. Magnenat-Thalmann, and D. Thalmann. An<br />
Interactive Interface <strong>for</strong> Directing Virtual Humans. In Proc. ISCIS’98. IOS<br />
Press, 1998.<br />
22
[69] G. Sannier, S. Balcisoy, N. Magnenat-Thalmann, and D. Thalmann. VHD:<br />
A System <strong>for</strong> Directing Real-Time Virtual Actors. In The Visual <strong>Computer</strong>,<br />
volume 15, No 7/8, pages 320–329. Springer, 1999.<br />
[70] Karl Sims. Evolving Virtual Creatures. In <strong>Computer</strong> Graphics (Siggraph<br />
’94 Proceedings), pages 15–22. ACM Press, 1994.<br />
[71] A. James Stewart and James F. Cremer. Beyond Keyframing: An Algorithmic<br />
Approach to <strong>Animation</strong>. In Graphics Interface ’92, 1992.<br />
[72] Harold C. Sun and Dimitris N. Metaxas. Automatic Gait Generation. In<br />
<strong>Computer</strong> Graphics (Siggraph ’01 Proceedings), pages 261–270, 2001.<br />
[73] Wen Tang, Marc Cavazza, Dale Mountain, and Rae Earnshaw. Real-Time<br />
Inverse Kinematics through Constrained Dynamics. In CAPTECH’98,<br />
pages 159–170. Springer, 1998.<br />
[74] Daniel Thalmann. A New Generation of Synthetic Actors: the Real-time<br />
and Interactive Perceptive Actors. In Proc. Pacific Graphics 96, 1996.<br />
[75] Daniel Thalmann. Physical, Behavioral, and Sensor-Based <strong>Animation</strong>. In<br />
Graphicon’96, 1996.<br />
[76] Daniel Thalmann. The Foundations to Build a Virtual Human Society.<br />
In Intelligent Virtual Agents Workshop (IVA2001), volume 2190, LNAI.<br />
Springer, 2001.<br />
[77] Daniel Thalmann, Hansrudi Noser, and Zhiyong Huang. Autonomous Virtual<br />
Actors based on Virtual Sensors. In Creating Personalities <strong>for</strong> Synthetic<br />
Actors: Towards Autonomous Personality Agents, volume 1195, LNAI,<br />
pages 25–42. Springer, 1997.<br />
[78] Xiaoyuan Tu. Artificial Animals <strong>for</strong> <strong>Computer</strong> <strong>Animation</strong>: Biomechanics,<br />
Locomotion, Perception, and Behavior, volume 1635 of LNCS. Springer,<br />
1999.<br />
[79] Munetoshi Unuma, Ken Anjyo, and Ryozo Takeuchi. Fourier Principles<br />
<strong>for</strong> Emotion-based Human Figure <strong>Animation</strong>. In <strong>Computer</strong> Graphics (Siggraph<br />
’95 Proceedings), pages 91–96, 1995.<br />
[80] Keith Waters. A muscle model <strong>for</strong> animation three-dimensional facial expression.<br />
In Proc. Siggraph ’87, pages 17–24. ACM Press, 1987.<br />
[81] Michael W. Whittle. Musculo-Skeletal Applications of Three-Dimensional<br />
Analysis. In Paul Allard and Ian A.F. Stokes, editors, Three-Dimensional<br />
Analysis of Human Movement. Human Kinetics Publishers, 1995.<br />
[82] Andrew Witkin and Zoran Popović. Motion Warping. In <strong>Computer</strong> Graphics<br />
(Siggraph ’95 Proceedings), pages 105–108, 1995.<br />
[83] Wayne L. Wooten and Jessica K. Hodgins. <strong>Animation</strong> of Human Diving.<br />
In Graphics Interface ’95, 1995.<br />
23