animation streams identified as Charisma, which can be            Fig. 4: Example of physics-based facial animation [31].
applied for animation systems utilised with games and virtual     F. Performance Driven Facial Animation
characters in the web.
                                                                      Performance capture is a method that uses motion
  E. Physics-Based Muscle Modelling                             capture technology to represent the performance of the
                                                                character. In conventional motion capture, the face and the
      Many attempts have been set on physics-based muscle       body are recorded at different times and then blended
modelling to model anatomical facial behaviour. These are       together. Using performance capture, the face and the body
classified into three classes; mass-spring systems, vector      are captured at the same time to describe the entire
representation, and layered spring meshes. Mass-spring          performance of the performer. The Polar Express [37] was
approaches propagate muscle forces in an elastic spring mesh    the first film to successfully use facial motion capture for an
which models skin deformation [28]. The vector method           entire computer graphics (CG) movie as shown in Fig. 5.
deforms a facial mesh utilising motion fields in delineated
regions of influence [29]. A mass-spring structure was              Fig. 5: Example of performance-driven facial capture
extended into three connected mesh layers by a layered                utilising markers, used in the Polar Express [37].
spring mesh [30].
                                                                      Artist driven manual key-frame animations may never
      The limitation of blendshapes is that they provide only   capture the subtleties of a human face. The trend in facial
the linear subspace. Recently, researchers have tended to use   animation has moved towards utilising the human face itself
physical simulation to achieve more expressive, non-linear      as the driver and input device for facial animation. Extracting
facial animation. One of the first approaches for physics-      information from an actual performance of facial movements
based facial animation was suggested by Sifakis et al. [31],    is natural, easy, and fast which lead to the concept of
who construct a detailed face rig comprising of a complete,     performance-driven facial animation. A ’performance’ can
anatomically muscle structure, generated manually from the      be understood as a visual capture of an actor’s face talking
actor’s medical data. Constructing the muscle structure for     and emoting which is utilised to extract information then
an actor is a time-consuming process. Cong et al. [32]          retarget the motion onto a digital character. Williams [38]
enhanced an automatic method to transfer anatomy pattern to     presented the term performance driven facial animation to
target input faces. Ichim et al. [33] fit a template model of   the computer graphics society in Siggraph 1990. Since then
muscles, bones and flesh to face scans. This approach           there have been many studies that have extended the main
succeeds by resolving for the muscle activation parameters      concept. Hardware motion capture systems were familiar in
that best appropriate the input scans through forward           the mid-90s and were utilised regularly in short demos [38].
simulation, and generates an actor physical face mesh for
animation. Ma et al. [34] use a mass-spring system to                 The process of performance driven facial animation can
construct a blendshape model which incorporates physical        be divided into three stages: modelling, capture, and
interaction. Kozlov et al. [35] concentrates on the production  retargeting. The modelling stage has to do with the model of
of expression-specific physical effects, however the            the human face such that it could be digitally stored,
drawback is that spatially-varying material parameters          displayed and modified. The choice of representation does
require to be painted and set manually for each expression.     have an effect on the final animation as the model inherently
                                                                limits the expressive abilities of the face. modelling
Fig. 3: The facial feature points defined in the MPEG-4         approaches variety from mesh propagation-based
                         standard [36].                         approaches where a single 3D mesh is deformed over the
                                                                performance [39, 40] as shown in Fig. 6, 2D and 3D
                   Table II                                     statistical models based on PCA [41, 42], blendshape models

FAP groups in MPEG-4.

Group                        Number of FAPs

Viseme and expressions       2

Cheeks                       4

Eyebrow                      8

Tongue                       5

Lip, Chin and Jaw            26
