The uncanny valley is a hypothesis that first appeared in psychiatry more than a century ago, which states that as a character gets close to looking truly human it begins to get creepy and scare human observers. It can be observed across the fields of robotics, medicine and computer animation and the modern term was coined by Japanese robotics professor Masahiro Mori in 1970.
The film industry became aware of the uncanny valley for the first time 25 years ago in 1988. This was the year Pixar ‘s short film “Tin Toy” presented the world’s first computer animated baby, named Billy.
Pixar at the time was still a hardware company working on the Pixar Image Computer intended for high-end scientific visualization and the animated shorts were simply produced as a way to show off its capabilities.
This changed with “Tin Toy”, initially made as a test for Renderman, it became the film for which Pixar developed the world’s first dedicated animation program called Menv, which was short for modeling environment and separated the animation process into modeling, animation and lighting.
The story, seen from the perspective of a toy being chased by baby Billy, was inspired by John Lasseter’s visit to The Tin Toy Museum in Yokohama, Japan, as well as his observations of a friend’s baby, and it proved challenging to balance the real and the cartoony look of an actual human being.
With a budget of 300, 000 dollars the five minute long short earned Pixar their first Academy Award for Best Animated Short Film in 1988 and also sparked off the debate of the uncanny valley in computer animation. Critics had mixed opinions, it was received with positive response at 1988’s SIGGRAPH by scientists and engineers for technology and innovation. But while many were amazed by the use of technology some saw it as the most frightening and disturbing piece of animation ever created. These negative reactions to Billy in the audiences caused the film industry to take the uncanny valley concept serious, and would explain why attempts at realistic digital humans rarely hit the big screen still today.
“Tin Toy” gained attention from Disney which led to the agreement of “Toy Story”, the world’s first computer animated feature that came in 1995. It was also important in establishing computer animation as a legitimate medium outside just SIGGRAPH and animation film festivals, which it did by winning the Best Animated Short Academy Award. Following this Pixar sold off its hardware business and became the animation studio we know it as today. The Pixar Image Computer was too expensive for mass deployment and never sold more than 300 machines, several of these were bought by Disney and used for the first digital ink and paint system known as “CAPS” which came into use during the Disney renaissance in the 1990’s and was awarded another Academy Award in 1992.
In the 1990’s the mainstream animation revival took place, there was an exponential growth in CGI to enhance both animated sequences and live-action special effects, and following the success of “Toy Story” more computer animated features started being produced, although the characters were anthropomorphic animals and highly stylized cartoon humans.
Flash forward a few years to 2001 when the first photorealistic computer animated feature film came, this was the Japanese-American “Final Fantasy: The Spirits Within”, which at 137 million dollars still holds the record as the most expensive video-game inspired film ever made.
The story was set around a female heroine, the scientist Aki Ross, trying to free a post-apocalyptic Earth from a mysterious alien race known as the Phantoms. Created by video game designer behind the series and director of the film Hironobu Sakaguch, she was supposed to become the first photorealistic computer-generated actress to appear in other hyper-realistic CG movies after the same changes were made to things like clothes, hair and make-up that a real live actress would need. This came from the idea that it would be a waste of time and resources to start from scratch on the character design when it was so meticulously made. The world’s most famous character model was called an “It girl” and voted one of the sexiest women ever by Maxim as the first fictional character to ever make the list.
Aki was composed of 400,000 polygons and 60,000 hairs which were all fully animated and rendered, which required a render farm consisting of 960 workstations. The film’s 141,964 frames rendered at 90 minutes each which took four years and 200 members of staff to complete. By the end of the film the production company Square Pictures was sitting on approximately 15 terabytes (15 000 GB) of artwork and the team had put in a combined 120 years of work.
“Final Fantasy: The Spirits Within” was the first mainstream feature to be fully made with motion capture technique, also known as performance capture or mocap for short, which is the process of recording live actors and using the data to animate digital characters. Unfortunately the lifelike features of the main heroine freaked out western audiences which led to a box office bomb when the film failed to bring in anything near the astronomic production cost. Like that the uncanny valley had just killed the career of world’s first digital film star, although the film remains a landmark in the history of computer animation.
In the 2000’s following Final Fantasy use of motion capture on feature films, which was already widely in use on video games since being pioneered on the Atari Jaguar in 1995, became more common. The next attempt at realistic life-like human characters came in 2004 with “The Polar Express”, which was the first feature film to be shot entirely on a motion capture stage. The story was based on a children’s book about a boy who travels to the North Pole on a magical train after he starts to doubt Santa’s existence.
Three motion capture stages were built for the film and with a budget of 165 million dollars it was more expensive than the average Pixar one-million-dollar-a-minute-animation. It was an experiment in driving the technology forwards where director Robert Zemeckis worked together with Sony Imageworks who developed the motion capture systems as well as other effects.
Although the film contained many children surprisingly there were no child actors in the film, instead adults performed as children. The illusion was created by scaling props and sets to make the adults appear to be children in relation to their environments. Every set had three scale versions, the normal sized set was standard scale, sets with children were 120 percent and sets with elves, which were just two feet tall, were at 200 percent. Tom Hanks played five characters in the film and to keep eye lines correct for his child character, a special rig called a “snorkel” was devised. It was a backpack with a three-foot rod with a ball on top of it, so when he was playing a child the adult playing opposite him would wear the rig.
The film was unique in the highly detailed facial motion capture, the markers used were both smaller and more numerous than on other films, a standard system at the time used 30-60 markers while Imageworks used 152 in total. Their system also allowed facial and body data to be captured at the same time and at a much greater distance to the camera, which allowed for much more flexibilty and realism as actors that had before been sitting two feet away from the camera were now moving around freely at distances up to 26 feet away. Custom-tailored Japanese bodysuits made from velcro had 72 larger body markers and this data was much easier to capture and apply to digital characters than the facial, where the small marker size was more susceptible to noise and occlusion by set props and other actors.
Motion capture data was applied to the digital characters which had been modeled in Maya, where it worked as a foundation and the animators filling in the missing parts where data was missing or making specific edits.
In applying the performance data from actors to digital characters a few problems occurred which caused a lot of negative audience feedback and placed the film in the uncanny valley category, although it didn’t flop at the box office in the same way Final Fantasy had done three years earlier.
The first problem was scaling issues between the digital child characters and the adults playing them. The dynamic motion of the actors faces sometimes looked strange when applied to the child characters, and as it was scaled down it was difficult to preserve the motion which would help with the realism. There were also some issues with the motion capture data where spines and shoulders came out very stiff and had to be loosened up by animators.
MotionBuilder, which was relatively new at the time, played an important role in dealing with the with scaling issues and MotionBuilder rigs that redistributed motion to make up for lack of clavicle and spine rotation in the body data were created.
Each character had three control rigs, one mocap control rig containing the source data which was never compromised and one animation control rig with the keyframed performance animation. The animator could blend between these two, in addition there was the mocap offset rig which allowed the animator to execute additive animation on top of the mocap. The resultant motion was a composite of all three sources: mocap, mocap offset and animation control rigs.
Finally there were no markers on tongues, eyelids or eyeballs of actors so these important expressive elements had to be done by animators and when captured data was combined with key-framed elements this was bound to make the characters appear uncanny even if the film itself was as a technical achievement. It was also the first feature length all CG film to be created in stereoscopic 3D, as well as first feature to be released simultaneously for 35 mm projection and IMAX 3D.
Although some critics called “The Polar Express” a failed experiment, director Robert Zemeckis continued to work with the motion capture technology with “Beowolf” in 2007 followed by a two film deal with Disney ending in the huge flop “Mars Needs Moms” in 2011, which was a massive commercial failure and the worst box-office reception ever for a Disney branded film.
The most recent film in this group of the uncanny was “The Adventures of Tintin” in 2011 which was the first fully animated feature done at Weta Digital, drawing from their experience in performance capture on Avatar in 2009. It is also the only non-Pixar film to win a Golden Globe for Best Animated Feature Film.
Whereas audiences remain skeptic to realistic computer animated humans large film studios are continuing to develop this technology, which has huge potential in the future: TinTin was shot in just 32 days and The Polar Express in 44, something that would have taken around eight months to shoot live-action and required traveling to all the locations. And if characters like Aki Ross can be recycled indefinitely once created anything becomes possible, especially in a time where rendering engines are taking huge steps forward.
We’ve come a long way in the past 25 years of computer animation since Pixar created Billy with a cement-style diaper, but for now we’ve still got a bit left to go to get past the uncanny valley.