Transformed social interaction

Overview of TSI

Over time, our mode of remote communication has evolved from written letters to telephones, email, internet chat rooms, and videoconferences. Similarly, collaborative virtual environments (CVEs) promise to further change the nature of remote interaction. CVEs track verbal and nonverbal signals of multiple interactants and render those signals onto avatars, three-dimensional, digital representations of people in a shared digital space. Unlike telephone conversations and videoconferences, interactants in virtual reality have the ability to systematically filter the physical appearance and behavioral actions of their avatars in the eyes of their conversational partners, amplifying or suppressing features and nonverbal signals in real-time for strategic purposes. Given that CVEs render the world separately for each user simultaneously, it is possible to break the normal physics of social interaction and to render the interaction differently for each participant at virtually the same time. In other words, the information relevant to each CVE participant is transmitted to the other participants as a stream of information that summarizes his or her current movements or actions. However, that stream of information can be transformed on-the-fly in real time for strategic purposes.

The first dimension of TSI is self representation, the strategic decoupling of the rendered appearance or behaviors of avatars from the actual appearance or behavior of the human driving the avatar. Because CVE interactants can modulate the flow of information, thereby transforming the way specific avatars are rendered to others, rendered states can deviate from the actual state of the interactant. In the distance learning paradigm, it could be the case that some students learn better with teachers who smile and some learn better with teachers with serious faces. In a CVE, the teacher can render herself differently to each type of student, tailoring her facial expressions to each student in order to maximize their attention and learning.

The second dimension is transforming social-sensory abilities. These transformations complement human perceptual abilities. One example is to render ‘invisible consultants’, either algorithms or human avatars who are only visible to particular participants in the CVEs. These consultants can either provide real-time summary information about the attentions and movements of other interactants (information which is automatically collected by the CVE) or can scrutinize the actions of the user herself. For example, teachers using distance learning applications can utilize automatic registers that ensure they are spreading their attention equally towards each student.

The third dimension is transforming the social environment. The contextual setup of a virtual meeting room can be optimally configured for each participant. For example, while giving a speech in front of the audience, the speaker can replace the gestures of distracting students with gestures that improve the ability of that speaker to concentrate. Furthermore, by altering the flow of rendered time of the actions of other interactants in a CVE, users can implement strategic uses of “pause,” “rewind,” and “fast forward” during a conversation in attempt to increase comprehension and productivity.

TSI is relevant not just to CVEs, but to any form of communication media that uses digital representations of people—cell phones, videoconferences, textual chat rooms, online videogames, and many other forms of digital media. Currently, over 60 million people use internet chat per day. In Korea, it is estimated that 1/20th of the general population spend a significant amount of time playing online video games while interacting with digital representations of other people. Five million people per day interact via avatars in online games and exhibit meaningful and thorough social interaction. Cell phones are ubiquitous and now include digital photograph and video capabilities. Some companies even claim to offer limited face tracking and rendering on cell phone avatars. In any communication medium in which there is a digital representation of another person, TSI is not only possible, but inevitable. The use of TSI has the potential to drastically change the nature of distance education, communication practices, political campaigning, and advertising. Consequently, it is crucial to understand both the effectiveness of these transformations as well as people’s ability to detect them.

Examples of TSI

Facial Similarity

Human beings are biologically driven to prefer faces similar to them. Naturally, facial cues convey more than a person's gender, race, or age; they also evoke strong affective responses. Over the years, researchers have found that similarity between two people instills altruism and trust. Biological explanations for this effect argue that phenotype matching (implicit recognition of subtle physical cues) is a mechanism organisms use to identify genetically-related kin. Indeed, different areas of the brain process facial images morphed with the self than images morphed with familiar others. Social explanations argue that people use physical similarity as a proxy for compatible interests and values.

In the context of political campaigns, a candidate's face could, by itself, influence voters' impressions of the candidate, especially in situations when substantive information is not available. Simply put, voters may prefer candidates whose faces resemble their own. It is inevitable that political candidates, advertisers, educators, and others who seek social influence will resort to methods of dynamically transforming appearance. This is especially true in most state and local elections where voters possess very little information about the candidates on the ballot. In such 'low-information' races, voters will resort to visual affective cues as the dominant basis for electoral choice.

A study carried out by the Virtual Human Interaction lab at Stanford University demonstrated that the outcome of the 2004 Presidential election could be manipulated by digitally altering the pictures of Kerry and Bush. Other studies have also proven that this preference holds true with other lesser known, generic faces as well. Results indicated that in a low-information context, a candidate could increase electoral support by as much as 20 percentage points simply by incorporating elements of individual voters' faces into his or her campaign photograph.

To test the effect of facial identity capture on vote choice, digital photographs of a national random sample of voting aged citizens were passively acquired. One week before the 2004 presidential election, participants in the study completed a survey of their attitudes concerning George Bush and John Kerry while viewing photographs of both candidates side by side (See Figure 1). For this study, a software application called Magic Morph was used to digitally blend two images. A random one-third of the subjects had their own faces morphed with Kerry while unfamiliar faces were morphed with Bush. For a different one-third, their own faces were morphed with Bush while unfamiliar faces were morphed with Kerry. The remaining one-third of the sample viewed un-morphed pictures of the candidates.

File:Vhil brevia.jpeg Figure 1: Two subjects, (Panels A and B), the morph of Subject 1 and Bush (Panel C), the morph of Subject 2 and Kerry (Panel D), and the vote intention score by condition (Panel E). The difference in vote intention for Bush and Kerry by condition was statistically significant (p < .05).

Post-experiment interviews demonstrated that not a single person detected that his or her image had been morphed into the photograph of the candidate. Participants were more likely to vote for the candidate morphed with their own face than the candidate morphed with an unfamiliar face. The effects of facial identity capture on candidate support were concentrated among weak partisans and independents; for ‘card carrying’ members of the Democratic and Republican parties, the manipulation made little difference.

The use of facial identity capture was sufficient to change the outcome of the presidential election by a double-digit margin, according to a national random sample. In the case of presidential elections, it is well documented that the candidates’ party affiliation, their positions on major issues, their personal traits, as well as the state of the economy affect vote choice. Results demonstrate that implicit facial similarity should be added to this list.

Behavioral Mimicry

Non-Zero Sum Gaze

Non-zero-sum gaze (NSZG) is directing mutual gaze at more than a single interactant in a CVE at once. Previous research has demonstrated that eye gaze is an extremely powerful tool for communicators seeking to garner attention, be persuasive and instruct. People who use mutual gaze increase their ability to engage an audience as well as to accomplish a number of conversational goals. In face-to-face interaction, gaze is zero-sum. In other words, if Person A looks directly at Person B for 65 percent of the time, it is not possible for Person A to look directly at Person C for more than 35 percent of the time. However, interaction among avatars in CVEs is not bound by this constraint. The virtual environment as well as the other avatars in the CVE is individually rendered for each interactant locally. As a result, Person A can have his avatar rendered differently for each other interactant, and appear to maintain mutual gaze with both B and C for a majority of the conversation. Three separate projects (Bailenson, Beall, Blascovich, Loomis, & Turk, 2004; Beall, Bailenson, Loomis, Blascovich & Rex, 2003] have utilized a paradigm in which a single presenter read a passage to two listeners inside a CVE. All three interactants were of the same gender, wore stereoscopic, head-mounted displays, and had their head movements and mouth movements tracked and rendered, and the presenter’s avatar either looked directly at each of the other two speakers simultaneously for 100 percent of the time (augmented gaze) or utilized normal, zero-sum gaze. Results across those three studies have demonstrated three important findings: 1) participants never detected that the augmented gaze was not in fact backed by real gaze, 2) participants returned gaze to the presenter more often in the augmented condition than in the normal condition, and 3) participants (females to a greater extant than males) were more persuaded by a presenter implementing augmented gaze than a presenter implementing normal gaze.

Non-Zero Sum Proxemics

The Proteus Effect

Augmented Social Perception

Academic Articles about Transformed Social Interaction

Bailenson, J.N., Yee, N., Patel, K., & Beall, A.C. (2007, in press). Detecting Digital Chameleons. Computers in Human Behavior.

Yee, N. & Bailenson, J.N. (2007, in press). The Proteus Effect: Self Transformations in Virtual Reality. Human Communication Research.

Yee, N., & Bailenson, J.N. (2006). Walk A Mile in Digital Shoes: The Impact of Embodied Perspective-Taking on The Reduction of Negative Stereotyping in Immersive Virtual Environments. Proceedings of PRESENCE 2006: The 9th Annual International Workshop on Presence. August 24 – 26, Cleveland , Ohio , USA

Bailenson, J. N. (2006). Transformed Social Interaction in Collaborative Virtual Environments. In Messaris, P. and Humphreys, L. (Ed.) Digital Media: Transformations in Human Communication. 255-264. New York: Peter Lang.

Bailenson, J.N., Yee, N., Blascovich, J., & Guadagno, R.E. (2006, in press). Transformed Social Interaction in Mediated Interpersonal Communication. In Konijn, E., Tanis, M., Utz, S. & Linden, A. (Eds.), Mediated Interpersonal Communication, Lawrence Erlbaum Associates.

Bailenson, J.N., Garland , P., Iyengar, S., & Yee, N. (2006). Transformed Facial Similarity as a Political Cue: A Preliminary Investigation. Political Psychology, 27, 373-386.

Bailenson, J.N. & Beall, A.C. (2006). Transformed Social Interaction: Exploring the Digital Plasticity of Avatars. In Schroeder, R. & Axelsson, A.'s (Eds.), Avatars at Work and Play: Collaboration and Interaction in Shared Virtual Environments, Springer-Verlag, 1-16.

Bailenson, J.N., Beall., A.C., Blascovich, J., Loomis, J., & Turk, M. (2005). Transformed Social Interaction, Augmented Gaze, and Social Influence in Immersive Virtual Environments. Human Communication Research, 31, 511-537.

Bailenson, J. N. & Yee, N. (2005). Digital Chameleons: Automatic assimilation of nonverbal gestures in immersive virtual environments. Psychological Science, 16, 814-819. (see video)

Bailenson, J.N., Beall, A.C., Loomis, J., Blascovich, J., & Turk, M. (2004). Transformed Social Interaction: Decoupling Representation from Behavior and Form in Collaborative Virtual Environments. PRESENCE: Teleoperators and Virtual Environments, 13(4), 428-441.

Beall, A.C., Bailenson, J.N., Loomis, J., Blascovich, J., & Rex, C. (2003). Non-Zero-Sum Mutual Gaze in Collaborative Virtual Environments. Proceedings of HCI International, 2003, Crete.

Bailenson, J.N., Beall, A.C., Blascovich, J., Weisbuch, M., & Raimmundo, R. (2001). Intelligent Agents Who Wear Your Face: Users' Reactions to the Virtual Self. Lecture Notes in Artificial Intelligence, 2190, 86-99.