In recent years, topics such as ‘’Metaverse’’ have gained enormous attention with the advancement of related technologies. Metaverse is a multiuser environment merging physical reality with digital virtuality and is based on the convergence of technologies such as virtual reality (VR) and augmented reality (AR) that enable multisensory interactions with virtual environments, digital objects and people (Mystakidis, 2022). As a result of various restrictions due to COVID-19 in recent years, it is highly likely that working, skill-learning, exercising and socializing would shift more towards VR and cyberspace in the future. The metaverse is expected to be a realistic society, while the concepts of race, gender, and even physical disability would be weakened (Duan et al., 2021). Virtual embodiment enables us to have an illusory body ownership towards bodies in different skin color (Peck et al., 2013), gender (Lopez et al., 2019), and size (Tambone et al., 2021), bodies that are invisible or transparent (Guterstam et al., 2013;2015, Martini et al., 2015, Kondo et al., 2018;2020), and non-human agents (Hoffmann et al., 2018) and animals (Krekhov et al., 2019). Here, we are interested in the idea of minimizing physical disabilities of users in the virtual worlds through specially designed virtual avatars (Inami et al., 2022). Human movement augmentation such as supernumerary limbs, body-parts remapping, and exoskeletons expands physical abilities both for impaired and unimpaired individuals (Eden, et al., 2022; Sasaki et al., 2017; Umezawa, et al., 2021; Kondo, et al., 2020). In social VR platforms, with appropriate software developments and accurate body movement tracking, it would be possible for multiple users to embody one avatar and engage in various tasks together. This may allow disabled VR users to get support from others and increase work efficiency as well as user satisfaction. The present study takes this social approach.
Although the concept of two individuals co-embodying one avatar in first-person view is rather new, it has been proposed in a few recent studies (Fig. 1A). Hagiwara et al., 2020 conducted an experiment where two human participants were embodied within a concurrent shared avatar in VR and showed that the movements of the shared avatar were straighter, and less jerky compared to the movements of the individual participants and the solo body avatar. Fribourg et al., 2020 introduced the concept of “virtual co-embodiment”, and showed that the participants were good at estimating their real levels of control but significantly overestimated their “sense of agency” when they could anticipate the motion of the avatar. These previous studies on virtual co-embodiment have studied avatars of which the movements were determined by averaging the movements of the two participants in real time (occasionally in different ratios of control). Although averaging movements may have its benefits for some tasks for normal VR users, allowing a partner to completely control the virtual avatar’s limb corresponding to a person’s missing/immovable limb would be more applicable in the case of disabled VR users such as hemiplegic patients and amputees. Such aid by partners would allow amputees to weaken disabilities and perform complicated tasks in one avatar where each participant can focus on one detailed sub task. We have developed this type of co-embodiment avatar as a joint body avatar, in which the left half of the avatar is controlled by a user and the right half of the avatar is controlled by another (Fig. 1B, Video S1). However, allowing another person to fully control a limb of a virtual avatar in which an individual is immersed in first person perspective may cause a severe lack of “sense of embodiment” towards that limb, which may lead to an uncomfortable experience associated with the feeling of being partly possessed by someone else. Therefore, addressing this lack of sense of embodiment towards limbs controlled by others is necessary for a pleasant virtual co-embodiment.
According to Kilteni et al., 2012, “sense of embodiment” consists of three subcomponents called (1) sense of self-location: the ability to perceive the location of one’s body parts (Lenggenhager et al., 2009, Blanke, & Metzinger, 2009), (2) sense of agency: the sense of having control of motion (Haggard, 2005, Blanke, & Metzinger, 2009), and (3) sense of body ownership: one’s self-attribution of the body (Gallagher, 2000). Multiple studies using VR (Gonzalez-Franco et al., 2010, Aymerich-Franch, & Ganesh, 2016) as well as physical objects such as rubber hands (Botvinick, & Cohen, 1998, Kalckert, & Ehrsson, 2014) have investigated on these subcomponents and shown that humans can elicit embodiment towards virtual and physical objects other than their own bodies. While studies on inducing ownership towards external objects like rubber hand illusion (Botvinick, & Cohen, 1998) have claimed that visuo-tactile synchrony or synchronized active or passive movements are necessary for embodiment of a rubber hand, studies on virtual avatars in VR claim avatar appearance (Paludan, et al., 2016), visuomotor congruence or avatar control (Caspar, et al., 2015), and user point of view (Gorisse, et al., 2017) to be factors affecting embodiment. Since most of these studies on embodiment of virtual avatars have been conducted on a one-to-one basis (one person controlling all limbs of one avatar), there is a lack of knowledge regarding factors affecting embodiment towards body parts controlled by others. In such cases, ‘control’ for the limb which is known to be one of the most crucial factors affecting sense of agency according to Fribourg et al 2020, will be lost consequently reducing or even cancelling embodiment towards the limb. Therefore, it is crucial to explore other possible factors that would affect embodiment towards such uncontrolled limbs. In our previous study, we connected two participants' backs with a hard brace and found that connection force feedback synchronized with the movements of the joint virtual avatar could increase illusory body ownership towards a limb controlled by another. (Harin et al., under review). However, for realizing the joint body in virtual environments, it is not reasonable to connect users physically. Therefore, we need other factors to increase embodiment towards the limb controlled by another person without connection force feedback.
In this study, we investigated whether sense of embodiment towards a limb fully controlled by a partner would change depending on the nature of the task itself performed in the co-embodied avatar. The task of the joint avatar can be considered as 'joint action'. Sebanz et al. (2006) defined joint action as any form of social interaction whereby two or more individuals “coordinate” their actions in space and time to bring about a change in the environment (Sebanz et al., 2006). For example, two people carrying a table together would require both of them to coordinate their actions in a temporally and spatially precise manner to successfully execute the task. Some coordination mechanisms depend on sensorimotor information shared between co-actors, thereby making joint attention, prediction, non-verbal communication, or the sharing of emotional states possible (Vesper et al., 2017). Previous studies related to joint attention have suggested that the ability to direct one’s attention to where an interaction partner is attending provides a basic mechanism for sharing representations of objects and events (Frischen, & Tipper, 2004, Tollefsen, 2005) which may possibly encourage the ‘we’ intention or the ‘we-mode’ (Gallotti, & Frith, 2013) believed by some philosophers to be what joint action mostly depends on (Tollefson, 2005). Here, we thought of the possibility of enhancing embodiment towards a co-embodied avatar by encouraging we-mode through sensory (visual) information manipulation during joint action, to eventually investigate on the relationship between embodiment and joint attention (or the sensory information responsible for joint attention). Therefore, we hypothesized there to be an increase in embodiment if the individual and the partner shared one common intention/goal compared to having their own separate intentions before observing the partner’s action during the task. Furthermore, we hypothesized that in the case of having two different targets, the embodiment would be higher when the partner’s target is visible compared to when it is invisible. To test these hypotheses, we conducted an experiment using a joint avatar. Our joint avatar was an avatar co-embodied by two individuals together of which one participant fully controlled the left-side limbs while another fully controlled the right-side limbs in first person perspective. In our experiment, participant dyads performed three types of reaching tasks involving and not involving joint action (Fig. 2).
We named the three reaching tasks/conditions of our study as “Same goal”, “Different goals: Visible”, and “Different goals: Invisible”. The “Same goal” condition consisted of a reaching task where the two participants in the dyad reached yellow spheres in front of them using the two hands of the joint avatar. In “Different goals: Visible” and “Different goals: Invisible” conditions, the two participants reached two different targets with the left and right arms. The red targets were reached with the left hand of the avatar and the blue targets were reached with the right hand of the avatar. 100 common targets or 100 pairs of targets appeared in each condition, which was blocked as a session.
The participants were instructed to reach the targets and keep the hand touching the target until it erases before returning their hands back to the original position. In the last target reach at the end of each session, we displayed a stimulus of a knife stabbing one of the hands of the joint avatar to measure startle responses using the skin conductance response (SCR) (Fig. 3). Participants answered a questionnaire regarding senses of body ownership and agency at the end of each session.