After the data extraction and synthesis processes, we started analyzing the research questions separately and then integrated the results into a framework to better understand and connect the variables relevant to videoconference negotiations, as described in this section.
3.1. Research questions
3.1.1. RQ1: Which articles specifically study videoconference negotiations?
According to our reference-based search, we have eight relevant studies, and according to our database search, ten. However, some of them are duplicated, which leaves us with 13 relevant papers for our review, as displayed in Table 2. Twelve of them compare videoconference negotiation to another communication medium, being Swaab & Swaab (2009) the only one focusing on videoconferencing when studying sex differences in negotiations. Whereas the first articles compare video negotiations to face-to-face and audio (Short, 1974; Turnbull, Strickland, and Shaver, 1976), instant messaging or synchronous text communication appears in 1999 with Wachter (1999) and Suh (1999). In the latest article, which is by Kornfield et al. (2021), a new form of interaction appears, namely the negotiation through a telepresence robot.
Additionally, in Daly-Jones et al. (1998), we can see the combination of video communication and the visualization of a document, as an early approximation of the current videoconferencing tools for sharing documents.
3.1.2. RQ2: What specific aspects of video negotiations have been studied?
There is a broad range of variables studied in the 13 articles, either related to the negotiation or other characteristics of the interaction and also focusing on either objective or subjective measures, as summarized in Table 2. Most studies analyze at least one objective negotiation variable being economic outcomes or negotiation time the most researched and complemented with profit distribution or amount of consensus change. The authors also usually combine it with subjective negotiation variables like process or outcome satisfaction, power, trust, or other perception-related variables.
We can also find another type of variables that are not related to negotiation but to the interaction itself or the way it was carried out. For example, we find verbal fluency of videoconference communication or screen size of the device used among the objective variables studied. Regarding the other subjective variables, several articles analyze the perceived richness of the communication media used, the cognitive effort required to use a certain medium or ease of use of technology and the engagement of the negotiation partner with one’s environment.
Table 2
Media and variables studied per article
Nr. | Article | Media studieda | Negotiation variablesb | Other variablesc |
Objective | Subjective | Objective | Subjective |
1 | Short (1974) | FTF – V – A | EO | | | |
2 | Turnbull et al. (1976) | FTF – V – A | EO | P, OP | | |
3 | Daly-Jones et al. (1998) | V – A | | OP | VF | CE |
4 | Suh (1999) | FTF – V – A – IM | EO, NT | S | | MR |
5 | Wachter (1999) | FTF – V – A – IM | EO, PD | S, P, T | | |
6 | Purdy et al. (2000) | FTF – V – A – IM | EO, NT, PD | S, OP | | MR |
7 | Mennecke et al. (2000) | FTF – V – A – IM | NT, CC | | | MR |
8 | Schweitzer et al. (2002) | V – A | EO | T, OP | | |
9 | Hausen et al. (2006) | FTF – V – A – IM | | OP | | |
10 | Swaab & Swaab (2009) | V | EO | OP | | |
11 | Wang & Doong (2014) | FTF – V – IM | NT | S | | CE |
12 | Kurtzberg et al. (2018) | V – IM | EO | T | SS | |
13 | Kornfield et al. (2021) | V – R | PD | P | | PE |
aMedia studied – FTF: Face-to-face, V: video, A: audio, IM: instant messaging, R: robot bNegotiation variables – EO: economic outcomes, NT: negotiation time, PD: profit distribution, CC: consensus change, S: process and outcome satisfaction, P: power/dominance, T: trust, OP: other perception variables cOther variables – VF: verbal fluency, SS: screen size, MR: perceived media richness, CE: cognitive effort/ease of use of technology, PE: partner’s engagement with the local environment |
3.1.3. RQ3: How has video negotiation evolved throughout the years?
The articles in our study date from 1974 to 2021, so we can see an evolution of the technologies used in these almost 50 years. Whereas the earlier articles stressed the fact that video communication also included audio communication (Short, 1974; Turnbull, Strickland, and Shaver, 1976), the latest already assume that everyone knows how to communicate via videoconference (Kurtzberg, Kang, and Naquin, 2018; Kornfield, Rae, and Mutlu, 2021).
In the beginning, we find that there was a distinction between video communication or videoconferencing, which referred to the use of a TV monitor (Short, 1974; Daly-Jones, Monk, and Watts, 1998), and computer conferencing or desktop conferencing, which meant using a computer (Suh, 1999; Wachter, 1999). Daly-Jones et al. (1998), for example, already describe different technologies associated with videoconferencing and “media spaces”. Around 2000, researchers start using “video-based media” to refer to videoconference or videophone (Mennecke, Valacich, and Wheeler, 2000) and also teleconference (Schweitzer, Brodt, and Croson, 2002). Other concepts such as multimedia or interactive media group together videoconference and instant messaging communication (Hausen, Fritz, and Schiefer, 2006). Videoconferencing might also be used equivalently to a negotiation support system (NSS) but only in the case that this software is used as a video communication tool, being a passive technology, without other negotiation-enhancing features (H.-C. Wang and Doong, 2014). However, NSS usually involve the use of other features that aim at reducing the negotiator’s cognitive effort by helping to structure the negotiation process, evaluate different alternatives, or even predict the other party’s moves (Kersten and Lai, 2007). We see in 2018 that video-based interactions can additionally mean the use of a smartphone (Kurtzberg, Kang, and Naquin, 2018), being more recently described as “a well-established mode of distance communication” (Kornfield, Rae, and Mutlu, 2021).
Analyzing the different technologies used in the relevant articles, we can see an evolution in the hardware and software used, the screen size, the concern for connection speed, and the familiarity of users with the technology. Although these variables are usually just described when explaining the experiment researchers performed (except screen size), they are key to understanding the evolution of videoconference negotiations and the comparability of the studies performed throughout the years.
3.2. Videoconference negotiations variables’ framework
With the study of the defined research questions, we saw that researchers generally study the impact of a negotiation carried out through a certain communication medium on a set of negotiation or communication-related variables. The technological conditions under which a videoconference is performed are mostly noted but are not the object of the study itself or are not controlled for. However, those technological conditions and their evolution are key factors when we want to compare different studies. That is why, we propose a framework to classify the different variables we found in this SLR or other variables that might be studied in future research about videoconference negotiations, as displayed in Fig. 2. We can classify the variables according to their field of study in negotiation, communication, and technological variables, and also according to the data used to measure them, in either objective or subjective variables. This way, we stress the importance of technological variables in this specific part of negotiation research as we integrate the business management and the computer science perspectives in this human-computer interaction approach that is a videoconference negotiation.
3.2.1. Negotiation variables
Starting with the objective negotiation variables, economic outcomes is the most studied one and it is measured as profits or payoffs by adding up the scores set for the negotiation task. The authors find that negotiation through video leads to higher profits than audio (Short, 1974; Turnbull, Strickland, and Shaver, 1976), also than IM (Kurtzberg, Kang, and Naquin, 2018), and similar profits to FTF (Purdy, Nye, and Balakrishnan, 2000; Turnbull, Strickland, and Shaver, 1976). Some specific variables have an effect on economic outcomes, as scores are higher when negotiating consonant to one’s personal views (Short, 1974), with eye contact when being female (Swaab and Swaab, 2009) and without eye contact when being male (Swaab and Swaab, 2009). Other studies do not find economic outcome differences when compared to FTF, audio, and IM (Suh, 1999).
Referring to the negotiation time, also named as the time to reach an agreement or negotiation efficiency, video negotiation takes either the same time as FTF (Mennecke, Valacich, and Wheeler, 2000; Suh, 1999; H.-C. Wang and Doong, 2014) or more than FTF (Purdy, Nye, and Balakrishnan, 2000). Compared to audio, the results are contradictory, as video is found to take more time (Purdy, Nye, and Balakrishnan, 2000; Suh, 1999) or less time (Hausen, Fritz, and Schiefer, 2006; Mennecke, Valacich, and Wheeler, 2000). Compared to IM, we can find consistent results stating that video negotiation takes less time (Hausen, Fritz, and Schiefer, 2006; Mennecke, Valacich, and Wheeler, 2000; Purdy, Nye, and Balakrishnan, 2000; Suh, 1999; H.-C. Wang and Doong, 2014).
Another measure related to economic outcomes is the difference between the negotiation profits achieved by both negotiators, as in profit inequity or profit distribution (Purdy, Nye, and Balakrishnan, 2000) or contract balance (Wachter, 1999). This variable also gives us information about the negotiator’s behavior, being more cooperative, then the difference is smaller and more competitive when the difference is larger (Kornfield, Rae, and Mutlu, 2021; Purdy, Nye, and Balakrishnan, 2000). Indeed, Purdy et al. (2000) find a more competitive behavior in video than in FTF negotiation. However, when comparing video to robot negotiation, Kornfield et al. (2021) do not find a significant difference. However, when negotiating through videoconference, negotiation outcomes appear to be more equal between negotiators when users are perceived to be geographically close and less equal when they are geographically far (Kornfield, Rae, and Mutlu, 2021). Regarding differences between female and male negotiators, Wachter (1999) does not find a significant difference.
Also related to the negotiation profits, we can find the variable measuring the amount of consensus change, calculating the difference between the payoffs of the initial allocations made by negotiators and the payoffs of the final agreement allocations. In this case, Mennecke et al. (2000) cannot find a significant difference between video, FTF, audio, and IM.
Concerning the subjective negotiation variables and focusing on satisfaction with the negotiation process, studies find that satisfaction in the video condition is the same as FTF (Suh, 1999; H.-C. Wang and Doong, 2014) and is higher than in audio (Suh, 1999). Compared to IM, we find contradictory results as Wang & Doong (2014) find satisfaction to be higher and Suh (1999) to be lower.
The satisfaction related to the negotiation outcome is found to be similar between video and FTF (Purdy, Nye, and Balakrishnan, 2000) or no significant differences between FTF, video, audio, and IM are found (Suh, 1999).
Power or the dominance of one user in the negotiation appears to be easier to exert for women in video and audio than in FTF; as there is more physical distance between negotiators (Wachter, 1999) and is found not significant comparing the interaction through video or a telepresence robot (Kornfield, Rae, and Mutlu, 2021).
Regarding trust, although Wachter (1999) does not report conclusive results on the video condition, Kurtzberg et al. (2018) find higher levels of trust when the negotiation is performed through videoconference than through IM. More specifically, they find that trust partially mediates the relationship between the communication medium used and the negotiation outcome. Although trust is not a requirement for a higher negotiation outcome, they argue that trust makes the integrative potential of a negotiation easier to achieve. While they do not find dis-trustworthy behavior in their study, they point out that future research should explore trust and distrust in videoconferences. Also, Schweitzer et al. (2002) find a higher level of trust in videoconference with respect to audio negotiation when studying deception, as their users are more able to monitor the effect of the lies they are telling. In general, they find that trust decreases when a person is a target for a lie, but they find that trust increases when a person is a target for a lie in the case of monitoring-dependent lies and visual access. The negotiator who is lying is especially attentive to the other person’s reactions because he/she is monitoring how the other handles the lie, to avoid capitulation risk. This creates a high level of (fake) trust, similar to the trust experienced by negotiators that do not tell monitoring-dependent lies. Actually, Schweitzer et al. (2002) point out that visual access may help negotiators use deception more successfully and increase the likelihood that they will use deception.
Considering the other relationship-related variables, we can highlight Daly-Jones et al. (1998) studying interpersonal awareness, meaning the importance of being aware of the attention that the other person is paying to the conversation, for example. When the negotiation is performed via video and not audio, interpersonal awareness is higher. This means that one negotiator can monitor if the other one is focusing on what the first one is saying or active listening rather than concentrating on something else. However, they point out that this increased attentional status might be due to the novelty effect of using video communication back in 1998 or the fact that there is a discrepancy in eye contact, which can make users have to focus more. They also argue that those barriers will be overcome with time, as people will get used to video communication.
The desire for future interaction does not appear to be significantly different in the video, FTF, and audio negotiations, it is only found to be higher than in IM (Purdy, Nye, and Balakrishnan, 2000). Concerning the development of a shared understanding during the negotiation, meaning that negotiators take into account the other parties’ interests to find a common viewpoint, Swaab & Swaab (2009) find different results depending on the negotiator’s gender. For females, they find that eye contact is needed to develop a shared understanding that will bring higher negotiation outcomes. However, for males, this shared understanding is better developed without eye contact, leading then to higher negotiation outcomes.
3.2.2. Communication variables
In terms of the objective communication variables, we can highlight Daly-Jones et al. (1998) studying verbal fluency, measured by the number of turns spoken, the number of words used in every turn, overlapping speech, and vocal or visual backchannels. The authors find that it is not significantly higher in video communication than in audio.
Regarding the subjective communication variables, some studies have focused on the media richness theory (Daft and Lengel, 1984, 1986) and evaluated if the negotiator’s perceptions of the media used corresponded to the theory’s predictions. Video appears to be a richer medium than audio (Suh, 1999; Mennecke, Valacich, and Wheeler, 2000), and IM (Mennecke, Valacich, and Wheeler, 2000) but its richness is similar to FTF (Purdy, Nye, and Balakrishnan, 2000; Mennecke, Valacich, and Wheeler, 2000).
Also, Daly-Jones et al. (1998) analyze interpersonal awareness, detecting that it is higher when the negotiation is performed via video and not audio. At last, Kornfield et al. (2021) are also concerned about the partner’s engagement with the local environment. As they compare videoconference to a telepresence robot, they observe that partners negotiating via videoconference are less engaged with the local environment.
3.2.3. Technological variables
Starting with the objective technological variables and regarding the hardware used, we can see that in the first articles, a closed circuit television with a TV monitor was used for the video negotiation (Short, 1974; Daly-Jones, Monk, and Watts, 1998). After that, in 1999–2000 negotiators use a videophone (Wachter, 1999; Mennecke, Valacich, and Wheeler, 2000), and the use of desktop computers also began and carries on until nowadays (Suh, 1999; Purdy, Nye, and Balakrishnan, 2000; Schweitzer, Brodt, and Croson, 2002; Hausen, Fritz, and Schiefer, 2006; H.-C. Wang and Doong, 2014; Kornfield, Rae, and Mutlu, 2021). We can also see the use of newer versions of personal computers such as laptops in 2018, where also the first study with a smartphone appears (Kurtzberg, Kang, and Naquin, 2018). In the latest article, the use of a desktop computer is compared to a telepresence robot (Kornfield, Rae, and Mutlu, 2021).
If we focus on the software used in the negotiations performed via computer, when specified, an early Intel ProShare Video System 150 is used by Suh (1999); and own software is developed for the studies of Hausen et al. (2006) and Wang and Doong (2014). It is after 2018 with Kurtzberg et al. (2018) and also Kornfield et al. (2021), that the broadly known software Skype is used in videoconference negotiations.
Among the studies that specify screen size, we can find either very small screens such as videophone’s 3-inch screens (Wachter, 1999; Mennecke, Valacich, and Wheeler, 2000) or smartphone screens (Kurtzberg, Kang, and Naquin, 2018) or relatively large or very large screens, from 17 to 27 inch to be exact (Short, 1974; Daly-Jones, Monk, and Watts, 1998; Purdy, Nye, and Balakrishnan, 2000; H.-C. Wang and Doong, 2014; Kurtzberg, Kang, and Naquin, 2018; Kornfield, Rae, and Mutlu, 2021). Mennecke et al. (2000) already point out that, although only limited research has been performed at that point referring to video communication and screen size, there are findings stating that user’s ratings of small screens are more negative than those of larger screens (Duncanson and Williams, 1973). Indeed, Kurtzberg et al. (2018) focus on the effect of screen size on negotiation performance, finding that larger screens lead to higher individual and joint gains. In fact, the highest joint gains are achieved when both negotiators use large screens.
Another interesting finding showing the evolution of videoconferencing is the concern of some studies for the speed of their connections or the frame transmission. For example, Suh (1999) uses a connection transmitting 15 frames per second, whereas Mennecke et al. (2000) there is a 2–5 frames per second refresh rate. Schweitzer et al. (2002) use a 112–128 kbps line and Wang and Doong (2014) make sure all their users have the same high network speed to perform the experiment. However, in the two latest studies, namely Kurtzberg et al. (2018) and Kornfield et al. (2021), the use of Internet already seems to be normal, and there is no concern about the transmission speed anymore. Indeed, researchers already note in their limitations that the transmission speed may have affected their results. Earlier articles such as Daly-Jones et al. (1998) already analyze different video systems showing concern about the speed of the video links and discussing that there are bandwidth restrictions that impact its transmission clarity. Also, Purdy et al. (2000) argue that with the growth of internet bandwidth, also technologies will evolve and combinations of different media like videoconferencing with instant messaging will be developed, which is an area of future research. Furthermore, Mennecke et al. (2000) also identify that the slow refresh rate of their videophone needs to be taken into account when analyzing the use of that medium.
Referring to the subjective technological variables, the cognitive effort is measured in two studies, finding that videoconference demands more effort than audio negotiation (Daly-Jones, Monk, and Watts, 1998); but that when the task matches the medium used, for example, when the task is less analyzable and the negotiation is performed through videoconference or FTF, the cognitive effort is lower (H.-C. Wang and Doong, 2014). Similarly, Kornfield et al. (2021) measure the ease of use of the technology, finding that videoconference is perceived to be easier to use than the telepresence robot.
With the evolution of videoconference systems, also the familiarity of users with the technology varies. For example, In Suh’s study (1999), their participants have very little experience with videoconferencing, as they state in their limitations, and they did not control for possible novelty effects in their experiments. Mennecke et al. (2000) also point out that the experience of users with the video medium needs to be taken into account. In their experiment, negotiators had more experience with communication via traditional channels such as FTF and telephone rather than videoconference. Therefore, they also indicate in their limitations that this should be considered to interpret their results.
Similarly, in Purdy et al. (2000), study participants did not have any prior experience with videoconferencing. However, we can see a shift after 2006 as in Hausen et al. (2006), one of the assumptions of their study is that users are familiar with the handling of the communication system designed. Although Swaab & Swaab (2009) do not specify if the users are acquainted with the system, they mention the term “regular videoconference”, indicating familiarity. The concern of familiarity is also present in Wang & Doong (2014), as they question their participants before the experiment, finding that 122 out of 144 users have online negotiation experience. Another change can be noticed after 2018, as in Kurtzberg et al. (2018) the familiarity with videoconference or video devices is already so high that they even allow the participants to use their own laptop and smartphone devices for the study. Analogously, to Kornfield et al. (2021) communicating via videoconference is described as mainstream, where participants are used to collaborating remotely with geographically dispersed professional teams, online communities, or even romantic relationships.
Indeed, the use of new technology might create a “novelty effect” that can distort results. For example, Daly-Jones et al. (1998) point out the enthusiasm users show in their subjective ratings in the video condition and Suh (1999) notes that video participants are using a “state-of-the-art” technology which might explain a higher process satisfaction. Also, Mennecke et al. (2000) state that the introduction of a novelty element, for example, video as a new communication medium, changes the negotiation process, as users need to first figure out other processes before focusing on the negotiation itself.
Moreover, Purdy et al. (2000) specify that the introduction of a new medium means that users need to compensate for the reduction of conventional nonverbal and para linguistic cues; which might lead to a higher joint outcome, which in this case is similar to FTF. We can also see this novelty effect in Kornfield et al. (2021) but this time the new medium is a telepresence robot instead of videoconferencing. The authors point out that negotiators’ perceptions might be altered by the enthusiasm of communicating via such a new medium.
The technological evolution of videoconference negotiations in terms of hardware and software used, screen sizes, connection speed, user familiarity with technology, and novelty effects, makes that one cannot directly compare studies performed at different points in time.