According to the subjective description of participants after the user testing, 39 out of 43 participants experienced the binocular rivalry phenomenon when dichoptically viewing the puppy image and human image at the same time.  A summary of the emotion under the three stimuli chosen by the 39 participants who experienced the binocular rivalry is presented in Figure \ref{223093}. The bars show how often each of the 7 options was chosen under different conditions. For all three stimuli, the participants tended to choose the target emotion (happiness) that was supposed to be created in Kuleshov Effect more frequently than other alternative options. If perceiving the dog image had no effect on people's interpretation of the man's emotion, each of the 7 options should be selected with similar possibilities that were supposed to be around 0.167 (1/7). But as shown in Figure \ref{223093}, the option "happiness" in montage, single image, and the binocular rivalry was chosen with the 0.605, 0.443 and 0.407 frequencies respectively. The 95% confidence intervals of the percentage of happiness in the three conditions were also above the 0.167 baseline. However, the option "other" under the binocular rivalry view condition was also selected more than the average with a frequency of 0.368. On the contrary, when the participants viewed the montage and the single image, no other options were chosen significantly more often than 16.7% except happiness.
As for other worth-mentioning subjective feedback, when viewing the montage of the man and the dog in VR, most participants stated that they felt the man changed to the dog. This subjective feeling was different from that in the Kuleshov Effect experiment in 2-dimensional motion pictures where people would more likely think, like in continuity editing, that the man was looking at the dog \cite{Barratt_2016}. In the single image viewing scenario, most participants could spot that it was an image composed of a man and a dog overlaying together.

Discussion & Conclusion

Before going to the discussion, a few limitations of the experiment need to be declared. Researchers have already proven that the dichoptic presentation would create the visual fatigue and the visual discomfort to the eyes \cite{Lambooij_2009}. However, this paper fails to discuss the influence of such fatigue and discomfort to the result of the user testing. Moreover, because the user study is designed as the single-trial between-subject experiment, it is prone to noise in the data. More trails with different images need to be introduced in the future.
Based on the results of the current experiment, bold interpretations could be made that (1) dichoptically viewing two images could actually yield to more meanings than watching the two images in isolation under the binocular rivalry scenario, and (2) it would affect people's perception differently than viewing the same pair of images sequentially. 
Interestingly, for nearly 40% of the participants, the new idea created in the binocular is similar to the thought they get when watching the same content in motion picture as a montage. There are usually two types of rationale for the Kuleshov Effect: one is that the two shots are related in the purely cognitive level (i.e., the man and the dog exist independently and are only connected in the audience's mind)(Eisenstein 1949) while others believe that the shots are actually connected spatially or/and temporally (i.e., the man is looking at the dog because of the continuity editing effect)\cite{Plantinga_1997}. Although Barrate et al. has proved that the Kuleshov Effect does exist in continuity editing, the subjective feedback and the statistic results of the current experiment in the montage condition show that the Kuleshov Effect could still exist on the cognitive level. Indeed, the continuity editing is hard to achieve in VR by simply swapping the content, it does not illustrate that montages cannot be used in VR and, if used, they could still create some new meanings in storytelling.
Since "other" is another option frequently chosen by participants in binocular rivalry, which did not appear in the other two viewing conditions, the following interpretations could be inferred. Combining Eisenstein's argument -- the interpretation of the montage comes from the conflict of the original thoughts derived from the shots -- people often choosing "other" is probably due to the reason that no confirmed idea idea was initially created. In other words, during the binocular rivalry, people may not read (cognitively perceive) any meanings from either the left eye image or the right eye image even they are constantly seeing them both. This inference is in line with the epistemological theory that our brain is a predictive machine and people's perception is actually a hypothesis with the 100% possibility \cite{Hohwy_2008}. Hohwy et al. argue that in the binocular rivalry, it is because that none of the two hypotheses made by the brain (1. I am looking at a man; 2. I am looking at a dog) could reach the 100% possibility, people are seeing the two images contantly switching. The binocular rivalry will stop if  the brain thinks one hypothesis is 100% possible: a perception is formed (one of the images is read.) However, there is another interpretation of the high percentage of "other" in the participants' choices. As the original 6 options of people's basic emotions are from the research based on animals that actually live in the real world, a dog-man does not exist on the earth after all. It is probably because that a completely new perception is created during the binocular rivalry, and none of the "old" emotion could describe it. Future research is needed to explain why the participants chose "other" frequently.