Discussion
\label{sec:disc}
Several factors might explain our failing at observing any effect of social
facilitation or social inhibition: the small effect sizes of social
facilitation, certain weaknesses in the psychology literature and a broader
publishing bias toward positive result that might mask how brittle some effects
are.
The difficulty of observing social facilitation
What does this failed attempt at reproducing a ‘classic’ result of social
psychology tell us? Beyond possible experimental confounds, our failure at
reproducing these results is likely due to the small effect size of social
facilitation. In their meta-analysis of studies on social facilitation, Bond and
Titus \cite{Bond1983} showed that the overall mean effect sizes are low, ranging
from 0.03 to 0.36. According to Cohen \cite{Cohen1977}, effect size of 0.2
should be regarded as small, an effect size of 0.5 as medium, and 0.8 as large.
complete/develop
Social facilitation or inhibition is as well affected by a combination of
several other psychological effects: the observer effect (also known as the
Hawthorne effect); demand characteristics; cultural conventions; personality orientation; […].
These effects are potential confounds, and adequately accounting for each of
them in the experimental design proved problematic.
For example, Liad Uziel \cite{Uziel2007} compared the effect sizes of studies in the literature that examined personality effects for subjects with negative orientation (trait anxiety and low self-esteem) and
positive orientation (extraversion and high self-esteem). The results suggest that the effect of orientation is higher than the task complexity in determining the performance, that is, the subjects with positive orientation had an improve in performance for both simple and complex tasks in the mere presence of a person, and similarly, those under negative orientation had an impairment in the performance in both types of tasks. However, these results were only based on a few studies that examined the personality effects, hence, the author states that the results cannot be generalised.
how was it done in other studies? was it accounted for at all? ¡- I couldn’t find anything else but the above point
We believe that these comments on, on one hand the small effect sizes, and
on the other hand the complexity of multi-faceted psychological effects,
are applicable beyond the specific case of social facilitation.
complete/develop
Weak methods in older psychology literature
Beyond the caution that must be observed when studying one specific
psychological effect, a broader range
of methodological issues with older research in psychology might explain why
some results in psychology are incorrectly believed to be reliable.
For instance, Bond and Titus \cite{Bond1983} meta-analysis of research on social
facilitation claims to have exhaustively examined every publications prior to
the publication of the meta-analysis itself (in 1983). As a matter of fact, the
oldest study that they refer dates from 1898, and 35 out of the 241 were
published prior to 1965. As such, social facilitation is a good example of an
old, classical psychological effect. It however also hints to the fact that its
characterisation might have relied on weak research methodologies by today
standards. Bond and Titus raise in that regard interesting points: only 100 out
of the 241 studies state that the experimenter was in a different room in the
\alonecondition(and in 96 studies, we know the experimenter was in the room).
This would be seen today as a serious confound. Similarly, Bond and Titus report
that 72.3% of the total participants were undergrad students, pointing to a
possible serious demographic bias.
Biases in scientific publishing: the ‘file drawer’ problem
Coined in 1979 by Robert Rosenthal \cite{Rosenthal1979}, the file drawer
problem refers to the bias introduced into the scientific literature by mainly
publishing positive results, and rarely negative or non-confirmatory results. As
a consequence, an effect could be reported and believed reliable, simply for the
lack of literature showing the contrary. Rosenthal proposes to account for this
problem by reporting in meta-analysis the ‘fail-safe N’ measure: N is the number
of null effects that would be required to make the original result
non-significant. Rosenthal proposes to consider an effect resistant to the ‘file
drawer problem’ of unreported null effects iff the fail-safe N is above \(5k+10\), with \(k\) the number of reported effects.
Bond and Titus report the fail-safe N for some of the effects of social
facilitation. For instance, their meta-analysis show that the performance quantity of
participants for complex tasks reliably decreases in presence of an observer
(even thought the effect size is small). 54 effects are reported, and they note
that the fail-safe N value is 160: 160 is clearly smaller than \(5\times 54+10=280\) and as such, this result could well be subject to the problem of unreported
null effect. The fact that social presence inhibit the performance in complex
tasks is not a robust result in the face of the bias towards publishing only
positive results.
The fail-safe N for the ‘quantity’ for simple tasks is 6,183 and the
‘quality’ for complex tasks is 5,697. That is why, I wrote in the literature review that quantity increases with
simple tasks and quality decreases with complex tasks. So I don’t know how we should add this, otherwise it might be seen as we didn’t put into account the fail-safe N in the first place
A weighted calculation of the fail-safe number has been
proposed \cite{rosenberg2005file} that address some of the concerns with
Rosenthal proposal, and while not systematically reported in the literature,
this metric is a valuable tool for HRI researchers when assessing how robust a
result in psychology is.