yen-lin.pan@univ-amu.fr
INTRODUCTION
Individuals tend to remember speech addressing emotional experiences,
perhaps due to the use of emotionally-charged words (Kensinger &
Corkin, 2003). The emotion that a word elicits has been frequently
measured and contrasted by vector models, suggesting that each emotion
can be described in terms of two main dimensions: valence and arousal
(Rubin & Talarico, 2009). Valence refers to the degree to which an
emotion is positive or negative; for instance, words like “paradise”
have high, positive valence, while those such as “earthquake” have
low, negative valence. Arousal refers to the intensity that an emotion
activates – a word such as “death” evokes a high level of arousal,
whereas a more neutral word such as “carousel” is associated with low
arousal ratings. In vector models, words with higher arousal tend to be
located at the two extreme ends of valence scale, while words with lower
arousal are rated in the middle of valence scale. Additionally,
negatively valenced words generally show a stronger correlation with
arousal than positively valenced words. This arousal bias to negative
words as well as the u-shaped distribution along arousal and valence
dimensions have been observed across several European languages
(Warriner et al. 2013; Söderholm et al. 2013; Monnier & Syssau 2014;
Stadthagen-Gonzalez et al. 2017) as well as in Mandarin (Yu et al.
2015). These findings reflect a universal pattern of emotion
classification, as revealed by questionnaires where participants
evaluate their perception of a word’s arousal and valence on a numeric
scale.
The impact of emotional valence on word processing has been investigated
through several physiological measures, such as heart rate (Iffland et
al. 2020), skin conductance (Jankowiak et al. 2018) or facial muscle
activity (Niedenthal et al. 2009). Number of studies using
electroencephalography (EEG) have demonstrated specific ERP signatures
evoked by the processing of valenced words presented in written format,
both at initial as well as at later stages (for a review of early
studies, see Citron, 2012). Two early ERP components, the P2 and the
Early Posterior Negativity (EPN), have been frequently observed in
response to these stimuli. The P2 component, peaking at approximately
150 - 300 ms over centro-frontal sites, is characterized by more
positive amplitudes for highly arousing stimuli compared to less
arousing stimuli, reflecting the automatic allocation of attentional
resources to words that elicit emotion (Hajcak et al. 2012). The EPN
response is similar to the P2 in terms of its sensitivity to emotional
content of verbal stimuli and the time window. The two, however, have
distinct polarity and scalp distribution. Specifically, EPN shows larger
negative-going amplitudes for emotionally-valenced words compared to
neutral words, observed mostly over occipito-temporal sites. In
contrast, the ERP components elicited by emotional words during the
later stages of processing, notably the N400 and Late Positivity
Component (LPC) tend to be influenced by task demands. Some studies have
reported reduced N400 effects for valenced words compared to neutral
words when participants performed a lexical decision task (Kanske &
Kotz, 2007; Schacht & Sommer, 2009; Pauligk et al., 2019), an
emotion-color stroop task (Sass et al., 2010) or a gender decision task
(Kanske & Kotz, 2011). These findings thus suggest facilitated lexical
or semantic processing of emotional stimuli. The LPC is a positive
deflection that occurs at a latency of around 500-800 msec post stimulus
onset over parietal regions. Emotionally valenced (positive and/or
negative) words generally elicit a greater response than neutral words
(Carretié et al. 2008; Citron et al., 2013; Herbert et al., 2008;
Hinojosa et al., 2010; Hofmann et al., 2009; Palazova et al., 2011),
reflecting the sustained attention towards a more in-depth evaluation of
the emotional features of a stimulus.
The auditory processing of valenced words has been less widely
documented in comparison to written words. In a seminal study,
Mittermeier and colleagues (2011) reported an early modulation, in the
P2 time window, for valenced words in comparison to simple tones. Using
the same materials as Mittermeier et al (2011) in a combined fMRI/ERP
design, Jaspers-Fayer and colleagues (2012) reported a similar early
modulation of the P2 component, which was coupled with early activation
of the anterior and orbito-frontal cortex specifically for emotionally
laden auditory words. Importantly however, in both studies, these
effects were obtained by contrasting auditory words with negative or
positive valence to either simple tones or meaningless syllables. In
contrast to these results, studies that compared the auditory processing
of valenced words to neutral words rather than to non-linguistic
stimuli, found no evidence of such early modulations. Grass and
colleagues (2016) reported effects of valence 370-530 msec post-stimulus
onset, evidenced by an increased frontal positivity and
parieto-occipital negativity, which the authors suggested to be a mix of
an N400 response and the auditory equivalent of the visual EPN. They did
not find that the modulation of these later components, linked to
lexical-semantic processing, was affected by modulations of the physical
characteristics of the auditory words (volume), which affected the N1-P2
complex. Grass et al. (2016) argued that the auditory response evoked by
the emotional content of words is thus distinct from early auditory
evoked potentials. It is important to note that none of the above
studies manipulated the prosodic contours of the auditory words, which
were produced in a neutral manner. Indeed, the emotion conveyed by
spoken words can be transmitted by both semantic content and by the
speaker’s prosody, and the latter can affect earlier components such as
the P2 (cf. Kotz & Paulmann, 2007). Hatzidaki and colleagues (2015)
reported that valenced words evoked an increased late positivity, coming
in after the offset of auditory stimuli, which resembled an LPC. Rohr
and Rahman (2015), in contrast, did not find a reliable effect of
valence on EEG signatures of auditory word processing at either early or
later stages in pre-defined time windows or electrode sites. Post hoc
exploratory analyses revealed a small but reliable increase in
negativity at central ROI in response to negatively-valenced words from
300-400 msec. To further explore how valence and arousal may affect
processing in the auditory domain, Kanske and Kotz (2011) compared the
cortical response for negatively valenced compared to neutral auditory
words in a task that involved response conflict. Both the ERP and fMRI
results showed an interaction between response conflict and emotion,
with an increased positivity at anterior sites between 420 and 550 msec
and increased activation in the ventral anterior cingulate cortex when
processing conflict in an emotional context compared to neutral context.
Taken together, these studies suggest that the processing of isolated
auditory words pronounced in neutral tone has shown a rather wide range
of cortical response, with none of the reported effects occurring
earlier than 300 msec.
Valence not only has an immediate impact on processing, as indexed by
changes of neural activity, but its longer-lasting effects on our
cognitive functions, especially memory, have also been studied with
behavioral measures (for a review, see Kensinger & Schacter, 2008) and
ERPs. Indeed, people tend to remember more visual stimuli that have high
arousal, emotional valence than that have low arousal, neutral valence,
including images (Jaeger et al., 2022), faces (Johansson et al., 2004),
sentence contexts (Maratos et al., 2001) or isolated words (Leclerc &
Kensinger, 2011). Several ERP studies were also conducted to understand
the neural underpinning of how recognition memory is modulated by
emotional valence. A common experimental design is the “study-test”
paradigm, in which participants are first presented with a set of
stimuli they are instructed to memorize, followed by a test phase during
which they are asked to indicate whether items are old or new. Windmann
and Kutas (2001) used this paradigm under the hypothesis that valence
would bias participants’ recognition memory, leading them to both
correctly identify and falsely recognize more negatively-valenced words
as old, which would in turn impact the ERP response. Their hypothesis
bore out behaviorally (see Inaba et al., 2005 for similar behavioral
evidence). In contrast, valence produced no effect on ERPs prior to 450
msec and only a limited effect on the LPC. Using a similar design, Inaba
and colleagues (2005) reported an “increased positivity” starting at
150 msec and continuing through 700 msec for correctly identified
negative and positive words (new and old) compared to neutral words. The
difference across the two studies lies thus in the ERP signature, which
showed an early effect of valence, most likely related to the N400, in
Inaba et al. (2005) but only a late effect, linked to the LPC, in
Windmann and Kutas (2001). Santaniello and colleauges (2018) employed a
short and long lag repetition priming paradigm to examine the influence
of valence both behaviorally and on ERPs. They demonstrated that,
compared to neutral and positive words, repeated negative words elicited
a reduced N400 in central-posterior regions, suggesting a stronger
episodic trace for these words. Critically, the facilitation was
short-lived as the reduced N400 was significant only for very short lag
repetition. Auditory stimuli, both linguistic (Schirmer, 2010) and
non-linguistic (Alonso et al. 2015), have also been used in behavioral
research to investigate the effects of valence on recognition memory.
Schirmer (2010) showed that emotional prosody of auditory neutral words
modulated the subsequent valence ratings of written words, but did not
increase their recognition accuracy. In sum, previous ERP research on
the effect of valence on the recognition of printed words has produced
inconsistent results. The present study aimed to further address this
question.
To our knowledge, no research to date has tested whether the valence of
auditory words enhanced the subsequent recognition of written words, nor
how such may impact the underlying neural mechanisms. To address these
questions, we conducted an ERP experiment where participants were
presented with positive, negative and neutral words in auditory format,
and were later tested for their recognition of these words in written
format. Based on previous literature, we hypothesized that participants
would show enhanced behavioral recognition for valenced words, compared
to neutral words. In relation to ERPs, previous research on auditory
processing of valenced words has produced mixed results such that no
clear hypotheses can be made. For written words, we predicted that
valenced stimuli would induce an increased early attention, as indexed
by the P2 or EPN. We also predicted facilitated processing to valenced
words, based on not only pre-existing valence norms but also the ratings
of individual participants.
EXPERIMENT 1
The first experiment had three main objectives. The first was to expose
participants to the set of stimuli, presented as individual spoken words
in Mandarin. The second was to reexamine the effect of valence and
arousal on the cortical processing of spoken words as evidenced by ERPs.
Only valance and arousal were manipulated; prosody was neutral for all
stimuli. The third aim was to provide a cross-linguistic validation of
the valence of the auditory stimuli, originally rated in English, by
asking participants to rate each item on a 5 point scale, from negative
to positive, in Mandarin.