General discussion

Our study showed the influence of tonal expectations on pitch perception with three experimental tasks: the pitch of related target tones was more accurately judged (i.e., via tuning/mistuning judgments) than the pitch of less-related target tones, even if related targets were judged more in-tune overall (Experiment 1); a correctly tuned target tone was processed faster when tonally related to the context than when less related (Experiment 2); and discrimination sensitivity for slightly mistuned tones was improved when these tones were tonally related to the context (Experiment 3).

The influence of Western tonal context on pitch perception had been shown for the detection of mistunings (Francès, 1958; Lynch & Eilers, 1992) and subjective pitch judgments (Warrier & Zatorre, 2002). This previous research contrasted Western tonal contexts to non-Western contexts (e.g., Lynch & Eilers, 1992) or to single-tone, repeated tone, and random contexts (Warrier et al., 1999; Warrier & Zatorre, 2002). Our study investigated the influence of tonal context on a more fine-grained level within a tonality: differences in pitch perception were observed between tones inside a tonal context, notably by comparing target tones close in tonal relatedness. By using pairs of almost identical tonal melodies, our data suggest that knowledge-based expectations can elicit different degrees of facilitation while remaining inside a tonal context. The findings indicate the strength of tonal knowledge in nonmusician listeners (i.e., listeners without explicit musical background) and stress the importance of implicit knowledge in auditory perception, notably via the influence of tonal expectations on the processing of low-level sensory features like pitch.

Psychoacoustic research on contextual pitch perception has shown improved pitch processing due to direct cues (Greenberg & Larkin, 1968; Scharf, 1998) and indirect cues (Hafter et al., 1993; Howard et al., 1984). Expectations based on bottom-up patterns (notably following Gestalt properties) also exist in music perception (Narmour, 1990; Schellenberg et al., 2002). Their influence on pitch processing has been investigated by Jones, Johnston, and Puente (2006): contextual patterns of pitch intervals influence the detection of a change in pitch height of the pattern’s penultimate tone. In our study, these pattern-like influences (e.g., linked to pitch proximity or continuity) have been kept constant between the melodies of a pair in order to focus on the influence of listeners’ tonal knowledge. Our study thus took the investigation of central factors in pitch processing one step further by showing an effect of tonal knowledge; that is, a top-down effect of a set of abstract features that are not contingent on the experimental session and the immediately preceding tone pattern.

Some top-down influences on pitch processing have been reported with a cued 2AFC task (Green & McKeown, 2001, Experiment 3). The percentages of valid cues (i.e., cues that had the same frequency as the signal) were either 75% or 25% (i.e., respectively informative and uninformative about the frequency of the signal). The observed cueing effect was larger when cues were informative. The authors suggest that listeners were intentionally trying to focus on the frequency of the cue when it was likely to predict the signal frequency (resulting in a more accurate pitch processing). These top-down processes were contingent to the experimental session, and deliberate. On the contrary, top-down processes in our study refer to tonal knowledge, which results from a long exposition to the statistical regularities of Western music stored in long-term memory.

Another difference with Green and McKeown’s study is that in our study listeners’ tonal knowledge did not provide direct information about the task and cannot serve for the development of task-related strategies. The fact that tonal relatedness was irrelevant to the tasks pleads for the automaticity of the induced top-down processes. This is in agreement with harmonic priming data showing that chord processing is influenced automatically by schematic expectations despite conflicting veridical expectations (Justus & Bharucha, 2001; Tillmann & Bigand, 2004).

In psychoacoustic studies, contextual pitch perception effects have been explained in terms of attentional bands or filters, which are formed around the cued frequency: a cue draws listeners’ attention to a narrow frequency band centered on the cue frequency, and detection performance is best at the cue frequency and progressively decreases with increased frequency distance between cue and target (Greenberg & Larkin, 1968; Scharf, 1998; Scharf, Quigley, Aoki, Peachey, & Reeves, 1987). Attentional bands can be multiple in case of multiple cues (e.g., Scharf, 1998), and can be based on specific relations (e.g., musical fifths; Hafter et al., 1993) to a cue. Although our study differs in many aspects from these psychoacoustic studies, the concept of attentional filters may be applied here: tonal knowledge may lead to an attentional band centered on the tonic pitch, providing improved processing accuracy for the tonic pitch (in comparison to the subdominant pitch). In our experimental material, the pitch of the targets differed between melodies, and the pitch of the tonic depended on the key instilled by the melodic context. The observed influence of tonal relatedness on pitch processing thus suggests that attentional bands can change from trial to trial on the basis of the contextual tonal center. Consistent with this potential framework, the relevance of attentional bands for music perception has been previously discussed by Bharucha (1996) for the phenomenon of melodic anchoring.

Our data showed top-down influences on pitch processing in musical contexts. Since performance in behavioral tasks reflect the outcome of several processing stages, the question remains whether the performance benefit is a direct consequence of knowledge-based processes on low-level, perceptual processes, or whether it is mediated by attentional processes (which then influence perceptual processes), or whether it is a consequence of decisional or response-related processes or whether it is a mixture of these processes. This question is not specific to music perception, but applies also to visual and auditory perception. For speech perception, the question is whether the influence of lexical knowledge is restricted to post-perceptual decision processes or extends to pre-lexical perceptual processes (see McClelland, Mirman, & Holt, 2006; McQueen, Norris, & Cutler, 2006; Mirman, McClelland, & Holt, 2006). Using a behavioral approach, Samuel (2001), for example, provided evidence for the influence of top-down, lexical processes on the perceptual phenomenon of selective adaptation based on repeated sound processing. To separate perceptual sensitivity from response bias, signal detection theory (SDT) was applied to the analyses of discrimination performance. Data from sensitivity measures have been taken as evidence for the influence of attention on early perceptual processes in visual perception (Correa, Lupiáñez, & Tudela, 2005; Hawkins, Hillyard, Luck, Mouloua, Downing, & Woodward, 1990; Luck, Hillyard, Mouloua, Woldorff, Clark, & Hawkins, 1994). For example, temporal expectations increased sensitivity in a perceptual discrimination task without changing the response criterion (Correa et al., 2005). In our study, the SDT analyses using area scores (Experiment 1 and particularly Experiment 3) indicate top-down influences (here based on tonal knowledge) on bias-free discrimination performance, and thus suggest influences on perceptual levels.

While it is difficult to provide evidence for specific processing levels by means of behavioral methods, measuring neurophysiological responses to expected and unexpected stimuli can help to unveil the levels at which top-down processes modulate event processing. Research on visual and auditory perception has shown that early electrophysiological markers reflecting perceptual processing can be influenced by top-down processes (e.g., Correa, Lupiáñez, Madrid, & Tudela, 2006). In the visual modality, spatial orienting of attention influenced not only sensitivity in detection performance, but also the amplitude of the N1-P1 complex in the ERP data (Luck et al., 1994). Similarly, temporal attention can influence early visual processing, as reflected in the modulation of N1 components (Correa et al., 2006). For auditory perception, early processing stages (i.e., N1 components) have been modulated by spatial as well as temporal attention (e.g., Hillyard, Hink, Schwent, & Picton, 1973; Lange, Krämer & Röder, 2006; Lange, Rösler & Röder, 2003). Selective attention can influence auditory processing not only at the level of the auditory cortex (including the sharpness of the tuning curves, Kauramäki, Jääskeläinen & Sams, 2007), but also at the levels of brainstem and probably the cochlea (Giard, Fort, Mouchetant-Rostaing, & Pernier, 2000).

This brief review of neurophysiological studies shows that top-down influences on perceptual processes have been mostly studied in terms of the effects of attentional processes and expectations on the processing of temporal and spatial information. For music perception, links between musical structures and attention have been proposed in a dynamic theory of attending (Jones, 1987; Jones & Boltz, 1989). In this theory, musical events that are important in the melodic and temporal structure influence the modulation of attention over time, directing attentional resources to the processing of structurally important events. In the priming paradigm, more attentional resources would be allocated to tonally related than to less-related targets, resulting in facilitation effects (see Bigand et al., 2001; Escoffier & Tillmann, in press). The influence of tonally induced top-down expectations could thus modulate perceptual processes such as pitch processing via attention. Our study provided behavioral data suggesting that top-down, knowledge-based processes modulate pitch processing in music perception. Electrophysiological evidence for differences in N1 amplitude influenced by tonal relatedness has been recently reported for tones in music-like contexts, even within the constraints of an MMN paradigm, albeit in an active listening situation (Krohn, Brattico, Välimäki, & Tervaniemi, 2006). Using ERPs with material controlled for sensory expectations, like the melodies used here, will allow us to further investigate the perceptual processes influenced by cognitive expectations - expectations linked to listeners’ tonal knowledge.