Simulations

The goal of the simulations with a sensory model was to track down possible sensory components in our experimental material, which might prevent us from concluding for cognitive priming in melody processing. The used sensory model was Leman's model of auditory short-term memory (Leman, 2000). The rationale of this model is to transform sound stimuli into two auditory images representing respectively the immediate pitch percept (local image) and a more integrated pitch image computed on a longer time scale (global image), and to calculate the correlation between these two images at a given time point t. The main components of this model are depicted in Figure 3 (for computational details, see Leman et al., 2000). The first stage of the model transforms an acoustical signal into patterns of neural firing rate-codes in the auditory nerves. This first stage simulates the filtering processes of the outer and middle ear, the resonance of the basilar membrane (i.e., inner ear), and the coding of the temporal dynamics of the auditory neurons (i.e.,the hair cells). The resulting patterns of firing probabilities are then processed by a temporal model of pitch perception. A periodicity analysis, by means of a windowed autocorrelation function⁷, is performed for each of the auditory channels resulting from the first stage. The resulting periodicity patterns are then summed up over the channels, resulting in a single autocorrelation pattern for each time window. This autocorrelation pattern is a pitch image that represents the common periodicity along the auditory neurons in the frequency region of 80-1250 Hz, which is the range assumed to contain the most important pitch periodicity information (Leman, 2000).

In the short-term memory component of this model, the pitch image represents the input, and an image based on a leaky integrator (called echoic image) represents the output. The leaky-integrated image is updated at each time step by adding a certain amount of the previous image to the new incoming pitch image. The specified echo defines the amount of context that is taken into account. With a very short half-decay time (e.g., a short echo of 0.5s) representing a very short context, a Local Pitch Image is obtained (i.e., an immediate pitch percept). With a longer half-decay time (e.g., an echo of 1.5 s) taking in consideration more of the preceding contextual information, a Global Pitch Image is obtained. Both local and global images are pitch images of the auditory short-term memory.

Finally, the correlation coefficient between the local and global pitch images is calculated at run time. This index, referred to as tonal contextuality, gives the current "tension" of a local pitch image with respect to the global pitch image. Tonal contextuality reflects the degree of similarity between local and global images: higher tonal contextuality values reflect higher similarity between local and global images. The longer echo of the global image integrates the tone events of the context over a longer time period, hence global images contain more residual information of the past tone events than local images. Higher tonal contextuality values thus reflect a better fit of the local pitch information with the global context. In sum, the model takes in consideration acoustic information over time and processes it according to the characteristics of the peripheral auditory system. It does not include any cognitive components, and thus any observed differences between related and less related conditions are based on sensory features only.

Simulations were carried out for the melodic pairs used in Experiment 1 played with either piano tones or pure tones. If the spectral richness of the piano timbre creates sensory differences between related and less-related melodies that go beyond the controlled tone repetition, tonal contextuality values should be higher for related targets than for less-related targets. However, for pure tones, no difference in tonal contextuality values should be observed between related and less-related conditions.

Figure 3. Schematic diagram of the peripheral auditory model used in the present simulations (LPF = lowpass filter, BPF1 to BPF40 = band-pass filters, HCM = Hair cell model, ANI = auditory nerve image, A-C = auto-correlation function, PI = pitch image, EM = echoïc memory, GPI = Global Pitch Image, LPI = Local Pitch Image, TC = Tonal Contextuality).

Notes
7.

This function compares a signal with different temporally delayed versions of it, in order to find the delay that produces the highest correlation value between these two versions. It indicates periodicities, or repetitions, present in the signal. The used time windows are shifted by 60 ms (Leman, 2000).