Two-hundred adults (69 females, 3 non-binary) participated in this study. They were mainly university students, with a mean age of 23.86 years (SD = 4,65). One hundred and seventeen participants reported to have received some music training in their life (M =11.08 years, SD = 4.66, years range = 3 – 22). All participants had a high level of education, with 112 individuals having completed high school, 61 having a bachelor's degree, 26 a master's degree and one person a higher degree.
The minimum number of participants required to reach 80% of statistical power (predicted with g-power,  for a repeated-measure ANOVA with within 3 within factors, was 73.
Music. The music excerpts we selected had already been used in previous works and categorized with the Geneva Emotional Music Scale (GEMS) [5,8,31]. The GEMS is a domain-specific scale on musical emotions consisting of nine second-order categories (Wonder, Transcendence, Nostalgia, Tenderness, Peacefulness, Joyful Activation, Power, Tension and Sadness) and three first-order emotion factors (Sublimity, Vitality, and Unease). We used the GEMS scale as it is based on categories (e.g., tension, nostalgia) rather than on dimensions (i.e., valence and arousal). This distinction seems important as the affective state of participants was found to facilitate the recognition of emotional congruent stimuli (i.e., words) when they both (i.e., the affective state and the word) belonged to the same emotional category, but not when they had the same valence . We selected 15 music excerpts, equally dived into three categories: two first order categories of the GEMS (i.e., vitality and sublimity), and one second-order category (i.e., tension). The first-order category “unease” (that together with “tension” includes also “sadness”) was represented by tension only, rather than both tension and sadness, because the intercorrelations between the two have been found to be relatively low . Moreover, we decided to use the higher-order categories sublimity, vitality, and unease (represented by tension here) rather than the nine more specific emotional categories for two reasons: first, the narrower the GEMS emotions the more they could be specific to music (e.g., “transcendence” could be hardly elicited by a picture). Secondly, it would have been near impossible to achieve an experimentally viable balance of items/trials, symmetry of picture presentation, and duration of the sessions using nine categories.
Loudness was equalized across all excerpts. The complete list of the excerpts used is reported in the supplemental material.
Pictures. In order for music and pictures to be matched in terms of emotion, the pictures were taken from the EmoMadrid  database and were pre-tested in a pilot study, in which we asked 98 subjects (divided into four groups, each to evaluate 50 pictures) to categorize each picture according to the emotions they were representing/inducing. The rating was done on the GEMS-9, with the addition of the category “neutral”. We then selected 80 pictures that were equally disturbed to the four categories (i.e., tension, sublimity, vitality, and neutral).
The memory experiment was designed specifically for the present study with the program Psychopy . The first part consisted of the presentation of the music and the pictures, and had a total of 15 trials. Each trial consisted of listening to a classical music excerpt for 45s, and then looking at four pictures appearing on the screen for 2s. One picture was always congruent with the emotion expressed by the music, two pictures were emotionally incongruent with the music, and one was neutral. There were 15 trials, equally distributed into the three categories of the GEMS (i.e., tension, sublimity, vitality) that belonged to three separate blocks. The order of the blocks was randomized across subjects as well as the order of the excerpts within each block. The pictures presented as congruent and incongruent were also randomized across subjects (e.g., picture 1 could be congruent for one subject, and the incongruent for another subject). An example of a trial is depicted in Fig. 1. After the first part finished, and before the second part began, participants had to solve 4 arithmetic equations, and answer questions about the familiarity with the music and how much they liked it. In the second part of the experiment, participants had to perform the recognition task. This task consisted of presenting 61 different pictures, one at a time, at the centre of the screen, and the subject had to decide whether each picture had already been seen in the first part or not. Fifty percent of pictures were already presented, and 50% were new. Among those already presented, 15 were the congruent images and 16 were the incongruent images, equally distributed to the four categories (i.e., unease, sublimity, vitality, and neutral). Accuracy was computed as the percentage of pictures correctly identified as “already seen” (separately for the congruent and incongruent pictures) and “new”. The experiment script and the stimuli used are available on the OSF platform.
Affective state. To measure the current affective state a short form of the Positive and Negative Affect Scale (i.e., I-PANAS-SF) developed by Thompson (2007) was used in this study. It contains 10 items about different affective states (e.g., “Right now I feel attentive.”), and the subject has to answer to each of them on a five-point Likert scale from “not at all” to “extremely”. The score of the scale is computed separately for positive and negative affectivity.
Emotional intelligence. The Emotional Intelligence Scale  was used to assess the subjects' emotional intelligence. The scale includes 33 items that should be rated using a five-point Likert scale (1 = "I strongly disagree" to 5 = "I strongly agree"). An example item is as follows: “I know what other people are feeling just by looking at them.”
Age, gender, education, and musical background (i.e., musical status, years of music training) were assessed with a questionnaire
Music liking and familiarity
We asked participants to rate on a scale from 1 to 4 how much they liked the musical excerpts presented, as previous studies suggested that this variable can influence emotional experience . Furthermore, we assessed whether participants were familiar with the musical excerpts, as also familiarity seems to play a role in musical emotions experience .
On the starting page of the Survey platform (LimeSurvey 2.64.1), participants were informed of the nature of the tasks. This was followed by an informed consent statement that participants had to agree to in order to proceed with the study. Then, demographic information was collected, followed by the emotional intelligence and the PANAS questionnaires. At the end of the questionnaires, participants were redirected to the platform Pavlovia, where the memory experiment began. The entire duration was around 25 minutes.
The current study was approved by the ethics committee of University of Innsbruck.