1. Search strategy
YouTube was searched in December (07/12/2018) 2018 via the default settings, and all relevant video metadata were documented for later assessment (13). Searches were conducted using the incognito mode to reduce the influence of location and search history of the computer or user on the displayed videos (14). Two search strings, “lumbar puncture medical procedure” and “spinal tap medical procedure”, were entered in the search bar. Quotation marks were used instead of the logic operator “AND” to ensure the concomitant presence of all search terms. This study considered the first 100 videos for each search string for further analysis.
Videos were included if they were presented in the English language and if they were educational in nature. Duplicate videos, videos about lumbar puncture in children, videos about epidural anesthesia and irrelevant videos (e.g. advertising, presentation of a company) were excluded. All videos were fully viewed to confirm that they included the lumbar puncture procedure. If duplicates of an instructional video were detected, only the video that appeared first on the list was retained for further analysis. All videos that were part of a series were considered as a single file.
For the comparison of “lumbar puncture videos” with YouTube instructional videos from other categories additional search terms were applied, namely “CPR instructions” (as an example of a different medical intervention) and “baking a cake instructions” (as an example of a more general instructional video).
2. Data collection
The following video characteristics were documented: number of views, number of likes and dislikes, number of comments, video length in minutes and days since upload. We documented the number of views, likes, dislikes and comments relative to the number of days on YouTube (15). To evaluate the popularity of the videos a Video Power Index (VPI) was calculated (VPI = like ratio * view ratio /100; with like ratio = like*100/(like + dislike) and view ratio = number of views/days). This index combines number of viewing, user interactions and the audience scale across the entire.
Videos were categorized by source into six groups: patient/individual, government/news agencies, university/school, health care professionals (doctor, hospital), standalone health information websites and other (16,17).
3. Video assessment
Four scoring checklists were used to evaluate the videos. All checklists with detailed item description can be found in the supplementary materials.
The Lumbar Puncture Assessment Tool (LumPAT) was used as the golden standard for the assessment of a lumbar puncture procedure (Supplementary Figure 1). The videos were judged in terms of the planning and preparation, the performance of the procedure, the finalization and the used communication. A total of 11 questions were scored, each with a score from 1 to 5. A pass/fail standard of 44/55 was imposed as this score has been validated to be consistent with trained medical observers’ global judgments of pass/fail (18).
Reliability of information was scored from 1 to 4, based on 4 questions (adapted from the DISCERN tool) (Supplementary Table 1) (19).
The educational quality of each video was rated using the five-point Global Quality Score (GQS), modified from Singh et al (Supplementary Table 2) (19).
Attractiveness and understandability (AU) was scored using an eight-point modified PEMAT (Patient Education Materials Tool) score, adapted from the original PEMAT score for the evaluation of the understandability and actionability of print and audiovisual materials (20). Two organizational aspects, three visual aspects and three auditory aspects were scored for their presence (1) of absence (0) (Supplementary Table 3).
4. Statistical analyses
All data were analyzed using SPSS software (SPSS Inc. Chicago, IL, USA, version 25). The median numeric evaluations of the four investigators were used for statistical analysis. Normality was assessed using the Shapiro-Wilk test (21).
Descriptive statistics analyzed the mean and the standard deviation (if normally distributed), the median and IQR (if not normally distributed), the minimal/maximal value and the proportions (if categorical). Finally, linear regression analyses were performed. P-value < 0,05 represents the threshold for statistical significance.