Evaluation of the reliability, utility, and quality of the lid loading videos on YouTube

It is aimed to determine the utility, reliability and quality of the lid loading videos on YouTube, a video sharing platform. A YouTube searches were made with the keywords ‘Eyelid Loading,’ ‘Gold Weight Implantation,’ ‘Lid Loading for Lagophthalmos’ (without user login, cleared search history, in incognito tab). A total of 75 videos were recorded. Length of videos (seconds), number of views, uploaded source (doctor/health institution/medical channel), number of subscribers, number of likes, time since uploading (days), video content (surgical/theoretical information), type of narration (verbal narration/subtitle) were recorded. DISCERN, The Journal of the American Medical Association (JAMA), and Global Quality Scores of the videos were evaluated and recorded by two experienced oculoplastic surgeons (KSC, HT). After the exclusion criteria, the remaining 46 videos were included in the study. The mean DISCERN score was 25.17 ± 6.88 (very poor quality), the JAMA score was 0.79 ± 0.63 (very poor quality), and GQS was 2.84 ± 1.03 (medium quality). Thirty videos (65.2%) had verbal narration, and 16 videos (34.8%) had subtitled narration. The DISCERN score and GQS were significantly higher in the videos with verbal narration compared to the narration with subtitles (p < 0.05). All three scores were positively correlated with each other. There was also a positive correlation between video length, number of subscribers, and DISCERN score. The videos about lid loading on YouTube are of poor reliability, accuracy, and educational quality. The duration of the video and the type of narration can be kept in the foreground when choosing the video. Experts must review the content that is uploaded to websites like YouTube.


Introduction
The Internet made easier our access to information. As the Internet accelerates access to information, searches for health-related topics have also increased. Doctors in order to increase and update their current knowledge with videos of their own branches; patients also use the Internet to get information about their own health issues. However, most of the matters on the Internet contain incorrect or incomplete information. This necessitates the evaluation of the contents on the Internet by the authorities.
YouTube is the most widely used video sharing website worldwide. It is the second most visited website after Google [1]. Every minute, 500 h of content are added to the website and a total of 2 billion different people visit YouTube monthly [2]. Because of its frequent use, YouTube disseminates information to the public. However, since its use is liberated, there is a risk of spreading misinformation from uncertain sources. Today, YouTube videos are used by doctors, medical school students, assistant health personnel, and patients to obtain information [3,4].
Lagophthalmos is the incomplete closure of the eyelid and may cause complications such as corneal exposure such as punctate epitheliopathy, corneal scar, epithelial defect, keratitis and even perforation [5,6]. Conservative approaches such as artificial tears, ophthalmic ointment, and eyelid taping can be applied due to lagophthalmos [7]. Lid loading is an alternative surgical treatment for lagophthalmos due to facial palsy [8]. In this treatment method, it is aimed to prevent complications of lagophthalmos by implanting a weight on the lid. Gold or platinum weights are used during the operation [9]. Since it is a surgery that requires advanced expertise and is not performed frequently during the residency period, inexperienced surgeons try to acquire technical skills by watching previously made surgery videos. Patients also use the Internet to have detailed information about their health conditions. YouTube is a resource often used for this purpose. More information is gathered from YouTube videos with higher quality, and vice versa. Therefore, we aimed to assess the reliability, utility and quality of lid loading videos on You-Tube. As far as we know, no research has been done for this purpose so far, and our study will be the first.

Materials and methods
Institutional review board permission was not required for this retrospective, cross-sectional, register-based study. On May 19, 2022, a search was done on YouTube (www. youtu be. com) with the keywords 'Eyelid Loading,' 'Gold Weight Implantation,' 'Lid Loading for Lagophthalmos.' The search was done in the incognito tab of the browser, without logging in, deleting the search history. "Sort videos by relevance" was the default search setting that was used. Seventy-five videos in all were watched. Only English language videos were included. Videos that were shorter than 30 s, in other languages, duplicated, or unrelated to lid loading were also excluded.
The information that was recorded included the time (seconds), the number of views, the source (doctor/health institution/medical channel), the number of subscribers, the number of likes, the number of days since the upload date, the type of expression (verbal narration/subtitles), and the content (surgery/theoretical information) of the video. Videos were viewed and analyzed by two oculoplastic surgeons (KSC, HT).
The DISCERN questionnaire score, the Journal of the American Medical Association (JAMA) benchmark criteria, and the Global Quality Score (GQS) were all utilized to assess the videos' dependability and instructional content. Three sections and sixteen questions make up the DISCERN scoring system ( Table 1). Each question receives an evaluation from 1 to 5. With reference to medical information in general and treatment in particular, this grading evaluates its objectivity and completeness. With the first eight questions, the first section assesses the validity of the source, in this case, an online video. In the second section, seven questions are used to grade treatmentrelated material, and one question was used to score evaluation of the videos' overall quality in the third section but this last question was not included to the total score [10]. The excellent (i.e., 63-75 points), good (i.e., 51-62 points), fair (i.e., 39-50 points), poor (i.e., 27-38 points), or extremely poor (i.e., 16-26 points) categories are determined by the DIS-CERN scoring system, which has a range of 15 to 75 points [11]. The JAMA grading system is a tool for evaluating the quality of knowledge found on healthrelated websites. It has a total possible score of four points and four criteria (authorship, attribution, disclosure, and currency) with a possible point for each ( Table 2). The best quality is indicated by a score of four points [12,13]. A video's informative value was assessed using the GQS grading system. The GQS system offers viewers a five-point Likert scale to rate the overall quality of a video's content (Table 3). A score of 1 or 2 points denote low quality, 3 points denote medium quality, and 4 or 5 point denote high quality. The GQS system also mirrors the organization and usability of the data shown in the online video [14].
Version 26.0 of SPSS (SPSS Inc. Chicago, IL) was used for the statistical analysis. The Shapiro-Wilk test was used to evaluate the data's distribution and normality. Means and standard deviation were used to reveal the continuous variables (SD). Student's t test for continuous variables with a normal distribution; Mann-Whitney U test for continuous variables with a non-normal distribution were used. When Is it clear when the information used or reported in the publication was produced? Is it balanced and unbiased?  References and sources for all content should be listed clearly, and all relevant copyright information noted Disclosure Web site "ownership" should be prominently and fully disclosed, as should any sponsorship, advertising, underwriting, commercial funding Currency Dates that content was posted and updated should be indicated Table 3 Global Quality Score (GQS) 1 Poor quality; very unlikely to be of any use to patients 2 Poor quality but some information present; of very limited use to patients 3 Suboptimal flow, some information covered but important topics missing; somewhat useful to patients 4 Good quality and flow, most important topics covered; useful to patients 5 Excellent quality and flow; highly useful to patients appropriate, one-way ANOVA and Kruskal-Wallis tests were employed to compare the parameters among the three or more groups. The associations between variables were investigated using the Spearman's correlation test. The intraclass correlation coefficient (ICC) and 95% confidence interval (CI) for each score were calculated to measure interobserver reliability. DISCERN, JAMA, and GQS all had ICC values more than 0.90. A p value less than 0.05 was considered significant.

Results
Twenty-five videos for each keyword, a total of 75 different videos were included in the study. Repetitive videos, videos which are not in English, or not related to lid loading were removed. The remaining 46 videos were statistically analyzed. In Table 4, descriptive statistics are provided.
According to the DISCERN score, 34.7% (n = 16) of the videos had very poor quality, 50% (n = 23) had poor quality, 13.1% (n = 6) had fair quality, and 2.2% (n = 1) had good quality, while none of the videos had very good quality. Considering the distribution of DISCERN score according to sections; section 1, which questions the reliability and accuracy of the website, received 65.6% of the total DISCERN score, and section 2, which questions the accuracy of the information about treatment options, received 34.4%. None of the videos were high quality when the high-quality threshold was set to 3 for the JAMA score. While information about authorship was given in 29 of the videos, currency was specified in only six videos. However, none of the videos provided information about attribution and disclosure. GQS concluded that 37% (n = 17) of the videos were of low quality, 39% (n = 18) were of medium quality, and 24% (n = 11) were of good quality.
While 23 videos (50%) showed surgical technique, 10 videos (21.73%) contained theoretical information. In 13 videos (28.27%), both surgical technique and theoretical information were included. In all the videos showing the surgical technique, the implants were placed in the pre-tarsal area, and none of the videos showed complications. Implant material was mentioned in 35 videos (76.08%). Thirty-one of these videos (88.57%) mentioned gold implant, three of them (8.57%) platinum implant, one of them (2.86%) mentioned both implants.
There was verbal narration in 30 videos (65.2%) and subtitled narration in 16 videos (34.8%). The DISCERN score in videos with verbal narration was 27.16 ± 7.52, the JAMA score was 0.85 ± 0.65, and the GQS was 3.14 ± 0.89. The DISCERN score in the videos with subtitled narration was 21.43 ± 3.14, the JAMA score was 0.68 ± 0.60, and the GQS was 2.31 ± 1.07. The DISCERN score and GQS of the videos with verbal narration were significantly higher than the videos with subtitles (p = 0.006 and p = 0.025, respectively). JAMA score showed no evidence of a meaningful difference (p = 0.351).
Nineteen (41.3%) videos were uploaded by doctors, five (10.9%) by health institutions, and 22 (47.8%) by medical channels. In terms of DIS-CERN, JAMA, and GQS scores, there was no statistically significant difference between these three groups (p > 0.05). Of the doctors who uploaded videos, nine were ophthalmologists (47.4%), eight were otolaryngologists (42.1%), and two were plastic surgeons (10.5%). Regarding the DISCERN score, JAMA score, and GQS, there was no statistically significant difference between these three groups (p > 0.05).
All three scores were positively correlated with each other. There was a significant correlation between video duration, number of subscribers and DISCERN score, but not with JAMA and GQS (Table 5). There was no significant correlation between quality scorings and other video parameters.

Discussion
YouTube is a popular resource on medical topics. It can be used by doctors and patients for educational purposes. However, since it is a free sharing site, anyone can share and spread false information. This creates the need to question the reliability and accuracy of the material conveyed by such sources. The usefulness, reliability and educational quality of videos on lid loading surgery were questioned in our study.
There have been many studies in the field of ophthalmology questioning the quality of videos on YouTube [15][16][17]. Also, there have been previous YouTube studies of ptosis surgery and upper lid blepharoplasty in the field of oculoplastic [18,19]. But in our knowledge, our study is the first study on lid loading. Upper eyelid loading is one of the frequently used surgical treatment options for paralytic lagophthalmos. Rigid implants such as gold or platinum and flexible implants can also be used. Implants can be placed in different localizations as a single piece or multi-piece [20,21]. Post-implantation complications such as foreign body reaction, allergic reaction, redness, astigmatism, entropion, and implant exposure may occur [22][23][24].
The DISCERN score measures the accuracy and reliability of information on websites that provide information about treatment options. It can be used by those who make the treatment decision or those who receive the treatment recommended for them [25]. In our results, we found a mean DISCERN score of 25.17 ± 6.88 (very poor quality). When we analyze the scores as sections, 65.4% of the scores were Section 1 and 34.6% were Section 2. Section 1 evaluates the reliability and accuracy of the website, while Section 2 questions the quality of information about treatment options. The main reason for the low DIS-CERN score was that the videos were only about lid loading and did not include other treatment options for lagophthalmos. The fact that the video with the highest DISCERN score is more than half (58%) of the score received by Section 2 also supports this. Similarly, the lack of mention of implant types, implantation locations and complications cause low scores even in videos where a good surgery is performed. Additionally, there was a positive correlation between the DISCERN score and the length of the video and the number of subscribers. It is expected that the video duration should be longer in order to show the surgical steps in more detail or to show and explain the details such as examination, indication, post-surgical follow-up and complications. Therefore, longer videos can be of higher quality educationally. But, when we analyzed the videos that were shorter than the average duration one by one, we found that seven videos had a higher score than the average DIS-CERN score. Therefore, it should be kept in mind that if the information conveyed in the video is accurate and clear, the videos can be of high quality, even for a short time.
Authorship, attribution, disclosure, and currency are the four factors that the JAMA benchmark criteria, first published by Silberg et al. [26], use to assess the reliability of an information source. The affiliations and credentials of the writers and collaborators must be disclosed as part of authorship. The Table 5 The correlation between some video parameters and scoring system term "attribution" refers to the inclusion of all pertinent references, sources, and copyright information.
The extent of disclosure of website "ownership" and any sponsorship, advertising, financial support from businesses, or conflicts of interest is evaluated. The website must state the dates on which the material was uploaded and modified in order to be considered current. For each of the criteria that are satisfied, one point is given, with a maximum score of 4. A score of 4 denotes a reputable source, while a score of 0 denotes a doubtful source. In our study, we found an average JAMA score of 0.79 ± 0.63 (very poor quality), while authorship was stated in 29 videos, currency was mentioned in only six videos. But attribution and disclosure were not mentioned in any of the videos. This reduces the reliability and academic value of the videos and causes the JAMA score to be low.
The GQS was a five-point Likert scale that evaluated the accuracy, usability, and flow of the information available online. In our results, we found an average GQS 2.84 ± 1.03 (medium quality). It states that the information flow in these videos is suboptimal, some information is mentioned, but important issues are missing, and it is moderately beneficial for patients. Two learning channels related to multimedia learning were mentioned before; visual channel and auditory channel. Mayer and Moreno mentioned that videos using both of these ways can be more effective in terms of instruction. In addition, they said that videos that present only written text will negatively affect learning [27]. De Koning et al. [28], on the other hand, stated that the use of written texts only to emphasize a piece of information can be beneficial. In our study, we also found that videos with audio narration had higher DISCERN and GQS, which was consistent with the literature. Also, none of the videos used subtitles as recommended by De Koning et al.
The median DISCERN score for ptosis surgery videos was 32.8 ± 10, the median JAMA score was 1.3 ± 0.5, and the median GQS was 3.1 ± 1.1, according to study by Garip and Sakallioglu [18]. In another study, Sayin et al. [29] reported mean DIS-CERN, JAMA, scores and GQS were 37.65 ± 10.49, 0.82 ± 0.52, 2.86 ± 0.86, respectively, for vitreoretinal surgeries on YouTube. Quality scores in both studies were similar to those in this study. In both studies, it was concluded that the educational quality of surgical videos on YouTube was low. But, in another study, Çetinkaya Yaprak and Erkan Pota found the mean DISCERN score, JAMA score, and GQS of YouTube videos about the treatment of keratoconus to be 42.92 ± 18.14, 2.7 ± 0.73, 3.07 ± 1.25 and reported that they were educational and informative for patients [16]. This difference may be related to the total number of videos and the specificity of the researched topic. Explaining the surgical stages and method for doctors in a video is more difficult and complex than providing information about a disease for patients. Also, this may have caused the difference. Videos containing surgical techniques may have a lower score, because DISCERN, JAMA scores, and GQS are not specifically designed to evaluate material involving surgical techniques.
Our study does have certain limitations. First of all, the videos uploaded to YouTube in a certain time period were evaluated. There may be videos added or deleted due to the dynamic nature of YouTube. Accordingly, studies conducted at different times may yield different results. Second, only English language videos were evaluated. But videos in other languages were very few. Third, we included only 75 videos in our study, and this limited number of videos is small to give an idea about the videos on the Internet. Fourth, answers to questions in DISCERN, JAMA scores, and GQS are subjective. But there was significant high correlation between two independent viewers. Finally, the scoring methods were insufficient to assess the videos' audio and visual quality. However, they do not affect the quality of the video content. They can only affect the number of likes and views.
In conclusion, according to our findings in the study, the educational quality of lid loading videos on YouTube is low. Longer videos and videos with verbal narration may have higher educational quality. Content uploaded to websites such as YouTube needs to be inspected and evaluated by professionals. Doctors and health care professionals who do not have enough information, or patients searching for their own health conditions, should be more selective when watching lid loading videos on YouTube.
Author's contributions All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Ali Safa Balci and Kubra Serefoglu Cabuk. The first draft of the manuscript was written by Ali Safa Balci, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.