The academic database searches identified 2790 references (after duplicate removal), and the first five pages of the Google search identified 71 references (after duplicate removal). The search results were supplemented with 2 known relevant references from SC’s personal library that had not been identified by either the academic or Google searches. The (NNR) to identify a primary paper from the combined academic databases, Google search and SC’s personal library was 286.3 (2790 refs + 71 refs + 2 refs / 10 primary refs).
Database Comparison
Figure 2 shows the databases where primary papers were found in the searches. It also shows the databases that held these references and could have been located with ‘the optimum search’ but were not necessarily identified with our search. Seven of the ten included references were identified in the Google searches [27-33]. One of these [34] was also found in the database search. Therefore, six references were uniquely found via the Google searches. The NNR for the Google searches was 10.1 references.
Across the academic databases the CINAHL search identified two primary papers. Several database searches were ‘redundant’; HMIC did not identify any relevant references, and The Cochrane Library, EMBASE and Medline searches found two duplicates [34, 35] also found in the CINAHL search. A further two primary papers were identified in SC’s personal library [36, 37]. Using our search strategies, the 10 primary papers could have been identified using three sources; CINAHL, Google and SC’s Personal Library, rather than the seven sources (Figure 3). The NNR for the academic searches was 1395 while for the combined academic databases and Google search it was 286.3.
Reasons for non-retrieval in existing searches
Figure 1 shows that some primary paper references were available in CINAHL, EMBASE, Medline and Google, but were not identified by the original searches. Table 1 lists the reasons for non-retrieval. All 10 primary references were available in Google, however only 7 appeared in the first five pages of search results from the original two Google searches.
When the original Google searches were replicated in 2019 the 10 primary references were found within the first 12 pages of Google (i.e. the first 216 records if it is assumed there are 9 references per page on average in Google search results). This indicates that if a larger set of results (216) had been screened from the original search, all 10 primary papers could have been identified with this one source. However, this is an estimate, as we do not have the data for a download of the first 12 pages from the search in 2017.
Five references were not indexed in the academic databases we searched (Table 1). These included a guideline [37], blog item [29], conference abstract [31], book chapter [33] and one journal article [36]. At the time of the original search, the journal reference record was available in PubMed and ‘Ovid Medline In-Process & Other Non-Indexed Citations’ but not the version of Ovid Medline searched (Ovid MEDLINE(R) <1946 to April Week 3 2017>). Of the remaining 5 primary references, CINAHL contained 5 but only retrieved 2, EMBASE contained 4 but only retrieved 1, Ovid Medline (1946-present) contained 4 but only retrieved 1 and The Cochrane Library contained and retrieved 1. The references found in EMBASE, Medline and The Cochrane Library were duplicates of the references found within CINAHL. Three references were not identified by the academic databases searches despite being available in at least one database [27, 28, 32] due to using the search limit ‘Adult’ or ‘Aged’ which the references had not been indexed for [28, 32]. Also search terms (index terms and free text words) used in the search concept for ‘theory and publication types’ were not present in some database indexing records [27, 28, 32].
The optimum search with the lowest NNR of 21.8 was the original Google search (Pressure ulcer risk assessment and Pressure ulcer risk assessment tools). We extended the records screened from five pages (approx. 36 unique records) per Google search to at least 12 pages (approx. 108 unique records) per Google search, in order to retrieve all 10 primary references.
Replicability of Google and Google Scholar searches
The results of the replicability searches in Google where the first five pages were screened by 2 researchers working independently, were similar but not identical. For search (i) ‘Pressure Ulcer Risk Assessment’, 11 results were on the same page for each researcher but in a different order and six references were found by one but not both researchers. For search (ii) ‘Pressure Ulcer Risk Assessment Tool’, 10 results appeared in a different order for each researcher and 22 results were found by one but not both researchers.
Comparison searches were undertaken to test if Google Scholar performed better than Google Searches for identifying our ten primary papers, and to check replicability. The first five pages for each search (76 records in total) were downloaded. The search results from both researchers were exactly the same, indicating strong replicability. However, in both the ‘Pressure Ulcer Risk Assessment’ and the ‘Pressure Ulcer Risk Assessment Tools’ search, only one of the ten primary papers was identified [34]. Google Scholar was inferior to Google in this case study for identifying reports identifying programme theories.
Optimising the Search
In light of our experience in this case study we used a combined but adapted approach to help mitigate publication bias that can arise from only using one source for publications, or in the case of Google searching, an ‘Internet Research Bubble’ [38]. This comprises the existing Google search but extending to page 12 and optimising our academic databases searches by searching only CINAHL, the most relevant database (containing the most included studies (5) with the other databases only containing a subset of these 5). We also optimised the precision of the CINAHL search by searching only in the title field and not the abstract or keywords for the phrases “pressure ulcer” “pressure ulcers” AND “risk assessment”. This would identify the same five available primary references from CINAHL with fewer abstracts needing to be screened. To demonstrate this we ran the specific phrase search in CINAHL (for the same time period of our original search). Search results were limited to studies published up to May 2017, when the original CINAHL search was conducted) and 186 papers were identified with an NNR of 18.6.
The Google and CINAHL optimised searches retrieved 402 references with an NNR of 40.2, whereas the original searches retrieved (2790 + 71 + 2) 2863 references with an NNR 286.3. The researcher would save time in screening 2461 (2863 – 402) fewer references, and also by not developing complex search strategies and downloading records from four academic databases. If we estimate that it takes on average 2 minutes to screen each paper with 10% being screened by a second reviewer this would save approximately 90 hours of researcher time. This illustrates the large difference in efficiency of the optimised search to the original, though we acknowledge it is impossible to develop such an optimised search when you are developing literature search at the start of the project as you do not yet know the included studies the search needs to find. However, the results indicate some search methods worthy of further testing and research for more efficient identification of programme theory.