Quality assessment of guidelines
The ICC values for appraisal of the identified guidelines ranged from 0.81 to 0.97, indicating a good agreement among appraisers. The overall quality of the included CPGs was moderate, with the domain ‘clarity of presentation’ receiving the highest score, and the domain ‘applicability’ receiving the lowest score (Table 2, Additional file 4).
Scope and purpose
Guidelines for this domain received a median score of 69.44% with the IQR ranging from 35.42% to 85.42%. The highest score in this domain was 86.11%, as the guideline clearly defined its scope and global objectives and specifically defined the related clinical field and target populations .
The guidelines appraised received the second lowest scores for stakeholder involvement (median, 41.67%; IQR: 30.56% to 75.00%). Six guidelines (66.67%) scored lower than 50% for domain ‘stakeholder involvement’ [3,8,10,11,13,15]. Another three guideline panels consisted of a multidisciplinary group of covering clinicians [9,12,14], methodologists [9,12,14], pharmacists  and administrative staff . Two guidelines involved patients or their representatives in guideline development to consider the preferences of the target population [9,14].
Rigour of development
The median score for the domain ‘rigour of development’ was 48.96% with an IQR ranging from 27.08% to 65.63%. Five guidelines (55.56%) scored lower than 50% [8,10,11,13,15], this was probably because these guidelines did not report the systematic methods for searching or evaluating the evidence [8,11,13]. Only one guideline described the process of how final decisions were made . The proportions of SRs in evidence types were approximately 11.27% , 12.78% , 14.39%  and 14.73%  in four guidelines that presented their body of evidence clearly.
Clarity of presentation
The domain ‘clarity of presentation’ received the median score of 80.56% (IQR: 66.67% - 93.06%), with all guidelines scoring > 60%, as the most relevant recommendations in all guidelines could be easily found with explicit SOR and LOE.
The domain ‘applicability’ received the lowest median score (median 34.38%; IQR: 22.92% to 40.63%). In general, there was little information regarding potential organizational barriers, cost implications, and tools for application, except for the NICE guideline , which scored 81.25%. Some derivative products including pathways , summaries for the public , quick reference document  and various translation versions , could be useful for application. Cost effectiveness was considered only in the NICE guideline, which involved health economists in guideline panels, incorporated health economics evidence and discussed implications for budgets behind recommendations .
The greatest range of scores was observed in the domain ‘editorial independence’ (IQR: 35.42%, 85.42%). Although all the guidelines disclosed their conflicts of interest (COI), the quality of disclosure was not ideal. They gave minimal information about ways in which any COI were managed in either tabular or narrative form. A complete summary of the process for identifying, managing and reporting COI during guideline development was only presented in one of the guidelines .
Synthesis of recommendations
Of the 9 guidelines, one guideline did not present the LOE underpinning the recommendations , and the remaining eight guidelines used six grading systems to rate the LOE and seven grading systems to rate the SOR (Additional file 5).
A total of 177 recommendations on the management of NMIBC were extracted for statistics (Additional file 6). Three guidelines tended to formulate a recommendation supported by more than one type of evidence, resulting in no correspondence between the number of types of evidence and recommendations [9,10,12].It could be clearly seen that recommendations rated as grade A (33.9%) plus grade B (49.7%) accounted for a higher proportion, whereas evidence rated as level 2 (48.1%) plus level 3 (20.9%) accounted for a higher proportion.
To demonstrate differences between the identified guidelines, the key recommendations for the management of NMIBC were extracted and summarized (Table 3–5, Additional file 7–9). Although the contents of recommendations achieved a significant consensus in most areas, there were some noteworthy discrepancies in these guidelines.