2.2 Assessment of Guideline Quality
Four assessors, two were experienced clinicians and the other two were public health fellows with experiences in developing and evaluating guidelines, completed the online overview tutorial and practice exercise recommended by the AGREE collaboration before evaluation2.
The assessors independently responded to a total of 23 questions in six domains using the AGREE II instrument: (1) scope and purpose of the guideline, (2) stakeholder involvement, (3) rigour of development, (4) clarity of presentation, (5) applicability, (6) editorial independence2, 3. Each item was rated on a scale of 1 for “strongly disagree” to 7 for “strongly agree”2, 3. On evaluating the 23 items, each appraiser provided an overall assessment of each guideline, and decide if the guideline is recommendable. The decision was based on the personal judgement of assessors and domain scores4. In order to reduce discrepancies among four assessors, we referred to a previous method4, 5: if the scores assigned by four appraisers differed by 1 point, the lower was kept; if the scores differed by 2 points, they were averaged; and if the scores differed by ≥3 points, an agreement was reached after discussion.
According to the AGREE II methodology2, 4, domain scores were calculated as follows: (obtained score−minimum possible score)/(maximum possible score−minimum possible)×100%, while the obtained score was defined as the sum of four assessors’ scores of each item. Then, as reported in previous researches3, 4, 6, a value >60% was considered as sufficient and a value >80% as good. A median score across all six domains was calculated for each guideline.