Abstract
Objective:
In urology there is currently no validated and objective way to measure the ‘Relationship with Patients’ aspect of re-validation. The Sheffield Patient Assessment Tool (SHEFFPAT) questionnaire has been validated in a paediatric setting and is recommended by the Picker institute for patient feedback. The aim of this study is to assess the feasibility, reliability and validity of the SHEFFPAT questionnaire in urology to determine if it is an appropriate tool to be used for patient feedback.
Subjects and methods:
Ten consultants in the North West Region gave permission for the SHEFFPAT questionnaire to be distributed to their patients. A minimum of 25 completed questionnaires was required per consultant. A detailed analysis looking at reliability, bias, feasibility and validity was then carried out.
Results:
In total there were 464 completed questionnaires. The cohort mean score was 4.66 (S.D. 0.19) ranging from 2.0 to 5.0. Twenty-three patients are needed to provide feedback in order to achieve a reliability of 0.7 (95% CI 0.21). The gender and ethnicity of the patient nor their familiarity with the urologist helped to explain the variability in scores (R = 0.27, R 2 0.072, standard error of estimate 0.525).
Conclusion:
The SHEFFPAT questionnaire appears to provide reliable, valid and unbiased feedback from the patients' perspective fulfilling the White Paper and Health Minister's request for patient involvement in the re-validation process.
Introduction
Within the next 5 years every urologist will have to undertake the process of re-validation [1]. This became a requirement, and no longer just a suggestion, with the publication of the government White Paper ‘Trust, Assurance and Safety’ [2]. Re-validation will be divided into re-licensure and re-certification. Re-licensure will be a general requirement for all doctors, and will be the responsibility of the General Medical Council (GMC). Re-certification will be for those doctors on the Specialist Register and will be the responsibility of the Royal Colleges and the Speciality Associations. The remit of each division is still being established but it is known that the whole process will incorporate all seven domains of ‘Good Medical Practice’ set out by the GMC [3] (Fig. 1). One of these domains is dedicated to the ‘Relationship with Patients’ and therefore this will be an integral part of re-validation. The White Paper states that “communication skills will be an important part of feedback” and asks for patient involvement in every step of the evolutionary process of re-certification [2]. The Health Minister, Professor Darzi's, recent report ‘High Quality Care For All’ [4] calls for the systematic measurement and publication of information about the quality of care. Measures are to include patients' own views on the quality of their experiences. Currently doctors' relationships with their patients are an essential component of appraisal, but are often only subjectively reflected by the accumulation of “thank you” cards and a register of complaints. A more satisfactory, objective measure is required to fulfil the obligation of re-certification. The Picker Institute has published a review of 10 questionnaires for gathering patients' feedback on their doctor [5]. Three questionnaires were recommended but only one from the UK. The Sheffield Patient Assessment Tool (SHEFFPAT) questionnaire (Figs. 2 and 3) scores strongly on all aspects of the review. The questionnaire has been validated and shown to be reliable in paediatric practice [6,7]. The primary objective of this study was to evaluate the feasibility, reliability and validity of the SHEFFPAT questionnaire in Urology.

7 domains of GMC's “Good Medical Practice”.
Subjects and methods
Ten consultants in the North West Region, who had been assessed using SHEFFPAT as part of their appraisal evidence, gave permission for the SHEFFPAT questionnaire to be distributed to their patients. This paper reports the quality assurance of this data. Patients seen in the outpatients' clinic were randomly selected, and asked by an administration assistant to complete a questionnaire only following the consultation. Using this method the consultants were blinded to which patients were being asked to participate; the administration assistant was unaware of how the consultation had gone; and the patient was only aware of the questionnaire following the consultation. Fifty questionnaires were distributed per consultant with the hope of achieving at least 25 returns [7]. Feedback was provided to each consultant showing their mean score for each question as well as their overall score, each compared to the cohort means. Free text comments were also reported verbatim in the summary.
A detailed analysis for the use of SHEFFPAT in urology was carried out. Data were anonymised and the ‘unable to comment’ responses were removed prior to analysis. The mean scores and S.D.s at the level of the form and participant were calculated with ranges. The free text statements received from patients underwent a content analysis. Comments were categorised into negative or positive comments and the focus of the negative comment noted.
Reliability was explored using 95% confidence intervals (CIs) for mean ratings and as a traditional coefficient based on Generalisability theory [8] and is reported in line with a recently published expert consensus [9]. A reliability of D = 0.7 or above is conventionally accepted [10]. Variance components were calculated using MINQUE in SPSS v.14.0. Due to the naturalistic study design, the model was fully nested (assessors were taken as unique to each doctor) only allowing for the calculation of variance attributable to the doctor (true) and that which was not (error). The square root of the error variance is the standard error of measurement; this was calculated for 1–15 assessors
. The 95% confidence intervals are equal to the standard error of measurement multiplied by 1.96 and are added to and subtracted from a mean rating.
It can be hypothesised that known characteristics of the patient might independently affect scores reducing the validity of the process. The effect of confounding factors (patient gender, ethnicity and number of times the patient had previously seen the doctor) was evaluated using multiple regression. Taking the t-statistic as a measure of the relative importance of each potential confounder those above +2 or below −2 with significance (p< 0.05) are reported.

SHEFFPAT questions.

Scoring system for the SHEFFPAT questionnaire.

Aggregate scores achieved by urologists.
Results
Descriptive statistics
464 questionnaires were completed by patients for the 10 consultants.
The consultants mean age was 46.5 years (range 38–51) and the mean number of years in consultant practice was 7.2 (range 1–15). A mean of 46.4 questionnaires were correctly completed per consultant (range 30–57).
The cohort mean score was 4.66 (S.D. 0.19) ranging from 2.0 to 5.0. The aggregate score for each urologist showed that all 10 consultants had a mean score above 4.0 and 9 of the 10 consultants had a mean score above 4.60 (Fig. 4).
Consultants had an average of six free text comments (range 3–10). Of the 60 free text comments 86.7% were positive with only 13.3% commenting on a negative aspect. All of these eight negative comments were constructive criticisms about the department and organisation rather than the specific consultant.

Reliability estimates.
Reliability
The conventional D study and 95% CIs are presented in Fig. 5. Twenty-three patients are needed to provide feedback in order to achieve a reliability of 0.7 (95% CI 0.21).
Sources of bias
The gender and ethnicity of the patient nor their familiarity with the urologist helped to explain the variability in scores (R = 0.27, R 2 0.072, standard error of estimate 0.525).
Discussion
There was a high response rate from patients which supports feasibility of the use of SHEFFPAT in urology. However, this was only possible because of dedicated and motivated administration assistance. This will be a requirement and therefore a resource issue if patient feedback is to be used more widely.
Using the SHEFFPAT questionnaire the mean score for urology is 4.66 (S.D. 0.19). This places the result more towards “Best I can imagine” than “Same as most doctors”. There was very little variation in the overall mean between the consultants (4.14–4.79) with only one below 4.60. These results approximate to a normal distribution with a positive skew to the right (Fig. 4). Our results are in keeping with the evaluation of SHEFFPAT in the paediatric setting [6]. Therefore, this questionnaire would appear to have a ceiling effect in both settings. This could be interpreted as leniency bias [11], where assessors over-mark inappropriately, or alternatively, that medical cohorts simply perform well. In other specialities, when exploring poor performance by other complementary means, only about 1–3% of doctors are found to be poor performers [12]. It would therefore appear that our SHEFFPAT results demonstrate good performance rather than leniency bias. Also, although we automatically focus on poor performance we should always remember that the White Paper states that “professional regulation is as much about sustaining, improving and assuring the professional standards of the overwhelming majority of the health professionals as it is about identifying and addressing poor practice or bad behaviour” [2]. The SHEFFPAT questionnaire does this in urology.
The free text section makes analysis of the questionnaire more complicated and time consuming. However, there is evidence that doctors respond more to patient feedback than that of their colleagues [13], and constructive criticism about facilities and organisation can be used to procure funds for future work force planning.
Using the Generalisability theory it is possible to calculate how many consultations would be required to achieve a given reliability. Twenty-three questionnaires were needed to be completed to provide an acceptable level of reliability in this study (0.7). However, certain situations may warrant higher or lower reliabilities. High-stakes assessments require higher reliability, requiring more questionnaires to be completed, which in turn reduces feasibility. One way round this is to calculate 95% CIs. The 95% CI can be placed around an individual's aggregate score when the number of patients is known that have contributed to that score. This gives a measure of precision for that score and therefore confidence that can be placed in it in relation to a cut score (in this case 3.0—same as most practitioners). A cut score is that which is considered acceptable to demonstrate competence. If we were to use the 95% CI of 0.3 generated from our data, then we would only have to complete 11 questionnaires (Fig. 5). This would further improve feasibility. If the individual urologist's score and 95% CI lie above the cut score then their competency will have been demonstrated. If their score and 95% CI fall below the cut score then reliability would need to be improved before any judgement on competency could be made. Reliability would be improved by setting a higher D value requiring the completion of more questionnaires. This will then either show that the doctor is actually competent or at least provide more reliable evidence of deficiencies in this aspect of care.
The SHEFFPAT questionnaire has previously been validated, showing both good construct and criterion validity [5]. Changing any aspect of the questionnaire's content would compromise this validity. Providing we assume that fundamentally a consultation between a doctor and their patient remains essentially the same across all specialities, then this validity is preserved in the urology setting.
Conclusions
The SHEFFPAT questionnaire appears to provide reliable, valid and unbiased feedback from the patients' perspective fulfilling the White Paper and Health Minister's request for patient involvement in the re-validation process.
However, it does appear to have a ceiling effect which limits its discriminatory power to further distinguish those individuals grouped as satisfactory performers. The SHEFFPAT questionnaire needs to retain its current format in order to maintain its validity. The number of questionnaires that need to be completed depends on the balance between cost, feasibility and reliability. Although cost and feasibility are important their consideration should not be at the expense of reliability and validity.
Footnotes
The authors declare that there are no conflicts of interest.
