Abstract
Objective
The Americleft study is a North American initiative to undertake an intercenter outcome study for patients with repaired complete unilateral cleft lip and palate from five well-established North American cleft centers.
Design
Retrospective cohort study.
Setting
Five cleft palate centers in North America.
Methods
This is the first paper in a series of five that outlines the overall goals of the study and sets the basis for the clinical outcome studies that are reported in the following four papers. The five centers’ samples and treatment protocols as well as the methods used for each study are reported. The challenges encountered and possible mechanisms to resolve them and reduce methodological error with intercenter studies are also reviewed.
By the 1990s the array of surgical treatment approaches and primary infant management protocols for patients with cleft lip and palate had become enormous, with little reliable information available upon which to base rational decisions to choose one method over another (Semb and Shaw, 1998). In a survey of 201 European cleft centers, 194 different surgical protocols were in use for the primary closure of a complete unilateral cleft lip and palate (CUCLP) alone (Shaw et al., 2000). With this number of approaches possible, the inadequate research methods of previous decades offered little chance to identify specific individual procedures that could be shown to be unequivocally superior to others (Spriestersbach et al., 1973; Roberts et al., 1991; Shaw et al., 1996; Semb and Shaw, 1998).
However, as described in a “State of the Art” paper by Long et al. (2000), an approach using rigorous comparisons of outcomes from different centers marked a significant change in direction from previous methodologies. In 1992, the landmark Eurocleft study (Asher-McDade et al., 1992; Mars et al., 1992; Mølsted et al., 1992; Shaw et al., 1992a; Shaw et al., 1992b) clearly established the value of intercenter collaboration to carry out well-controlled assessments of treatment outcomes using retrospective data. Using records routinely collected for clinical diagnosis and treatment purposes, the study design and execution of the original Eurocleft investigation allowed for objective and unbiased detection of favorable versus unfavorable effects of varying primary infant management protocols for patients with CUCLP in the areas of dental arch relationship (Mars et al., 1992), craniofacial morphology (Mølsted et al., 1992), and nasolabial aesthetics (Asher-McDade et al., 1992). Most notably, the outcomes found most favorable to improve results in these key areas appeared to be in centers with simple surgical and orthodontic approaches (Shaw et al., 1992a; Shaw et al., 1992b). Though clearly not capable of identifying the individual procedures within a total treatment management protocol that are responsible for favorable or unfavorable outcomes, intercenter comparisons have become a key part of quality improvement programs, leading to international agreement on standardized documentation (Shaw et al., 2000).
Although the original Eurocleft project subsequently led to the networking of most cleft–craniofacial teams in Europe (Eurocleft Network) and to two randomized controlled trials of surgical and orthopedic interventions for primary infant management, similar progress among centers in North America has not materialized. North American centers have been unable to generate any significant momentum in inter-center, collaborative clinical research in spite of the large number of high-volume, well-organized centers and the organizational support available through the American Cleft Palate–Craniofacial Association. In 2004, the World Health Organization (WHO, 2004, pp. 46–47) devoted an entire report to promote international collaborative research and found that “… unlike the European experience, in which the original study generated a groundswell of support, extension of the clinical research approach throughout European centres and led to the establishment of Eurocran, Scandcleft and strong financial support from government and nongovernmental sources, the experience in the US has been the opposite.” Not only have there been no other US centers to join in the European collaborations, but attempts to duplicate the approach within the US between centers have been limited to isolated collaborations yielding very limited information, and failing to lead to expansion of interest to include additional centers.
The reasons for this failure are complex. They include a lack of funding, lack of research training, and lack of time and interest on the part of primary care providers to engage in research projects. Another critical problem in the United States is the current health care climate that tends to prioritize cost containment and access to care over quality of care. This in turn has created a tendency for decentralization of care provision. Although the existence of a large number of centers and individuals providing treatment for craniofacial anomalies improves patients’ geographical accessibility to care, it simultaneously creates a fractionation of the study population, thereby reducing the probability of developing patient samples of adequate size to enable valid research. The entire landscape is further complicated by noncomparable patient populations, noncomparable treatment records, unquantifiable differences in operator skills, difficulties in letting go of biases, and an unwillingness to question our own protocols developed and used for years within individual centers. For intercenter collaboration to succeed, individual centers must be willing to entertain a degree of uncertainty about the effectiveness of their chosen protocols and a desire to discover ways to improve those protocols based on the experiences and outcomes of others. Finally, there remains a general lack of agreement among centers on minimal standards for reporting and recording outcomes, as well as cost and ethical concerns over taking records that cannot be clearly identified as essential for diagnosis and treatment. In an attempt to remedy this problem, in 2006 the American Cleft Palate–Craniofacial Association (ACPA) established a Task Force on Intercenter Collaboration as part of its Research Committee and provided support for the initiation of an Americleft study, based on the principles established in the original Eurocleft study. These basic principles are as follows.
(1) Understanding the Strength of Evidence Hierarchy
Numerous authors have emphasized the relative strengths of evidence obtained through various reporting methods and types of investigations (Roberts et al., 1991; Long and Deacon, 2008). The strongest evidence is clearly in the domain of the randomized control trial (RCT) because it best identifies the effects of specific features of treatment management protocols and control for bias. For cleft lip and palate treatment, such RCTs are difficult to execute due to the time, costs, and large samples required. Fortunately, though, the effect of these methodological challenges, confounding variables, and research biases can be minimized by study design. In the hierarchy of evidence, intercenter comparisons of outcomes are considered second strongest to RCTs, assuming appropriate interpretation of results and rigorous attempts to minimize bias (WHO, 2002).
(2) Controlling Bias
Bias has been identified by many as a primary weakness that limits the strength of evidence of retrospective studies. As emphasized in the 2002 WHO Report, “differences arising from the biases … are likely to exceed actual differences attributable to the procedures” (p. 18). In this investigation, susceptibility (case selection) bias was controlled by the requirement for large, consecutively treated samples with documentation of CUCLP to ensure equivalency of the samples at the outset. Follow-up bias was minimized by the requirement to account for any patients initially enrolled in the center for whom records were not available for the outcomes study (i.e., patients moving away, failing to return for team evaluations). Management of analysis bias was achieved through statistically determined minimal sample sizes, blinding all raters and examiners as to the source of the records; training and calibrating all raters and examiners; and statistically determining intrarater and interrater reliability. Only proficiency bias related to variation in operator skills could not be controlled and remains a potentially significant factor to explain the outcome differences identified.
(3) Using Standardized Clinical Records
One of the greatest obstacles to meaningful intercenter outcome comparisons is the inconsistency and noncomparability of the records that document the outcomes. Therefore, some agreement among participating centers on the minimal standards for record taking was critical to this study. It was critical that the records used for the intercenter comparison were those that were normally taken for treatment planning purposes. Given that mixed dentition orthodontic treatment is standard in most centers, dental models, photographs, and radiographs are generally taken by the orthodontists as a routine part of this treatment planning. Fortunately, as shown in the original Eurocleft, these records can be used simultaneously to evaluate dental arch relationships, skeletal morphology, and facial aesthetics outcomes resulting from prior interventions with appropriate ethics approval. Thus, these records and the outcomes they represent are the most immediate and simplest way to initiate an intercenter collaborative study.
Key Methodological Considerations
In this study, there were seven main considerations that were established for the methodology. Again, these were largely attempts to mirror the methods used in the original Eurocleft study. Given the difficulties described above in making any progress whatsoever in establishing the interest in, and foundations for, successful intercenter comparisons, it was decided to take advantage of the successful Eurocleft experience as a model for Americleft. Therefore the following considerations were established as ideal targets for this study, with the understanding that, as with the Eurocleft experience, achieving them all would be a difficult undertaking. However, by setting these standards for the study at the outset, it was also hoped that any shortcomings encountered would be more clearly evident when considered in the context of what was ideal. Although five centers were included as the pilot group for this study, based on the varied ability of each center to meet all criteria for all of the outcome measures investigated, desired samples were nearly impossible to achieve for all centers for all outcome measures. Nonetheless, it was felt of value to set these target methodological considerations at the outset of the study to mirror Eurocleft as much as possible and to use as a yardstick to help better understand the challenges of initiating a project of this nature. These key considerations were:
Sample sizes in the range of 30 to 40 were sought. This was based on the findings of the Eurocleft study, part 5 (Shaw et al., 1992b). The final conclusions in that study regarding sample sizes suggested that for three of the pertinent outcome measures (Goslon score of dental arch relationship, cephalometric maxillary prominence, and sagittal soft tissue relationships), five center comparisons based on a .5 significance level, 80% power, would need sample sizes of 30 to 40 to allow for detection of between 0.50 and 0.75 of a Goslon point and 3 to 4 degrees of soft tissue relationship difference. In the case of cephalometric maxillary prominence and to increase the sensitivity of difference detection for Goslon scores and soft tissue relationship, significantly larger sample sizes (97, 97, and 63, respectively) would be required. At the outset it was understood that sample sizes of that magnitude would not be feasible for most centers given the other methodological requirements of consecutiveness, complete nonsyndromic UCLP, and availability of records. Therefore, the 30 to 40 sample size was targeted to at least enable significant power for a few of the most sensitive and important outcome measures. Smaller samples were included in the study but would not represent a reliable and valid assessment of the outcome in question. Center A (see below) was only included in the dental arch relationship part of the study because that measure required only a sample of 16 for detection of a 1.0 Goslon point difference.
All patients in the study were white. In addition, samples had to be separated according to cleft type. Because CUCLP is the most frequently encountered type of cleft, CUCLP was chosen to facilitate achieving adequate sample sizes.
All unilateral cleft lip and palates had to be complete clefts and nonsyndromic; Simonart bands were permissible as part of a CUCLP as long as the skeletal (alveolar) cleft was complete. Furthermore, records had to be available to document and confirm the initial condition (i.e., initial entry chart notes, photos, or dental models).
Each center was required to provide evidence, usually through chart note entry dates, patient number, or patient birth date, that the samples consisted of consecutively enrolled patients. It was understood that although pure consecutiveness was desired, it might not be possible due to patients lost to long-term follow-up.
Records used for outcome assessments were unlinked to any cleft-craniofacial clinic or health center and were prepared in such a way as to ensure blinding of raters and examiners. This was deemed especially important in intercenter comparisons of this nature due to the sensitivity of investigating favorable or unfavorable center protocols and the possibility of incorporating analyses bias.
Records describing the primary treatment protocol and number of operating surgeons involved in the primary surgeries had to be available. Furthermore, it was required that all primary surgeries were completed at each respective center.
All centers had to obtain patient permission and institutional review board (ethics) approval for use of the clinical records in an intercenter outcome assessment study. The study was designed to further ensure privacy of patient identity and information as well as the identity of the individual centers and surgeons/clinicians.
Summary of Participating Centers and Their Primary Protocols
Five centers were chosen to participate in the Americleft study. The first criterion of the participating centers was to meet the above seven methodological conditions. Second, yet equally important, was the willingness and commitment of the individual cleft teams to openly enter into, without preconceived bias, an intercenter collaboration to determine the relative benefits of their protocols compared with those achieved by the other teams. All participating centers had well-established North American cleft teams with long histories of involvement in cleft lip and palate treatment that provided centralized, multidisciplinary care and used standardized protocols. In addition, the centers, identified as A through E in Table 1, were chosen to provide a wide range of protocols. The protocols for the individual centers are summarized in Table 1. Noteworthy within these tables is the inclusion of one center that used primary bone grafting (Center B), three centers that used variations of presurgical orthopedic treatment (Centers B, D, and E), and one center using two-stage palate repair (Center C). There was also a wide and representative range of lip and palate repair techniques. Finally, the number of surgeons involved at the various centers was small, especially compared with the number of surgeons reported for some centers in the original Eurocleft study (Shaw et al., 1992a). This is relevant due to the suggestion from the Eurocleft study of a surgical volume/skill effect on outcomes.
Sample Characteristics and Treatment Protocols for the Study Centers
Center B. The molding plate was started at 2 to 3 months and continued through primary bone grafting to the palate repair at 11 to 15 months. The molding plate was discontinued at the time of palate repair.
Center D. Infant presurgical orthopedic treatment was done using a modified McNeil technique with extraoral traction. The orthopedic appliance was placed prior to lip repair at 3 months and discontinued once lip repair was done.
Center E. Infant presurgical orthopedic treatment was done using a modified McNeil technique with extraoral traction. The orthopedic appliance was placed prior to lip repair at 3 to 4 months and continued until the time of palate repair at 12 to 14 months.
Those patients who received a 6- to 12-week Millard lip repair had a 9- to 12-month Bardach palate repair; whereas, those who received a 5- to 6-month Delaire lip and soft palate repair received a 9- to 12-month Delaire palate repair.
IVP = intravelar veloplasty.
The total sample pool from the five centers was 172 patients, with individual sample sizes all reaching desirable levels except for Center A. For many patients complete records necessary for all three parts of this study were available. However, final sample sizes in parts 2, 3, and 4 were slightly less than that listed in Table 1 due to occasional missing or poor-quality dental models or radiographic or photographic records on individual patients. In addition, Center A was not included in parts 3 and 4 due lack of the necessary radiographs and photographs on all patients, further reducing the already suboptimal sample size.
Following identification of the participating centers, each center was given time to obtain ethics approval and prepare their samples. The initial outcomes to be assessed were identical to those reported in the initial Eurocleft study and included dental arch relationships using dental models and the Goslon rating system, skeletal and soft tissue craniofacial morphology using lateral cephalometric radiographs, and nasolabial aesthetics using standard frontal and lateral photographs. The methodology and results for each of these assessments will be reported in papers 2 through 4, and a summary paper will close the series as the fifth paper.
Footnotes
Acknowledgments.
The authors and Americleft group would like to acknowledge Jennifer Gregory, Data Manager and Research Assistant, Lancaster Cleft Palate Clinic. We would also like to acknowledge Drs. Bryan Ruda, Mairaj Ahmed, and Brian Smith and Mr. J.B. Peterman, research assistants at the Lancaster Cleft Palate Clinic, and Dr. Mike Horst, Director of Research at Lancaster General Research Institute.
