Development and Reliability Assessment of the Presurgical Infant Orthopedics Assessment Tool for Evaluating Clinical Outcomes of Presurgical Infant Orthopedics

Abstract

Objective

Presurgical infant orthopedics (PSIO) is used to optimize anatomical outcomes in infants with cleft lip and palate to facilitate favorable surgical results. However, standardized, reliable tools to assess PSIO effectiveness are lacking due to phenotypic variability and diverse treatment protocols. This study aimed to develop and perform a preliminary reliability assessment of a novel phenotype-based clinical outcome assessment tool, the PSIO Assessment Tool (PAT), to assess PSIO-related morphological changes in unilateral and bilateral cleft lip with or without palate.vd

Design

Tool Development and Reliability Assessment Study.

Setting

Multicentre expert consensus involving craniofacial orthodontists from diverse global regions.

Participants

Standardized pretreatment and post-treatment clinical cases of unilateral and bilateral cleft lip and alveolus with or without palate were used for calibration and reliability assessment.

Intervention

A panel of 7 expert craniofacial orthodontists collaboratively developed the PAT through iterative calibration using clinical cases. The tool assesses cleft severity pretreatment and morphological correction post-treatment, grading cleft width, nasal symmetry, and alveolar alignment.

Main Outcome Measure(s)

Reliability was evaluated via inter-rater and intrarater agreement using Fleiss κ and quadratic weighted Cohen's κ statistics.

Results

The PAT demonstrated good preliminary inter-rater and intrarater reliability, with inter-rater Fleiss’ κ of 0.83 for pretreatment grading and weighted Cohen's κ of 0.75 for post-treatment grading. Intrarater reliability was substantial to almost perfect (κ = 0.70-0.81).

Conclusions

The PAT demonstrated encouraging preliminary reliability among experienced craniofacial orthodontists evaluating standardized PSIO records. Further multicenter studies are needed to establish broader validity and clinical applicability.

Keywords

presurgical infant orthopedics cleft lip and palate outcome assessment tool unilateral cleft lip and palate bilateral cleft lip and palate

Introduction

Comprehensive and interdisciplinary care is essential in managing cleft lip and palate (CLP), given its complex impact on feeding, breathing, speech, dentofacial growth, and psychosocial development.^1–3

CLP presents a spectrum of phenotypes, ranging from microforms to complete clefts.⁴ A Unilateral cleft can be complete or incomplete, with the latter sometimes appearing as a Simonart's band.^4,5 Associated deformities include distortion of the vermilion, rotation of Cupid's bow and philtrum toward the noncleft side, lateral and posterior displacement of the alar cartilage, a depressed and deviated nasal tip, and a shortened, deviated columella. The nostril on the cleft side often has a more horizontal orientation. The nasal cartilage may or may not be deficient.^2,4,5 However, bilateral cleft lip and palate, presents additional reconstructive challenges. The premaxilla is separated from the palatine processes, and the prolabium lacks defined philtral ridges, Cupid's bow, and orbicularis oris muscle fibers. The nasal cartilage is displaced laterally, resulting in a flattened nasal tip and a markedly shortened columella.^2,4,5

Presurgical infant orthopedics (PSIO) has long been used to optimize cleft anatomy and facilitate surgical outcomes. McNeil⁶ first introduced passive acrylic plates, followed by Latham et al's⁷ active appliance utilizing surgically anchored pins. In 1993, Grayson developed nasoalveolar molding (NAM), integrating nasal stents with passive alveolar plates to mold both structures simultaneously.⁸ PSIO techniques have continuously evolved, including the use of nasal elevators, such as DynaCleft or steri-strips, which use elastic tension to apply light pressure and mold the alveolar ridge and nasal cartilage preoperatively.⁹ Mejia et al¹⁰ introduced the Presurgical Orthopedic Appliance, and most recently, the 3-dimensional (3D)-printed Rhinoplasty Appliance System was developed for early nasal molding.¹¹ Advances in digital technology have enabled the development of digital NAM and OrthoAligner NAM, offering improved precision and efficiency.¹²

Although over 100 PSIO protocols have been described, outcome measures used to evaluate their effectiveness remain highly variable and heterogeneous.^13,14 The evaluation of clinical outcomes, through standardized protocols and auditing, is essential for improving care and enabling meaningful comparisons among providers and centers.^13,14

Several approaches have been proposed for comprehensive CLP evaluation, ranging from standardized frameworks such as the International Consortium for Health Outcomes Measurement to broader outcome measures including patient-reported outcome measures and expert or layperson aesthetic ratings.^15,16 However, many of these methods rely on subjective impressions, which may lack consistency and reproducibility. Reliable outcome assessment tools must be both valid and reproducible. However, many existing cleft outcome frameworks are designed for broader interdisciplinary or long-term outcome assessment and do not specifically provide a simple, phenotype-based method for standardized evaluation of PSIO-related morphological changes during the presurgical infant stage. While reliability testing is relatively straightforward, validity requires consensus from expert panels, especially in the absence of the gold standard. An ideal tool should be practical, quick to administer, and not require specialized training or equipment.¹³

Despite efforts toward international standardization of outcome measures in CLP, this goal is challenged by the wide phenotypic variability and diversity in treatment approaches.¹⁴ These limitations hinder consistent and reproducible assessment of morphological changes associated with PSIO treatment. Given the absence of universally accepted objective outcome measures for PSIO and the ongoing variability in treatment approaches, there remains a need for practical and reproducible clinical assessment frameworks capable of documenting observable morphological changes in a standardized manner. Additionally, the long-term effectiveness and stability of PSIO-related outcomes remain subjects of ongoing debate within cleft care literature. Hence, this study aimed to develop and preliminarily assess the reliability of a novel tool to assess morphological outcomes associated with PSIO treatment based on cleft phenotype. We hypothesized that experienced cleft orthodontists would demonstrate acceptable inter-rater and intrarater agreement when applying a standardized phenotype-based assessment framework to evaluate cleft severity and PSIO-related morphological changes.

Materials and Method

Study Design

The present study was designed to develop and perform a preliminary reliability assessment a clinical outcome assessment tool in PSIO treatment in infants born with unilateral and bilateral cleft lip and alveolus, with or without cleft palate.

Ethical Considerations

The data were collected from Smile Train Express Records. All patients’ legal guardians had previously provided informed consent for clinical documentation and use of de-identified records for research and educational purposes according to institutional and Smile Train documentation protocols. Only anonymized and de-identified photographic records were shared with the expert panel for evaluation, without any patient identifiers or treatment center information. Data sharing across participating international collaborators was conducted using a secure de-identified records in accordance with applicable institutional ethical standards and principles governing research involving human participants. The study was approved by the Institutional Ethical Committee (Ref No.: MRIIRS/MRDC/SDS/IEC/2024/129).

Data Source and Image Standardization

For each patient, standardized 2-dimensional (2D) clinical photographs were available both before and after PSIO treatment. These included extra-oral frontal and basal nasal views, as well as intraoral maxillary occlusal views, obtained as part of routine Smile Train documentation protocols. All images were captured following Smile Train's standardized clinical photography guidelines to ensure consistency in head position, lighting, magnification, and framing. Only cases with complete and adequate-quality pretreatment and post-treatment image sets were included for evaluation. To minimize assessment bias, all images were anonymized and de-identified prior to distribution. The images were presented in a standardized format and sequence, without any patient identifiers, treatment center information, or timing cues, allowing blinded evaluation by the expert panel.

Working Group and Sample

The working group comprised of 7 craniofacial orthodontists (AF, MD, PB, MM, TC, JP and RHL), members of a Smile Train Global Orthodontics Advisory Group, with experience in CLP treatment in different global regions (United States, India, Brazil, Mexico and Philippines). All participating raters had substantial clinical experience in cleft orthodontics and PSIO management, with involvement in multidisciplinary cleft care programs and international cleft initiatives. Pretreatment records were obtained within the first month after birth, and post-treatment outcomes were evaluated immediately after completion of PSIO and prior to any primary lip repair, ensuring that all assessments reflected PSIO-related changes only, without surgical influence. The primary goal of the working group was to develop and perform a preliminary reliability assessment of an orthodontic treatment outcomes assessment tool of PSIO treatment considering cleft type and severity. Although 7 experts participated in the development and calibration phases of the expert-based orthodontic assessment tool, only 6 evaluators completed all rating rounds and were therefore included in the final reliability analysis. The development and reliability datasets included both unilateral and bilateral cleft cases representing varying severities of deformity.

Case Selection

Cases were retrospectively selected from the Smile Train database. Inclusion criteria included: (1) infants with unilateral and bilateral cleft lip and alveolus with or without palate undergoing PSIO, (2) availability of complete standardized pretreatment and post-treatment photographic records, and (3) adequate image quality for evaluation. Exclusion criteria included syndromic clefts, incomplete records, prior surgical intervention, and poor-quality or nonstandardized photographs.

Development of the PSIO Assessment Tool

Conceptual Basis for Tool Development

The initial framework for PSIO assessment tool (PAT) was informed by clinical experience, literature describing common cleft morphological characteristics evaluated during PSIO, and a previously developed unpublished internal grading framework created by AF. This unpublished framework served only as a conceptual starting reference and underwent substantial modification during expert consensus discussions.

Parameters were selected based on their clinical relevance, visibility on standardized 2D photographs, and frequent use in routine PSIO assessment. Angular and dimensional thresholds were intended as visual clinical reference guides rather than precise photogrammetric measurements. No digital angular measurement software was used during scoring. The proposed angular and dimensional references were used only as visual clinical guides to support ordinal grading rather than as direct quantitative measurements obtained from photographs. Accordingly, formal measurement error analysis for photogrammetric landmark identification was not performed. Instead, reproducibility of the assessment framework was evaluated through inter-rater and intrarater reliability testing. These included cleft width, nasal symmetry, columella deviation, alar cartilage displacement, premaxillary position, and alveolar alignment. Parameters requiring advanced imaging or difficult reproducibility in retrospective photographic records were not included.

The process involved 2 main phases: Development of Assessment Tool and reliability analysis.

Phase 1: Development of the Assessment Tool

In the first phase, a panel of experts was convened to develop a tool. During the initial meeting, the group discussed and defined the key parameters to be included in the tool, using illustrative clinical cases as references. The parameters included in the tool were divided into pretreatment grade (Depending on the severity of the deformity: Mild, Moderate, and Severe) and post-treatment grade (Grades 1-3 Depending on the quality of outcomes based on pretreatment to post-treatment changes), in both unilateral and bilateral cleft lip and alveolus with or without palate. The development process was planned and discussed through a series of in-person and online meetings. Following this, 15 cases were randomly selected from a data bank, and the experts were asked to rate the pre treatment and post-treatment grade, based on the tool.

The finalized PAT comprised 2 components—a pretreatment grade assessing the initial cleft severity and a post-treatment grade evaluating morphological correction following PSIO. In unilateral cleft cases, the pretreatment grade included parameters such as cleft width, columella angle, and alar cartilage displacement, while in bilateral cases, it included cleft width, premaxillary deviation, nasolabial angle, alar cartilage displacement, and nasal tip projection. The post-treatment grade assessed improvements in alveolar alignment, nasal symmetry, and soft tissue contour. Each criterion was graded on a 3-point scale (1-3) corresponding to the percentage of correction from baseline. During the development phase, 15 unilateral and 15 bilateral cleft lip and alveolus cases with or without palate undergoing PSIO were included for expert calibration and assessment. Figure 1 illustrates the anatomical landmarks and visual reference parameters incorporated into the PAT framework for unilateral and bilateral cleft assessment. A 3-point ordinal grading system was intentionally selected to balance simplicity, reproducibility, and clinical applicability. More complex grading systems with additional categories were considered more susceptible to subjective variability, particularly when applied to retrospective 2D photographic records. The development and reliability datasets included both unilateral and bilateral cleft cases representing varying severities of deformity.

Figure 1.

PAT parameters: (a) Width of the cleft—Distance between the most medial curvature of the greater segment to the most medial curvature of the lesser segment, (b) Width of the bigger cleft—PSIO Unilateral lip with or without palate pretreatment grade, (c) Deviation of the premaxilla—Distance from the incisive papilla to the midline; The reference line is the perpendicular line to the horizontal line between the pterygomaxillary junctions (tuberosities), (d) Columela angle—The angle formed by the intersection between the line from the tip of the nose to the philtrum and the line that goes from the alar base from one side to the other. The angle measurement is on the affected nostril side, (e) Alar Contour—The curvature formed from the most upper point of the columella to the insertion of the nasal base bilaterally, (f) Nasal tip projection—The vertical distance between the nasal tip and subnasale (junction point between the columella to upper lip or philtrum), (g) Nasolabial angle is the angle formed by the nasal tip (or the most superior point of the columella, subnasale and upper lip vermillion border. PSIO, presurgical infant orthopedics; PAT, PSIO assessment tool.

Calibration

Calibration was performed through structured review sessions involving representative unilateral and bilateral cleft cases with varying severities. During these sessions, the panel reviewed grading discrepancies, discussed interpretation of each parameter, and refined operational definitions to improve consistency. Consensus was achieved through iterative discussion and modification of grading descriptors.

Phase 2: Validation and Reliability Analysis

In the second phase, an additional set of 10 new clinical cases was distributed to the same panel of experts. The experts rated the pre and post-treatment grades, and the agreement was analyzed to assess an inter-rater agreement. To evaluate the intrar-rater accordance, the experts evaluated the same cases, in different orders, with an interval of 10 days (about 1 and a half weeks) between each evaluation.

Statistical Analysis

To evaluate the accordance between the experts the intrarater and inter-rater reliability during pretreatment evaluation were calculated using Fleiss κ and post-grade were calculated using quadratic weighted Cohen's κ with a 95% confidence interval, using JAMOVI version 2.3. Fleiss’ κ was used for assessing agreement among multiple raters, while quadratic weighted Cohen's κ was applied for ordinal pairwise data to account for partial agreement between categories. Although Landis and Koch¹⁷ interpretation categories were used for consistency with prior reliability studies, their limitations and potential optimism have been acknowledged, described in Table 1.

Table 1.

Qualitative Interpretation of κ According to Landis and Koch, 1977.

Value of κ
0 0-0.2	Poor—Slight
0.2-0.4	Fair
0.4-0.6	Moderate
0.6-0.8	Substantial
0.8-1	Almost perfect

Results

PSIO Treatment Outcome Assessment Tool

The finalized PAT was systematically organized into 2 sections: pretreatment and post-treatment grading allowing comprehensive evaluation of PSIO outcomes in unilateral and bilateral cleft cases.

Table 2 summarizes the pretreatment grading parameters used to categorize the initial cleft severity as mild, moderate, or severe. Agreement was generally higher for pretreatment severity grading than for post-treatment correction grading, likely reflecting increased subjectivity in estimating treatment-related morphological improvement. Table 3 presents the post-treatment grading criteria reflecting the degree of morphological correction achieved following PSIO, based on nasal symmetry, alveolar alignment, and soft tissue improvement. The structured scoring framework demonstrated clear differentiation among varying severities and outcomes, supporting its potential utility as a standardized descriptive assessment framework.

Table 2.

Presurgical Infant Orthopedics (PSIO) Pretreatment Grade for Unilateral and Bilateral Cleft lip and Alveolus with or Without Cleft Palate.

	PSIO Unilateral Lip with or Without Palate Pretreatment Grade
Evaluation criteria		Mild	Moderate	Severe
Width of the cleft	Distance between the most medial curvature of the greater segment to the most medial curvature of the lesser segment (Figure 1).	0-3 mm	4-7 mm	≥8 mm
Columella angle	The angle formed by the intersection between the line from the tip of the nose to the philtrum and the line that goes from the alar base from one side to the other. The angle measurement is on the affected nostril side (Figure 2).	75°-90°	45°-75°	45°
Alar cartilage contour	The curvature formed from the most upper point of the columella to the insertion of the lateral nasal base on the affected side (Figure 3).	Convex	Flat	Concave

PSIO bilateral lip with or without palate pretreatment grade
Width of the bigger cleft	PSIO Unilateral lip with or without palate pretreatment grade (Figure 4).	0-3 mm	4-7 mm	≥8 mm
Deviation of the premaxilla	Distance from the incisive papilla to the midline. The reference line is the perpendicular line to the horizontal line between the pterygomaxillary junctions (tuberosities). If the intraoral record is not clear, we can refer to the facial midline perpendicular to the midpoint between distance between both medial canthi of the eyes (Figure 5).	D₀-5 mm	D₅-10 mm	D≥10 mm
Nasolabial angle	Nasolabial angle is the angle formed by the nasal tip (or the most superior point of the columella, subnasale and upper lip vermillion border (Figure 6).	75°-90°	45°-75°	≤45°
Alar cartilage contour	The curvature formed from the most upper point of the columella to the insertion of the lateral nasal base on the affected side (Figure 3).	Convex	Flat	Concave
Nasal tip projection	The vertical distance between the nasal tip and subnasale (junction point between the columella to upper lip or philtrum); (Figure 7).	≥4 mm	1-3 mm	0

*For unilateral cleft, at least 2 out of the 3 to be considered in one category; *For bilateral cleft, at least 3 out of the 5 to be considered in one category; If the prolabium is less than 50% of the premaxilla size add one point to the severity(estimated visually relative to the visible premaxillary width).

Table 3.

Presurgical Infant Orthopedics (PSIO) Post-treatment Grade for Unilateral and Bilateral Cleft lip and Alveolus with or Without Cleft Palate.

	PSIO Unilateral Lip with or Without Palate Post Treatment Grade*
Evaluation criteria		1	2	3
Alveolar cleft width correction 100% = 0 mm	Correction of the alveolar cleft width compared with the beginning of the treatment. Each cleft side was assessed individually, and overall grading reflected combined clinical severity. In conjunction with overall alveolar alignment and archform continuity(100% is achieved).	0%-40%	40%-60%	60%-100%
Columella angle correction 100% = 90°	Correction of columella angle compared with the beginning of treatment.	0%-40%	40%-60%	60%-100%
Alar cartilage contour correction	Correction of Alar cartilage contour regarding the convexity.	No change or minimum change	Mild to moderate convexity	Perfect convexity

	PSIO bilateral lip with or without palate post-treatment grade**
Alveolar cleft width correction 100% = 0 mm	Correction of the alveolar cleft width compared with the beginning of the treatment. When there is no space between the segments, 100% is achieved.	0%-40%	40%-60%	60%-100%
Correction of premaxilla deviation	Correction of the premaxilla deviation compared with the beginning of the treatment. Distance from the incisive papilla to the midline.	0%-40%	40%-60%	60%-100%
Nasolabial angle 100% = 90° (approximate visual reference)	Correction of nasolabial angle compared with the beginning of the treatment. A good Nasolabial angle is near to 90°, which represents 100% of correction achieved.	0%-40%	40%-60%	60%-100%
Alar cartilage correction	Correction of Alar cartilage contour regarding the convexity.	No change or minimum change	Mild to moderate convexity	Perfect convexity
Nasal tip projection 100% ≥4 mm	Correction of Nasal tip projection compared with the beginning of the treatment. Nasal tip projection is considered 100% corrected when distance between the nasal apex and subnasale is ≥ 4 mm.	0%-40%	40%-60%	60%-100%

*For unilateral cleft, cases that have collapsed with the treatment in the alveolus are considered: severe collapse = 1, moderate collapse = 2.

**For bilateral cleft, at least 3 out of the 5 to be considered in one category. If the premaxilla is deflected due to treatment, consider one point less.

Inter-rater Reliability

Pretreatment Grade Reliability

The inter-rater reliability for pretreatment grades showed a mean κ of 0.83 (95% CI: 0.63-1.00), indicating almost perfect agreement overall. Pairwise κ values ranged from 0.67 to 1.0, with percentage agreement between 81.8% and 100%. Agreement interpretations, according to Landis and Koch criteria, ranged from substantial to almost perfect (Table 4).

Table 4.

Interrater Reliability for Pretreatment and Post-treatment Grades.

Pregrade Inter-rater Agreement TestTotal—Mean κ = 0.83 (95% CI: 0.63-1.00)
Evaluator	Evaluator (comparison)	κ	p-value	95% Conficence Interval	Percentage Agreement	Agreement Interpretation
1	2	1.0	<.001	1-1	100%	Almost perfect
	3	0.83	<.001	0.49-1.23	90.9%	Almost perfect
	4	0.83	<.001	0.46-1.24	90.9%	Almost perfect
	5	1.0	<.001	1-1	100%	Almost perfect
	6	0.83	<.001	0.47-1.24	90.9%	Almost perfect
2	3	0.83	<.001	0.47-1.25	90.9%	Almost perfect
	4	0.83	<.001	0.48-1.19	90.9%	Almost perfect
	5	1.0	<.001	1-1	100%	Almost perfect
	6	0.83	<.001	0.48-1.22	90.9%	Almost perfect
3	4	0.67	.006	0.24-1.17	81.8%	Substantial
	5	0.83	<.001	0.45-1.26	90.9%	Substantial
	6	0.67	.006	0.25-1.17	81.8%	Substantial
4	5	0.83	<.001	0.46-1.26	90.9%	Almost perfect
4	6	1.0	<.001	1-1	100%	Almost perfect
5	6	0.83	<.001	0.47-1.19	90.9%	Almost perfect

Postgrade Inter-rater agreement testTotal—mean κ = 0.75 (95% CI: 0.52-1.00)
Evaluator	Evaluator (comparison)	κ	p-value	95% Conficence Interval	Percentage Agreement	Agreement Interpretation
1	2	1.0	<.001	1-1	100%	Almost perfect
	3	0.67	.006	0.27-1.14	80%	Substantial
	4	1.0	<.001	1-1	100%	Almost perfect
	5	1.0	<.001	1-1	100%	Almost perfect
	6	0.65	.006	0.22-1.14	80%	Substantial
2	3	0.67	.005	0.25-1.14	80%	Substantial
	4	1.0	<.001	1-1	100%	Almost perfect
	5	1.0	<.001	1-1	100%	Almost perfect
	6	0.65	.006	0.23-1.16	80%	Substantial
3	4	0.67	.005	0.25-1.17	80%	Substantial
	5	0.67	.005	0.27-1.14	80%	Substantial
	6	0.66	.005	0.27-1.16	80%	Substantial
4	5	1.0	<.001	1-1	100%	Almost perfect
4	6	0.66	.005	0.24-1.11	80%	Substantial
5	6	0.65	.006	0.24-1.14	80%	Substantial

Pairwise κ Values, 95% Confidence Intervals (Truncated at 1.0), Percentage Agreement, and Agreement Interpretation are Shown.

Post-treatment Grade Reliability

For post-treatment grades, the mean inter-rater κ was 0.75 (95% CI: 0.52-1.00). Pairwise κ values ranged from 0.65 to 1.0, with percentage agreement between 80% and 100%, indicating substantial to almost perfect agreement (Table 4).

Intrarrater Reliability Test

Pretreatment Grade Reliability

Intrarater reliability for pretreatment grades showed a mean κ of 0.81 (95% CI: 0.57-1.00), with percentage agreement ranging from 75% to 100%. Agreement interpretations ranged from moderate to almost perfect (Table 5).

Table 5.

Intrarater Reliability for Pretreatment and Post-treatment Grades.

Pregrade Intrarrater Agreement TestTotal—Mean κ = 0.81 (95% CI: 0.57-1.00)
First Evaluation	Second Evaluation	κ	p-value	95% Conficence Interval	Percentage Agreement	Agreement Interpretation
1	1	0.83	<.001	0.46-1.23	90.9%	Almost perfect
2	2	0.67	.005	0.24-1.16	80%	Substantial
3	3	0.83	<.001	0.45-1.25	90.9%	Almost perfect
4	4	0.60	.02	0.22-1.11	75%	Moderate
5	5	1	<.001	1-1	100%	Almost perfect
6	6	1	<.001	1-1	100%	Almost perfect

Post grade intrarrater agreement testTotal—Mean κ 0.70 (95% CI: 0.60-0.80)
First Evaluation	Second Evaluation	κ	p-value	95% Conficence Interval	Percentage Agreement	Agreement Interpretation
1	1	0.67	.005	0.26-1.15	80%	Substantial
2	2	0.83	<.001	0.46-1.25	90.9%	Almost perfect
3	3	0.83	<.001	0.45-1.25	90.9%	Almost perfect
4	4	0.83	<.001	0.45-1.25	90.9%	Almost perfect
5	5	0.67	.005	0.24-1.16	80%	Substantial
6	6	0.67	.005	0.24-1.15	80%	Substantial

κ Values, 95% Confidence Intervals (Truncated at 1.0), Percentage Agreement, and Agreement Interpretation are Presented.

Post-treatment Grade Reliability

For post-treatment grades, the mean κ was 0.70 (95% CI: 0.60-0.80), with percentage agreement between 80% and 90.9%, indicating substantial to almost perfect agreement (Table 5).

Discussion

The absence of standardized PSIO appliances, treatment protocols and assessment tools has made it challenging to evaluate and measure treatment effectiveness in patients with unilateral and bilateral clefts of the lip with or without cleft palate. This lack of consistency has made it more challenging for cleft teams to systematically monitor outcomes and refine care strategies. The present study primarily demonstrates that experienced cleft orthodontists were able to apply the PAT with acceptable consistency when evaluating standardized retrospective photographic records. The observed inter-rater and intrarater agreement suggests that the tool may provide a reproducible framework for describing pretreatment cleft severity and post-treatment morphological changes following PSIO therapy. However, the present findings should be interpreted as preliminary reliability data rather than comprehensive validation of the tool.

The PAT was designed to provide a structured and standardized framework for documenting observable morphological changes associated with PSIO treatment in infants with CLP. The tool allows consistent assessment of features such as cleft width, alveolar alignment, nasal symmetry, columella position, and soft tissue morphology using standardized clinical photographs. Unlike advanced quantitative approaches such as 3D imaging, digital morphometric analysis, or stereophotogrammetry, the PAT was intentionally designed as a simple, clinically accessible, and low-resource assessment framework that can be readily implemented in routine cleft care settings. While the present study demonstrated encouraging reliability among experienced orthodontists, the study was not designed to evaluate the biological effectiveness or long-term clinical benefit of PSIO itself. Ongoing debate persists regarding the magnitude, stability, and durability of PSIO-related outcomes, particularly with respect to long-term facial growth, dental arch development, and functional outcomes. Accordingly, PAT should be viewed as a descriptive assessment framework rather than as evidence supporting any specific PSIO protocol or treatment approach. The PAT is therefore not intended to replace high-end quantitative analyses, but rather to complement them by providing a practical and reproducible clinical assessment tool applicable across diverse healthcare environments. Previous studies have suggested that although PSIO may improve early presurgical morphology and facilitate surgical approximation of tissues, evidence regarding sustained long-term craniofacial benefits remains inconclusive. Some authors have also questioned whether early orthopedic changes necessarily translate into meaningful long-term growth advantages, emphasizing the need for cautious interpretation of short-term treatment effects.^18,19

At the same time, the ability to reliably document early morphological changes may still hold important clinical relevance. Standardized assessment of presurgical changes can assist multidisciplinary cleft teams in communication, treatment planning, longitudinal record keeping, and comparison of outcomes across institutions and treatment protocols. Furthermore, a reproducible tool such as PAT may facilitate future prospective multicenter studies investigating the relationship between early PSIO-induced morphological improvements and longer-term surgical, esthetic, functional, and growth-related outcomes. In this context, the PAT may contribute not only to short-term treatment assessment but also to the broader understanding of the long-term impact of PSIO within comprehensive cleft care.^18,19

The PAT was designed to be intuitive and clinically applicable, requiring only minimal calibration for consistent use. As the included parameters, cleft width, nasal symmetry, columella angle, and alveolar alignment are routinely evaluated by clinicians trained in PSIO, the learning curve for implementing the PAT is expected to be low. A brief calibration session involving review of reference cases and consensus on grading criteria was sufficient to achieve high inter-rater and intrarater agreement among the expert panel. It is important to note that this study was not designed to quantify the biological or clinical treatment effect of PSIO, but rather to assess the preliminary reliability of a standardized, expert-based tool capable of reliably detecting and grading PSIO-related changes. The grading categories were not intended to dictate surgical technique or establish treatment thresholds, but rather to provide standardized descriptive assessment of presurgical morphology. Further studies are necessary to determine the applicability of the tool across broader clinical settings and multidisciplinary teams, including validation among nonorthodontic cleft care providers and clinicians with varying levels of PSIO experience.

Future developments in cleft outcome assessment are likely to incorporate artificial intelligence (AI), machine learning, and advanced digital imaging technologies. Recent advances in automated image analysis, 3D facial scanning, stereophotogrammetry, and AI-based morphometric assessment have demonstrated the potential to improve the objectivity, precision, and reproducibility of craniofacial evaluations.^20–23 In this context, standardized frameworks such as the PAT may provide an important foundation for the development of automated or semi-automated assessment systems. The structured parameters included in the PAT could potentially be integrated with digital image analysis algorithms to facilitate objective quantification of cleft morphology, automated scoring of treatment-related changes, and large-scale multicenter outcome comparisons. Furthermore, machine learning models trained on standardized datasets may help identify predictors of treatment response and support more individualized treatment planning.²¹ While such applications remain exploratory and require rigorous validation, the PAT may serve as a clinically relevant framework that complements future AI-driven approaches to cleft outcome assessment.

Limitation(s)

This study has several limitations. First, the reliability analysis was conducted on a relatively small retrospective sample without formal sample size calculation, limiting the precision and generalizability of the agreement estimates. Second, the study utilized retrospective standardized 2D photographic records, which may not fully capture the 3D complexity of cleft morphology and may introduce variability related to image quality, head positioning, and timing of image acquisition. Third, all evaluators were experienced orthodontists with cleft and PSIO expertise, and, therefore, the findings may not be generalizable to nonorthodontist clinicians or less experienced raters. Fourth, although the PAT demonstrated encouraging inter-rater and intrarater reliability, the present study did not evaluate construct validity, criterion validity, responsiveness, or correlation with objective morphometric measurements or long-term surgical and functional outcomes. In addition, several post-treatment grading criteria relied on subjective estimation of percentage correction from baseline photographs, which may introduce observer variability despite calibration efforts. The study was also limited to records obtained through a single documentation system and has not yet undergone external multicenter validation using independent datasets. Future studies with larger and more diverse samples, objective morphometric comparisons, and longitudinal follow-up are necessary to further establish the clinical utility and validity of the PAT. The interpretation of κ values using Landis and Koch criteria should be interpreted cautiously, particularly in small-sample reliability studies. Furthermore, the tool focuses primarily on morphological outcomes and does not capture functional or psychosocial impacts.

Conclusion

The PAT demonstrated encouraging preliminary inter-rater and intrarater reliability among experienced cleft orthodontists evaluating standardized retrospective photographic records. The tool may provide a structured framework for describing pretreatment cleft severity and post-treatment morphological changes following PSIO therapy in unilateral and bilateral cleft cases. However, the present findings represent an initial reliability assessment rather than comprehensive validation. Further studies involving larger and more diverse populations, external validation, objective morphometric comparisons, and correlation with surgical and functional outcomes are necessary before broader clinical implementation.

Footnotes

ORCID iDs

Puneet Batra

Tatiana Castillo

Ethical Statement

The ethical approval for the study was obtained from the Institutional Ethical Committee of Manav International Institute of Research and Studies(Ref No: MRIIRS/MRDC/SDS/IEC/2024/129).

Informed Consent

Not applicable.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability

Data related to this article are available from the corresponding author upon reasonable request.

References

Dixon

Marazita

Beaty

Murray

. Cleft lip and palate: Understanding genetic and environmental influences. Nat Rev Genet. 2011;12(3):167-178.

Shkoukani

Chen

Vong

. Cleft lip - a comprehensive review. Front Pediatr. 2013;1:53.

Losee

Kirschner

. Comprehensive Cleft Care. McGraw-Hill; 2009.

Henry

Samson

Mackay

. Evidence-based medicine: The cleft lip nasal deformity. Plast Reconstr Surg. 2014;133(5):1276-1288.

Ozsoy

Demirel

Yildirim

Tosun

Sarikcioglu

. Method selection in craniofacial measurements: Advantages and disadvantages of 3D digitization method. J Craniomaxillofac Surg. 2009;37(5):285-290.

McNeil

. Congenital oral deformities. Arch Pediatr. 1950;67(4):285-298.

Latham

Kusy

Georgiade

. An extraoral force appliance for newborn cleft lip and palate. Plast Reconstr Surg. 1980;66(5):687-694.

Grayson

Santiago

Brecht

Cutting

. Presurgical nasoalveolar molding in infants with cleft lip and palate. Cleft Palate Craniofac J. 1993;30(6):379-385.

Vinson

. The effect of DynaCleft® on cleft width in unilateral cleft lip and palate patients. J Clin Pediatr Dent. 2017;41(6):442-445.

10.

Mejia

Wolfe

Murphy

Rothenberg

Tejero

Bauer

Wolfe

. Gingivosupraperiosteoplasty following presurgical maxillary orthopedics is associated with normal midface growth in complete unilateral and bilateral cleft patients at mixed dentition. Plast Reconstr Surg. 2021;148(6):1335-1346.

11.

Mejia

Pablo Gomez

Moon

Wolfe

Perlyn

Anthony Wolfe

Steinberg

. 3D Infant orthopedic nasal molding system for improved outcomes in cleft nasal deformity. FACE. 2023;4(2):141-147.

12.

Batra

Gribel

Abhinav

Arora

Raghavan

. Orthoaligner “NAM”: A case series of presurgical infant orthopedics (PSIO) using clear aligners. Cleft Palate Craniofac J. 2020;57(5):646-655.

13.

Jones

Al-Ghatam

Atack

Deacon

Power

Albery

Ireland

Sandy

. A review of outcome measures used in cleft care. J Orthod. 2014;41(2):128-140.

14.

Sarilita

Sjamsudin

Mossey

. Scoping review of outcome measures in cleft care used in research and reports. Orthod Craniofac Res. 2024;27(Suppl 1):42-48.

15.

Aycart

Caterson

. Advances in cleft lip and palate surgery. Medicina (Kaunas). 2023;59(11):1932.

16.

Allori

Kelley

Meara

Albert

Bonanthaya

Chapman

Cunningham

Daskalogiannakis

de Gier

Heggie

, et al. A standard set of outcome measures for the comprehensive appraisal of cleft care. Cleft Palate Craniofac J. 2017;54(5):540-554.

17.

Landis

Koch

. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159-174.

18.

Batra

Kubavat

Kuijpers-Jagtman

Gribel

Ahuja

. Impact of three types of presurgical infant orthopedics on nasolabial appearance in unilateral cleft lip and palate: A 4-year follow-up study. Angle Orthod. 2026:e082825-729.1. doi: 10.2319/082825-729.1

19.

Batra

Kubavat

Ahuja

. Four-year follow-up comparison of three pre-surgical infant orthopedic methods on mandibular arch morphology in unilateral cleft lip and palate: A retrospective study. Int Orthod. 2025;23(4):101013. doi: 10.1016/j.ortho.2025.101013

20.

Nalabothu

Nandan

Reddy

Mueller

. Smartphone and AI workflow for 3D printed plate for presurgical therapy in cleft lip and palate: Retrospective evaluation of outcomes. Cleft Palate Craniofac J. 2026;63(3):376-382. doi: 10.1177/10556656251400877

21.

Nalabothu

Nandan

Gosla Reddy

de Macêdo Santos

Mueller

. Smartphone scanning and machine learning for automated presurgical 3D-printed plate fabrication from cleft impressions. Plast Reconstr Surg Glob Open. 2025;13(9):e7134. doi: 10.1097/GOX.0000000000007134

22.

Gomez

Batra

Echeverry

Dominguez

Ahuja

Saha

. A step-by-step guide for implementing digital scanning in presurgical infant orthopedics (PSIO). Cleft Palate Craniofac J. 2025:10556656251380611. doi: 10.1177/10556656251380611

23.

Nalabothu

Benitez

de Macedo Santos

Mueller

. Cleft lip and palate digital impression workflow. Plast Reconstr Surg Glob Open. 2025;13(5):e6741. doi: 10.1097/GOX.0000000000006741