Abstract
Background
This systematic review aims to assess the effectiveness between exclusive polyethylene (PE) glenoid implants and metal back (MB) implants in anatomical total shoulder arthroplasty.
Methods
We have systematically reviewed the literature and include full-text randomized clinical trials (RCTs) comparing PE (keeled and pegged) versus MB implants, available in PubMed, Scopus, Embase, Cochrane CENTRAL, LILACS, Web of Science, WHO ICTRP, Clinical trials. Grey literature was also hand searched assessed
Results
Eight RCTs were included, with 323 patients and 338 shoulders, with follow-up from six weeks to ten years. Shoulder function through the Constant-Murley score, range of motion and pain through the Visual Analogue Scale, were important outcomes to access the PE components that showed no difference; meanwhile, there was a reduction in radiolucency lines (RL) up to grade two around 50% and 12.4% for complications and surgery revisions in favour of the peg over the keel. Compared to MB, there was 25% more RL around PE, however function and complication rates were equivalent.
Discussion
The PE component in anatomical total shoulder arthroplasty shows more RL compared to MB, especially around the keel design. Despite this, there were no significant differences in complications or revision surgery between both groups.
Keywords
Introduction
Osteoarthritis (OA) of the glenohumeral joint is a common clinical condition often managed non-surgically with non-steroidal anti-inflammatory drugs, intra-articular injections, and physiotherapy. Arthroscopic debridement may also be considered, however, anatomic total shoulder arthroplasty (TSA) remains the preferred treatment in advanced stages or when conservative methods fail.1–3 A functional rotator cuff is essential for anatomic TSA; otherwise, reverse shoulder arthroplasty is indicated. 4
Failure of prosthetic components accounts for up to 39% of complications following anatomic TSA, 5 with the glenoid component being the most common site of failure, primarily due to the development of radiolucent lines (RL) at the implant–bone interface (80%), increasing the risk of aseptic loosening.6–9
In addition to surgical technique, selecting a stable, anatomically conforming implant is critical.10,11 Polyethylene (PE) glenoids may be cemented in a keel 12 or pegged design, the latter characterized by posterior fixation pegs. 13 To reduce RL, preserve subchondral bone, and enable central reaming, metal-backed (MB) designs have been adopted.14,15
Substantial uncertainty remains regarding implant performance and complications. Although prior systematic reviews (SR)5,9,16–19 addressed this topic, many included non-comparative, low-evidence studies, yielding conflicting results. This SR includes only randomized or quasi-randomized trials directly comparing PE and MB components.
Materials and methods
This systematic review was conducted in accordance with the Cochrane recommendations 20 and adhered to the guidelines outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA), 21 aiming to assess the effectiveness polyethylene (PE) versus metal-back (MB) glenoid implants in anatomical total shoulder arthroplasty. The PICOS strategy was applied as follows: Population (P) – Adults above 18 years old; Intervention (I) – Total shoulder arthroplasty with metal back glenoid component; Comparison (C) – Total shoulder arthroplasty with polyethylene glenoid component, keeled or pegged; Outcomes (O) – Primary: Shoulder function and complications. Secondary: quality of life, complications; Type of study (S) – Randomized or quasi-Randomized Controlled Trial. This review has been registered in the PROSPERO (CRD 42018079537) on 11 January 2018. The methodology was established prior to conducting the review. This study protocol was previously published on BMJ Open. 22
Eligibility criteria
The eligibility criteria included full-text RCTs published up to April 2025. Searches were conducted from January 2021 in MEDLINE (PubMed), Cochrane Central (issue 1, 2025), LILACS, EMBASE, WHO ICTRP and ClinicalTrials.gov.
Grey literature was sourced from Google Scholar, Opengrey, and GreyNet. No language or date restrictions were applied.
Excluded were non-RCT, experimental, cadaveric, cohort, observational, case report, case–control, and studies with less than five participants.
Search strategy
Detailed strategies for each database are in Supplementary Material 1.
Selection of studies
Three reviewers (RZ, MT, and RS) independently screened titles and abstracts. Full texts were reviewed to identify RCTs and quasi-RCTs (where allocation was not fully randomized). Disagreements were resolved through discussion or adjudicated by a senior reviewer (JB).
Data summarization
Two reviewers (RZ and MT) independently extracted data using a pre-designed Excel® form (version 2205). Extracted data included study methods, author/institutional details, funding sources, participant characteristics (number of shoulders randomized/analysed, losses, baseline characteristics, eligibility criteria), interventions, and outcomes (type, scale, units, direction, and remarks). Discrepancies were resolved by discussion. Final data were exported to Review Manager 5 by reviewer JB.
Quality assessment
The risk of bias was independently assessed by two reviewers (RZ, FM) utilizing the Cochrane Risk of Bias tool for randomized trials (RoB 2). 23 The overall risk of bias for each outcome was then categorized as ‘Low risk’, ‘Some concerns’, or ‘High risk’.
Discrepancies were resolved by consensus. The following domains were considered: bias arising from the randomization process, deviations from intended interventions, missing outcome data, outcome measurement, and selection of the reported result.
Statistical analysis
For continuous data, means, standard deviations (SD), and group size were extracted. If studies reported other dispersion measures, SD were calculated per Cochrane Handbook guidance. 24 Data were synthesized using meta-analyses with the inverse variance or generic inverse variance method under a random-effects model.
Meta-analyses used the inverse variance or generic inverse variance method under a random-effects model. SMD were used when outcomes had different units and were presented with 95% confidence intervals (CI). For interpretability, SMDs were translated to a typical scale in the ‘Summary of Findings’ table. 25 If medians and interquartile ranges (IQR) or ranges were reported, the method of Wan et al. 26 was used to estimate means and SDs.
For dichotomous outcomes, event counts and group sizes were extracted, and risk ratios (RR) were calculated using a random-effects model. Each shoulder was considered the unit of analysis. When studies reported multiple timepoints, the longest follow-up data were used.
Clinical heterogeneity was assessed by comparing study populations, interventions, and outcomes. Statistical heterogeneity was evaluated using forest plots and the Chi2 test (P < .10 threshold) and quantified with the I2 statistic. 27 Publication bias was to be assessed using funnel plots if ten or more studies were included in a meta-analysis, this criterion was not met. Available protocols were examined to detect reporting bias.
Meta-analyses were performed only when more than one study with comparable participants, interventions, and outcomes was available. All analyses followed the Cochrane Handbook 27 and were conducted using Review Manager 5. Continuous outcomes were summarized using MD or SMD; dichotomous outcomes were summarized using RR, all via random-effects models.
Results
The database search, applying filters according to pre-established inclusion and exclusion criteria, yielded 641 records. Initially, 125 duplicates were removed, and an additional 451 records were excluded based on title and abstract screening. The remaining 64 studies underwent full-text screening. Ultimately, eight RCTs or quasi-RCTs met the eligibility criteria and were included for data extraction and final analysis.28–35 The PRISMA flowchart outlining the study selection process and reasons for exclusion is presented in Figure 1.

PRISMA flow diagram.
Characteristics of the studies
All included studies were single-centre, two-arm parallel group RCTs, except for one multicentre study. 33 Follow-up durations ranged from 6 weeks 32 to 10 years. 28 The trials were conducted in the United States (n = 4), Canada (n = 1), France (n = 1), England (n = 1), and Sweden (n = 1). The included studies were published between 2002 and 2021. (Table 1)
Demographic characteristics of included studies.
NA: not available
Summarization of the studies
A total of 323 patients were included across the seven RCTs. Notably, 40 participants initially enrolled in one trial 29 were followed up in a subsequent study, 30 leading to a total of 338 randomized shoulders. There were 19 shoulders lost to follow-up; however, loss to follow-up data was not reported in one study. 30 Among the included patients, 151 were male and 157 were female. The age at the time of surgery ranged from 60.1 to 73.9 years, although one study did not report age or sex, 34 and another provided incomplete data on age. 32 These characteristics are summarized in Table 1.
Primary glenohumeral osteoarthritis was reported as an inclusion criterion in all studies except one. 34 Dominance of the operated shoulder was described in 65 participants in three studies.29–31
Risk of bias of included studies
The risk of bias across the included studies is summarized in Figures S1 and S2. We identified that approximately 62.5% of the studies presented an overall risk of bias classified as ‘some concerns’, while the remaining 37.5% were judged to be at high risk of bias.
Randomization process
Issues related to random sequence generation, shoulder allocation, or group imbalance were identified in three studies28–30 which were therefore classified as having ‘some concerns’. The remaining five studies31–35 were considered to have been appropriately conducted in this regard.
Intervention
Inadequate or unreported blinding of participants or study personnel, or deviations from the intended interventions that could have affected outcomes, were strongly observed in two RCTs,30,34 partially in four,29,31,33,35 and only two studies fully adhered to the protocol.28,32
Outcome
In this domain, seven of the included studies presented issues such as lack of protocol registration or the use of multiple outcome measures, with only the most favourable being reported.28–34 Only the study by Chin et al. 35 reported pre-specified outcomes without selective scale reporting or tailored analyses.
Selective reports
Inadequate or unreported blinding of patients or research staff to the assigned intervention, or deviation from the planned protocol affecting outcome assessment, was strongly noted in two RCTs,30,34 partially in four,29,31,33,35 and only two trials followed the protocol as intended.28,32
Imprecision was identified in five RCTs28–30,32,35 due to unclear reporting regarding the blinding of surgeons, outcome assessors, or participants, which may have impacted the study results due to awareness of the assigned intervention. In the remaining three studies,31,33,34 sufficient information was available to classify them as having a low risk of bias in this domain.
Regarding the overall risk of bias assessment, although the trials were classified as level I or II evidence, approximately 37.5% were judged to be at high risk of bias. This finding has direct implications for the quality of evidence and, consequently, the strength of recommendations, as assessed using the GRADE approach.
PEG versus KEEL
Four studies28,30,33,34 assessed functional outcomes in 95 shoulders, comprising 44 peg and 51 keel components using the Constant-Murley Score. Keel glenoid group presented a mean of 66.2, and the peg group 61.14, showing no statistically significant difference between groups (MD −0.17; 95% CI, −0.62 to 0.28; P = .46) (Figure 2). Only one study 28 assessed pain outcomes, evaluating 20 shoulders (10 keel and 10 peg) using the VAS. Keel group presented a mean of 6, and Peg 7, with no statistically significant difference.

Forest plot Peg×Keel – Function and complications.
Active anterior flexion and abduction range of motion were assessed in two studies,28,30 totalling 58 shoulders (26 peg and 32 keel). The weighted mean anterior flexion at final follow-up was 125° for the PEG group and 142.4° for the keel group (WMD = −11.09; 95% CI, −31.66 to 9.48; P = .29). The weighted mean abduction at final follow-up was 124° for the PEG group and 141.2° for the keel group (WMD = −14.17; 95% CI, −42.72 to 14.38; P = .33).
Therefore, no statistically significant differences were observed between groups for active anterior flexion or abduction. Other planes of motion were either not assessed or were evaluated in only one study.
Complications and revision surgeries were assessed in four studies, encompassing 121 shoulders (60 peg and 61 keel). Patients treated with the keel glenoidal component presented 19.7% of complications rates or revision surgeries, whereas the PEG implant showed a 7.3% rate (RR 0.37; 95% CI, 0.13 to 1.05; P = .06) (Figure 2).
Isolated surgical complications included implant loosening, rotator cuff tear, instability, glenoid fracture after a fall, infection, haematoma drainage, and polyethylene wear associated with pain and functional deterioration. Complications occurred in 16.7% of shoulders in the Keel group and in 7.3% of shoulders in the Peg group (RR = 0.44; 95% CI, 0.13 to 1.45; P = .18). Surgical revision was required in 19.7% of cases in the Keel group and in 7.3% in the Peg group (RR = 0.43; 95% CI, 0.13 to 1.44; P = .17).
Component loosening was assessed based on the presence of Radiolucency lines (RL). RL graded zero or one was assessed in three studies,29,32,33 including 114 shoulders (55 peg and 59 keel). Keel implants presented 71,2% of RL grade 0–1 whereas PEG implants showed lower incidence (36.3%) (RR 0.51; 95% CI, 0.30 to 0.88; P = .01). However, for loosening graded as four or five,29,30,32,33 which represent an imminent risk of loosening, the Keel implant presented 11.1% rate and the PEG implants presented 11,7% rate, without statistically significant difference (RR 1.05; 95% CI, 0.23 to 4.93; P = .95) (Figure 3).

Forest plot Peg×Keel – Radiolucency lines: grades 0–1 and 4–5.
Therefore, the peg-type glenoid implant may reduce or have little to no effect on the presence of periprosthetic radiolucency at least grade 2; however it may increase or have little to no effect on grades 4–5. Nevertheless, the evidence remains very uncertain (Table S-2).
Metal-back versus polyethylene
In the comparative analysis between metal-back (MB) and polyethylene (PE) glenoid components, two studies were included.31,35
Functional outcomes were assessed in 129 shoulders (64 MB and 65 PE) using the ASES score. The MB group achieved a mean score of 88.8, whereas the PE group scored 87.9, with no statistically significant difference between the two groups (WMD = −0.06 ; 95% CI, −0.39 to 0.28; P = .72) (Figure 4). Pain was assessed only in the study by Boileau et al., 31 using the Constant Score pain subscale. A total of 35 shoulders (18 MB and 17 PE) were included. The PE group reported less pain, with a mean score of 13 points, compared with 12 points in the MB group, showing no statistically significant difference.

Forest plot Polyethylene×Metal-backed.
Range of motion for active anterior flexion and external rotation was evaluated in both included studies,31,35 with no heterogeneity. A total of 124 shoulders (62 MB and 62 PE) were analysed. The weighted mean anterior flexion at final follow-up was 141° for the MB group and 1465° for the PE group (WMD = 5.5; 95% CI, −2.16 to 13.16; P = .16). The weighted mean external rotation was 49° for the MB group and 44,1° for the PE group (WMD = −4.88; 95% CI, −12.41 to 2.64; P = .20).
In total, 133 shoulders were evaluated for complications and the presence of RL, including 66 MB and 67 PE components. Patients treated with the PE glenoid component presented a 7.2% rate of complications or revision surgeries, whereas the MB implant showed a 13.6% rate (RR 0.53; 95% CI, 0.06 to 4.71; P = .57) (Figure 4).
Surgical revision was required in 9.1% of MB implants and 4.3% of PE components (RR = 0.47; 95% CI, 0.06 to 3.52; P = .46). Although there were fewer complications and revision surgeries in the PE group, this was not statistically significant.
Metal-backed implants presented 9.1% of radiolucent lines, whereas PE implants showed a higher incidence (34.4%), with a statistically significant difference (RR = 3.78; 95% CI, 1.82 to 7.85; P < .01) (Figure 4). However, the certainty of the evidence was rated as low due to study limitations and imprecision, according to the GRADE approach (Table S-3).
Discussion
This is the first meta-analysis to compare glenoid components in anatomical total shoulder arthroplasty using only randomized or quasi-randomized clinical trials.
When comparing functional outcomes between PE components (peg and keel), an absolute difference of 5.1 points favouring the keel design was found. However, this did not reach the minimum clinically important difference of 5.7 points on the Constant-Murley Score. This finding aligns with the results reported by Moulton et al. 35 in their multicentre retrospective study, which observed no significant difference between PE implant designs.
Regarding complications, Vavken et al. 16 conducted a review limited to keeled and pegged components, including both randomized and retrospective studies. They reported higher surgical complication and revision rates in the keel group, with an average reduction of 8.9% in favour of the peg. Our findings are consistent, showing up to a 17.1% reduction in complications favouring the peg design.
Our study also demonstrated a higher prevalence of radiolucent lines (RL) in the keel group. This result is in line with Lazarus et al. 13 who retrospectively compared 328 shoulders and found a mean RL score of 1.8 for the keel versus 1.3 for the peg. However, no significant difference was observed in RL indicative of loosening (grades four and five), a result that concurs with Moulton et al., 35 who reported RL in 15% and 20% of the keel and peg implants, respectively, with no statistically significant difference.
In the comparison between PE and MB components, no significant difference was observed in complication or revision rates. This contrasts with Papadonikolakis et al. (2014), 17 who reported a threefold increased risk of such events with MB implants. However, their review included retrospective studies that did not directly compare PE and MB components. Instead, the authors assessed studies that included only MB implants separately from those including only PE implants.
In our analysis, radiolucent lines were 25% more prevalent in PE components compared to MB, with statistical significance. Kim et al. 19 recently conducted a meta-analysis with a methodology similar to that of Papadonikolakis et al., 17 incorporating retrospective studies that did not compare groups directly. Despite these methodological differences, they also found a higher RL rate in the PE group (41.7%) compared to MB (14.8%).
Study limitations
Overall, the evidence level of the included studies is considered high, given that all were randomized controlled trials. However, the quality of evidence was rated as low or very low according to the GRADE approach, mainly due to the high risk of bias observed across the included studies. The direct comparison between MB and PE components was based on two studies published in 2002 and 2021, which may have introduced heterogeneity related to differences in implant designs, as well as variations in failure and revision rates. Therefore, the conclusions may not be fully applicable to current clinical decision-making regarding the contemporary selection of glenoid components. Moreover, most RCTs were not adequately powered to detect differences in the major outcomes of interest, such as complications and revision rates, limiting the robustness of these comparisons. The low sample sizes and methodological variability across studies – particularly regarding surgical techniques, polyethylene type, and the use of modern pegged designs, which are known to significantly influence implant survivorship – further reduce comparability. These limitations contrast with findings from large international arthroplasty registry data, which provide more robust evidence on survivorship and complication trends across implant designs and materials. Substantial heterogeneity was also observed in follow-up duration and in the instruments used to assess outcomes such as pain, quality of life, and range of motion. Additionally, the generally short follow-up periods in some studies constrain the interpretation of long-term results. Finally, the paucity of studies directly comparing MB with PE restricts the applicability of the conclusions drawn.
Conclusion
In this meta-analysis, encompassing 323 patients (338 shoulders), the PEG model demonstrated superiority over the KEEL design regarding radiolucent lines (RL), and the PE component showed a higher incidence of RL compared with MB. However, the quality of evidence for critical outcomes such as complications and revision was rated as low or very low, mainly due to methodological limitations of the included RCTs and their insufficient statistical power to detect these events. These limitations, together with the small sample sizes and inability to control for key confounders such as polyethylene type and the use of modern pegged designs, indicate that data from global arthroplasty registries should be considered to obtain a more comprehensive assessment of implant survivorship. Future prospective studies with larger cohorts and longer follow-up are required to provide more definitive conclusions.
Supplemental Material
sj-doc-1-sel-10.1177_17585732251414953 - Supplemental material for Glenoid outcomes after total shoulder arthroplasty with cemented all-polyethylene versus metal-backed implants: A meta-analysis of randomized clinical trials
Supplemental material, sj-doc-1-sel-10.1177_17585732251414953 for Glenoid outcomes after total shoulder arthroplasty with cemented all-polyethylene versus metal-backed implants: A meta-analysis of randomized clinical trials by Renato Aroca Zan, Fabio Teruo Matsunaga, Ramon Sampaio Souza Santos, Nicola Archetti Netto, João Carlos Belloti and Marcel Jun Sugawara Tamaoki in Shoulder & Elbow
Supplemental Material
sj-docx-2-sel-10.1177_17585732251414953 - Supplemental material for Glenoid outcomes after total shoulder arthroplasty with cemented all-polyethylene versus metal-backed implants: A meta-analysis of randomized clinical trials
Supplemental material, sj-docx-2-sel-10.1177_17585732251414953 for Glenoid outcomes after total shoulder arthroplasty with cemented all-polyethylene versus metal-backed implants: A meta-analysis of randomized clinical trials by Renato Aroca Zan, Fabio Teruo Matsunaga, Ramon Sampaio Souza Santos, Nicola Archetti Netto, João Carlos Belloti and Marcel Jun Sugawara Tamaoki in Shoulder & Elbow
Supplemental Material
sj-docx-3-sel-10.1177_17585732251414953 - Supplemental material for Glenoid outcomes after total shoulder arthroplasty with cemented all-polyethylene versus metal-backed implants: A meta-analysis of randomized clinical trials
Supplemental material, sj-docx-3-sel-10.1177_17585732251414953 for Glenoid outcomes after total shoulder arthroplasty with cemented all-polyethylene versus metal-backed implants: A meta-analysis of randomized clinical trials by Renato Aroca Zan, Fabio Teruo Matsunaga, Ramon Sampaio Souza Santos, Nicola Archetti Netto, João Carlos Belloti and Marcel Jun Sugawara Tamaoki in Shoulder & Elbow
Footnotes
Contributorship
RZ, FM and RS researched literature and conceived the study. NN and MT were involved in protocol development and data analysis. RS and JB assessed risk of Bias. RZ and FM wrote the first draft of the manuscript. All authors reviewed and edited the manuscript and approved the final version of the manuscript.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
