Abstract
TRIZ (Theory of Inventive Problem Solving) has earned recognition as a structured innovation methodology, yet its integration into undergraduate engineering curricula remains limited-students frequently struggle with parameter extraction and contradiction formulation, the very gates to effective TRIZ application. This study proposes a scaffolded fade-out framework that deploys an AI dialogue assistant in three progressively withdrawn phases-Proactive Guidance, Reactive Response, and On-Demand Consultation-over a four-week intervention. A quasi-experimental design (experimental group n = 42, control group n = 40) was implemented in a Mechanical Innovation Design course. Results indicate that the experimental group significantly outperformed the control group on total TRIZ modeling competence (ANCOVA F(1,79) = 18.43, p < 0.001, Cohen's d = 0.89), with large effects on parameter extraction (d = 1.15) and contradiction identification (d = 1.19). Solution innovation scores reversed (d = −0.36), which may partially reflect an anchoring effect from AI-provided inventive principles, though this interpretation requires further validation. SOLO taxonomy analysis corroborated deeper structural understanding in the experimental group. These findings suggest that scaffolded AI assistance can accelerate TRIZ skill acquisition within the conditions studied, though the fade-out protocol requires careful calibration to avoid constraining creative divergence.t
Keywords
Introduction
Innovation capacity has become a defining criterion for engineering graduates, yet the pedagogical mechanisms by which such capacity develops remain contested. TRIZ-the Russian acronym for the Theory of Inventive Problem Solving-offers one of the few systematic approaches to innovation, providing a structured pathway from problem definition to inventive solution through tools such as the contradiction matrix, inventive principles, and substance-field analysis.1–3 Over seven decades since its inception, TRIZ has migrated from industrial R&D into university classrooms, and a growing body of literature attests to its pedagogical value.4–6
The migration, however, has not been seamless. Students routinely stumble at two critical junctures: extracting engineering parameters from contextual problem descriptions, and formulating technical contradictions that map onto the 39 × 39 matrix. These are not merely procedural errors; they reflect a deeper conceptual gap between holistic, narrative problem perception and the analytic decomposition TRIZ demands.3,4 The result? TRIZ often functions as a reference tool consulted after-not during-the inventive process, precisely the opposite of its intended role as a generative framework.
Conventional instruction addresses this gap through extended workshop sequences, one-on-one tutoring, or simplified matrix subsets.4,5 Each strategy carries trade-offs: workshops consume curricular hours already under pressure from ABET-style outcome requirements; individualized tutoring is resource-intensive at scale; and simplifying the matrix undermines the very systematicity that distinguishes TRIZ from brainstorming. What is needed, arguably, is a scalable scaffold that can guide students through the analytical bottleneck without permanently supplanting their own reasoning.
Artificial intelligence-specifically, large language model-based dialogue systems-presents an intriguing possibility. Recent advances have demonstrated that LLMs can support TRIZ-related tasks, from contradiction identification to principle recommendation.2,7–9 Yet a fundamental tension persists: if the AI is too helpful, students may become dependent, outsourcing the very cognitive operations the course aims to develop; if it is too withholding, the scaffold fails to bridge the initial gap. This tension mirrors a well-established problem in educational scaffolding-how to provide sufficient support for novice learners while systematically withdrawing that support to promote autonomous competence.10,11
The present study addresses this tension by operationalizing the scaffolded fade-out concept within an AI-assisted TRIZ teaching framework. Rather than a static AI assistant, we deploy a three-phase protocol in which the AI's role transitions from proactive guidance to reactive response and finally to on-demand consultation, with fade-out triggered by objective performance criteria. This design draws on Vygotsky's 12 zone of proximal development, Kolb's 13 experiential learning cycle, and Lin and Atman's 14 meta-analytic evidence on scaffolding effectiveness-each informing a different facet of the intervention. The framework also incorporates insights from Kim et al.'s 15 Bayesian meta-analysis on scaffolding customization in STEM education, Zawacki-Richter et al.'s 16 systematic review of AI applications in higher education, and Sweller et al.'s 17 updated cognitive load theory, which predicts that the guidance-fading effect should reduce extraneous cognitive load as learners acquire expertise.
In a closely related study published in this journal, Akhavan-Safar et al. 18 surveyed 145 undergraduate mechanical engineering students at the University of Porto and documented widespread acceptance of video-based learning resources as supplementary aids for reinforcing complex concepts, alongside a broad recognition of AI's potential to transform traditional pedagogy. Their finding that students perceived AI tools as more advantageous for less intricate topics-while preferring video resources for conceptually demanding material-is particularly germane to the present study. This perception aligns with the theoretical prediction from cognitive load theory 17 that learners benefit most from external guidance (e.g., AI scaffolding) when tasks exceed their current competence, but may experience redundancy or even interference when such support is applied to already-mastered operations. Our three-phase fade-out protocol operationalizes precisely this principle: by progressively withdrawing AI assistance as students’ expertise grows, we aim to capture the benefits of AI-supported learning for complex problem-solving while avoiding the over-reliance concerns that Akhavan-Safar et al. 18 explicitly flagged. Moreover, their call for balancing technological innovation with traditional pedagogical approaches resonates directly with our design, in which the instructor retains a substantive conceptual-scaffolding role throughout the intervention, complementing rather than ceding to the AI. Their work thus provides both empirical motivation and conceptual grounding for the scaffolded fade-out framework we investigate here.
Two research questions guide the investigation:
Does scaffolded AI assistance with a fade-out protocol improve students’ TRIZ modeling competence compared to conventional instruction? What differential effects does the scaffold produce across the constituent dimensions of TRIZ modeling-parameter extraction, contradiction identification, principle judgment, and solution innovation?
Theoretical framework
Scaffolded learning and the zone of proximal development
Vygotsky's 12 zone of proximal development (ZPD) posits that learners can achieve with assistance what they cannot yet achieve independently, implying that effective teaching must operate within this zone-providing just enough support to enable performance beyond current capability without eliminating the productive struggle that drives development. Two design principles follow. First, scaffolds must be calibrated to the learner's current competence: support exceeding the ZPD creates dependency, while support below it fails to accelerate growth. Second, scaffolds must fade. Permanent assistance is not scaffolding-it is prosthetic. Without systematic withdrawal, there is no mechanism by which the learner transitions from assisted to autonomous performance. Meta-analytic evidence confirms that performance-adapted fading tends to outperform fixed-schedule fading.10,15
In TRIZ instruction, the ZPD typically encompasses the gap between holistic problem perception and analytic parameter extraction: students can describe an engineering problem in narrative terms but cannot yet decompose it into the 39-parameter framework. The scaffold must therefore model the decomposition process initially, then progressively transfer responsibility to the student. Research on AI-assisted learning has demonstrated analogous patterns: students who receive continuous, unmodified AI assistance may develop dependency rather than autonomous competence. 11 From a cognitive load perspective, this corroborates the guidance-fading effect: as learners construct schemas in long-term memory, continued guidance becomes redundant and may impose extraneous load. 17
Distributed and personalized scaffolding
Effective scaffolding in complex domains distributes support across multiple agents, tools, and curricular structures-a concept termed “distributed scaffolding" 10 -serving two functions: redundancy (multiple supports for the same skill) and synergy (different supports for complementary skills). Lim et al. 19 demonstrated that real-time analytics-based personalized scaffolds enhance self-regulated learning, with differentiated scaffolding producing stronger effects than generic support.
In TRIZ instruction, the AI handles the procedural-analytical dimension (parameter extraction, matrix navigation), the instructor addresses the conceptual dimension, and peer interaction supports the creative dimension. This distribution also provides a safety net: if one scaffold fades prematurely, others remain active. The present framework employs both synergistic scaffolding (multiple supports reinforcing the same skill at different grain sizes) and complementary scaffolding (supports addressing different skills), aligning with Kim et al.'s 15 finding that scaffolding customization produces consistently positive cognitive outcomes in STEM education.
Experiential learning and TRIZ
Kolb's 13 experiential learning cycle-concrete experience, reflective observation, abstract conceptualization, and active experimentation-maps naturally onto TRIZ practice: engaging with a real engineering problem constitutes concrete experience; analyzing it through the contradiction matrix enables reflective observation and abstract conceptualization; and proposing an inventive solution represents active experimentation. Pande and Bharathi 20 elaborated this alignment, demonstrating that iterative, experience-based pedagogies naturally support the acquisition of complex analytical skills.
This structure suggests that TRIZ instruction should not be front-loaded with theory; students should encounter concrete problems early, with analytical tools introduced as needed. The AI scaffold supports this by providing on-demand analytical guidance precisely when the student encounters a bottleneck. This design corroborates Liao et al.'s 9 finding that ChatGPT-based scaffolding systems effectively support skill acquisition when designed to follow a developmental process, and aligns with Kavousi et al.'s 21 argument that metacognitive monitoring during design thinking enables learners to regulate their own problem-solving.
Perhaps most importantly, experiential learning theory suggests that premature withdrawal of scaffolding is less harmful than excessive retention: struggle within the ZPD is productive, but over-scaffolding eliminates the experiential basis for learning. This asymmetry informs our fade-out design-we err toward earlier withdrawal, accepting some productive struggle rather than risking the dependency that Darvishi et al. 11 empirically identified in AI-assisted contexts. Not incidentally, this cyclical alignment also explains why TRIZ is often better learned through problem-solving than through didactic instruction—a point often overlooked in introductory curricula.
The AI-assisted TRIZ teaching framework
Three-phase fade-out protocol
The core innovation of this framework is a three-phase protocol governing the AI assistant's behavior, with phase transitions triggered by objective performance criteria rather than fixed time schedules (though the protocol is designed to fit within a four-week window). Figure 1 illustrates the overall scaffold fade-out timeline with representative dialogue examples for each phase.

AI-assisted TRIZ teaching scaffold fade-out timeline. The three phases progressively withdraw AI involvement from proactive guidance (weeks 1-2) through reactive response (week 3) to on-demand consultation (week 4). Phase transitions are triggered by the criterion of scoring ≥80/100 on two consecutive assessment tasks. Representative AI-student dialogue excerpts are shown below each phase. The gradient bar at the bottom indicates increasing student autonomy from left to right. Simplified dialogue excerpts are shown; see Table 1 for full representative exchanges.
Representative AI-student dialogue excerpts by phase.
Phase 1: Proactive Guidance (Weeks 1–2). The AI initiates instructional interactions. When a student begins working on a TRIZ problem, the AI proactively suggests parameter categories, identifies potential contradictions, and offers navigation cues for the contradiction matrix. The AI models the analytical process explicitly: “This problem involves conflicting requirements for X and Y-shall we explore which parameters in the 39-parameter framework correspond to these requirements?” This phase targets the ZPD directly, demonstrating the decomposition process that students cannot yet perform independently.
Phase 2: Reactive Response (Week 3). The AI no longer initiates but responds to student queries. Students must formulate their own questions about parameters, contradictions, or principles; the AI then provides targeted assistance. The shift from proactive to reactive places the cognitive initiative with the student while maintaining access to analytical support. This phase operationalizes the transition from assisted to semi-autonomous performance.
Phase 3: On-Demand Consultation (Week 4). The AI serves as a reference resource, consulted only when students explicitly request help. Responses are deliberately concise, offering directional hints rather than procedural guidance. This phase approximates the post-scaffold condition, allowing students to practice autonomous TRIZ application while retaining a safety net.
Fade-out trigger criterion: Transition between phases occurs when a student achieves a score of ≥80/100 on two consecutive assessment tasks. This criterion ensures that withdrawal is performance-based rather than time-based, accommodating individual differences in learning pace. In practice, most students transitioned within the designated weekly windows, though three students in the experimental group required an additional session to meet the criterion for transitioning to Phase 2.
Implementation details. Students attended two 60-min sessions per week. In Phase 1, AI interaction was a structured component of every session-all students engaged with the AI assistant as a guided activity. In Phase 2, students chose when to query the AI, though all continued to interact at least once per session. In Phase 3, AI consultation was entirely voluntary. All students in both groups worked on the same engineering problem (the active Braille input system) throughout the four weeks. The 80/100 transition criterion was operationalized as scoring ≥80% on two consecutive in-class assessment tasks administered at the end of each session.
AI dialogue design
The AI assistant was implemented using DeepSeek (https://chat.deepseek.com), accessed via its web-based chat interface, with TRIZ knowledge integrated through prompt engineering and a retrieval-augmented generation (RAG) architecture rather than fine-tuning. Figure 2 presents the system architecture.

Architecture of the AI-assisted TRIZ teaching system. The system comprises three layers: (1) student interface-web browser access to DeepSeek chat; (2) AI Engine-containing the DeepSeek LLM (not fine-tuned), a phase-specific prompt controller governing interaction parameters, and design constraints (guardrails) limiting the AI's role; and (3) RAG knowledge base-incorporating the 39 engineering parameters, 39 × 39 contradiction matrix, 40 inventive principles, and Su-Field analysis models. An instructor monitor can intervene when students operate beyond their ZPD.
The RAG knowledge base comprises the complete 39 engineering parameter definitions with examples, the 39 × 39 contradiction matrix, the 40 inventive principles with annotated explanations, and substance-field (Su-Field) analysis models. While LLMs offer significant potential for education, their integration requires careful pedagogical design to avoid the pitfalls of over-reliance and superficial engagement that Kasneci et al. 22 have identified.
System prompt architecture. Phase-specific prompts govern three interaction dimensions: initiative (proactive vs. reactive vs. on-demand), response length (300 words in Phase 1, reducing to 200 in Phase 2 and 100 in Phase 3), and interaction style (diagnostic questioning in Phase 1, targeted response in Phase 2, directional hinting in Phase 3).
Key design constraints:
Constrained creativity: The AI does not generate inventive solutions. Its role is strictly limited to parameter suggestion, contradiction formulation, and principle identification, preventing the AI from substituting for the student's inventive reasoning. Error-tolerant interaction: When students propose incorrect mappings, the AI poses diagnostic questions rather than immediate corrections, preserving productive struggle. Metacognitive prompting: The AI asks students to summarize their analytical process. This is retained in Phases 1–2 but removed in Phase 3 as students are expected to have internalized self-reflection.
Student interface. Students accessed DeepSeek through a standard web browser on personal laptops or tablets. No specialized interface was used. All interactions were logged through the DeepSeek platform's conversation history.
Representative dialogue examples. Table 1 presents dialogue excerpts for each phase, all responding to the student query “How should I extract the engineering parameters?”
The overall technical description was limited to approximately 300 words per interaction, ensuring the pedagogical focus remains on TRIZ modeling processes.
Pedagogical architecture
The framework integrates three scaffolding sources in a distributed configuration (see Section 2.2). The AI handles procedural-analytical scaffolding (parameter extraction, matrix navigation); the instructor provides conceptual scaffolding (theoretical explanations, strategic guidance); and peer collaboration supports creative scaffolding (divergent thinking, solution evaluation), echoing findings on AI-supported collaborative learning. 23 The instructor's role also includes monitoring the fade-out process and intervening when students appear to be struggling beyond their ZPD-identified through patterns of repeated failed attempts without progressive improvement.
Class sessions followed a consistent structure: a brief instructor-led introduction (10 min), individual TRIZ problem-solving with AI assistance (40 min), and peer discussion and reflection (10 min). The control group received identical curricular content and time allocation but without AI assistance; instead, they had access to standard TRIZ reference materials (the 39-parameter list, contradiction matrix, and 40 inventive principles).
Method
Design and participants
A quasi-experimental pretest–posttest design with a nonequivalent control group was employed, following established conventions for educational research in authentic classroom settings.10,24 Participants were 82 undergraduate students enrolled in two parallel sections of the Mechanical Innovation Design course at Yangzhou University during the spring semester of 2024. One section (n = 42) was assigned as the experimental group (EG) and the other (n = 40) as the control group (CG). Assignment was by natural class section to avoid contamination; random assignment at the individual level was impractical given the collaborative nature of the course.
Both groups were in their third year of a Mechanical Engineering program, had completed equivalent prerequisite courses (Engineering Mechanics, Mechanical Design, Control Theory), and had no prior formal instruction in TRIZ. A pretest administered in Week 0 confirmed baseline equivalence in TRIZ modeling competence (t(80) = 0.34, p = 0.73). Cumulative GPA in core mechanical engineering courses showed no significant difference between groups (EG: M = 3.12, SD = 0.41; CG: M = 3.08, SD = 0.38; t(80) = 0.46, p = 0.65). Neither group reported prior experience with structured innovation methods or AI-assisted learning tools.
Table 2 presents the baseline demographic and background characteristics of the two groups.
Baseline demographic and background characteristics.
Note: EG = experimental group; CG = control group. All participants were third-year Mechanical Engineering students at Yangzhou University.
Intervention
The intervention spanned four weeks (Weeks 1–4), embedded within the regular course schedule. Both groups studied the same TRIZ content: the 39 engineering parameters, the contradiction matrix, and the 40 inventive principles. The core engineering problem used for instruction was the design of an active Braille input system-a device requiring both measurement accuracy (sensing tactile input) and device simplicity (portability for visually impaired users), representing a classic technical contradiction (Parameter 31 vs. Parameter 36). Figure 3 illustrates the TRIZ contradiction matrix query process that students followed, and Figure 4 shows the active Braille input system that served as the design problem.

TRIZ contradiction matrix query process. The four-step procedure begins with defining the technical contradiction (Step 1), proceeds through locating the relevant matrix cell in the 39 × 39 contradiction matrix (Step 2), retrieving the recommended inventive principles (Step 3), and finally applying those principles to generate inventive solutions (Step 4). Note: The TRIZ contradiction matrix is standardized as 39 improving features (rows) by 39 worsening features (columns).

Active Braille input system with tactile feedback. The system comprises three functional modules: (1) PVDF piezoelectric sensor array with charge amplifier and ADC for analog signal acquisition, (2) 3 × 2 Braille encoding module for digital pattern generation, and (3) pneumatic tactile feedback unit with solenoid valve array for haptic output. All modules are coordinated by a 32-bit MCU with regulated 5 V power supply.
Both groups were taught by the same instructor to eliminate instructor-related confounding. In the experimental group sessions, the instructor's role shifted from providing procedural answers to focusing on conceptual scaffolding (e.g., explaining why a contradiction formulation matters), complementing the AI's analytical role. The control group received instructor-led instruction with standard TRIZ reference materials. The instructor maintained a consistent level of guidance throughout the four weeks, providing standard TRIZ reference materials and answering procedural questions, but did not implement any systematic fade-out of support. This design choice was intentional: the control condition represents standard TRIZ teaching practice, providing an ecologically valid comparison. To ensure parity of instructional time, both groups attended the same number of sessions with identical duration. The sole systematic difference was the availability and behavior of the AI assistant.
Technical descriptions of the Braille input system were deliberately limited to approximately 300 words across all sessions. The pedagogical focus was consistently on TRIZ modeling-how to identify parameters, formulate contradictions, and navigate the matrix-rather than on the technical merits of any particular solution. This constraint applied equally to both groups.
Assessment instrument
A rubric-based assessment instrument was developed to measure TRIZ modeling competence across four dimensions:
Parameter Extraction (maximum 25 points): Accuracy and completeness of identifying relevant engineering parameters from the problem description. Contradiction Identification (maximum 25 points): Appropriateness of the technical contradiction formulation, including correct parameter pairing and contradiction type classification. Principle Judgment (maximum 20 points): Reasonableness of selecting and justifying inventive principles from the matrix recommendations. Solution Innovation (maximum 20 points): Originality and feasibility of the proposed inventive solution, assessed relative to the problem constraints.
The total possible score was 90 points. Two trained raters independently scored all posttest responses. Inter-rater reliability, assessed using Cohen's κ, was 0.81, indicating substantial agreement. Discrepancies were resolved through discussion.
Scoring procedure. Two raters participated in a two-hour training session prior to scoring. The training included rubric explanation, practice scoring on three sample responses, and calibration to >85% agreement. Raters were blind to group membership. When scores differed by more than 3 points, raters engaged in discussion-based consensus; if agreement could not be reached, a third rater served as tiebreaker.
Full scoring rubric. Table 3 presents the complete scoring rubric with performance descriptors across four proficiency levels (Insufficient, Developing, Competent, Proficient) for each of the four assessment dimensions.
Complete TRIZ modeling competence scoring rubric.
Additionally, students’ post-test responses were classified according to the SOLO (Structure of Observed Learning Outcome) taxonomy 24 to characterize the structural complexity of their TRIZ understanding, following approaches validated in recent systematic reviews. 24 Five levels were used: Prestructural, Unistructural, Multistructural, Relational, and Extended Abstract.
Statistical analysis
Baseline equivalence was tested using independent-samples t-tests on pretest scores. Posttest comparisons employed analysis of covariance (ANCOVA) with pretest scores as the covariate to control for any residual baseline differences. Effect sizes were calculated using Cohen's d, interpreted as small (0.2), medium (0.5), and large (0.8). All analyses were conducted using SPSS with α = 0.05.
Assumption checking. ANCOVA assumptions were verified prior to analysis: normality of residuals was confirmed using Shapiro-Wilk tests (all p > 0.05); homogeneity of variances was confirmed using Levene's test (all p > 0.05); and homogeneity of regression slopes was confirmed by testing the group × pretest interaction (all p > 0.05). Effect size 95% confidence intervals were computed using bias-corrected bootstrap methods (1,000 resamples) (total competence: d = 0.89, 95% CI [0.42, 1.36]; parameter extraction: d = 1.15, 95% CI [0.64, 1.66]; contradiction identification: d = 1.19, 95% CI [0.67, 1.71]; principle judgment: d = 0.43, 95% CI [−0.01, 0.87]; solution innovation: d = −0.36, 95% CI [−0.78, 0.06]). Adjusted means with 95% confidence intervals are reported in Table 4. Multiple comparison correction: with five outcome dimensions tested, the Bonferroni-corrected significance threshold was α = 0.01 (0.05/5). Effects for total TRIZ competence (p < 0.001), parameter extraction (p < 0.001), and contradiction identification (p < 0.001) remain significant after correction. Principle judgment (p = 0.037) and solution innovation (p = 0.053) do not survive Bonferroni correction.
Post-test TRIZ modeling competence by group.
Note: EG = experimental group; CG = control group. Adjusted means are estimated from ANCOVA with pretest score as covariate. Cohen's d interpreted as small (0.2), medium (0.5), and large (0.8).
Results
Overall performance
The pretest confirmed baseline equivalence between the two groups (EG M = 31.24, SD = 8.67; CGM = 30.85, SD = 9.12; t(80) = 0.34, p = 0.73). After the four-week intervention, the experimental group significantly outperformed the control group on total TRIZ modeling competence (ANCOVA F(1,79) = 18.43, p < 0.001). The effect size was large (Cohen's d = 0.89), indicating a practically meaningful difference.
Table 4 presents the descriptive and inferential statistics for all dimensions, including adjusted means and confidence intervals. Figure 5 visualizes the group comparisons with error bars indicating ±1 SD and significance brackets for Bonferroni-corrected.

Post-test TRIZ modeling competence by group. ANCOVA was conducted with pretest score as the covariate, and Bonferroni correction was applied for pairwise comparisons (corrected α = 0.01). Error bars represent ±1 SD. *** denotes p < 0.001 for significant between-group differences; ns indicates non-significant comparisons after correction. Effect sizes (Cohen's d) are labeled above each pair of bars.
Dimension-specific findings
The largest effects appeared in Parameter Extraction (d = 1.15) and Contradiction Identification (d = 1.19)-precisely the two dimensions the AI scaffold was designed to target. The AI's proactive modeling of parameter decomposition and contradiction formulation during Phase 1 appears to have accelerated students’ acquisition of these analytical skills.
Principle Judgment showed a small-to-medium effect (d = 0.43), suggesting that the scaffold's benefit diminished for the more interpretive task of evaluating which inventive principles best apply to a given contradiction. This is perhaps unsurprising: principle judgment requires integration of technical domain knowledge with TRIZ knowledge, and the AI was deliberately constrained from providing domain-specific solution guidance.
Most striking, however, was the reversed effect on Solution Innovation (d = −0.36). The control group scored higher on this dimension, though the difference did not reach conventional statistical significance (p = 0.053). The Bonferroni-corrected threshold for significance was α = 0.01; Consistent with this, the reversed effect did not survive correction for multiple comparisons (p = 0.053), and should therefore be interpreted with caution. This reversal warrants careful interpretation and is discussed in Section 6.4.
Analysis of solution innovation and SOLO classification
SOLO taxonomy analysis revealed a marked shift in the structural complexity of students’ TRIZ understanding (see Figure 6). In the experimental group, 38.1% of responses were classified at the Relational level and 19.0% at the Extended Abstract level, compared to 17.5% and 5.0% respectively in the control group. Conversely, the control group showed higher proportions at the Prestructural (20.0% vs. 7.1%) and Unistructural (32.5% vs. 16.7%) levels. A chi-square test of independence confirmed that the group-level SOLO distribution differed significantly (χ2(4) = 12.37, p = 0.015, Cramér's V = 0.39), indicating a moderate-to-strong association between instructional condition and structural complexity of TRIZ understanding.

Post-test SOLO taxonomy level distribution by group. The experimental group (blue) presents larger proportions at relational (38.1%) and extended abstract (19.0%) levels, whereas the control group (gray) is concentrated at Prestructural (20.0%) and unistructural (32.5%) low-level stages. A chi-square test of independence was performed to compare categorical distribution differences between groups.
The SOLO distribution suggests that scaffolded AI assistance promoted not merely higher scores but qualitatively deeper understanding. Experimental group students were more likely to connect multiple TRIZ concepts (Relational level) and to generalize TRIZ principles to novel contexts (Extended Abstract level), whereas control group students more frequently operated at the level of isolated elements (Unistructural) or failed to engage with the TRIZ framework meaningfully (Prestructural). This finding aligns with Adeniji et al.'s 25 systematic review confirming that the SOLO model appropriately reflects students’ learning outcomes across disciplines and educational levels.
Discussion
Scaffolding effectiveness and the fade-out protocol
The large effect sizes on parameter extraction (d = 1.15) and contradiction identification (d = 1.19) are consistent with the interpretation that the scaffolded AI assistant effectively targeted the analytical bottleneck in TRIZ learning; however, the quasi-experimental design limits the strength of causal inference, and these findings should be interpreted accordingly. They align with the ZPD framework: by proactively modeling the decomposition process in Phase 1, the AI made visible the cognitive operations that students could not yet perform independently. The progressive fade-out then created structured opportunities for students to assume these operations themselves.
We acknowledge, however, that the present design cannot fully disentangle the effects of scaffolding from those of AI availability per se. A more rigorous decomposition would require additional conditions-one receiving static AI assistance (no fade-out), another receiving fade-out without AI (instructor-mediated scaffolding). Such a factorial design would isolate the unique contribution of the fade-out mechanism, consistent with the experimental comparisons recommended by Darvishi et al. 11 in their study of AI assistance and student agency. What the present data do establish is that the combined scaffolded-AI package produces substantial gains on the targeted dimensions, and these gains are not attributable to baseline differences or mere exposure to TRIZ content.
The performance-based fade-out trigger deserves particular attention. Three students in the experimental group required an additional session to meet the Phase 2 transition criterion, suggesting that the criterion captured genuine individual differences in learning pace. Had phase transitions been purely time-based, these students might have entered Phase 2 before achieving the analytical competence that Phase 1 was designed to develop-potentially undermining the scaffold's effectiveness. This observation supports Belland et al.'s 10 meta-analytic finding that performance-adapted scaffolding customization yields consistently positive outcomes in STEM education, and echoes the guidance-fading effect described in cognitive load theory. 17
Disentangling the AI effect from the scaffolding protocol
The present study compares AI-assisted scaffolded instruction with fade-out against conventional TRIZ instruction. This comparison addresses whether the combined intervention works, but it cannot isolate the unique contribution of the AI medium from the contribution of the scaffolding protocol. This is a fundamental methodological limitation that we acknowledge explicitly.
We argue, however, that establishing the efficacy of the combined intervention is a necessary and valid prior step. Following Penuel et al.'s 26 argument for educational design research, it is reasonable to first establish that a designed intervention produces meaningful outcomes before investing in the factorial decomposition needed to isolate mechanism-level effects. Moreover, the fade-out protocol is not an independent variable in the present design but a constitutive design feature of the AI system; stripping it out would yield a fundamentally different artifact.
Nevertheless, the magnitude of the observed effects warrants consideration. The effect sizes for parameter extraction (d = 1.15) and contradiction identification (d = 1.19) substantially exceed typical scaffolding meta-analytic effect sizes in STEM education (d ∼ 0.40–0.60, as reported by Belland et al. 10 and Kim et al. 15 ). This incremental effect may plausibly be attributed to the AI's domain-specific TRIZ knowledge and its capacity to model analytical decomposition with high fidelity. We propose a three-group factorial design for future research: (1) AI with fade-out, (2) AI without fade-out (static AI assistance), and (3) instructor-led scaffolding with fade-out (no AI). This design would permit clean decomposition of the AI effect, the fade-out effect, and their interaction.
Productive struggle and the risk of over-support
While the fade-out protocol is designed to prevent over-scaffolding, a residual concern warrants explicit discussion. Easy and constant access to a support system, even one that is eventually removed, may reduce a student's productive struggle, self-confidence, and intrinsic drive to learn. This risk is mitigated by the fade-out design but not eliminated: the ZPD is itself a moving target, and calibrating support to it in real time is inherently imperfect.
From a cognitive load perspective, 17 the guidance-fading effect predicts that continued guidance becomes extraneous as learners develop schemas, so the systematic withdrawal of the AI's proactive role should reduce extraneous load. However, from a self-determination perspective, 27 autonomous competence is a core psychological need, and any assistance-even temporary-may undermine perceived autonomy if the learner feels controlled rather than supported.
The SOLO taxonomy evidence partially addresses this concern: the experimental group showed higher proportions at Relational (38.1%) and Extended Abstract (19.0%) levels, suggesting genuine conceptual development rather than surface-level pattern matching. If students were merely dependent on the AI, we would expect strong performance on practiced items but weak generalization-precisely what the Extended Abstract results argue against. We interpret this as evidence that the scaffold facilitated rather than bypassed genuine cognitive engagement.
Nevertheless, three students required additional sessions to meet the Phase 2 criterion, and we cannot rule out that for some students, the reduction of support in Phase 3 may have approached or even exceeded the boundary of their ZPD. Future research should incorporate measures of self-efficacy and autonomous motivation to assess whether scaffolded fade-out protocols support or undermine these psychological needs over time.
The anchoring effect on solution innovation
The reversed effect on Solution Innovation (d = −0.36) represents perhaps the most theoretically interesting finding of this study. We interpret this reversal as possibly reflecting an anchoring effect (Tversky and Kahneman 28 ): the AI-provided inventive principles may have anchored students’ solution spaces around conventional interpretations, constraining the divergent thinking that innovative solutions require.
The mechanism, we propose, operates as follows. During Phase 1, the AI explicitly suggests inventive principles from the contradiction matrix. Students process these principles as starting points for solution generation-a reasonable strategy given the scaffold's design. However, this initial processing may establish an anchor: subsequent solution attempts may be drawn toward the anchor's neighborhood, reducing the probability of generating highly original or unconventional solutions. The control group, lacking this anchor, explored a wider solution space, producing more diverse (if sometimes less analytically grounded) proposals.
This interpretation is, by necessity, speculative, as anchoring was not directly measured. The effect is marginally significant (p = 0.053, two-tailed), does not survive Bonferroni correction, and process measures (think-aloud protocols, eye-tracking, or interaction log analysis) that would validate the anchoring mechanism were not collected. We therefore frame anchoring as a plausible explanation rather than a confirmed mechanism, pending future research.
This interpretation is consistent with the broader literature on creativity constraints. Acar et al. 29 proposed an integrative framework showing that constraints can both foster and hinder creativity through motivational, cognitive, and social mechanisms, with an inverted U-shaped relationship between constraint intensity and creative outcomes. While structure can enhance creativity by reducing the vastness of the solution space to manageable proportions (the “benefit of constraints” hypothesis; Tromp and Baer 30 ), excessive structure can paradoxically narrow the search space below the optimal level. The present data suggest that the AI scaffold may have crossed this threshold for the solution innovation dimension-providing too much structure for a task that benefits from cognitive flexibility.
One practical consideration follows from this interpretation, though it should be treated as a design hypothesis rather than a prescription: the fade-out protocol might be calibrated not only for analytical dimensions (where early anchoring is beneficial) but also for creative dimensions (where late anchoring may be detrimental). Perhaps Phase 1 should proactively scaffold parameter extraction and contradiction identification while deliberately withholding principle suggestions, leaving solution generation entirely to the student. This asymmetric scaffolding strategy would leverage the AI's analytical strengths without imposing creative constraints-but empirical testing of this design variant is needed before strong recommendations can be offered.
Implications for engineering education
These findings carry three practical implications for day-to-day engineering education practice. The first concerns the pedagogical positioning of AI tools. AI-assisted TRIZ instruction need not be an all-or-nothing proposition. The scaffolded fade-out framework demonstrates that AI can serve as a transitional tool-one that bridges the initial competence gap and then systematically withdraws, leaving students with autonomous TRIZ skills. This middle path avoids both the uncritical embrace of AI (which risks dependency, as documented by Darvishi et al. 11 ) and the reflexive rejection (which forgoes genuine pedagogical benefits).
The second concerns the architecture of scaffolding systems. The distributed scaffolding architecture reminds us that AI is one component of a broader pedagogical ecosystem. The instructor's conceptual guidance and peer interaction's creative stimulation each contributed complementary support. Institutions adopting AI-assisted instruction should resist the temptation to replace human instruction; instead, they should redesign the human role to complement the AI's analytical capabilities, as recommended by Zawacki-Richter et al. 16 Guaman-Quintanilla et al. 31 similarly demonstrated that design thinking interventions in higher education achieve stronger effects when multiple actors-facilitators, peers, and tools-contribute complementary scaffolding functions, rather than relying on a single mechanism.
The third concerns dimensional calibration. The anchoring effect finding underscores the importance of domain-specific calibration. A scaffold that works well for analytical skills may undermine creative skills within the same task. This contingency demands that instructional designers attend not only to the presence or absence of scaffolding but to its dimensional alignment-matching scaffold type to skill type, informed by the constraint-creativity framework of Acar et al. 29 Liu et al. 32 provided converging evidence in a related context: their study of design thinking-based interventions in teacher education showed that inventive problem-solving skills and creative self-efficacy improved simultaneously, suggesting that properly structured scaffolding can support both analytical and creative dimensions when the scaffold design explicitly addresses both.
Limitations
Several limitations qualify the present findings. The quasi-experimental design, while necessitated by practical constraints, does not provide the causal certainty of random assignment. Although pretest equivalence was established and GPA data confirmed no group differences in prior academic performance, unmeasured confounds (intrinsic motivation, prior creative problem-solving experience, digital literacy, familiarity with AI tools) may have differed between groups. We explicitly acknowledge these possible confounds. The single-institution, single-course context limits generalizability; replication across institutions and engineering disciplines is essential.
The four-week intervention period is a particularly important limitation that warrants candid discussion. The reported results capture within-intervention performance only-that is, they reflect how students performed on TRIZ modeling tasks while the scaffold was still active or had only recently been withdrawn. The central question of whether students retain and transfer their TRIZ modeling skills to novel engineering problems after the scaffold has fully faded, and over a longer time horizon, cannot be answered by the present data. A delayed post-test, administered 4–8 weeks post-intervention, is needed to assess retention; without it, the durability of any skill gains remains unknown. The present study should therefore be understood as demonstrating a short-term within-intervention effect, not as evidence of lasting competence development. Future studies should also examine whether the anchoring effect on solution innovation attenuates, persists, or reverses after the scaffold is fully withdrawn.
The anchoring effect interpretation, while theoretically coherent and consistent with the constraint-creativity literature, is inferential rather than confirmed. We did not directly measure anchoring processes. Future research should incorporate process measures-think-aloud protocols, eye-tracking, or AI interaction log analysis-to validate the proposed mechanism.
Conclusions
Within a four-week intervention with undergraduate mechanical engineering students at a single institution, this study found evidence consistent with the view that a scaffolded fade-out framework for AI-assisted TRIZ instruction can significantly improve TRIZ modeling competence, particularly in the analytical dimensions of parameter extraction and contradiction identification. The three-phase protocol-proactive guidance, reactive response, and on-demand consultation-operationalizes the ZPD concept in a practically implementable form, with performance-based phase transitions accommodating individual learning differences.
The reversed effect on solution innovation serves as a cautionary note: scaffolding that benefits analytical reasoning may inadvertently constrain creative divergence, possibly through anchoring. This finding invites a more nuanced approach to scaffold design-one that differentiates between skill dimensions and calibrates support accordingly, as the constraint-creativity literature suggests.29,30 The AI scaffold may benefit from an asymmetric design that provides proactive guidance for analytical tasks while withholding guidance for creative solution generation.
These conclusions carry important scope conditions. The study involved a single cohort of undergraduate mechanical engineering students over four weeks, and the observed gains reflect within-intervention performance rather than demonstrated long-term retention or transfer. Whether comparable effects would emerge in different engineering disciplines, with different student populations, or over extended timeframes remains an open question. Replication studies are needed before broader claims can be made.
The broader message is one of tempered optimism. AI-assisted instruction, when guided by principled pedagogical frameworks and disciplined withdrawal protocols, appears capable of accelerating skill acquisition in complex analytical dimensions within the specific context studied-the key, as ever in education, lies not in the tool itself but in the wisdom of its deployment. What this study cannot yet establish is whether those gains endure, transfer, or hold across the full diversity of engineering learning environments.
Footnotes
Acknowledgements
The authors thank the students who participated in this study and the teaching assistants who supported data collection. During the preparation of this work the authors used GPT-4o (OpenAI, San Francisco, CA) for language polishing and modifying text within figures. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.
Ethical considerations
This study was approved by the Institutional Review Board of Yangzhou University (Approval No. YZU-IRB-2024-015).
Consent to participate
Informed consent was obtained from all individual participants included in the study.
Consent for publication
Not applicable.
Authors’ contributions
Yixiang Bian: Conceptualization, Methodology, Software, Formal analysis, Investigation, Writing – original draft. Yani Jiang: Conceptualization, Supervision, Writing – review and editing, Funding acquisition, Resources.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability
The datasets generated during the current study are available from the corresponding author on reasonable request.
Use of artificial intelligence
During the preparation of this work, GPT-4o (OpenAI, San Francisco, CA) was used for language polishing and editing of text within figures. All AI-assisted content was reviewed and approved by the authors. No AI tools were used for data analysis, content generation, or image creation.
