Abstract
In this final part of this extensive study, a new systematic data driven fuzzy modelling approach has been developed, taking into account both the modelling accuracy and its interpretability (transparency) as attributes. For the first time, a data driven modelling framework has been proposed designed and implemented in order to model the intricate friction stir welding (FSW) behaviours relating to AA 5083 aluminium alloy, consisting of the grain size, mechanical properties, as well as internal process properties. As a result, Pareto-optimal predictive models have been successfully elicited which, through validations on real data for the aluminium alloy AA 5083, have been shown to be accurate, transparent and generic despite the conservative number of data points used for model training and testing. Compared with analytically based methods, the proposed data driven modelling approach provides a more effective way to construct prediction models for FSW when there is an apparent lack of fundamental process knowledge.
Keywords
Introduction
For a comprehensive understanding of the effects of process conditions, such as tool rotation speed and traverse speed, on the friction stir welding (FSW) process as well as characterisations of welded materials, it is essential to construct accurate and reliable prediction models. These models would be effective to enhance the welding productivity and process reliability. Because of the high complexity of the FSW process, caused mainly by its undergoing intense plastic deformation and complex thermomechanical processes, it is often tricky to derive practical physical models. Because of this, a systematic data driven fuzzy modelling strategy is developed in this paper to elicit adequate prediction models based on experimental data, which include the internal process features, microstructural features, as well as mechanical properties relating to the AA 5083 aluminium alloy.
Compared with analytically based methods, fuzzy systems1, 2 are simpler in structure and easier to apply. They are capable of learning from data without needing much prior knowledge of the materials and machining processes. Fuzzy models are also convenient when combined with optimisation techniques to identify the input parameters that will provide a desirable welding profile.3 Furthermore, compared with black box modelling approaches, such as artificial neural networks (ANNs), fuzzy systems can lead to transparent characteristics and the relationships between inputs and outputs are more interpretable, because of their use of descriptive language, such as linguistic IF–THEN rules.
In this paper, the proposed fuzzy modelling methodology allows to generate fuzzy models considering not only accuracy (precision) but also transparency (interpretability) of fuzzy systems via utilisation of multiobjective optimisation techniques. As a result, a set of so called pareto-optimal4 models, in terms of various accuracy and interpretability levels, are constructed, which provide a wide range of choices for practitioners or users. In addition, a hierarchical optimisation structure is proposed to improve the modelling efficiency, where two learning phases are systematically combined in order to improve various attributes of fuzzy systems: one multiobjective optimisation algorithm, the multiobjective reduced space searching algorithm (MORSSA),5, 6 is used to optimise the model's structure. Based on a fixed model structure, another single objective optimisation algorithm, the reduced space searching algorithm (RSSA)5, 6 is employed to improve the model's parameters.
The remaining parts of this paper are organised as follows. The section on ‘Modelling methodology’ introduces the proposed modelling framework. The section on ‘Experimental studies’ presents the experimental studies on modelling the FSW related properties in detail. Finally, conclusions in relation to the whole study are drawn in the section on ‘Conclusions’.
Modelling methodology
Introduction to fuzzy systems and fuzzy modelling
Fuzzy rule based systems2 are robust universal approximators for non-linear mappings between inputs and outputs. It allows a system to be represented using a descriptive language (linguistic IF–THEN rules), which can easily be understood and explained by humans to help them to gain a deeper insight into uncertain, complex and ill defined systems. Generally, a fuzzy system consists of four fundamental components: fuzzy rule base, fuzzy inference engine, fuzzifier and defuzzifier (as shown in Fig. 1). The central part of a fuzzy system is the rule base (knowledge base) consisting of the fuzzy rules, where a fuzzy rule is an IF–THEN linguistic statement in which some words are characterised by continuous membership functions. The fuzzifier is defined as a mapping from a real valued point to a fuzzy set. In a fuzzy inference engine, fuzzy logic principles direct how to employ the fuzzy rules into a mapping from an input fuzzy set to an output fuzzy set. The defuzzifier is a mapping from the output fuzzy set to a real valued point. Conceptually, the purpose of the defuzzifier is to specify a point that best represents the output fuzzy set.7

Basic configuration of fuzzy systems
Fuzzy modelling, in particular, is a systems modelling approach employing fuzzy systems. Normally, there are two complementary ways for fuzzy modelling, namely knowledge acquisition from human experts and knowledge discovery from data. The knowledge acquisition approach lends itself to the design of fuzzy models based on existing expert knowledge. However, the complete and consistent expert knowledge is not always available or the cost of deriving such expert knowledge may be too high. On the other hand, knowledge discovery from data, i.e. data driven fuzzy modelling, can enable one to identify the structure and the parameters of fuzzy models from numerical data automatically. In recent years, we have witnessed a significant growth in both the generation and the collection of data, which allow the data driven modelling approach to take on a more pragmatic flavour. For the data driven fuzzy modelling methods, the main learning and optimisation techniques include linear least squares, gradient descent methods, neural fuzzy training methods, and some evolutionary optimisation techniques. Compared to the fuzzy systems using other learning techniques, evolutionary fuzzy systems are more practical to achieve improvements on not only the parameters but also the structure of the fuzzy systems.8, 9 Moreover, multiobjective optimisation techniques within the evolutionary computation can prove very helpful in studying the trade-off between the accuracy and the interpretability of fuzzy systems.8, 9
Reduced space searching algorithm
Inspired by natural and social behaviours, researchers have developed many successful optimisation algorithms. For example, the genetic algorithm (GA) originates from the simulation of natural evolution, while the particle swarm optimisation (PSO) algorithm is motivated by the simulation of the social behaviour of a bird flock. In the same way, a search and optimisation algorithm, named RSSA, was developed previously.5 ,6 This algorithm is inspired by a simple human experience when searching for an optimal solution to a real life problem, i.e. when humans search for a candidate solution given a certain objective, a large area tends to be scanned first; should one succeed in finding clues in relation to the predefined objective, then the search space is greatly reduced for a more detailed search. The most important difference between RSSA and other heuristic algorithms lies in the operation emphases within a search. Most of the optimisation algorithms concentrate on generating new solutions using various equations (derivative-related equations, PSO equations, etc.) or operators (mutation, recombination, etc.), while RSSA concentrates on transforming the search space so as to find the optimal subspace and the generation of solutions within a subspace does not constitute the real emphasis. In addition, RSSA was further extended to include the multiobjective optimisation case (MORSSA).5, 6 Both RSSA and MORSSA have been validated using a set of challenging benchmark problems. Compared with some other salient evolutionary algorithms, the introduced algorithms perform as well as and sometimes better than these well known optimisation algorithms.5, 6 In the following proposed modelling paradigm, both the single objective and multiobjective versions of RSSA will be implemented to improve the modelling performance.
Fast hierarchical multiobjective fuzzy modelling approach
In the previously proposed modelling approaches,8, 9 a multiobjective optimisation algorithm was used to improve fuzzy models’ structure and tune their parameters at the same time. This method would undertake relatively more calculation and would take longer to converge, where a large number of decision variables need to be adjusted and optimised simultaneously. In this paper, a hierarchical double loop optimisation structure is proposed, where two learning phases conduct sequentially and iteratively to improve the different aspects of fuzzy systems: the multiobjective optimisation algorithm MORSSA is mainly employed to optimise the model's structure; while the single objective optimisation algorithm RSSA is employed to improve the model's parameters. Figure 2 illustrates the proposed fuzzy modelling approach. It can be divided into several components and execution steps, which are described as follows:

Flow chart of proposed fuzzy modelling approach
data clustering: a modified agglomerative complete link clustering algorithm8 is employed to process training data in order to obtain the information relating to clusters. This algorithm has been shown to be more efficient and perform better than other well known clustering algorithms, such as the fuzzy C-means (FCM) clustering algorithm
initial model construction: the information provided by the clusters identified in step (i) is then used to construct an initial fuzzy model. In this approach, one cluster corresponds directly to one fuzzy rule; the centres of membership functions are defined using the information of their corresponding clusters’ centre positions; other parameters relating to the membership functions are defined under the principle that one membership function must cover all the training data, included in its corresponding cluster. More details about this step have been introduced in Ref. 10
interpretability improvement: the fuzzy system is improved in structure, including the variation of the fuzzy rules and fuzzy sets, considering the interpretability issue. This task can be achieved using a four-step operation, including removing redundant fuzzy rules, merging similar fuzzy rules, removing redundant fuzzy sets and merging similar fuzzy sets. These four steps are controlled by four threshold parameters, Th1–Th4. The details are explained in Refs. 8 and 9
accuracy improvement: the fuzzy models are improved by the RSSA algorithm in terms of accuracy based on a fixed modelling structure
non-dominated sorting and diversity sorting: the non-dominated fuzzy models with a good diversity are found using the non-dominated sorting and diversity sorting mechanisms, which are introduced in the algorithm MORSSA6
termination check: if the termination criterion is achieved, the modelling process is stopped and the final Pareto-optimal solutions are obtained; if not, all the modelling and performance information are passed to the multiobjective optimisation algorithm. Normally, the termination criteria are designed so that the number of function evaluations achieves a predefined value
multiobjective optimisation using MORSSA (see the section on ‘Reduced space searching algorithm’): the algorithm generates new control parameters (Th1–Th4) for interpretability improvement based on the multiobjective optimisation strategy, then returns to step (iii). It should been noted that the structure of a fuzzy model is not directly coded into the optimisation procedure, but is rather varied and optimised via controlling the thresholds. The accuracy of a fuzzy model can be evaluated using the root mean square error (RMSE) index, which is described as follows
is the measured output data and
is the predicted output data, l = 1, 2, …, N; N is the total number of data. The interpretability of a fuzzy model is affected by the number of fuzzy rules Nrule, the number of fuzzy sets Nset and the total length of fuzzy rules Lrule. To normalise these two objectives and make them similar and comparable in scale, they are formulated as follows:
Objective 1
Experimental studies
AA 5083 is a non-heat treatable aluminium alloy, which has excellent corrosion resistance, good strength and formability.11 In this work, 5·8 mm AA 5083 plates were friction stir processed using a MX-Triflute tool.12 All experimental trials are butt welds, which were made under position control with the tool at 0° tilt.
The Triflute concept has proved to be a successful second generation FSW tool design, where three deep helical grooves are cut into the probe of the tool to encourage vertical movement of the weld metal. It can further feature a second thread with a shallower thread depth and pitch angle, which is referred to as multihelix (MX) design. The improvements in material flow introduced by the MX-Triflute tool design significantly increase the maximum achievable welding speed in aluminium alloys.13 In this work, the MX-Triflute tool was used in conjunction with a 25 mm diameter scroll shoulder.
For the welding, two attributes are used to control the process: tool rotation speed (rev min−1) in clockwise or counter-clockwise direction and forward movement per revolution along the joint line (a function of welding speed) (mm rev−1). The rotation of tool results in stirring and mixing of material around the rotating pin and the translation of tool moves the stirred material from the front to the back of the pin. Normally, a higher tool rotation speed generates higher temperatures due to increased friction heating and results in more intense stirring and mixing of material.14 In this work, an assessment was undertaken using a parameter test matrix consisting of five levels of tool rotation speed, i.e. 280, 355, 430, 505 and 580 rev min−1, and five levels of traverse feedrates, 0·6, 0·8, 1·0, 1·2, and 1·4 mm rev−1.
In the following, the proposed modelling method is applied to predict the multiple properties for the FSW process. In these experiments, the initial number of clusters was set to nine, which means that the initial fuzzy model was generated using nine rules. For the MORSSA algorithm, the number of function evaluations was set to be 5000; for the innerloop RSSA, the number of function evaluations was set to 200; all other parameter settings were as same as those recommended in Ref. 6. Every single experiment was carried out over 20 runs to test repeatability and consistency of results. Only one set of typical results out of the 20 runs is selected and shown in the following sections.
Internal process variables
In order to design a safe and practical FSW process, it is crucial to establish correlations between the controllable process conditions and some internal process variables, such as temperatures and forces, which can help to avoid overheating and tool wear problems in the weld design. In this work, the internal process variables considered consist of tool temperature, shaft temperature, torque, traverse force, compression force, and bending force. All the relevant data were collected via an advanced advanced rotating tool environment monitoring and information system (Artemis) unit13 developed by TWI (see also Part 1 of paper), which is an extensively instrumented rotating tool holder for in-process collection of the internal data representing welding status.
In the following case relating to the peak temperature of the tool (TPT), 20 data points were used for training and five data points were used for final testing. Figure 3 illustrates the trade-offs among the Pareto-optimal models respect to the multiple objectives and various criteria, including the root mean square error, the number of fuzzy rules, the number of fuzzy sets and the total length of fuzzy rules.

Performance of one set of optimised Pareto-optimal fuzzy models for tool peak temperature modelling problem
Table 1 includes the main parameters of three optimised fuzzy models, which are selected from all the Pareto-optimal models and with eight, five and three rules respectively. Figure 4 shows the prediction performance of these models. It can be seen that, for these optimised models, more fuzzy rules and more parameters will lead to better accuracy while the models with fewer fuzzy rules and parameters are simpler in structure and easier to understand.

Tool peak temperature models’ predicted outputs versus measured outputs (with +1% and −1% error bands)
Main parameters of some obtained tool peak temperature models
To provide more details about the obtained fuzzy models, Fig. 5 shows the fuzzy rule base of the three fuzzy rule model. These fuzzy rules can be rewritten as the following approximate linguistic rules3, 9 using the linguistic hedges approach:15

Rule base of the three rule tool peak temperature model
R1: IF tool rotation speed is very small AND forward movement rate is small, THEN tool peak temperature is more or less large
R2: IF tool rotation speed is more or less small AND forward movement rate is large, THEN tool peak temperature is small
R3: IF tool rotation speed is small medium AND forward movement rate is more or less medium, THEN tool peak temperature is medium large.
It is clear that such linguistic fuzzy rules allow for a better insight into the FSW process.
To verify the physical interpretation of the obtained model, Fig. 6 shows three-dimensional response surfaces of the obtained models. From these surfaces, it can be seen that the models with more fuzzy rules can capture more details from the training data. It can also be observed that, with increasing forward movement per revolution (the ratio of welding speed and rotation speed), the tool temperature tends to decrease. This trend is consistent with the finding from Refs. 16–18 and follows the expected behaviour from the knowledge experts.

Response surfaces of tool peak temperature models
Similarly to the above, multiple sets of fuzzy models for other internal process variables have also been established. Figure 7 shows an instance of the prediction models for the peak torque during the welding process.

Modelling case of peak torque
Mechanical properties
All the welds have been tested for tensile properties at room temperature, including yield strength (YS), ultimate tensile strength (UTS), reduction of area (ROA) and elongation, where a two-dimensional digital image correlation (DIC)19 system (LaVision 2D system running a 2MP monochrome camera) was used for data acquisition and displacement measurements. For each weld profile, five separate specimens were produced and tested. Tensile specimens were machined from the nugget zone in transverse orientation to the weld. It should be emphasised that the strength obtained in the transverse tensile test represents the weakest region of the weld and the observed ductility is an average strain over the gauge length including various zones.
In these tensile tests, failures occurred mainly as a shear fracture in the heat affected zone (HAZ) (see Fig. 8a), because the HAZ has the lowest strength due to significantly coarsened precipitates and the development of the precipitate free zones (PFZs).14 For the welds with defects, failures may also occur in the nugget zone (see Fig. 8b), where voids were produced.

Fracture surfaces of tensile specimens: fracture occurring in a HAZ and b nugget zone
Based on these measured data, the proposed intelligent modelling approach is then employed to construct the prediction models for the above mechanical properties. For these cases, 16 data were used for training and four data were used for final testing. Figure 9 shows the trade-offs among the multiple criteria within Pareto-optimal fuzzy models for yield strength, ultimate tensile strength and elongation respectively. Figure 10 shows the prediction performance of some selected models and Fig. 11 shows the three-dimensional input–output surfaces of these fuzzy models.

Performance of optimised Pareto-optimal fuzzy models

Models’ predicted outputs versus measured outputs

Response surfaces
From the surfaces, it can also be observed that, with an increasing forward movement per revolution, the yield strength tends to increase; and with an increasing tool rotation speed, the UTS tends to decrease. These trends are considered to be consistent with the finding from Ref. 20 and follow the expected behaviours from the knowledge experts. It is also worth noting that this fuzzy model represents a non-linear mapping with a good generalisation ability, which is evidenced by the smooth input–output response surface.
Grain size
The FSW process results in significant microstructural evolution, including grain size, grain boundary character, dissolution and coarsening of precipitates, breakup and redistribution of dispersoids, and texture.14 In this paper, the average grain size (AGS) in the nugget zone is investigated. Figure 12 shows the micrographs of the preweld and post-weld materials, where one can identify fine equiaxed grains with different grain sizes.

Micrographs of a parent material, b weld with RS = 280 rev min−1 and FM = 1·4 mm rev−1, c weld with RS = 580 rev min−1 and FM = 0·6 mm rev−1, and d weld with RS = 580 rev min−1 and FM = 1·4 mm rev−1
In this experiment, 20 data points were used for training and five data points were reserved for testing. Figure 13 shows the trade-offs among the multiple criteria within the non-dominated fuzzy solutions. It can be observed that these Pareto-optimal models exhibit fuzzy sets pattern behaviour, which means that they provide a wider choice of different solutions to users.

Performance of one set of optimised Pareto-optimal fuzzy models for average grain size modelling problem
Table 2 includes the main parameters of three optimised fuzzy models, which are selected from all the Pareto-optimal models and with nine, seven and four rules respectively. It can be seen that, for these optimised models, more fuzzy rules and more parameters will lead to a better accuracy while the models with fewer rules and parameters are simpler in structure and easier to understand (interpret).
Main parameters of some obtained average grain size models
Figure 14 shows the prediction performance of the model with seven fuzzy rules and its response surface. It is noted that the grain size can be reduced by decreasing the tool rotation rate at a constant tool traverse speed, which is the same as the findings from Refs. 17 and 21. This observation is consistent with the general principles for recrystallisation,22 because that the increase in heat input (high tool rotation speed) leads to generation of coarse grains.

Seven rule average grain size model's a predicted outputs versus measured outputs (with +10% and −10% error bands) and b response surface
For more details, Fig. 15 shows the rule base of the four rule fuzzy model. It can be further represented as the following approximate linguistic rules 3 3,9 using the linguistic hedges approach:15

Rule base of four rule average grain size model
R1: IF tool rotation speed is medium small AND forward movement rate is medium large, THEN average grain size is medium small
R2: IF tool rotation speed is more or less medium AND forward movement rate is medium small, THEN average grain size is more or less medium
R3: IF tool rotation speed is medium large AND forward movement rate is medium, THEN average grain size is large
R4: IF tool rotation speed is large AND forward movement rate is large, THEN average grain size is medium small.
By inspecting these linguistic rules, one can understand more about the system's behaviour.
Conclusions
In spite of the relatively short history of FSW, it has found a number of key applications in industries. Friction stir welding can also lead to significant improvements in the mechanical properties of the welded materials (weld and parent material) over the more conventional welding techniques. However, the fundamental physical understanding about the process is still lacking, as the material flow and microstructural evolution within the weld is very complex. To achieve a comprehensive understanding of the relationships among the process conditions, internal process variables and post-weld properties, the foregoing Part 1 of this paper has studied the correlations between temporal internal process variables and the weld quality using multiple correlation analysis techniques, such as Fourier frequency analysis and wavelet based analysis. Subsequently, in this Part 2, a systems modelling framework has been successfully applied within the context of predicting multiple properties for the FSW processed welds, including the mechanical properties, average grain size of welded materials, and some internal process properties. In the derived modelling method, the multiobjective optimisation technique has been employed to improve both the accuracy and the interpretability attributes of fuzzy models, and a hierarchical double loop optimisation structure, including two learning techniques (MORSSA and RSSA), has also been included to improve the modelling efficiency. In future, the developed models will be exploited to serve as the core module for reverse engineering designs that are able to suggest optimal process conditions (process routes) by taking into account a set of desired objectives relating to achieving structurally sound, defect free, and reliable welds.
