Filtered prioritized and cluster-based optimization towards correction of antipatterns

Abstract

When systems evolve, the maintainability and the complexity of the system increase, resulting in the poor antidesign patterns or antipatterns in software maintenance. Refactoring is an operation used to improve the structure of an existing system by changing its internal structure without affecting its external behavior. Many studies revealed that refactoring is underused by software development team in industry due to its increased effort and lack of knowledge about existing approaches. Thus, a prioritized correction of antipatterns based on relevance of classes, analyzed development history, risk to refactor, minimized code changes, and maintainers context and preferences is needed for good refactoring recommendations. We propose an approach based on filtered relevant classes with generic linkage-based clustering and adapted nondominated sorting genetic algorithm III to find good refactoring recommendations, which maximize the software quality and reduce the refactoring effort. Our approach uses our own node centrality-based sub-graph isomorphism algorithm to dynamically understand the system with good accuracy prior to optimization. Our approach is evaluated on six open-source systems and the results have shown the effectiveness of our approach compared to the other existing ones.

Keywords

Antipatterns artifacts metaheuristic refactorings refactor understandability

Introduction

Software maintenance and evolution are two extreme ends for the success of a product in modern-day software development. The design and code to be optimized periodically in order to avoid any technical debt resulting from decaying of software artifacts such as classes, packages, requirement documents, etc. The decision is on part of the developer’s expertise, but the maintenance tasks when applied on software systems increase their complexity, which incurs a much deviation from its original design and structure. This may lead to poor design defects known as antipatterns.^1,2 These antipatterns have negative effect on software quality attributes such as maintainability, flexibility, understandability, etc.³

Refactoring is a widely used technique for improving software quality. Refactoring has been defined by Martin Fowler as the process of changing a software system in such a way it does not alter the external behavior of the code, yet it improves its internal structure.⁴ Refactoring can be done mainly through the following steps: (1) identify the portions of code where there is a presence of antipatterns or code smells and (2) identify the good sequence of refactoring solutions. The first part is well covered in various literatures,^3,5 but all these approaches are not interactive by putting the programmer into loop and purely metric based one. In the study by Mkaouer et al.,⁶ a high-dimensional search-based scalable approach is recommended, but the search space is still large to handle.

In this research work, we propose a novel interactive approach towards refactoring recommendations in which architecturally relevant classes are filtered and perform a cluster-based selection to prioritize the code smells according to criteria such as relevance, maintainer’s context and preferences, risk to refactor, minimized code changes, and analyzed version history. Thus, refactoring problem as a whole is considered as a multi-objective optimization one using generic linkage-based clustering selection to find good refactoring solutions. Our work is concentrated on the law of the vital few that is “only 20% code of code contains 80% of errors.”⁷ We choose only the most relevant classes which are refactoring prone by considering historical information for urgent refactoring needs. We evaluated our approach on six medium- and large-scale open-source systems using existing related studies by Ouni et al.³ and reported the effectiveness of our results.

The remainder of this paper is structured as follows: the upcoming section provides the challenges to refactoring recommendations, which is followed by a section that defines refactoring as a multi-objective optimization problem. A further section introduces a search-based approach to this problem using generic linkage-based clustering selection and adapted nondominated sorting genetic algorithm III (NSGA III) , which is followed by a section that describes the solution coding and fitness functions. The penultimate section explains the research questions, results and discussions and describes the related work, and the final section discusses conclusion, future work, and references.

Educational benefits

Design patterns and pattern-oriented software architecture is an important course in the domain of software engineering at postgraduate and doctoral level, but the students are not aware of the professional benefits of understanding this dynamic software. Also, they had a question how design patterns became antipatterns or code smells producing negative impact on the software quality and future maintenance. This research paper addresses these issues as how to prioritize antipatterns within the framework of interactive machine learning and graph algorithms to achieve high-quality software products. This work will provide a basis for further ongoing research and an enhanced course to become future entrepreneurs achieving good quality products.

Challenges to refactoring recommendations

Background and definitions

The idea of refactoring is to reorganize classes, variables, and methods to adapt to the changing requirements. Thus, the entire software system achieves various aspects of software quality such as maintainability, understandability,^8,9 etc. Our approach has detected 12 kinds of bad smells such as Blob, Feature Envy, Spaghetti code, and anti-singleton, class data should be private, complex class, lazy class, long parameter list, message chain, refused parent bequest, data class, and Swiss army knife.^10,11 In this approach, we attempt to correct those design defects termed as antipatterns^1,2 that have a negative effect on software quality, which often leads to much bugs and failures.

In many real-life problems, there may be conflicting objectives with each other. Here, in this research work, we have to maximize the quality, minimize the refactoring effort, minimize the change score, and improve code correction ratio (CCR). To investigate a set of solutions, each of which satisfies the objectives at an acceptable level without being dominated by any other solution. So, the solutions were identified in the set of all feasible non-dominated solutions that is the pare-to optimal set. Identifying an optimal solution for conflicting objectives is practically impossible due to its size, so it will be worth to reduce the search space prior to genetic operations. For that purpose, we use generic linkage-based clustering adaptation to NSGA III algorithm.¹² Our objective is as follows: (a) improved software quality, (b) minimized code changes, (c) consistency with the analyzed version history, (d) the architectural relevance of the classes and their ranked prioritization, and (e) take into account the developer’s context and preferences.

Filtered cluster-based optimization perspective towards refactoring recommendation

An overview of the approach

The approach explores a large search space to gather good refactoring solutions to correct the detected antipatterns or bad smells. A generic linkage-based clustering optimization, that is, adapted NSGAIII, is used here to generate refactoring solutions. The general structure of our approach is sketched in Figure 1. Our approach first performs a class prioritization which is a three-step process: analyze different versions, according to the study by Girba et al.,¹³ the classes that have been refactored frequently in the past are likely to be refactored in the near futureand analyze architecturally relevant classes from the current version; Then, a rank is calculated.

\begin{array}{l} R ank = Frequency of refactoring (Fr) \times Severity score (Sr) \\ \times \sum (severity of each code smell for a class (S (ci))) \\ \times no . of code smells (NC (ci)) \end{array}

(1)

Figure 1.

Proposed structure of the approach.

Refactoring process as a multi-objective problem

Quality

The quality objective fitness function can be evaluated by looking at the CCR, which is calculated using the ratio of the no. of antipatterns¹ to be removed from the system after refactoring to the total number of antipatterns observed in the system.

CCR = \frac{# antipatterns removed after refactoring}{# total number o f antipatterns observed in the system before refactoring}

(2)

Code changes

To calculate the code changes score in the analyzed version history, we use the approach where each refactoring solution is considered as a set of n refactoring operations (ROs), a weight is assigned to each RO in the range [1--3].

Code changes = \sum_{i= 1}^{p} wi

(3)

The java-based optimization frame work designed by our own easily calculates the value of the number of redirect method calls, field references, return types, and parameters of a method.

Consistency with the version history

The different stable versions of the input system have been analyzed. The classes which have been refactored at least once in the previous versions are recorded along with their frequency score value. For this purpose, we use Ref-finder¹⁴ tool, and the current version of the system is analyzed with the organic tool (http://www.organic/).¹⁵ To find out the antipattern detection, we use our own node centrality-based sub-graph isomorphism approach as described in the studies by Sreeji and Lakshmi.^10,11 The classes that have not been refactored in the past are filtered out from this step. This process is conducted before the optimization is performed, and the rank score is given as input to the optimization framework.

Severity

Developers give different importance to different types of antipatterns because each of them has different impact on software quality. In this approach, we are considering 13 types of antipatterns and the highest score of 13 is given to Blob since it is most risky and affects a large number of classes as indicated in different literatures³ and our observations^10,11 (12 for feature envy, 11 for spaghetti code, and so on) depending upon the frequency of each type of antipattern. The value may vary from one system to another depending on the context. In this research work, the software developer context and preferences are taken into account for choosing the severity.

Adapted NSGAIII towards software refactoring

Algorithm 1: Pseudo code for adaptation of generic linkage-based clustering selection to NSGA III towards refactoring

Input: a cluster set of each individual refactoring solutions i $\in$ R

Output: reduced non dominated set as a list L

Initialise cluster set of all refactoring solutions.

if |R| ≤N where N is the maximum sequence of refactoring solutions

/*Assume that the update points are vector based with the Euclidean distance as dissimilarity measures. The distance between two clusters R₁ and R₂ is the average distance of all pairs of refactoring solutions (x $\in$ R₁ and y $\in$ R₂) and is calculated using the formula. */

d_{xy} = (1 / R_{1 *} R_{2}) = \sum^{​} d (x, y)

x \in R_{1}, y \in R_{2}

Determine two clusters with minimal distance d_xy and form a priority queue L implemented as a binary heap.

Else

For each cluster do

update the minimum distance

calculate the centroid

update the priority queue

End Do

End if

Output the reduced non dominated set as an output list L.

Algorithm 2: Pseudo code of adaptation to NSGA III¹² towards refactoring

Input: H initial model, a set of quality metrics, structured reference points Z^r, parent population containing design defects Pt

Output: Pt’, a best refactoring solution

St = φ, i = 1

Pt’ = generic linkage based Cluster Selection (Pt)

Qt = Recombination+Mutation(Pt’)

Rt=Pt’UQt

(F1,F2,…….)=Non-Dominated _sort(Rt)

repeat

St=St $U Fi$ and i = i + 1

until |St|≥ N

Fl=Fi

If |St|=N then

Pt’=St, break

else

Pt’= $U_{j = 1}^{l - 1} Fj$

K = N-|Pt’−1|

/* Normalize the objectives and set up new reference set R*/

Associate each element of St with a closest reference point

Find out the distance between s and perpendicular distance of s

Choose a Pj such that

Pj= $\sum_{s \in St | Fl} ((π (s) = j) ? 1 : 0)$

Niching(k,pj,π(s),d(s), Z^r,Fl,Pt’)

End if.

Choose refactoring solutions from last front Fl

End

Solution coding

The refactoring order is based on the position in the vector. First, create an empty vector for the current solution, then randomly select (1) a RO from the list of possible refactorings and (2) a set of code elements, and each of the RO suggested should satisfy a set of pre- and post-conditions. Apply the operation to an intermediate model and the process is repeated until we get a maximal solution of length (n).

Mutation and crossover applied to the individuals

A single-point cross-over and mutation performed by interchanging probabilistically the bit string position in the vector is shown as follows:

a. Crossover

Parent 1

PDF

PUF

ESUBC

Parent 2

PUF

⇓Cross-over

Child 1

Child 2

PUF

PDF

PUF

ESUBC

b. Mutation

Parent

PUF

PDM

Mutation ⇓

Child

PUF

PDM

Fitness function

After creating a solution S, the fitness is calculated, as for each filtered class here, we have conflicting objectives, so instead of giving equal weights, the problem should be solved multiple times with different weight combinations. A normalized weight vector is randomly generated for each solution during the selection phase at each generation. The simulation codes are programmed in java optimization frame work that helps to analyze java programs, metric calculation, and simulation of refactorings in Eclipse.

Results and discussions

Research questions

RQ1: To what extent the prior to optimization filtering of classes produces CCR, estimated effort to refactor, and rank score?

RQ2: To what extent the approach corrects the existing filtered out classes with antipatterns?

RQ3: How much of quality is gained after filtered and prioritized refactoring?

RQ4: How does the approach compared to the existing approach without prior filteration?

Experimental setup and systems studied

The approach is applied on six open-source projects which are large- and medium-sized ones such as GanttProject (www.ganttproject.biz), JFreeChart (http://www.JFreechart.org/), and JHotDraw (http://www.jhotdraw.org/). GanttProject is a cross-platform tool for project scheduling. JHotDraw is a GUI framework for drawing editors. ArgoUMLv0.6 and 0.3 is framework for modeling purposes. Xerces-J is a family of packages for parsing XML. Finally, JFreeChart is an open-source framework to create charts written in Java.

Result analysis

To answer RQ1 and RQ2, our method uses two performance metrics: CCR and estimated refactoring effort (ERE).

CCR is given in equation (1) and calculates the ratio of antipatterns corrected after refactoring over the total number of antipatterns observed in the system. After filtering at least once refactored classes between three consecutive versions, the architecturally relevant classes were found out. The CCR can be calculated and is shown in the Table 2. The proposed refactorings are applied using eclipse meta model, and the program behavior was observed. The ERE is calculated using the formula

ERE = \frac{No . of classes to be refactored}{Total number of filtered classses observed in the system}

(4)

The results are shown in Table 1.

Table 1.

Performance Evaluation metrics for the systems studied.

System	Step	CCR	ERE
GanttProject 2.6.6	MORE	578/668=87%	327/668=48%
GanttProject 2.6.6	Cluster Based(CB) NSGAIII	517/541=96%	124/541=22%
JHotdraw 7.6	MORE	635/669=95%	237/669=35%
JHotdraw 7.6	CB NSGAIII	527/548=96%	117/548=21%
JFreeChart 1.0.19	MORE	997/1214=82%	525/1214=43%
JFreeChart 1.0.19	CB NSGAIII	815/889=91%	217/889=24%
ArgoUMLv0.26	MORE	1258/1358=93%	138/1358=10%
	CB NSGAIII	760/771=98%	75/771=9%
ArgoUMLv0.3	MORE	1275/1409=90%	558/1409=39%
	CB NSGAIII	770/775=99%	129/775=17%
Xerces-J v2.7	MORE	885/991=89%	550/991=55%
	CB NSGAIII	769/775=99%	82/775=10%

CCR: code correction ratio; ERE: estimated refactoring effort.

The statistical study of our approach uses 31 independent runs for each problem instance. The experiments are conducted using Wilcoxon rank sum test with 99% confidence level (α < 0.004) for CCR with p value <0.004 and 95% confidence value (α < 0.05) for ERE for the following algorithms: (1) MORE and (2) our approach. The results indicate that the new approach improves the CCR with minimized refactoring effort. Hence, our approach outperforms the other algorithms.

RQ3: We use the QMOOD quality model as in the study by Ouni et al.³ and its related attributes. These attributes are calculated using 11 low-level design metrics such as design size in classes, number of class hierarchies in a design, average no. of ancestors, data access metric, direct class coupling, cohesion among methods, measure of aggregation, functional abstraction, number of polymorphic methods, class interface size, and complexity in terms of number of methods. Let Q={q1, q2,q3,q4,q5,q6} are the quality attributes before filtration and prioritized correction plus optimization and QP={qp1,qp2,qp3,qp4,qp5,qp6}are the quality attributes after filtered prioritized correction. Assign a weight in between 0 and 1 for each of the attributes and find the weighted sum of the difference between these attributes.

RQ4: Regarding the improvement in quality attributes, our approach has more improvement in QMOOD attributes such as flexibility and understandability. Figure 2 illustrates this fact. The graph is drawn for the QMOOD values for the six systems for both approaches. The quality gain values are normalized between 0 and 1 using min–max normalization.

Figure 2.

QMOOD quality attributes comparison over various systems.

Parameter set-up and industrial case study

Our experimental approach uses different population sizes such as 90, 200, and 150 for 3, 5, and 8 objectives. The maximum number of generations used is 300, 400, and 600, respectively, for each of the objectives. Each algorithm is executed for 31 runs with different configurations and parameters, then the performance parameters such as CCR, ERE, computational time, the average Euclidean distance differentiating each reference point from its closest one that is termed as the IGD (inverted generational distance), and quality gain are calculated for the two algorithms and for each open-source system. The scalability of NSGA III is illustrated in Mkaouer et al.⁶ We compared the computational time and IGD for the algorithms such as high-dimensional NSGA III,⁶ our adapted NSGA III, and multi-objective NSGA II (MORE). Our aim is to ensure the efficacy of our adapted approach. For that purpose, we asked five software engineers to use our approach on a medium-scale project, web based, and developed in java framework. Our approach is able to detect abstract factory, bridge and command design patterns with a total of 79 code smell instances out of 200 classes. They asked to perform the suggested refactorings and are able to fix 95% of code smells with minimized effort and improved quality. The recommended refactorings are consistent with their preferences and found to be useful.

Discussions

Table 2 shows the median IGD and CCR values of 31 independent runs for the three algorithms under comparison. For the three objective cases, the three algorithms perform almost same, but for higher number of objectives, MORE using NSGA II performs less. High-dimensional NSGA III⁶ performs well for higher number of objectives, but our cluster-based NSGA III achieves much better results. This is due to its non-metric, graph-based antipattern detection and cluster-based selection to reduce the search space and finally the reference point-based selection for the next iteration. The CCR ratio increases with increased number of objectives.

Table 2.

The median IGD and CCR values of 31 independent runs for Mkauer et.al (NSGA III), Cluster based NSGA III and MORE using NSGA II.

Problem	N	Maximum Generations (M)	Mkaouer et.al (NSGA III)		Cluster based NSGA III		MORE using NSGA II
Problem	N	Maximum Generations (M)	CCR (%)	IGD	CCR (%)	IGD	CCR (%)	IGD
ArgoUML v0.26	3	300	67	1.357 × 10⁻³	71	1.358 × 10⁻³	67	1.357 × 10⁻³
	5	400	69	1.356 × 10⁻³	82	3.256 × 10⁻³	58	4.42 × 10⁻³
	8	600	86	3.957 × 10⁻³	99	3.257 × 10⁻³	42	∼
Xerces-Jv 2.7	3	300	64	9.751 × 10⁻⁴	69	9.493 × 10⁻⁴	62	9.752 × 10⁻⁴
	5	400	78	7.874 × 10⁻³	84	7.653 × 10⁻³	56	8.806 × 10⁻³
	8	600	84	8.001 × 10⁻³	89	6.678 × 10⁻³	48	∼
ArgoUML v0.3	3	300	63	2.667 × 10⁻³	65	1.953 × 10⁻³	63	2.647 × 10⁻³
	5	400	79	4.283 × 10⁻³	94	5.672 × 10⁻³	54	4.124 × 10⁻³
	8	600	91	5.545 × 10⁻³	98	4.925 × 10⁻³	∼	∼
GanttProject v2.6.6	3	300	65	5.661 × 10⁻³	68	4.901 × 10⁻³	65	5.601 × 10⁻³
	5	400	69	6.601 × 10⁻³	92	6.521 × 10⁻³	54	6.945 × 10⁻³
	8	600	88	7.726 × 10⁻³	96	6.976 × 10⁻³	∼	∼
JHotDraw7.0.6	3	300	66	5.337 × 10⁻³	68	5.337 × 10⁻³	66	5.332 × 10⁻³
	5	400	79	6.202 × 10⁻³	87	5.901 × 10⁻³	74	6.776 × 10⁻³
	8	600	88	7.116 × 10⁻³	88	5.934 × 10⁻³	∼	∼
JFreechart1.0.19	3	300	67	5.691 × 10⁻³	69	5.592 × 10⁻³	68	5.591 × 10⁻³
	5	400	80	6.102 × 10⁻³	85	5.690 × 10⁻³	72	6.667 × 10⁻³
	8	600	89	6.676 × 10⁻³	91	5.561 × 10⁻³	∼	∼

CCR: code correction ratio; IGD: inverted generational distance; NSGA: nondominated sorting genetic algorithm.

When we take the computational time (CPU time), which is the most crucial step in the evaluation process, cluster-based NSGA III performs well with reduced computational time since the number of classes in the solution space is further reduced due to filtration and clustering. Figure 3 illustrates this fact.

Figure 3.

CPU time values of 31 independent runs of different algorithms studied.

Related work

Murphy-Hill et al.¹⁶ conducted empirical studies to provide the support for refactoring. First, M Fowler^8,9 coined the word refactoring as an effective approach to improve the design and to preserve the external behavior. Some of the major bottlenecks in adopting refactoring of industrial projects include deadline pressure, inadequate tool support, etc. This scenario demands an effective approach to the refactoring research community to help developers to attain good quality software without incurring much cost and effort. Search-based software engineering^17–20 successfully adopted many different approaches to automate software engineering tasks such as refactoring.

Many of the search-based approaches use metric-based objective functions to find an optimal refactoring sequence.³ The researchers used QMOOD quality model and attained minimal improvement in quality factors such as understandability and flexibility. Seng et al.²¹ proposed a mono objective optimization approach, which uses a genetic algorithm to maximize a weighted sum of several quality factors such as cohesion, complexity, etc, but the approach needs a human intervention to decide whether a suggested refactoring can be applied or not. Recently, Ouni et al.³ introduced a search-based approach using NSGA II, which, in addition to the quality fitness function, semantic fitness, and code smell objective function, is also considered. The approach is able to fix 84% of code smells and introduce an average of six design patterns. But the approach suggests an offline refactoring suggestions which is tedious and time-consuming process. Also, the approach fails to assess the ERE in urgent refactoring needs.

There are very few research works going on towards the prioritization and correction of antipatterns when software community needs urgent refactoring treatments. R Marinescu^8,22 developed the InFusion tool and defined the severity index as severity is computed by measuring how many times the value of a chosen metric exceeds a given threshold considering size, encapsulation, complexity, coupling, and cohesion metrics. Other facts such as change history, the context, and the characteristics of the smell are also considered to prioritize and manage antipatterns. Arcelli et al.²³ proposed a tool called JCodeOdor to filter and prioritize code smells. The approach defined an index called code smell harmfulness to approximate how harmful does each code smell with the help of metrics and threshold computation. Arcoverde et al.²⁴ evaluated different heuristics such as change density and error density to prioritize the antipatterns and help maintainers with automatically ranking the code anomalies which are more harmful. Steidl and Eder²⁵ proposed a prioritization mechanism for maintainability defects such as code clones and long methods that are easy to refactor. Palomba et al.²⁶ proposed approaches which uses the parameter development version history of various applications to identify the code smells in the current version. Tsantalis et al.²⁷ described a search-based approach for identifying the good refactoring solutions based on historical information. In the study by Ouni et al.,³ a search-based approach for identifying the good refactoring solutions based on chemical reaction optimization was described and good CCR was achieved, and no attempt is made to assess the quality and reduce the refactoring effort. Choudhary and Singh²⁸ proposed an approach to minimize the refactoring based on historical, relevance, and code smell information, but the three-step filtration process filters out much of the relevant classes, and the design quality is not checked. Mkaouer et al.⁶ proposed the study on the scalability of the NSGA III algorithm for 15 different low-level design metrics and claimed as the first study using NSGA III. The approach uses static metric analyzers to detect the antipatterns, and no attempt is made to improve the refactoring effort.

The above works did not taken into account the developer context and preferences, and none of the works made an attempt to estimate the refactoring effort and reducing the search space. Also, they rely upon the static metric analyzer for the detection of antipatterns and found such relevant code smells to be missing. Our approach is a novel one which combines the benefits of graph algorithms for the detection of more antipatterns and design patterns instead of simple metrics, and we employed a filtered and cluster-based adapted NSGA III approach with reduced computational effort, more quality gain, and minimized refactoring effort to reduce the technical debt.

Conclusion and future work

Our approach performs the filtration of relevant and at least once refactored classes, prioritizes them, and then finds out the normalized rank score. The highest ranked classes are then optimized using generic linkage-based clustering and NSGA III multi-objective optimization. The approach achieves high-quality gain, reduced refactoring effort, and improved CCR than MORE using NSGA II³ and high-dimensional NSGA III.⁶ Our approach applied on six open-source systems written in java programming language, and the computational effort also becomes less since there are only relevant classes that are considered for optimization.

Our approach has two parts, one for dynamic recommendation of design patterns by using the concept of antidesign patterns. For this, we used a combined technique of graph algorithm and metric analyzer implemented by our own program. The second part uses filtered search-based optimization. Also, a semantic fitness function can also be incorporated to better achieve the refactoring meaningfulness.

Footnotes

Declaration of Conflicting Interests:

The author(s) declared no potential conﬂicts of interest with respect to the research, author- ship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

Koenig

Patterns and antipatterns. J Object Oriented Program 1995; 8: 46–48.

Brown

McCormick

Mowbray

, et al. AntiPatterns: refactoring software, architectures, and projects in crisis. New York, NY: Wiley, 1998.

Ouni

Kessentini

Sahraoui

, et al. A multiobjective refactoring approach to introduce design patterns and fix anti-patterns. In: North American search based software engineering symposium, pp. 1–15, 2015. Oxford, UK: Sciverse ScienceDirect.

Fowler

Refactoring: improving the design of existing code. Boston, MA: Addison-Wesley, 1999.

Ouni

Kessentini

Bechikh

, et al. Prioritizing code-smells correction tasks using chemical reaction optimization. Softw Qual J 2015; 23: 323–361.

Mkaouer

Kessentini

Bechikh

, et al. High dimensional search-based software engineering: finding tradeoffs among 15 objectives for automating software refactoring using NSGA-III. In: Proceedings of the 2014 conference on Genetic and evolutionary computation, 2014. New York, NY: ACM.

Pressman

RS.

Software engineering: a practitioner’s approach. Basingstoke, UK: Palgrave Macmillan, 2005.

Marinescu

Assessing technical debt by identifying design flaws in software systems. IBM J Res Dev 2012; 56: 9:1–13.

Opdyke

WF.

Refactoring: A program restructuring aid in designing object-oriented application frameworks. PhD Thesis, University of Illinois at Urbana-Champaign, IL, 1992.

10.

Sreeji

Lakshmi

Automated recommendations of useful design patterns and prioritized correction of antipatterns – a node centrality based approach. Int J Contr Theory Appl 2016; 9: 469–480.

11.

Sreeji

Lakshmi

Dynamic understanding of software system by detecting Design patterns and Antipatterns. J Comput Theor Nanosci 2018; 15: 3044–3050.

12.

Seada

Deb

U-NSGA-III: a unified evolutionary algorithm for single, multiple, and many-objective optimization. COIN Report Number 2014022.

13.

Girba

Ducasse

Lanza

Yesterday’s Weather: guiding early reverse engineering efforts by summarizing the evolution of changes. In: 20th IEEE international conference on software maintenance 2004, pp. 40–49. Piscataway, NJ: IEEE.

14.

Prete

Rachatasumrit

Sudan

, et al., Template-based reconstruction of complex refactorings. In: Proceedings of the international conference on software maintenance (ICSM), Timisoara, Romania, 12–18 September 2010.

15.

http://www.organic/

16.

Murphy-Hill

Parnin

Black

AP.

How we refactor, and how we know it. IEEE Trans Software Eng 2012; 38: 5–18.

17.

Harman

The current state and future of search based software engineering. In Briand

Wolf

(eds) Future of software engineering. Los Alamitos, CA: IEEE Computer Society Press, pp.342–357, 2007.

18.

Harman

Jones

BF.

Search-based software engineering. Inf Softw Technol 2001; 43: 833–839.

19.

Harman

Tratt

Pareto optimal search based refactoring at the design level. In: Proceedings of the genetic and evolutionary computation conference (GECCO’07), 2007, pp. 1106–1113.

20.

Harman

Mansouri

Zhang

Search-based software engineering: trends, techniques and applications. ACM Comput Surv 45: 61.

21.

Seng

Stammel

Burkhart

Search-based determination of refactorings for improving the class structure of object-oriented systems. In: Proceedings of the genetic and evolutionary computation conference (GECCO’06), 2006, pp. 1909–1916.

22.

Marinescu

Detection strategies: Metrics-based rules for detecting design flaws. In: Proceedings of the 20th international conference on software maintenance, 2004, pp. 350–359. Los Alamitos, CA: IEEE Computer Society Press.

23.

Arcelli

Ferme

Zanoni

, et al. Filtering and prioritising code smells detection, submitted to conference, 2014.

24.

Arcoverde

Guimaraes

Macia

, et al. Prioritization of code anomalies based on architecture sensitiveness. In: Software engineering (SBSE), 2013 27th Brazilian Symposium, IEEE, Brasilia, Brazil.

25.

Steidl

Eder

Prioritizing maintainability defects based on refactoring recommendations. In: Proceedings of international conference on program comprehension, 2014, pp. 168–176.

26.

Palomba

Bavota

Penta

, et al. Mining Version histories for detecting code smells. IEEE Trans Softw Eng 2015; 41: 462–489.

27.

Tsantalis

Chaikalis

Chatzigeorgiou

JDeodorant: Identification and removal of type checking bad smells. In: Proceedings of CSMR, 2008, pp. 329–331.

28.

Choudhary

Singh

Minimizing refactoring effort through prioritization of historical, architectural and code smell information. In: 1st International workshop on technical debt analysis, 2016.

Filtered prioritized and cluster-based optimization towards correction of antipatterns

Abstract

Keywords

Introduction

Educational benefits

Challenges to refactoring recommendations

Background and definitions

Filtered cluster-based optimization perspective towards refactoring recommendation

An overview of the approach

Refactoring process as a multi-objective problem

Quality

Code changes

Consistency with the version history

Severity

Adapted NSGAIII towards software refactoring

Algorithm 1: Pseudo code for adaptation of generic linkage-based clustering selection to NSGA III towards refactoring

Algorithm 2: Pseudo code of adaptation to NSGA III 12 towards refactoring

Solution coding

Mutation and crossover applied to the individuals

Fitness function

Results and discussions

Research questions

Experimental setup and systems studied

Result analysis

Parameter set-up and industrial case study

Discussions

Related work

Conclusion and future work

Footnotes

Declaration of Conflicting Interests:

Funding

References

Algorithm 2: Pseudo code of adaptation to NSGA III¹² towards refactoring