Survey on association rule analysis: Exploration using mining analysis

Abstract

Associative rule mining is a technique for discovering common patterns and correlations in data sets from different databases, including relational, transactional and other types of data repositories, such as relational databases. Different types of patterns exist in data mining such as frequent patterns, extended patterns, regular patterns etc. Many searches have focused on finding the frequent patterns and very little work has been carried out on negative or rare patterns. It has also been observed that only those items which are positively correlated(frequent) are been executed by various algorithms but very less attention is been given to negatively correlated items. Negatively correlated items also called infrequent items are the items which negate with each other. The items which do not satisfy the minimum threshold value generally are always been ignored by many researchers. Mining of Negative association helps in business such as for customer segmentation, in risk management as well as in medical field. So the main aim of writing this paper is to provide a short overview of various research issues involved in finding out positive and negative associations.

Keywords

Association rule mining negative association mining item sets positive association mining frequent pattern

1. Introduction

Data mining is a method for discovering patterns and relationships in enormous datasets, and two of the key purposes of data mining are to explore ever-larger databases and improve market segmentation. Data mining techniques might be used by businesses to anticipate future trends and make better business decisions. Relational, transactional, or other types of databases are used in association rule mining to discover patterns, correlations, relationships, or causal structures that are common across different types of data. Using association rule mining algorithms, it is feasible to identify hidden truths in enormous datasets and mine transaction databases for intriguing linkages between groups of things [1]. The purpose of association rule mining is to discover interesting and strong associations between variables. An object that has been acquired or sold is referred to as having a positive connection (i.e., items that are present and observed). A non-linear restriction that must be properly controlled for negative association arises when products bought individually or in combination are examined in market basket analysis and clustering, classification, and association rule mining are examples of data mining approaches [2]. The process of combining positive item sets with field data to produce negative relationships is known as association analysis. Association rule mining and frequent item set mining are two popular and widely studied data analysis techniques for a range of applications [3]. Negative Affiliations relate to associations that reject the existence of something, and in contrast to positive rules, negative rules define a property that is present in a dataset to be missing [4]. Negative association rules are those that form links between the present and absent items. Items that are not bought while others are might be illuminating, and substantial study has been done on the extraction of association rules from transactional databases since their beginnings, making item set mining or association rules mining essential in understanding buying behavior [5, 6]. Market basket analysis is used in association rule mining to learn about customers’ purchasing behavior by looking at what they put in their shopping carts. Although it has been demonstrated to increase revenue in several corporate businesses [7]. The Association rule can be determined with the support and confidence parameters [8] There is a negative correlation, and it is also regarded as a negative association when the values of one variable decrease as the values of the other increase. A transaction database with an excessive number of components and infrequent pattern sets is a good candidate for negative association rule mining since the approach diagnoses the crisis and optimizes the rule generation process. The Negative Association Rule reminds us that a record containing a given linguistic phrase will not have another linguistic term connected with it [9].

The negative rule requires more time and space to write, but it is more accurate than the standard mining association rule. For each transaction, an association rule scours the database and compresses it into a more compact version. There are statistical and non-statistical features to the association rule mining issue in a database. The study goal is to establish k-way correlations between products bought by customers and establish associations between those items. Most typical association pattern mining models utilize the frequency of sets as a quantifier for association level, which is employed in association.

1.1 Types of association rule mining

Associative rule mining is a method for finding correlations between items in a vast database of cases and drawing interferences on how to separate subsets of things that affect the existence of other subsets of things. Finding patterns and connections across a broad variety of databases, including relational, transactional, and other data repositories could be accomplished using the association rule mining technique. The method involves searching for these patterns, correlations, links, and causal structures. Popular and extensively studied association rule learning is a technique for identifying interesting relationships between variables in huge datasets. Association Rule mining relies on identifying frequent sets in a huge database to maximize the scalability of algorithms and find frequent sets that lead to valuable association rules [10]. Positive and negative association rules were the foundations of association rule mining.

1.1.1 Positive association rule

Positive association rule mining extracts frequent things or collections, but it might also eliminate many valuable items or collections because of their poor support. These infrequent items or item sets might cause large negative association rules and a powerful positive association is referred to be a positive relationship between two item sets [11]. The positive association rule is said to be positive, available in the form of A $\to$ B and association rule mining in diverse fields ranging from web usage mining to medical diagnosis and business database intelligence to geographical information systems [12]. Positive association rule mining is used when mining a huge database for relevant information and determining the amount of support and trust. The accuracy of association rules in dealing with vast volumes of unstructured data has been shown by earlier studies and visualization could be very beneficial to association rule mining by providing users with meaningful visual representations instead of poorly intelligible textual lists [13]. Positive association rules disclose items that are positively associated, whereas negative association rules indicate things that are negatively correlated [14].

1.1.2 Negative association rule

Negative association rule correlation is defined as the values of one variable decreasing as the values of the other increase. A positive connection is formed when the lift value is higher than 1, a negative relationship is formed when the lift value is less than 1, and a neutral relationship is formed when the lift value is equal to 1. Genetic operators and fitness function assignments are used to create negative association rules in databases. There is a lack of attention paid to negative dependencies in most quantitative association rule mining techniques [15, 16]. Algorithms have been created in a variety of approaches to construct negative association rules. It is difficult to discover negative norms because of the large variety of potential negative standards. This leads to millions of negative rules that a manager could not find helpful, and the goal of this exercise is to discover just interesting negative association rules that managers act on [17].

1.2 Application of association rule mining

It is possible to do association rule mining on data sets stored in relational or transactional databases as well as other types of data repositories to look for patterns, relationships, linkages, and causal structures that are common among them. Data analysis might be used to discover commonalities among the datasets. Association rule mining is a powerful technique for uncovering hidden links and connections within massive datasets. This rule establishes the frequency with which a certain item will be used throughout a transaction. Men’s and women’s risk for cardiovascular disease might be better understood with the use of association rule mining [18]. The use of association rules in data mining methods is critical for consumer behavior research and forecasting. It is possible to discover previously undiscovered patterns of purchasing behavior that might be leveraged to design more effective marketing campaigns via association rule mining. Clinical diagnosis might benefit from the use of soft set-based association rule mining and the adoption of association model assessment for the discovery of characteristics that might be utilized to enhance patients’ health from the illness schistosomiasis is now being employed by algorithms [19].

1.3 Data mining

The purpose of database mining is to discover patterns in large datasets that might be both helpful and amusing. The Apriori technique is used to find common groups of items (symbols) that occur together often in a database of customer transactions in this study. Due to its capacity to uncover patterns that can be recognized in big datasets, grasped by people, and used for evaluating data and making decisions, pattern mining technologies have grown more popular. The fact shows that frequent item sets are used in pattern mining, and they became popular because of their wide range of uses.

For detecting patterns in data, Association rule mining (ARM) is an alternative method that doesn’t take into consideration the sequential order of occurrences. It is thus impossible to employ these algorithms on data that contains temporal or sequential ordering information. If the sequential link between events or objects is ignored, important patterns in the data might be missed or patterns of limited use might be uncovered. In terms of the development of new computational and algorithmic techniques, Frequent Pattern Mining (FPM) is one of the data mining challenges that has received the most attention. Apart from frequent patterns, there are some extended patterns, such as rare patterns, negative patterns which aim to extract more insightful and meaningful information from the data [20]. Feature selection is also one of the critical steps in association data mining, aimed at identifying the most relevant attributes or variables that contribute significantly to the patterns or associations within the dataset [21]. We can also accompany association mining with opinion mining where we want to uncover patterns or associations within opinions or sentiments expressed in text [22].

2. Review of literature

The following study expands on enhancing the mining of association using several algorithms. Several researchers explain their findings as seen below.

Subrata Datta and Kalyani Mali (2021) [23] provides a novel framework of significant association rule mining with high associability which is built upon the concept of flexible dissociation. Rule pruning is a critical step in this framework. It involves the process of refining discovered association rules by leveraging the flexible dissociation concept while ensuring a certain level of confidence in the association. Pruning helps eliminate redundant or less meaningful rules, allowing the retention of only the most relevant and significant ones. This paper introduces the SARMHA- (Significant Association Rule Mining with High Associability) framework on account of support, confidence and flexible disassociation.

Jabbour and his colleagues (2018) [24] mentioned that one of the most basic issues in data mining is figuring out how to use methods like association rule discovery to find patterns of transactions with the omitted ones being intended as a contraindicative norm. A negative connection is fully mined as possible, and it is commonly linked with a new measure such as lift or conversion to restrict the set of extracted association rules, and in a satisfiability (SAT) approach is used. The result shows that the non-linear limitations imposed a unique technique, the search space must be pruned, and its efficiency examined.

I.Berin Jeba Jingle and J. Jeya A. Celin (2017) [25] proposed that efficient positive and negative association rule can be mined with the help of an artificial bee colony algorithm. In this, frequent and infrequent item sets are generated by using the Apriori multilevel and multi-support algorithm(Apriori_AMLMS) and then from these infrequent item sets PAR and NPAR are derived which becomes input for the optimized ABC algorithm. In this generating the positive and negative association rule (GPNAR) an algorithm is used for certain parameters such as comprehensibility, time, support and confidence.

Peddi Kishor and Dr. Sammulal Porika (2016) [26] found that it is essential to use association rule mining in large databases to find patterns that relate to different dataset components. Standard rules of association only consider objects that have been transacted in the dataset. In this paper the Yules correlation coefficient formula is used with the help of which we can generate negative rules without the need of finding out infrequent sets thus avoiding the need of scanning the database again.

Youcef Djenouri and his co-workers (2014) [27] investigates the extraction of metarules as a means to effectively prune irrelevant rules in association rule mining. The emphasis lies in clustering association rules specifically designed for handling large datasets. Various interdependencies among rules within the same cluster are identified using a meta-rules algorithm. Then a pruning strategy is applied which reduces redundancy and retain the most effective rules for each cluster k-Means algorithm is used to identify independent cluster.

Jose A. Diaz-Garcia and his team (2022) [28] suggested a brief overview of applications of association rules in the field of social media. The paper also narrates the strengths and weakness of using association rules to solve various tasks of textual social media. As social media mining deals with large amount of unstructured data which do not require labelled datasets so focus is also shifted to using unsupervised techniques.

Chandrasekar Ravi and Neelu Khare (2014) [29] demonstrated EO-ARM algorithm as most efficient algorithm for finding out positive and negative association rules by scanning the database only once irrespective of size of dataset it identifies the frequent item sets by using two dimensional matrix which resemble to k-Map the algorithm was also compared with standard Apriori algorithm in which it was proved that execution time in EO-ARM is much less when compared to Apriori. The algorithm was executed in MATLAB Results show that the use of the projected correlation measure, EO-ARM is more efficient than the normal Apriori method when it comes to execution time and several rules created.

John Tsiligaridis (2013) [30] explained a set of strategies for finding both positive and negative association rules in databases via mining association rules, with a focus on the negative ones. The Apriori classical association rules method is adapted to identify two types of Negative Association Rule (NAR) approaches: Constrained negative association rules (CNR) and Generalized Negative Association Rules (GNAR). The result shows that there is a series of formulae dependent on the tree’s height. Binary tree rules construction (BTRC) was developed for GNAR which can compute negative rules without creating any additional tuples.

Weimin Ouyang (2013) [31] examined that an important approach in data mining is association rule mining. The Mining Positive and Negative Association Rule over sliding window (MPNAR-SW) technique is used in market-basket analysis to identify goods that are either complementary or conflicting. Experiments were carried to find out negative associations over data stream.

Sikha Bagui and Probhal Chandra Dhar (2019) [32] presented a Hadoop implementation with the Apriori algorithm to mine both the positive and negative associations in big data. Map reduce jobs are used to find the frequents item sets from database. The results showed that there were more negative associations found as compared the positive associations, also different datasets with different slave nodes with different block sizes gave the best runtime performance.

Thanh-Long Nguyen and his associates (2017) [33] proposed the CP-Miner algorithm for mining colossal patterns. CP-Miner mines colossal patterns using the CP-tree and pre-processing methods to reduce the search space. CP-Miner, PCP-Miner and BVBUC algorithms were applied to compute patterns of nodes and prune nodes without loss of information in order to mine colossal patterns.

NVS Pavan Kumar and his co-workers [34] presented the work on studying the negative association across all distributed locations in order to determine the regular or global nature of patterns which can also be helpful for business operations. Organizations which are working in global market sometimes also need to carry out data mining operations on distributed data sources which can either homogeneous or heterogeneous so to mine such kind of distributed database we require some databases to be executing in parallel.

Rakesh Duggirala and P. Narayana [35] proposed a method from which association rules can be mined from coherent rules. These coherent rules require only knowledge of propositional logic. Coherent rules were analysed and the rules which were derived from these were being compared with other algorithms. In this positive and negative association rules can be derived without the knowledge of support threshold value.

E. Bala Krishna, B. Rama, A. Nagaraju [36] proposed a new Improved FP tree (IFP– Tree) and a Frequent Sequence Mining algorithm (FISM) to mine negative association rules. The proposed novel approach produced useful and valid negative association rules. The approach which they have defined is definitely capable than the existing FP–Growth based algorithms. The improved algorithm was implemented without generating candidate item sets.

Hetal Jadhav and Kinjal Thakar [37] proposed a new modified Apriori algorithm which helped to find out negatively associated data items within a realistic execution time. The main aim for using modified Apriori algorithm was to reduce space and time complexity in distributed environment on large databases so as to find out the negative association.

Sajid Mahmood and his colleagues [38] describes the methodology for finding the frequent and infrequent item sets and generation of association rules based on these item sets. The experiments were done on medical blog dataset. The positive and negative associations are identified which can help doctors to reach to conclusion among presence or absence of a particular disease.

K. S. Ranjith and A. Geetha Mary [39] proposed privacy preserving data mining technique for spatiotemporal database for mining negative association rules. Privacy became the important factor while mining association rules. The paper also focused on encryption technique of cryptography where partial support for all distributed sites were calculated.

Bemarisika Parfait and her co-workers [40] proposed an efficient method of counting the support primitive called as reduction-access-database and for generating the association rules a new technique called as reduction-rules-space was introduced. Different problems for positive and negative association rules were studied for big data.

Bowei Wang and his co-workers [41] proposed the discovery of association rule by using FP-Growth algorithm which mines both positive and negative frequent patterns. PNFP-Growth seems to be a strategy aimed at enhancing the quality of pattern discovery or data mining results by considering negated items in the analysis process. Experiments were carried out on public dataset from UCI Machine learning repository.

Toung-Long Nguyen and his co-workers [42] proposed a method to extract colossal patterns with length constraints also an attempt was made to shorten the candidates during the extraction process. Minimum and maximum constraints problem was also highlighted in this study. The Length constraint for colossal pattern (LCCP) algorithm was compared with other existing algorithms such as PCP-miner-POST1 and PCP-Miner-POST2 in order to prove the effectiveness of study. The algorithm was helpful in determining whether colossal patterns satisfies the length constraints. The constraints which were not helpful for improving the mining time and does not satisfy the length constraint were eliminated by the algorithm.

Akbar Telikani and his colleagues [43] proposed an approach in which an emerging research of evolutionary computations of Association Rule Mining (ARM) were discussed. Various applications on different types of ARM approaches were discussed such as Market basket analysis, Recommendation systems, computer networks etc. The paper also discusses different types of patterns extracted from large datasets in ARM.Statsistical analysis of different ARM approaches along with their evolution was also narrated in this paper.

Ping Qiu and his co-workers [44] carried out the task to determine whether data sequence contains negative sequence or not and then proposed NegI-NSP algorithm to calculate the support of negative sequence. In this work, two loose constraints were introduced frequency constraint and 1-length-neg element format constraint. Several experiments were carried on synthetic and real data sets in order to compare the performance of NegI-NSP with the baseline approach e-NSP.

Mikhail Moshkov and his colleagues [45] proposed a method to reduce the problem of study of association rules for a dispersed set of single information system with equal set of attributes to study of association rules. The work also shows how a joint information system can be built in a polynomial time. Study of association rule was carried out on transaction data.

B. Rini Rathan and Dr. K. Swarupa Rani [46] implemented a MapReduce model for mining frequent patterns from uncertain data which can be expressed in terms of probabilities. To carry out the mining of these frequent patterns different datasets were taken into consideration. To improve the efficiency of the work MR-Growth algorithm is modified and data is represented in the form of compact tree. The paper also focusses on comparing sequential and map reduce strategies for uncertain data mining. Implementation was being carried out on Java platform.

Moksha Shridhar and Mahesh Parmar [47] presented an efficient mining based algorithm for rule generation. In this study Apriori algorithm is carried out in order to improve precision and recall system. Implementation of different time executions has been carried out with different support and confidence parameters and a comparison of Apriori and regression is being done on highest value of time. In this paper simulation of Apriori with regression is also carried out in .NET framework. The main focus of carrying out this research work was to find locating associations within the item sets. The work was carried out on the partitions of dataset rather than full dataset which results in reduction of time.

Anindita Borah and Bhabesh Nath [48] provided a concise overview of mining rare pattern from synthetic and real datasets. Previous extensive review of what work on rare pattern mining is done is also being narrated in this study. Comparison of frequent and rare pattern mining is also carried out. It has been observed that only a limited attempt has been made by rare pattern mining techniques from data streams, graph database and sequential pattern as compared to frequent pattern mining techniques. The substantial rare patterns in various domains can be recognized.

Table 1
Summarize the reviewed literature

Author name	Algorithm used	Parameters and datasets	Result
Subrata Datta and Kalyani Mali (2021) [23]	SARMHA and Apriori	minconf thresholds $=$ 0.5,0.6 and 0.7 for chess dataset and minconf $=$ 0.3, 0.4 and 0.5 for skin dataset minsup and maxdisso are 0.7 and 0.5 for Chess dataset and 0.1 and 0.3 for skin dataset	Results showed SARMHAconsumes less memory and takes less execution times when compared with Apriori algorithm.
Jabbour and his colleagues (2018) [24]	DPLL-NAR and DPLL-NAR C backtracking search algorithm	Support value $=$ 812 and minimum confidence value $=$ 80%	Result proves that set of negative rules using conviction measure was limited as compared to one using without conviction.
I.Berin Jeba Jingle and J. Jeya A. Celin (2017) [25]	Apriori_A MLMS based Advance ABC algorithm	Different min_sup values such as 0.2,0.5,0.7 and 0.1 and min_conf values such as 0.6,0.65,0.7 and 0.8 were used	As support value is increased generation of frequent item set is decreased also time to generate FIS and InFIS is also decreased.
Peddi Kishor and Dr. Sammulal Porika (2016) [26]	PNARYCC algorithm: Positive Negative association rule Yules Correlation coefficient	Minimum support of 45%,55% and 65% and minimum confidence of 45%,55% and 65% is used, one parameter varied other remains constant	PNARY algorithm generated more positive and negative association rules when compared to ML-Rules.
Youcef Djenouri and his co-workers (2014) [27]	Meta-rules algorithm and K-means algorithm	Metrics such as lift, leverage, conviction and coverage are used. Different similarity measures are also used, Experiments encompassing clustering, meta-rule extraction, and pruning steps conducted on IBM Quest dataset	Both mining methods were successful to prune the irrelevant rules and keeping only necessary rules in the cluster.
Jose A. Diaz-Garcia and his team (2022) [28]	Survey is carried out on association rules in textual social media	Various application areas where association rules along with unsupervised techniques were discussed.	Extraction of user generated text from social media were carried out.
Chandrasekar Ravi and Neelu Khare (2014) [29]	EO-ARM and Apriori	No of rules were generated with 5,10,15 and 20 attributes keeping min_conf value as 0.5,0.55 and 0.65 and min_sup value as 0.25,0.3 and 0.35	Execution time of EO-ARM is less than Apriori algorithm thus increasing efficiency. Also nearly 40-50% of the rules were pruned thus optimizatio n is achieved.
John Tsiligaridis (2013) [30]	BTRC and MPNAR	Different support, correlation and conviction measure is used	Results showed that BTRC gives more negative rules without additional pass of tuples when compared with MPNAR algorithm.
Weimin Ouyang (2013) [31]	MPNAR-S W	Different window size of 200K and 1000K transactions. Min_sup threshold $=$ 20,000 and other support threshold values set to 0.1% 0.2% and 50%	Total sliding window time and pattern mining time was reduced and thus MPNAR proves to be scalable and efficient.
Sikha Bagui and Probhal Chandra Dhar (2019) [32]	Hadoop implementation of Apriori algorithm	Different support and confidence levels are used in one case and in another case confidence is only kept constant,25 slave nodes with block size of 25 MB was used	Result showed that more negative rules were generated when we use variable support and confidence values.
Thanh-Lo ng Nguyen and his colleagues (2017) [33]	PCP-Miner, CP-Miner and BVBUC	Experiments were carried out on different transactions such as 50,100 and 150.5 and different database were used	It was observed PCP Miner proves to be more efficient than CP-Miner and BVBUC, also BVBUC is much slower when compared with other two techniques. Search space for BVBUC is very large.
NVS Pavan Kumar and his co-worke rs [34]	Generation of Regular, Negative and Frequent item sets	Metrics such as Support count and Max regularity, minimum frequency, different transaction’s IBM Dataset	Results showed that we can get negative association among frequent and regular patterns at local and global locations.
Rakesh Duggirala and P. Narayana (2013) [35]	New Coherent rule algorithm	Min_sup $=$ 0.6,min_co nf $>$ 0.9,No of cycles performed $=$ 8,NetBeans IDE	We could find out the negative association rule based on implication of proposition al logic.
E Bala Krishna B. Rama and A. Nagaraju (2014) [36]	IFP-Tree, FISM	Different confidence and support parameters are used	Negative association rules are generated from frequent item sets which were generated by FISM algorithm.
Hetal Jadhav and Kinjal Thakar (2018) [37]	Modified Apriori	Experiments were done on Bakery dataset and carried out using NetBeans IDE and Weka tool, support and confidence parameters used	It was observed that space and time complexity was reduced and the proposed algorithm was carried out.
Sajid Mahmood and his colleagues (2014) [38]	PNARM	To generate rules different support and confidence values are taken, experiments are carried out on medical blog datasets total number of blogs $=$ 1926, java platform is used	Results showed the negative association among frequent item sets and positive association among in frequent item sets.
K.S. Ranjith and A. Geetha Mary (2020) [39]	DFPGA	Threshold support $=$ 40% Negative support, confidence and partial support are used to calculate the actual support taken on spatiotemporal dataset, cryptographic id calculated based on support count	Algorithm proved to be best at mining the association rules with help of cryptographic technique.
Bemarisi ka Parfait and her coworke rs (2018) [40]	ERAPN	Based on thresholds values such as minsup $=$ 1% and minmgk $=$ 60% number of rules are generated. Adult, German, Income and Iris dataset are used	Results showed that as the value of threshold decrease number of rules generation increase. Execution time of ERAPN is much faster as compared to other two algorithms.
Bowei Wang and his co-workers (2017) [41]	PNFP-Growth method	STATLOG heart dataset was used, minsup values is used which determines the frequent patterns or negative patterns	Results showed that PNFP algorithm was better used to generate frequent patterns as compared to simple FP growth method.
Tuong Long his co-worke rs (2021) [42]	LCCP and PCP miner algorithm	Min-length $=$ 20 to 26,min_sup and max-length $=$ 35 parameters are used, different values of transactions are run on different threshold values	When we increase the min_length constraint runtime of LCCP decreases significantly. While PCP miner algorithm also decreases significantly.
Akbar Telikani and his colleagues (2020) [43]	Association rule mining (ARM)	Comprehensive survey of Evolutionary computation approaches for Association rule mining problem were discussed	Several pros and cons of evolutionary ARM methods in terms of objective, encoding, search mechanism and parallelizati on were discussed.
Ping Qiu and his co-worke rs (2017) [44]	NegI-NSP (Negative Sequential pattern)	Algorithms were implemented in Eclipse; minimum support factor is calculated in terms of percentage based on frequency of patterns compared to sequence in database	Experiment s revealed that number of NSP mined by NegI-NSP is more than that of single NSP and the average runtime of single NSC for NegI-NSP algorithm is higher than that for single NSP.
Mikhail Moshkov and his colleagues (2022) [45]	Apriori, Éclat, Greedy heuristics dynamic programming	Finite sets of information system, set of attributes, support and confidence parameters are also used	Result proved how to reduce the problem of study of common association rules for a dispersed set of information systems with equal sets of attributes.
B. Rini Rathan and Dr. K. Swarupa Rani [46]	Map-Reduce (MR-Growth)	Experiments were carried out on synthetic datasets, number of nodes are taken into consideration for forming a tree	Result shows that MR-PUF- Growth algorithm was efficient in terms of both time and space factor. Also MapReduce implementations were better as compared to sequential patterns.
Moksha Shridhar and Mahesh Parmar [47]	Apriori with Regression	Simulation was carried out in .NET framework, different support and confidence values were used	Result indicated that when we combine Apriori with regression we achieve the output in lesser time of execution as compared to single Apriori algorithm alone.
Anindita Borah and Bhabesh Nath [48]	MS-Apriori to extract rare item sets and Apriori to derive frequent item sets	Minsup value starts from 20% to 40%. Experiments were carried out on synthetic and real datasets such as mushroom, retail etc.	Significance of rare patterns in wide range of application areas were discussed.
Tianyou Wei and his colleagues (2020) [49]	frequent closed high-utility item set mining (FCHUIM)	Minsup $=$ 0, minutil threshold value, experiments were performed on java language, Kosarak, Chainstore, Accidents, Chess and Retail database were used	Results showed that FCHUIM was efficiently used to reduce memory consumptio n as compared to other algorithms. Also because of pruning strategy and precheck method it also ensures generation of lesser number of candidates.
Saad M Darwish and his co-workers (2016) [50]	Apriori and genetic Tabu algorithms	Privacy preserving data mining techniques are used, minsup $=$ 25% and minconf $=$ 58%, Minsup-conf is set at 60%, 70%, and 80% for 500, 1000, 2000,3500, and 5000 transactions, different datasets such as heart disease, breast cancer, iris adult, diabetes are used	The integration of both Apriori and genetic Tabu search method were successful in finding negative association rules based on negative interestingness measures.
Rashed Salem and his colleagues (2018) [51]	Fast updated frequent pattern tree (FUFP-Tree)	Different sliding window sizes, number of transactions used $=$ 88163, retail market basket dataset is used	FUFP-Tree is very effective in terms of memory storage when finding correct frequent patterns, also runtime is reduced. Result also shows with more number of sliding window we can process more incoming streaming data in same time.
Animesh Paul (2016) [52]	Traditional Apriori and PNAR algorithm	msup $=$ 0.2, Maximum correlation coefficient for positive rules is 0.25 and max negative correlation coefficient is $-$ 0.44 and also dual confidence approach is used	PNAR algorithm prunes to be better in pruning search space and thus improves efficiency.
Bharadwaj Mudumba and Md Faisal Kabir (2024) [53]	Mine first association rule along with global and local ARM	Different threshold value for both local and global ARM is used. Confidence value remained consistent across all approaches	Mine first approach proved to be better in terms of efficiency and memory utilization over the integrated first method over the distributed datasets.

Tianyou Wei and his colleagues [49] presented an algorithm that is used to mine the frequent closed-high utility item set (FCHUIM) to reduce the search space. Experiments were carried out on synthetic and real datasets and the precheck method and nested list structure was also introduced. FCHUIM eliminates the need of scanning the database and provides a summarized list structure for retrieving the utility list.

Saad M. Darwish and his co-workers [50] focused on implementing data mining techniques in the domain of health care. In this study, an investigation is carried out on metaheuristic-based data cleaning for healthcare data in order to keep patient privacy safe. The main purpose of this study was to protect the patient’s data from to be revealed in unauthorized manner. Several privacy preservations in data mining (PPDM) techniques were carried out to keep patient information hidden. The proposed method uses negative interestingness to explain the success of negative association rules.

Rashed Salem and his colleagues (2018) [51] proposed a technique to mine an association rule from streaming data. The technique uses the FUFP-Tree algorithm, that reduces the number of traverses between tree nodes while extracting an association rules between items. In this study, the sliding window approach is also used to divide incoming data equally to all available windows in order to avoid congestion.

Animesh Paul [52] uses the positive and negative association rules (PNAR) algorithm which reduces the degree of redundant and contradictory rules to generate association rules with the help of correlation measure and dual confidence approach.

Bharadwaj Mudumba and Md Faisal Kabir [53] collected data from different varied sources and local association rules were discovered by mining approach and then those rules were integrated with global association rule mining. Class based association rule mining is used where the dataset contains multiple classes or categories, and we want to understand how different attributes are associated within each class separately.

There is a wide range of authors who used the technique and presented their discoveries, as can be found in Table 1.

3. Challenges and optimization of association rule mining

Computational cost and complexity becomes one of the challenge to face in association rule mining. Support and confidence thresholding, pruning techniques, Top-K rule mining, feature selection and various other techniques can be used to manage complexity and computational cost associated with large scale association task. Generation of large number of rules in association rule mining can pose another challenge in terms of analysis and interpretation. To overcome this generation of large number of rules of association clustering and classification methods can be used. Optimization of association rule can also be used by reducing the search space of all the generated rules as well as applying the pruning techniques. By applying the optimization strategy one can enhance the efficiency of association rule as well as generate the different or extended patterns from the data.

4. Conclusion and future scope

Association analysis is the act of merging positive item sets with field data to develop negative links. The goal of this process is to construct association rules which is another name for the rules that govern how associations are formed. A mining technique is required to improve the quality of negative associations. Pattern mining have always considered the frequent pattern mining over the years and neglected the rare or negative ones. This study demonstrates different algorithms which were used for mining positive and negative associations. In Future, we can apply some of the data mining techniques or integrate some of the algorithms on extended patterns which are rarely used in order to get negative association items which can be used for better decision-making in marketing strategies. In the future, researchers have conducted an extensive investigation of a variety of algorithms and approaches that are now available for the mining of patterns. They want to include the context of the characteristics into the work to improve the quality and effectiveness of the developed association rules. Tools for mining negative association rules would be added to the source as a further research.

References

Zhang

Wang

Zhang

, Privacy preserving association rule mining: Taxonomy, techniques and metrics, IEEE Access (2019).

Kumar

Toshniwal

, Analysing road accident data using association rule mining, International Conference on Computing, Communication and Security (2015), 1–6.

et al., Privacy-preserving-outsourced association rule mining on vertically partitioned databases, IEEE Transactions on Information Forensics and Security11(8) (2016), 1847–1861.

Naredi

Deshmukh

R.A.

, Improved extraction of quantitative rules using best M positive negative association rules algorithm, IEEE International Conference on Electronics, Computing and Communication Technologies (2015), 1–5.

Nijssen

Raedt

L.D.

, k-pattern set mining under constraints, IEEE Trans Knowl Data Eng25 (2013), 402–418.

Cheng

C.-W.

et al., icuARM-An ICU clinical decision support system using association rule mining, IEEE Journal of Translational Engineering in Health and Medicine1(1), 122–131.

Hossain

Sattar

A.H.M.S.

Paul

M.K.

, Market basket analysis using Apriori and FP growth algorithm, International Conference on Computer and Information Technology (2019), 1–6.

Tian

, Brand marketing leveraging the advantage of emoji pack relying on association rule algorithm in data mining technology, Computational Intelligence and Neuroscience (2022).

Gosain

Bhugra

, A comprehensive survey of association rules on quantitative data in data mining, IEEE Conference on Information and Communication Technologies (2013), 1003–1008.

10.

Bose

Datta

, Frequent pattern generation in association rule mining using weighted support, Third International Conference on Computer, Communication, Control and Information Technology (2015), 1–5.

11.

Suryawansi

Jain

, A review of negative and positive association rule mining with multiple constraints and correlation factor, International Journal of Emerging Technology and Advanced Engineering2(12) (2012), 778–781.

12.

Agarwal

Nanavati

, Association rule mining using hybrid GA-PSO for multi-objective optimization, IEEE International Conference on Computational Intelligence and Computing Research (2016), 1–7.

13.

Said

Z.B.

et al., Visualization of association rules based on a molecular representation, IEEE International Conference on Information Visualisation (2013), 577–581.

14.

et al., Mining causal association rules, IEEE International Conference on Data Mining Workshops, (2013), 114–123.

15.

Martin

et al., A new multiobjective evolutionary algorithm for mining a reduced set of interesting positive and negative quantitative association rules, IEEE Transactions on Evolutionary Computation18(1) (2013), 54–69.

16.

Karthikeyan

Ravikumar

, A survey on association rule mining, International Journal of Advanced Research in Computer and Communication Engineering3(1) (2014), 5223–5227.

17.

Sethi

Shekar

, Issues in negative association rule mining with business analytics perspectives, International Journal of Business11(20) (2017), 13–20.

18.

Nahar

et al., Association rule mining to detect factors which contribute to heart disease in males and females, Expert Systems with Applications40(4) (2013), 1086–1093.

19.

Ali

et al., Detection of schistosomiasis factors using association rule mining, IEEE Access7 (2019), 186108–186114.

20.

Asmat

, Basic pattern mining: a review on techniques and applications, International Journal of Scientific and Engineering Research9(5) (2018), 31–44.

21.

Salmi

Atif

, A Data Mining Approach for Imbalanced Automobile Insurance Fraud Data with Evaluation of Two Sampling Techniques and Two Filters, Journal of Information Assurance and Security17 (2022), 122–135.

22.

Kumar

Abraham

, Opinion mining to assist user acceptance testing for open-beta versions, Journal of Information Assurance and Security12 (2017), 146–153.

23.

Datta

Mali

, Significant association rule mining with high associability, IEEE International Conference on Intelligent Computing and Control Systems (2021), 1159–1164.

24.

Jabbour

El Mazouri

F.E.

Sais

, Mining negatives association rules using constraints, Procedia Computer Science127 (2018), 481–488.

25.

Jingle

I.B.J.

Celin

, Mining optimized positive and negative association rule using advance ABC algorithm, Journal of Theoretical and Applied Information Technology95(24) (2017), 6846–6855.

26.

Kishor

Porika

, An efficient approach for mining positive and negative association rules from large transactional databases, IEEE International Conference on Inventive Computation Technologies (2016), 1–5.

27.

Djenour

Drias

, Pruning irrelevant association rules using knowledge mining, International Journal of Business Intelligence and Data Mining9(2) (2014), 112–144.

28.

Diaz-Garcia

J.A.

Ruiz

M.D.

Martin-Bautista

M.J.

, A survey on the use of association rules mining techniques in textual social media, Springer Department of Computer Science and Artificial Intelligence (2022), 1175–1200.

29.

Ravi

Khare

, EO-ARM: An efficient and optimized k-map based positive-negative association rule mining technique, IEEE International Conference on Circuits, Power and Computing Technologies (2014), 1723–1727.

30.

Tsiligaridis

, Mining positive and negative association rules: An approach for binary trees, International Journal of New Computer Architectures and their Applications (2013).

31.

Ouyang

, Mining positive and negative association rules in data streams with a sliding window, IEEE Fourth Global Congress on Intelligent Systems (2013), 205–209.

32.

Bagui

Dhar

P.C.

, Positive and negative association rule mining in Hadoop’s MapReduce environment, Springer Journal of Big Data (2019).

33.

Nguyen

T.-L.

Snasel

, Efficient algorithms for mining colossal patterns in high dimensional databases, Knowledge-Based Systems (2017).

34.

Pavan Kumar

N.V.S.

et al., Mining distributed databases for negative associations from regular and frequent patterns, International Journal of Advanced Trends in Computer Science and Engineering8(4) (2019), 1449–1463.

35.

Duggirala

Narayana

, Mining Positive and Negative Association Rules Using Coherent Approach, International Journal of Computer Trends and Technology4(2), (2013).

36.

Bala Krishna

Rama

Nagaraju

, Mining of negative association rules using improved frequent pattern tree, International Conference on Computing and Communication Technologies (2014), 1–6.

37.

Jadav

Thakar

, Mining Negative Association Rules in Distributed Environment, International Journal of Scientific Research in Science, Engineering, and Technology (2018).

38.

Mahmood

Shahbaz

Guergachi

, Negative and Positive Association Rules Mining from Text Using Frequent and Infrequent Item sets, The Scientific World Journal (2014).

39.

Ranjith

K.S.

Geetha Mary

, Privacy-preserving data mining in spatiotemporal databases based on mining negative association rules, Springer Nature Emerging Research in Data Engineering Systems and Computer Communications (2020), 329–339.

40.

Parfait

Harrimann

Andre

, An efficient approach for extraction positive and negative association rules from big data, Springer International Federation for Information Processing (2018), 79–97.

41.

Wang

et al., Comprehensive association rules mining of health examination data with an extended FP-growth method, Mobile Networks and Applications, Springer, (2017).

42.

et al., Mining colossal patterns with length constraints, Springer Applied Intelligence (2021), 8629–8640.

43.

Telikani

Gandomi

A.H.

Shahbahrami

, A survey of evolutionary computation for association rule mining, Information Sciences, Elsevier, (2020).

44.

Qiu

Zhao

Dong

, NegI-NSP: Negative Sequential Pattern Mining based on loose constraints, IEEE Industrial Electronics Society (2017).

45.

Moshkova

Zielosko

Tetteh

E.T.

, Common association rules for dispersed information systems, Knowledge-Based and Intelligent Information and Engineering Systems, Elsevier, (2022).

46.

Rathan

B.R.

and Dr. Rani

K.S.

, A novel approach for mining patterns from large uncertain data using mapreduce model, IEEE International Conference on Computer Communication and Informatics (2017).

47.

Shridhar

Parmar

, Association rule mining with regression for optimization approach, IEEE International Conference on Intelligent Computing and Control Systems (2017), 1225–1230.

48.

Borah

Nath

, Rare pattern mining: challenges and future perspectives, Springer Complex and Intelligent Systems (2018).

49.

Wei

et al., FCHUIM: Efficient frequent and closed high-utility item sets mining, IEEE Transactions and Journals8 (2020), 109928–109939.

50.

Darwish

S.M.

et al., Privacy preserving data mining framework for negative association rules: An application to healthcare informatics, IEEE Access10 (2022), 76268–76280.

51.

Abd Elaty

A.A.

et al., Efficient Association Rules Mining from streaming data with a fault tolerance, IEEE International Conference on Computer Engineering and Systems (2018), 627–632.

52.