An intelli AFM: An intelligent association based fuzzy rule miner to predict high blood pressure using bio-psychological factors

Abstract

High Blood Pressure (HBP) is one of the major triggering factors for many health-related issues such as brain stroke, heart stroke, kidney failure, eye damage, etc. The victims of HBP are drastically increasing day by day across the globe. The prediction of HBP in advance is more beneficial to control the Blood Pressure (BP) rather than using BP control medications. So this paper focused on an intelligent fuzzy classification model called Association based Fuzzy rule Miner (AFM) to predict HBP. Although they are numerous parameters that contribute to HBP, the impact of Bio-Psychological factors on HBP is always worth noting. This paper considered biological factors obesity level, cholesterol level, age, and Psychological factors anxiety level and anger level of a person for experimental analysis. The proposed Model initially converts the crisp data set into the fuzzified data set. Later, the association rules are extracted using apriori algorithm based on conditions imposed as constraints. In the final step the extracted association rules for each decision class separately together constructs AFM, which predicts whether a person is a victim of HBP or not. The experiments are conducted on a real-time dataset of 1000 records, where 600 records are used for training and 400 records are used for testing. The AFM has shown 90.75% accuracy, which is for better than the accuracy of existing classifiers such as Random Forest, Naïve Bayes, Simple logistic regression, J48, and PART.

Keywords

Association blood pressure fuzzy based systems classification apriori

1. Introduction

In the present day’s many people around the globe are suffering from High Blood Pressure (HBP) also called Hypertension. There may be numerous reasons for elevated blood pressure in people, but bio-psychological factors [1, 2] are always worth noting. So we focused on age, obesity, cholesterol, anger and anxiety levels of a person to predict the blood pressure (BP) of a person. BP is the most often measured and the most intensively studied parameter in medical and physiological practice [3]. If the measured BP is more than the normal range, then it is called HBP [3]. Although there exist different machine learning techniques to classify medical data, the proposed AFM has selected a fuzzy-based approach to predict, whether a person is a victim of HBP or not. Fuzzy logic is an approach used to compute the degree of truth rather than the precise value of truth [4]. Fuzzy is a branch of mathematics, comes under the umbrella of artificial intelligence. It is sometimes also called a soft computing method. The fuzzy theory was introduced by L.A. Zadeh in 1965 [4].

Table 1
Existing work on HBP prediction

S. no	Paper title	Published in	Published year	Author	Factors considered	Approach used	Limitation
1	AWBP	IEEE Journal of biomedical and health informatics	2017	Mikko Peltokangas at el.	Age	Linear regression	Single attribute
2	Blood pressure and aging	Postgrad Med J	2007	Elisabete Pinto	Age	Linear regression	Single attribute
3	IASDBPSR	Hypertension	2012	Julie K.K. Vishram at al.	Age	Linear regression	Single attribute
4	UAACBP	International Journal of Biomedical and Advance Research	2015	Meenakshi Kalyan at el.	Abdominal fat	Linear regression	Single attribute
5	PHBPRBW	Asian Nursing research	2012	Yu-Li Lan at al.	Body Weight	Linear regression	Single attribute
6	OOBP	OBESITY RESEARCH	2000	Ilse L. Mertens	Overweight, Obesity	Linear regression	Two attributes
7	Obesity-related hypertension	The Ochsner Journal	2009	Richard N. Re	Obesity	Linear regression	Single attribute
8	O&H	Nephrol Dial Transplant	2006	Krzysztof Narkiewicz	Obesity	Linear regression	Single attribute
9	SC&BP	Journal of Human Hypertension	2002	LA Ferrara	Cholesterol	Linear regression	Single attribute
10	HC&HBP	American journal of hypertension	2005	D. Sesso	Cholesterol	Linear regression	Single attribute
11	RDC&BP	J Hypertensions	2011	Masaru Sakurai	Cholesterol	Linear regression	Single attribute
12	SMW&CD	Hypertension	2007	Daniela Lucini	Stress level	Linear regression	Single attribute

In fuzzy controlled systems, the initially crisp data set is collected, and it is converted into fuzzy data set using linguistic variables and fuzzy membership function, it is called as Fuzzification [4]. In general, all fuzzified values of the crisp data set are distributed in the interval of [0, 1]. After fuzzification, fuzzified rule base and inference engine are used to get the fuzzy output, the fuzzy output obtained in this stage is converted back as crisp output using the defuzzification process [5, 6].

2. Background work

This section covers the existing work on blood pressure and also explains the way the data set is collected, details of the data set, the way how a data set is converted into the required form for the experimental analysis. Table 1 unfolds that the existing approaches on blood pressure prediction are considered only a single attribute, and they concluded that the attribute considered for blood pressure prediction, blood pressure are positively correlated. But the combined quantitative influence of biopsychological factors on blood pressure does not experimented in the literature. The quantitative influence of biopsychological factors on blood pressure is using a fuzzy approach is addressed in this paper. Table 2 explains the paper title of Table 1.

Figure 1.

The architecture of the fuzzy controlled system.

Table 2

Paper title of Table 1 and its abbreviation

S. no	Paper title	Abbreviation	Number of records considered
1	AWBP	Age Dependence of Arterial Pulse Wave Parameters Extracted From Dynamic Blood Pressure and Blood Volume Pulse Waves	82
2	BPA	Blood Pressure and Aging	1500
3	IASDBPSR	Impact of Age on the Importance of Systolic and Diastolic Blood Pressures for Stroke Risk	68551
4	UAACBP	Ultrasonographic Assessment of Abdominal fat and its Correlation with Blood Pressure	75
5	PHBPRBW	Prevalence of HBP and its Relationship with Body Weight Factors among Inpatients with Schizophrenia in Taiwan	1030
6	OOBP	Overweight, Obesity, and Blood Pressure	50
7	ORH	Obesity-Related Hypertension	Not mentioned
8	O&H	Obesity and Hypertension the issue is more complex than we thought	Not mentioned
9	SC&BP	Serum cholesterol affects Blood Pressure regulation	73
10	HC&HBP	High cholesterol may lead to HBP in men	3110
11	RDC&BP	Relationship of Dietary Cholesterol to Blood Pressure: The Intermap Study	4680
12	SMW&CD	Stress Management at the Worksite: Reversal of Symptoms Profile and Cardiovascular Dysregulation	91

Table 3

Fuzzified values of age, anger and anxiety levels

Age continuous data	Fuzzified as	Anger level continuous data	Fuzzified as	Anxiety level continuous data	Fuzzified as
(0, 20)	Very young	(0, 1)	Healthy	(0, 1)	Mild
(15, 40)	Young	(0.8, 2)	Mild	(0.8, 2)	Moderate
(35, 50)	Middle aged	(1.8, 3)	Serious	(1.8, 3)	Severe
$>$ 50	Old	$>$ 2.8	Extreme	$>$ 2.8	Panic

Table 4

Fuzzified values of obesity, cholesterol levels

Obesity level continuous data	Fuzzified as	Cholesterol level continuous data	Fuzzified as
(0, 18)	Under weight	(0, 125)	Low
(15, 24.9)	Normal	(100,200)	Normal
(21, 30)	Over weight	(175, 239)	Boarder line
$>$ 30	Obese	$>$ 239	High

2.1 Conversion of crisp data set into fuzzy data set

Initially, in the data conversion process, the biological parameters such as age, obesity level, cholesterol level, psychological parameters anger level and anxiety levels are considered as fuzzy linguistic variables. Linguistic terms of each linguistic variable are identified as shown in Tables 3 and 4 [7]. This fuzzification is done based on the scientific and medical reports available on the web [8, 9]. For example, linguistic terms of fuzzy linguistic variable anger level are considered as {healthy, mild, serious, extreme}. The sample data copy after fuzzification is shown in Table 5. The fuzzy value for overlapped values of each attribute is assigned based on the degree of membership value of the attribute. In data set collection process anxiety, anger levels are measured on the scale of [0, 3], but age, cholesterol, and obesity levels are measured on different scales. In order to bring the whole data on to the scale of [0, 3], age, obesity level, and cholesterol level are normalized using a min-max normalization technique.

Table 5
A sample copy of fuzzified data set

Age	Obesity level	Cholesterol level	Anger level	Anxiety level	Bp_patient
Young	Over weight	Borderline	Serious	Severe	Yes
Young	Normal	Normal	Mild	Severe	Yes
Old	Over weight	Borderline	Serious	Moderate	Yes
Old	Over weight	Borderline	Mild	Severe	Yes
Young	Over weight	Borderline	Healthy	Moderate	No
Young	Normal	Normal	Serious	Severe	Yes
Old	Over weight	Normal	Healthy	Mild	No
Young	Normal	Normal	Mild	Mild	No
Young	Normal	Borderline	Healthy	Moderate	No
Young	Normal	Normal	Healthy	Mild	No
Middle aged	Over weight	Boarder line	Mild	Moderate	No
Young	Over weight	Low	Healthy	Mild	No
Young	Over weight	Borderline	Healthy	Moderate	No
Middle aged	Over weight	Normal	Serious	Severe	Yes
Middle aged	Over weight	Normal	Serious	Severe	Yes
Young	Normal	Normal	Mild	Moderate	No
Old	Over weight	Borderline	Mild	Mild	No
Young	Under weight	Normal	Healthy	Mild	No
Young	Over weight	Normal	Healthy	Mild	No
Young	Normal	Normal	Healthy	Mild	No
Young	Normal	Low	Healthy	Mild	No

2.2 Fuzzy membership function

The fuzzy membership function is used to calculate the fuzzy membership value or degree of membership value of the selected attribute in the interval [0, 1] [10, 11, 12]. The triangular, trapezoidal, and Gaussian functions are the most often used fuzzy membership functions [13]. Equation (1) represents the triangular function used for the experimental analysis where $a$ , $m$ are two positive integers and $x$ is the attribute value.

$\displaystyle\mu_{A}(x)=\left\{\begin{array}[]{ll}0,&x\leqslant a\\ \frac{x-a}{m-a},&a<x\leqslant m\\ \frac{b-x}{b-m},&m<x<b\\ 0,&x\geqslant b\end{array}\right.$ (1)

2.3 The proposed triangular membership function for AFM

The advantage of the triangular membership function which represents the linear relationship between the crisp value of the selected attribute and its degree of membership value as continuous increasing function as well as continuous decreasing function from a certain specific point [14]. If the age of a person is young, the membership value of age is increasing up to age reaches 30 in the range of [15, 40] and then the membership value is decreasing, The degree of membership value of each attribute is calculated using the membership functions as shown in Table 4.

Here, the degree of membership value represents the fuzzified value of the attribute in the interval [0, 1]. The continuous data set is then replaced by fuzzy linguistic terms using membership functions. For the overlapped areas for example age in the range of 15–20, both the membership values with respect to very young and young are calculated, the crisp value is then replaced by the fuzzified value of age for which membership value is high.

Figure 2.

Fuzzy membership value of age.

Figure 3.

Fuzzy membership value of obesity level.

Figure 4.

Fuzzy membership value of cholesterol level.

Figure 5.

Fuzzy membership value of anger level.

The crisp values of each attribute considered to build the AFM are fuzzified and they are shown in Figs 2–5. The crisp value for age young is actually in the range (0, 30), middle-aged is (31, 50), old is more than 51. As age is fuzzified as shown in Fig. 2, it is unfolded for one more dimension called very young, and also enabled the overlapped areas. The crisp values of each attribute in the overlapped area now have 2 membership values, the crisp value is replaced by a fuzzy value for which the membership value is more. The AFM has taken a triangular membership function to fuzzify the crisp data for each attribute in each range. The degree of membership values of an attribute are increasing up to the middle of the range, and then after decreasing towards zero. The AFM considered the special case of trapezoidal function i.e., L-function to obtain the membership values of age in the range old, obesity level in the range obese, cholesterol level in the range high, anger level in the range extreme, and anxiety level in the range panic.

Figure 6.

Fuzzy membership value of anxiety level.

Table 6

Fuzzy membership function

Attribute name	Crisp value	Fuzzy membership function
Age	$\leqslant$ 20	(20 – Age)/20
	$\geqslant$ 15 && $<$ 30	(Age – 15)/15
	$\geqslant$ 30 && $\leqslant$ 40	(40 – Age)/10
	$\geqslant$ 35 && $\leqslant$ 43	(Age – 35)/8
	$>$ 43 && $<$ 50	(50 – Age)/7
	$\geqslant$ 50	$1$
Obesity level	$\leqslant$ 18	(18 – Obesity Level)/18
	$\geqslant$ 15 && $\leqslant$ 20	(Obesity Level – 15)/5
	$>$ 20 && $\leqslant$ 24.9	(24.9 – Obesity Level)/4.9
	$\geqslant$ 21 && $\leqslant$ 25.5	(Obesity Level – 21)/4.5
	$>$ 25.5 && $\leqslant$ 30	(30 – Obesity Level)/4.5
	$>$ 30	1
Cholesterol level	$\leqslant$ 125	(125 – Cholesterol Level)/125
	$>$ 100 && $\leqslant$ 150	(Cholesterol Level – 100)/50
	$>$ 150 && $\leqslant$ 200	(200 – Cholesterol Level)/50
	$\geqslant$ 175 && $\leqslant$ 207	(Cholesterol Level – 175)/32
	$>$ 207 && $\leqslant$ 239	(239 – Cholesterol Level)/32
	$>$ 239	1
Anger level	$\leqslant$ 1	(1 – Anger Level)
	$>$ 0.8 && $\leqslant$ 1.5	(Anger Level – 0.8)/0.7
	$>$ 1.5 && $\leqslant$ 2	(2 – Anger Level)/0.5
	$\geqslant$ 1.8 && $\leqslant$ 2.5	(Anger Level – 1.8)/0.8
	$>$ 2.5 && $\leqslant$ 3	(3 – Ange Levelr)/0.5
	$>$ 2.8 && $\leqslant$ 3	(Anger Level – 2.8)/0.2
	$>$ 3	1
Anxiety level	$\leqslant$ 1	(1 – Anxiety Level)
	$>$ 0.8 && $\leqslant$ 1.5	(Anxiety Level – 0.8)/0.7
	$>$ 1.5 && $\leqslant$ 2	(2 – Anxiety Level)/0.5
	$\geqslant$ 1.8 && $\leqslant$ 2.5	(Anxiety Level – 1.8)/0.8
	$>$ 2.5 && $\leqslant$ 3	(3 – Anxiety Level)/0.5
	$>$ 2.8 && $\leqslant$ 3	(Anxiety Level – 2.8)/0.2
	$>$ 3	1

3. Proposed methodology

This section presents the approach used to build the AFM classifier to predict whether a person becomes victim of HBP or not using bio-psychological factors. First and foremost, it converts the crisp data set into fuzzified data set using fuzzy linguistic terms as shown in Table 5.3. The fuzzy linguistic terms of the bio-psychological factors considered for experiments are shown in Figs 2–6. In order to train the AFM, the data instances of YES class and NO class are separated. Now apriori is executed to generate association rules for each decision class separately, the generated rules are pruned using the coverage of the rule. Support Count (SCount) and Specified Confidence (SC) is set during the training phase of the AFM. In the last phase, the AFM is built using the extracted rules from apriori, and it is used to predict the class label of test instances supplied.

3.1 Proposed architecture of the AFM

The architecture of the proposed system is as shown in Fig. 7 as given below. Here the term modified apriori is used as we are pruning the rules generated from apriori using the coverage of the rule.

Figure 7.

The architecture of proposed AFM.

3.2 Algorithm to extract association rules

This chapter proposed an intelligent association based fuzzy rule miner to predict high blood pressure using bio-psychological factors named the AFM. It is an intelligent classifier that extracts the association rules generated from apriori using the coverage of a rule [15]. The generated rules are extracted for each decision class separately using the training data set. Extracted rules are used as classification rules in the next stage to classify the input test instances.

3.3 Yes class rules extraction

Initially, all the association rules are generated on the basis of support count and confidence specified during the training phase of the classifier using apriori. Yes, class rules extraction means the extraction of rules that satisfies the below-mentioned conditions imposed on the antecedent part of the rule, where the consequent part of the rule is always Yes. Condition 1: While extracting yes class rules the algorithm looks for the rule that covers the highest number of training records. Condition 2: Rules are generated from frequent itemsets based on support count, the first extracted rule is from only rules generated from 1-item frequent itemset, and the next extracted rule is from the rules generated from the 2-item frequent itemset and so on that satisfies condition 1. If no such rule is generated, decrease the support count and repeat the same or consider the rule generated from previous frequent item set that satisfies Condition 1. This is followed to improve the reliability and accuracy of the rule, during the training phase of the classifier. Condition 3: At each step generate the top 10 rules say N, pick up the best rule means that satisfies the above 2 conditions, and remove the records covered by the rule from the training data. If no best rule is found, increment N by 10 each time and repeat the process, stop the above algorithm only if the leftover training records are less than 5% of total training records after removing the records covered by the extracted rule in the current iteration.

Algorithm 1: Pseudocode to generate Yes class rules of AFM
Input: TD is the Number of training records with class label yes, TR is the Training records covered by rule R, SC is the threshold Support Count, C is the threshold Confidence;
Output: Extracted set of Yes Class Rules;
Begin
AFM_yesclass_rulegeneration (D, SC, C) /* function definition*/
Initialize LD=D; /*where LD is left over data set after extracting the records
Covered by the selected rule*/
Generate top 10 association rules using apriori;
Extract a rule R; /* satisfies the all conditions mentioned in section 5.3.3 */
Write R;
LD $=$ LD-TR;
if LD $>$ ((5/100)*TD) then
AFM_yesclass_rulegeneration (LD, SC, C);
end if
else
return;
End

3.4 No class rules extraction

Initially, all the association rules are generated on the basis of support count and confidence specified during the training phase of the classifier using apriori. No, class rules extraction means the extraction of rules that satisfies the below-mentioned conditions imposed on the antecedent part of the rule, where the consequent part of the rule is always No. Condition 1: While extracting No class rules the algorithm looks for the rule that covers the highest number of training records. Condition 2: Rules are generated from frequent itemsets based on support count, the first extracted rule is from only rules generated from 1-item frequent itemset, and the next extracted rule is from the rules generated from the 2-item frequent itemset and so on that satisfies condition 1. If no such rule is generated decrease the support count and repeat the same or consider the rule generated from previous frequent item set that satisfies Condition 1. This is followed to improve the reliability and accuracy of the rule, during the training phase of the classifier. Condition 3: At each step, generate the top 10 rules say N, pick up the best rule means that satisfies the above 2 conditions, and remove the records covered by the rule from the training data. If no best rule is found, increment N by 10 each time and repeat the process, stop the above algorithm only if the leftover training records are less than 5% of total training records after removing the records covered by the extracted rule in the current iteration.

Algorithm 2: Pseudocode to generate No class rules of AFM
Input: TD is the Number of training records with class label No, TR is the Training Records covered by rule R,SC is the threshold Support Count, C is the threshold Confidence;
Output: Extracted set of No Class Rules;
Begin
AFM_noclass_rulegeneration(D, SC, C) /* function definition*/
Initialize LD $=$ D; /*where LD is left over data set after extracting the records
Covered by the selected rule*/
Generate top 10 association rules using apriori;
Extract a rule R; /* that satisfies all conditions mentioned in section 5.3.4 */
Write R;
LD $=$ LD-TR;
if LD $>$ ((5/100)*TD) then
AFM_noclass_rulegeneration(LD, SC, C);
end if
else
return;
End

4. Experimental results and discussion

Experimental analysis is done on a real-time data set of size 1000 people. Each person data is considered one record. Each record consists of anxiety level, anger level, age, cholesterol level, obesity level, SBP, and DBP. Age, cholesterol level and obesity level are considered biological factors, whereas anxiety level and anger levels are considered psychological factors. For comparative analysis of the proposed AFM classifier, the existing classifiers supported in WEKA are considered. As WEKA processes the input data using ARFF (Attribute Relation File Format). The data collected is converted into an arff file format in the data preprocessing phase. The proposed AFM is implemented in JAVA. The total data set is divided into two sets. One is a training set and another is a test set. The proposed AFM considered 60% data for training and 40% data is for testing.

4.1 Rules extracted using modified apriori

Initially, Yes class and No class records are separated from the training data set. And then, apriori is executed to extract association rules based on the supplied support count and confidence. Initially, support count and confidence are set at 1. At support count 1, no rules are generated so the value is subsequently decreased by 0.1 in each iteration using a recursive approach. However at confidence 1 and support count 0.6, there exist many association rules, so confidence is set at 1 and support count is set at 0.6. The fuzzified and defuzzified rules are given below.

4.1.1 Fuzzified rules extracted

1.
If ((anxiety $=$ Severe) Then class $=$ Yes
2.
If ((obesity $=$ Over Weight) AND (anxiety $=$ Moderate) Then Class $=$ yes
3.
If ((cholesterol $=$ Normal) AND (anger $\geqslant$ 0.9 AND (Anger $=$ Mild) AND (anxiety
4.
y $=$ Moderate) Then Class $=$ Yes
5.
If (cholesterol $=$ Normal) then Class $=$ No

Table 7
Confusion matrices of different classifiers

Predicted class

Classifier name Actual class Yes No

Logistic regression Yes 49 59

No 27 265

Naïve Bayes Yes 94 14

No 48 244

J48 Yes 71 37

No 21 271

Random forest Yes 70 38

No 13 279

PART Yes 70 38

No 19 273

Proposed AFM Yes 94 14

No 23 269

Table 8
Accuracy of different classifiers

Classifier name Accuracy

Logistic regression 78.50

Naïve Bayes 81.75

J48 85.50

Random forest 87.25

PART 85.75

Proposed AFM 90.75

Table 9
Performance measures of the different classifiers

Algorithm used Class TP rate FP rate Precision Recall F-measure

Logistic Regression Yes 0.454 0.092 0.645 0.454 0.533

No 0.908 0.546 0.818 0.908 0.860

Naïve Bayes Yes 0.769 0.164 0.634 0.769 0.695

No 0.836 0.231 0.907 0.836 0.870

J48 Yes 0.657 0.072 0.772 0.657 0.710

No 0.928 0.343 0.880 0.928 0.903

Random forest Yes 0.648 0.045 0.843 0.648 0.733

No 0.955 0.352 0.880 0.955 0.916

PART Yes 0.648 0.065 0.787 0.648 0.711

No 0.935 0.352 0.878 0.935 0.905

Proposed AFM Yes 0.870 0.078 0.878 0.870 0.746

No 0.921 0.130 0.951 0.921 0.936

Figure 8.
Accuracy of different classifiers.

6.
If ((obesity $=$ Over Weight) AND (cholesterol $=$ Boarder line) Then Class $=$ No
7.
If (age $=$ Young) Then Class $=$ No
8.
If (obesity $=$ Normal) AND (cholesterol $=$ Boarder line) AND (anxiety $=$ Mild) Then Class $=$ No

Figure 9.
Performance of Yes class using the proposed AFM.

Figure 10.
Performance of No class using the proposed AFM.

4.1.2 Defuzzified rules to build AFM

	Predicted class
Logistic regression	Yes	49	59
	No	27	265
Naïve Bayes	Yes	94	14
	No	48	244
J48	Yes	71	37
	No	21	271
Random forest	Yes	70	38
	No	13	279
PART	Yes	70	38
	No	19	273
Proposed AFM	Yes	94	14
	No	23	269

Classifier name	Accuracy
Logistic regression	78.50
Naïve Bayes	81.75
J48	85.50
Random forest	87.25
PART	85.75
Proposed AFM	90.75

Algorithm used	Class	TP rate	FP rate	Precision	Recall	F-measure
Logistic Regression	Yes	0.454	0.092	0.645	0.454	0.533
	No	0.908	0.546	0.818	0.908	0.860
Naïve Bayes	Yes	0.769	0.164	0.634	0.769	0.695
	No	0.836	0.231	0.907	0.836	0.870
J48	Yes	0.657	0.072	0.772	0.657	0.710
	No	0.928	0.343	0.880	0.928	0.903
Random forest	Yes	0.648	0.045	0.843	0.648	0.733
	No	0.955	0.352	0.880	0.955	0.916
PART	Yes	0.648	0.065	0.787	0.648	0.711
	No	0.935	0.352	0.878	0.935	0.905
Proposed AFM	Yes	0.870	0.078	0.878	0.870	0.746
	No	0.921	0.130	0.951	0.921	0.936

1.
If ((anxiety $\geqslant$ 2) AND (anxiety $\leqslant$ 3)) Then class $=$ Yes
2.
If ((obesity $\geqslant$ 22.9 AND obesity $\leqslant$ 30) AND (anxiety $\geqslant$ 0.9 AND

anxiety $\leqslant$ 1.9)) Then Class $=$ yes
3.
If ((cholesterol $\geqslant$ 108 AND cholesterol $\leqslant$ 184) AND (anger $\geqslant$ 0.9 AND

anger $\leqslant$ 1.9) AND (anxiety $\geqslant$ 0.9 AND $\leqslant$ 1.9)) Then Class $=$ Yes
4.
If (cholesterol $\geqslant$ 108 AND cholesterol $\leqslant$ 184) then Class $=$ No
5.
If ((obesity $\geqslant$ 22.9 AND obesity $\leqslant$ 30) AND (cholesterol $\geqslant$ 185 AND cholesterol $\leqslant$ 239)) Then Class $=$ No
6.
If (age $\geqslant$ 18 AND age $\leqslant$ 37) Then Class $=$ No
7.
If (obesity $\geqslant$ 22.9 AND obesity $\leqslant$ 30) AND (cholesterol $\geqslant$ 185 AND cholesterol $\leqslant$ 239) AND (anxiety $\geqslant$ 0.9 AND anxiety $\leqslant$ 1.9)) Then Class $=$ No

4.2 Confusion matrices

The performance of the classification system is normally evaluated using the data present in the confusion matrix. The confusion matrix is a two-dimensional matrix that contains the information about actual and predicted classifications done by the classification system. Table 5.5 shows the confusion matrices of the proposed different classifiers and the proposed AFM. In the training data set out 400 records, 108 are Yes class records and 292 are No class records. The proposed AFM predicted 94 out of 108 Yes class records as Yes, 269 out of 292 No class records as No. The accuracy of proposed and different classifiers is shown in Table 8 and in Fig. 8.

4.3 Performance evaluation of the proposed AFM

If Positive class is considered, the TP Rate of a classifier represents the fraction of positive class records predicted as positive, and it shows how good the classifier is in predicting positive class records.

FP rate represents the fraction of negative class records predicted as positive, it shows how the error rate of the classifier is in predicting negative class records. The precision of the classifier shows how exact the classifier is in predicting positive class records. The F-measure is harmonic mean of the precision and recall represents the overall performance of the classifier with respect to the positive class.

If Negative class is considered, True Negative (TN) Rate represents the fraction of negative class records predicted as negative, and it shows how good the classifier is at predicting negative class records. False Negative (FN) rate represents the fraction of positive class records predicted as negative, and it shows the error rate of the classifier in predicting positive class records. The precision of the classifier shows how exact the classifier is in predicting negative class records. The F-measure is harmonic mean of the precision and recall represents the overall performance of the classifier with respect to Negative class. The performance measures of the experiments conducted are shown in Table 7. Figure 8 shows the performance details of the proposed AFM with respect to Yes class. Figure 9 shows the performance details of the proposed AFM with respect to No class.

4.4 Comparative study of AFM with existing classifiers

In this paper, it has been proposed a new model named AFM to predict the victims of HBP using real-time data set. The FM considers bio-psychological factors to predict the class label attribute. The experimental results are compared with classifiers supported in WEKA. The proposed AFM has outperformed in terms of F-measure, accuracy comparatively with logistic regression, naïve baye’s, j48, Random Forest and PART. The proposed AFM has also shown the improved performance in terms of TP rate, FP rate, precision for each decision class as shown in Table 9.

5. Conclusion

This chapter proposed an intelligent AFM to predict the HBP based on bio-psychological factors. Age, cholesterol level, obesity level are considered biological factors, and anxiety level, anger levels are considered psychological factors. The proposed approach initially generates association rules using apriori, later association rules are pruned based on the coverage of records. The extracted association rules are used to build the AFM. The real-time data sets of 1000 records are considered for experimental analysis. The AFM is trained using 60% of data and it is tested using 40% data. The proposed AFM has shown improved performance in terms of accuracy, TP rate, FP rate, precision, and F-measure comparatively with existing classifiers supported in WEKA like simple logistic regression, Naïve Bayes, j48, Random forest, and PART. The proposed approach has shown 90.75% accuracy in classifying the test instances.

The objective of this paper is to find the influence of biopsychological factors on blood pressure of a person. Biological factors such as age, obesity level, cholesterol level are collected from the medical laboratory, the anxiety level, anger levels of same people are collected from the response obtained from a set of predefined questionnaire. From the Experimental evaluation, we draw the following conclusions: 1. The extracted rules by the AFM are simple to understand, and they are very useful for technical and nontechnical communities to manage BP rather than using BP medications. 2. These rules can be used by an individual to keep him/her blood pressure in a healthy range. 3. From the experimental results, it is so vivid that people with higher anxiety levels (is severe) in the range between (2, 3) are more prone to the HBP. 4. In the people of overweight even their cholesterol level is in the normal range (108,184), but their anxiety levels are moderate (0.9, 1.9) and higher are prone to the HBP. 5. If the cholesterol level of a person is in a healthy range and if he is not anxious, he is less prone to the HBP. 6. However, young people aged between 18 and 37 are less prone to HBP. 7. If a person’s anger level and anxiety level is close to zero (Normal) and even if he is overweight and having cholesterol level in borderline (185,239), the person is less prone to HBP.

References

Borrell-Carrio

. The biopsychosocial model 25 Years later: Principles, practice, and scientific inquiry. The Annals of Family Medicine.2004; 2(6): 576-582.

Cuffee

Ogedegbe

Williams

Ogedegbe

Schoenthaler

. Psychosocial risk factors for hypertension: An update of the literature. Current Hypertension Reports. 2014; 16(10).

Frese

Fick

Sadowsky

. Blood pressure measurement guidelines for physical therapists. Cardiopulmonary Physical Therapy Journal.2011; 22(2): 5-12.

Ravi

Khare

. Review of fuzzy rule-based classification systems. Research Journal of Pharmacy and Technology. 2016; 9(8): 1299.

Fuzzy Sets and Operations. [Online]. Available: https://www.doc.ic.ac.uk/∼nd/surprise_96/journal/vol4/sbaa/report.fuzzysets.html. [Accessed: 21-Jan-2016].

Verikas

Guzaitis

Gelzinis

Bacauskiene

. A general framework for designing a fuzzy rule-based classifier. Knowledge and Information Systems.2010; 29(1): 203-221.

Mohammadpour

Abedi

Bagheri

Ghaemian

. Fuzzy rule-based classification system for assessing coronary artery disease. Computational and Mathematical Methods in Medicine.2015; 2015: 1-8.

Yuen

KKF

Lau

HGW

. Fuzzy linguistic variable matrix and parabola-based fuzzy normal distribution. IFIP International Federation for Information Processing Intelligent Information Processing III, pp. 205-215.

Health risks of obesity: MedlinePlus Medical Encyclopedia, MedlinePlus. [Online]. Available: https://medlineplusgov/ency/patientinstructions/000348.htm. [Accessed: 29–Apr-2015].

10.

Fletcher

, Cholesterol levels by age: Differences and recommendations, Medical News Today. [Online]. Available: https://wwwmedicalnewstoday.com/articles/315900.php. [Accessed: 14–Feb-2015].

11.

Guillaume

. Designing fuzzy inference systems from data: An interpretability-oriented review. IEEE Transactions on Fuzzy Systems. 2001; 9(3): 426-443.

12.

Taboada

Shimada

Mabu

Hirasawa

. Association rules mining for handling continuous attributes using genetic network programming and fuzzy membership functions. SICE Annual Conference 2007. 2007.

13.

Boston

. Effects of the shape of fuzzy membership functions on fuzzy inference, Proceedings of 3rd International Symposium on Uncertainty Modeling and Analysis and Annual Conference of the North American Fuzzy Information Processing Society.

14.

Alcala-Fdez

Alcala

Herrera

. A fuzzy association rule-based classification model for high-dimensional problems with genetic rule selection and lateral tuning. IEEE Transactions on Fuzzy Systems. 2011; 19(5): 857-872.

15.

Kianmehr

Kaya

Elsheikh

Jida

Alhajj

. Fuzzy association rule mining framework and its application to effective fuzzy associative classification. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2011; 1(6), 477-495.

16.

Pach

Gyenesei

Abonyi

. Compact fuzzy association rule-based classifier. Expert Systems with Applications. 2008; 34(4): 2406-2416.

	Predicted class
Classifier name	Actual class	Yes	No
Logistic regression	Yes	49	59
	No	27	265
Naïve Bayes	Yes	94	14
	No	48	244
J48	Yes	71	37
	No	21	271
Random forest	Yes	70	38
	No	13	279
PART	Yes	70	38
	No	19	273
Proposed AFM	Yes	94	14
	No	23	269