Abstract
The covering rough set model is viewed as a generalization of the rough set model. The covering rough set is built on a universe of discourse. A similar idea can be applied into three-way decisions theory to obtain covering-based three-way decisions which is the main object of this study. This paper firstly provides a summary of study on the three-way decisions based on covering rough set by applying the approach of decision-theoretic rough set. Then, the Bayesian decision procedure theory is implemented to covering decision systems.
Keywords
Introduction
Rough set theory was introduced by Pawlak [15] in 1982, which was considered as a mathematical approach to handle imprecision and uncertainty in data analysis. The key idea of rough set is to approximate an undefinable concept (usually a set) by a pair of definable concepts/sets obtained by the lower and upper approximation operators. Two approximation operators divide the universe into three disjoint regions: the positive region POS (X), boundary region BND (X) and negative region NEG (X). Probabilistic rough set (PRS) theory [11, 26] has appeared in many forms, such as decision-theoretic rough set model [31, 32], the variable precision rough set model [8, 35], and the Bayesian rough set model [8, 20]. By researching the common points of these rough set models, Yao recently proposed the notion of three-way decisions (positive, negative and boundary decisions) which correspond to the above three regions, respectively [27–29]. Yao first studied the decision-theoretic rough set approach which is combined with the Bayesian minimum decision theory. By setting different constraints on the loss functions, it obtains different types of PRSs, such as 0.5-PRS [16], asymmetric PRS [8], α-PRS [26, 31]. Yao et al. [25, 32] introduced Decision-theoretic rough set models, which considered the cost of each error and Bayesian decision theory. By the means of Bayesian Decision Procedure, Yao established probabilistic rough set models [26, 29], which solve probabilistic decision problems by allowing certain acceptable level of errors. Up to now, theory of three-way decisions is applied in many fields such as medical decision-making [14], risk decision-making [10], investment decisions [12] and further investigated extensions of three-way decisions [11, 30]. In 2014, Hu studied the three-way decisions from the mathematical viewpoint and proposed three-way decisions spaces based on fuzzy lattice [5] and partially ordered sets [6].
However, most researches of three-way decisions are based on relations (classical equivalence relation [28] and fuzzy relation [18, 38]). Besides, such an equivalence relation is still restrictive for many applications, such as incomplete information systems [9] and real-valued information systems [7]. To extend applicability of rough set model, many authors suggest many models by replacing equivalence relation or partition with notions. One of those extensions is coverings of the universe of discourse that is given by Bonikowski [1]. Shi [17] mentioned the covering-based probabilistic rough set and its Bayesian decision-making. The three-way decisions based on covering rough set have not been considered clearly. This inspired us to study three-way decisions based on covering rough set.
What is the covering? Bonikowski [1] proposed covering rough set as one of the extensions of the concept of ordinary rough set, in which covering-based rough set are derived by replacing the partitions of a universe with its coverings. The overviews on rough set based on covering were given in [13, 41–44]. The measure of roughness in covering rough set based on minimal description of the object x were also studied in [22, 40].
The rest of this paper is organized as follows.Section 2 recalls concepts of rough set, the Bayesian decision procedure and covering. Section 3 suggests the measure of roughness in covering rough set and their properties. Section 4 presents decision-theoretic rough set approach based on coverings. Section 5 studies probabilistic decision covering system. The last section concludes this paper.
Preliminaries
Rough sets
An information system is a pair (U, A), in which U is a nonempty set of samples {x1, x2, . . . , x n }, called a universe or sample space and A is a nonempty set of attributes (features, variables). Each attribute a ∈ A defines an information function f a : U → V a , in which V a is a set of values of a, called the domain ofattribute a.
For any subset B ⊆ A, we can define an equivalence relation as follows:
A decision system is a pair (U, A ∪ D), in which D = {d}, d is a decision attribute and elements in A are called conditional attributes.
The Bayesian decision procedure
The concept of the Bayesian decision procedure is introduced by Duda [3]. Then it becomes an important theory in decision-theoretic rough sets of Yao [25, 31].
Let Ω = {ω1, ω2, . . . , ω
s
} be a finite set of s states, and let be a finite set of m possible actions. Let P (ω
j
|
Then, is called the conditional risk or cost.
Given a description x, a decision rule is a function τ (
Covering rough sets
It is clear that a partition of U is certainly a covering of U, and the concept of a covering is an extension of the concept of a partition.
For every x ∈ U, C x is the minimal set including x in Cov (C). Each element in Cov (C) cannot be written as the union of other element in Cov (C). Cov (C) = C if and only if C is a partition. For every x, y ∈ U, if y ∈ C x then C x ⊇ C y ; so if y ∈ C x and x ∈ C y , then C x = C y .
For every x ∈ U, Δ x is the minimal set including x in Cov (Δ). Each element in Cov (Δ) cannot be written as the union of other element in Cov (Δ). If every covering in Δ is a partition, then Cov (Δ) is also a partition and Δ x is the equivalence class that includes x. For every x, y ∈ U, if y ∈ Δ x then Δ x ⊇ Δ y ; so if y ∈ Δ x and x ∈ Δ y , then Δ x = Δ y .
Let U be a nonempty universe of discourse and Δ be a family of coverings of U. Then, (U, Δ) is called a covering information system.
The pair of approximation operators are dual to each other. The positive, negative and boundary domains of X are computed as follows
Evaluation functions of three-way decisions
Let X and Y be two universes. Map (X, Y) is the family of all mappings from X to Y, i.e. Map (X, Y) = {f ∣ f : X → Y}.
Let (L, ∧ L , ∨ L , N L , 0 L , 1 L ) be a fuzzy lattice, i.e. complete distributive lattices with involutive negators. If A ∈ Map (U, L), then A is an L-fuzzy set of U [4].
Especially, if A ∈ Map (U, {0, 1}), then A is a subset of U, i.e. Map (U, {0, 1}) is the power set of U, which can also be simply written as 2 U . If A ∈ Map (U, [0, 1]), then A is a fuzzy set of U [34], namely Map (U, [0, 1]) is the fuzzy power set of U.
The union, intersection and complement for L-fuzzy sets in U are defined pointwise by the followingformulas
Then (Map (U, L) , ∩ L , ∪ L , N L , ∅ , U) is a fuzzy lattice, in which ∅ (x) =0 L , ∀x ∈ U and U (x) =1 L , ∀x ∈ U, Order relation of Map (U, L) is written as ⊆ L .
Let (L C , ∧ L C , ∨ L C , N L C , 0 L C , 1 L C ) and (L D , ∧ L D , ∨ L D , N L D , 0 L D , 1 L D ) be two fuzzy lattices in the following. Let U be a nonempty universe to make a decision on it, called decision universe and V be a nonempty universe in which condition function is defined, named condition universe.
E (∅) = ∅, i.e., E (∅) (x) = ∅
L
D
, ∀x ∈ U, A ⊆
L
C
B ⇒ E (A) ⊆
L
D
E (B), ∀A, B ∈ Map (V, L
C
), i.e., E (A) (x) ⊆
L
D
E (B) (x), ∀x ∈ U, and N
L
D
(E (A)) = E (N
L
C
(A)), ∀A ∈ Map (V, L
C
), i.e., N
L
D
(E (A)) (x) = E (N
L
C
(A)) (x), ∀x ∈ U.
Then E (A) is called a decision evaluation function of U (for A ∈ Map (V, L C )). Decision evaluation function is named an evaluation function for short. (U, Map (V, L C ) , L D , E) is called a three-way decision space.
A measure of roughness in covering rough set
Clearly, for all x ∈ U, 0 ≤ P (X|Δ x ) ≤1.
It is obvious to see the following results fromDefinition 2.4, 2.7 and Definition 2.4, 2.7.
P (X|Δ
x
) + P (X
c
|Δ
x
) =1. X ⊆ Y implies P (X|Δ
x
) ≤ P (Y|Δ
x
). If X
i
is a partition of U, then Σ
i
(P (X
i
|Δ
x
)) =1, for i = 1, . . , m. P (X|Δ
x
) =1 if and only if x ∈ POS
Δ
(X). P (X|Δ
x
) =0 if and only if x ∈ NEG
Δ
(X). 0 < P (X|Δ
x
) <1 if and only if x ∈ BND
Δ
(X).
P (X ∪ Y|Δ
x
) ≥ Max {P (X|Δ
x
) , P (Y|Δ
x
)}, P (X ∩ Y|Δ
x
) ≤ Min {P (X|Δ
x
) , P (Y|Δ
x
)}. X ⊆ Y implies P (X ∪ Y|Δ
x
) = P (Y|Δ
x
), P (X ∩ Y|Δ
x
) = P (X|Δ
x
).
In rough set theory, the equivalence class of x is definitely created by the equivalent relations, but in the rough set based on covering we have to construct one rule to create covering class of x. However, there are many ways to create covering class, such as: for each conditional attribute C i , (i = 1, 2, 3, 4), a covering element of x k with respect to C i can be defined as (C i ) x k = {x : d (x k , x) ≤ ɛ}, in which d (. , .) is distance function and ɛ ≥ 0 is a specified threshold. Then, C i = {(C i ) x k : k = 1, 2, . . . , 7} is acovering of U, and Δ = {C1, C2, C3, C4} is a family of covering of U.
The readers can see other ways to construct a family of covering of U in [].
Decision-theoretic covering rough set
Let (U, Δ) be a covering information system, Δ x be considered to be the description of x and Cov (Δ) be the set of all possible descriptions.
To apply Bayesian decision procedure into a covering information system, we have a set of two states and a set of three actions for each state. The set of states is given by Ω = (X, X c ). And, the set of three actions is given by , in which P, B, and N represent the three actions in classifying an object x, namely, deciding x ∈ POS (X), deciding x ∈ BND (X), and deciding x ∈ NEG (X), respectively. Then, Table 2 presents the loss function regarding the risk or cost of actions in different states.
In Table 2, λ
PP
, λ
BP
, and λ
NP
denote the losses incurred for taking actions P, B, and N, respectively, when an object belongs to A, and λ
PN
, λ
BN
, and λ
NN
denote the losses incurred for taking actions P, B, and N, respectively, when an object does not belong to X. We assume that the risk of assigning an object into the boundary region is between a correct classification and an incorrect classification. That is, we can see more detail in [25, 32],
That means the loss of classifying an object x belonging to X into the positive region POS (X) is less than or equal to the list of classifying x into the boundary region BND (X), and both of these losses are strictly less than the loss of classifying x into the negative region NEG (A). The reverse order of losses is used for classifying an object that does not belong to A.
For every x ∈ U, the conditional risks of the three actions given by the covering class are computed as:
The Bayesian decision procedure gives rise to three minimum-risk decision rules: If and , decide x ∈ POS (X), If and , decide x ∈ BND (X), If and , decide x ∈ NEG (X).
When P (X|Δ
x
) + P (X
c
|Δ
x
) =1, Equations (2), (3) and (4) can be rewritten as
Combined with condition in Equations (1), the minimum-risk decision rules (P1)-(B1) can be rewritten as: If P (X|Δ
x
) ≥ α and P (X|Δ
x
) ≥ γ, decide x ∈ POS (X), If P (X|Δ
x
) ≤ α and P (X|Δ
x
) ≥ β, decide x ∈ BND (X), If P (X|Δ
x
) ≤ β and P (X|Δ
x
) ≤ γ, decide x ∈ NEG (X).
In which
Consider an additional condition on the lossfunction:
If the loss function satisfies the two conditions in two Equations (1) and (8), then 0 ≤ β < γ < α ≤ 1. Based on tie-breaking criteria, we obtain the decision rules If P (X|Δ
x
) ≥ α, decide x ∈ POS (X), If β < P (X|Δ
x
) < α, decide x ∈ BND (X), If P (X|Δ
x
) ≤ β, decide x ∈ NEG (X).
The threshold γ is no longer needed. Then, the (α, β)-probabilistic positive, boundary and negative regions of X can be defined as follows,
The corresponding α-probabilistic lower approximation and β-probabilistic upper approximation are defined as follows,
Then, we have
Then,
Δ x 1 = {x1, x2}, Δ x 2 = {x2}, Δ x 3 = {x2, x3}, Δ x 4 = {x4, x5}, Δ x 5 = {x5}, Δ x 6 = {x5, x6}, Δ x 7 = {x7, x8}, Δ x 8 = {x8}, Δ x 9 = {x8, x9}.
For X = {x1, x3, x6, x7, x8}, and the loss function is as follows,
Then, this function for A satisfies Equations (1) and (8). By Equations (5) and (6), we have
The conditional probabilities for Δ x are computed as follows
P (X|Δ x 1 ) = P (X|Δ x 3 ) = P (X|Δ x 6 ) = P (X|Δ x 9 ) =0.5,
P (X|Δ x 2 ) = P (X|Δ x 4 ) = P (X|Δ x 5 ) =0,
and P (X|Δ
x
7
) = P (X|Δ
x
8
) =1.
Let (U, Δ ∪ D) be a covering decision system, in which Δ is a family of covering of U that is called conditional attribute, D is called decision attribute. Let U/D = {D1, . . . , D m } be the decision partition of U. If for any object x, there exists D j ∈ U/D such that Δ x ⊆ D j , then the (U, Δ ∪ D) is called a consistent covering decision system, denoted by Cov (Δ) ≤ U/D. Otherwise, (U, C ∪ D) is called an inconsistent covering decision system.
In the covering decision partition U/D, in which D i ∩ D j = ∅ , for i ≠ j, for each D i , we have different loss functions for different decision classes in Tablet 3.
Corresponding to decision partition, we have these families α, β, γ such as α = {α1, . . . , α
m
} , β = {β1, . . . , β
m
} and γ = {γ1, . . . , γ
m
}, then, a group of parameters α
i
, β
i
, γ
i
, i ∈ {1, 2, . . , m} for each covering decision class D
i
. Then, conditions for λ, β, γ in Equations (1) and (8) are:
Then, the (α
i
, β
i
)-probabilistic positive, boundary and negative regions of D
i
with respect to Cov (Δ) can be defined as follows
The corresponding α
i
-probabilistic lower approximation and β
i
-probabilistic upper approximation are defined as follows,
Then, we have
For this m-class problem, we can solve it in terms of m two class problems. Then, indicates the union of all the covering class defined by Cov (Δ). indicates the union of all the covering class defined by Cov (Δ). We have:
Clearly,
If , then B is called α-lower consistent set with respect to U/D. If , then B is called β-upper consistent set with respect to U/D.
If B is an α-lower consistent set with respect to U/D and , then B is called α-lower reduct of Δ with respect to U/D. If B is a β-upper consistent set with respect to U/D and for any attribute C
i
∈ B, then B is called β-upper reduct set of Δ with respect to U/D.
If , then B is called α-probabilistic consitent set with respect to U/D. If B is α-probabilistic consitent set with respect to U/D and for any attribute C
i
∈ B, then B is called α-probabilistic reduct set of Δ with respect to U/D.
Compute Cov (Δ) based on the family covering Δ, α, β based on the given loss functions. Compute α-probabilistic positive region of U/D; and let Red =∅. Select an attribute into Red each time until and output Red.
So, we have α = (0.64, 0.58) , β = (0.42, 0.42). Actually, the loss function for D1 is same as the one for X in Example 3.1. Thus, the fuzzy conditional probability, P (D2|Δ x ), is same as P (D2|Δ x i ), i = 1, 2, . . . , 9, and the conditional probabilities for D2 are as follows. P (D2|Δ x 1 ) = P (D2|Δ x 3 ) = P (D2|Δ x 6 ) = P (D2|Δ x 9 ) =0.5, P (D2|Δ x 2 ) = P (D2|Δ x 4 ) = P (D2|Δ x 5 ) =1
and P (D2|Δ x 7 ) = P (D2|Δ x 8 ) =0.
The corresponding fuzzy probabilistic regions of D1 and D2 are shown in Table 5.
Then,
Let us consider a house evaluation problem. Suppose U = {x1, x2, . . . , x10} is set of ten houses, and let Δ = {structure ; color ; price ; surroundings} be a set of conditional attributes, D = {sale, further - evaluation} be a set of decision attributes. The values of “price” are {high,middle, low}, the values of “structure” are {reasonable, ordinary, poor}, the values of “color” are {good; bad}, the values of “surroundings” are {quiet; slightly noisy; noisy; very noisy}. Their evaluation results are independent of each other. The evaluation are shown in Table 6.
As was noted in [2], a covering can be generated by set-valued attributes, from the set-valued decision above we can induce a family of coverings {Δ
i
: i = 1, 2, 3, 4},
The decision D is given as
Then,
Δ x 1 = {x1, x2, x3, x6}; Δ x 2 = {x2, x3, x6}; Δ x 3 = {x3, x6}; Δ x 4 = {x3, x4, x6, x7}; Δ x 5 = {x3, x4, x5, x6, x7}; Δ x 6 = {x6}; Δ x 7 = {x6, x7}; Δ x 8 = {x6, x8, x9}; Δ x 9 = {x6, x9}; Δ x 10 = {x10}.
We use loss functions for different decision classes be different which shown in Tablet 4,
so α = (0.64, 0.58) , β = (0.42, 0.42).
The conditional probabilities for Δ x with respect to D1 are computed as follows
P (D1|Δ x 1 ) = D1|Δ x 2 ) = P (D1|Δ x 3 ) = P (D1|Δ x 6 ) =1, , , , P (D1|Δ x 10 ) =0.
The conditional probabilities for Δ x with respect to D2 are computed as follows
P (D2|Δ x 1 ) = P (D2|Δ x 2 ) = P (D2|Δ x 3 ) = P (D2|Δ x 6 ) =0, , , .
The corresponding fuzzy probabilistic regions of D1 and D2 are shown in Table 7.
Then,
So,
Base on Algorithm 1, we can compute Red {Δ, D} = {{C1, C3, C4} , {C1, C2, C3} , {C1, C2, C4}}.
Conclusion
By using concept of measure of roughness in covering rough set , this paper investigates probabilistic covering decision system. A loss function is defined to state how each action is costly, and the final decision is to select the action for which the overall cost is minimum. A pair of threshold parameters α and β generated by the loss functions were used to define the lower and upper approximations. Then, two parameters were used to generate three pair-wise disjoint and obtain the corresponding decision rules automatics. Three-way decision covering rough set mainly studies how to make decision for every region in three regions of a set. We give one algorithm to find α-probabilistic reduct set of Δ with respect to U/D. Our future work will concentrate on the examinationof possible numerical characterizations in the framework of three-way decisions covering rough set and development other measure of roughness in covering rough set to study three-way decisions covering rough set.
Acknowledgments
This research was supported by the National Nature Science Foundation of China (Grant nos. 11571010 and 61179038).
