Three-way decisions based on covering rough set

Abstract

The covering rough set model is viewed as a generalization of the rough set model. The covering rough set is built on a universe of discourse. A similar idea can be applied into three-way decisions theory to obtain covering-based three-way decisions which is the main object of this study. This paper firstly provides a summary of study on the three-way decisions based on covering rough set by applying the approach of decision-theoretic rough set. Then, the Bayesian decision procedure theory is implemented to covering decision systems.

Keywords

Bayesian decision procedure decision-theoretic based on covering rough set probabilistic covering decision system three-way decisions

1 Introduction

Rough set theory was introduced by Pawlak [15] in 1982, which was considered as a mathematical approach to handle imprecision and uncertainty in data analysis. The key idea of rough set is to approximate an undefinable concept (usually a set) by a pair of definable concepts/sets obtained by the lower and upper approximation operators. Two approximation operators divide the universe into three disjoint regions: the positive region POS (X), boundary region BND (X) and negative region NEG (X). Probabilistic rough set (PRS) theory [11 , 26] has appeared in many forms, such as decision-theoretic rough set model [31, 32], the variable precision rough set model [8, 35], and the Bayesian rough set model [8 , 20]. By researching the common points of these rough set models, Yao recently proposed the notion of three-way decisions (positive, negative and boundary decisions) which correspond to the above three regions, respectively [27 –29]. Yao first studied the decision-theoretic rough set approach which is combined with the Bayesian minimum decision theory. By setting different constraints on the loss functions, it obtains different types of PRSs, such as 0.5-PRS [16], asymmetric PRS [8], α-PRS [26, 31]. Yao et al. [25 , 32] introduced Decision-theoretic rough set models, which considered the cost of each error and Bayesian decision theory. By the means of Bayesian Decision Procedure, Yao established probabilistic rough set models [26 , 29], which solve probabilistic decision problems by allowing certain acceptable level of errors. Up to now, theory of three-way decisions is applied in many fields such as medical decision-making [14], risk decision-making [10], investment decisions [12] and further investigated extensions of three-way decisions [11 , 30]. In 2014, Hu studied the three-way decisions from the mathematical viewpoint and proposed three-way decisions spaces based on fuzzy lattice [5] and partially ordered sets [6].

However, most researches of three-way decisions are based on relations (classical equivalence relation [28] and fuzzy relation [18, 38]). Besides, such an equivalence relation is still restrictive for many applications, such as incomplete information systems [9] and real-valued information systems [7]. To extend applicability of rough set model, many authors suggest many models by replacing equivalence relation or partition with notions. One of those extensions is coverings of the universe of discourse that is given by Bonikowski [1]. Shi [17] mentioned the covering-based probabilistic rough set and its Bayesian decision-making. The three-way decisions based on covering rough set have not been considered clearly. This inspired us to study three-way decisions based on covering rough set.

What is the covering? Bonikowski [1] proposed covering rough set as one of the extensions of the concept of ordinary rough set, in which covering-based rough set are derived by replacing the partitions of a universe with its coverings. The overviews on rough set based on covering were given in [13 , 41–44]. The measure of roughness in covering rough set based on minimal description of the object x were also studied in [22, 40].

The rest of this paper is organized as follows.Section 2 recalls concepts of rough set, the Bayesian decision procedure and covering. Section 3 suggests the measure of roughness in covering rough set and their properties. Section 4 presents decision-theoretic rough set approach based on coverings. Section 5 studies probabilistic decision covering system. The last section concludes this paper.

2 Preliminaries

2.1 Rough sets

An information system is a pair (U, A), in which U is a nonempty set of samples {x₁, x₂, . . . , x_n}, called a universe or sample space and A is a nonempty set of attributes (features, variables). Each attribute a ∈ A defines an information function f_a : U → V_a, in which V_a is a set of values of a, called the domain ofattribute a.

For any subset B ⊆ A, we can define an equivalence relation as follows: $Ind (B) = {(x, y) \in U \times U : a (x) = a (y), \forall a \in B} .$ Then, Ind (B) = ∩ _a∈B ({a}). For X ⊆ U, the lower and upper approximations of X with respect to Ind (B) are defined as $\begin{matrix} \underline{B} (X) & = & {x \in B : [x]_{R} \subseteq X}, \\ \bar{B} (X) & = & {x \in B : [x]_{R} \cap X \neq \emptyset} . \end{matrix}$

A decision system is a pair (U, A ∪ D), in which D = {d}, d is a decision attribute and elements in A are called conditional attributes.

2.2 The Bayesian decision procedure

The concept of the Bayesian decision procedure is introduced by Duda [3]. Then it becomes an important theory in decision-theoretic rough sets of Yao [25, 31].

Let Ω = {ω₁, ω₂, . . . , ω_s} be a finite set of s states, and let $A = {a_{1}, a_{2}, . . ., a_{m}}$ be a finite set of m possible actions. Let P (ω_j|x) be the conditional probability of an object x being in state ω_j given that the object is described by x. Let λ (a_i|ω_j) denote the loss, or cost, for taking action a_i when the state is ω_j. Since P (ω_j|x) is the probability that the true state is ω_j given x, the expected loss associated with taking action a_i is given by: $R (a_{i} | x) = \sum_{j = 1}^{s} λ (a_{i} | ω_{j}) P (ω_{j} | x) .$

Then, $R (a_{i} | x)$ is called the conditional risk or cost.

Given a description x, a decision rule is a function τ (x) that specifies which action to take. That is, for every x, τ (x) takes one of the actions, a₁, a₂, . . . , a_m. The overall risk $R$ is the expected loss associated with a given decision rule. Since $R$ is the conditional risk associted with action τ (x), the overall risk is defined by: $R = \sum_{x} R (τ (x) | x) P (x),$ in which the summation is over the set of all possible descriptions of objects. If τ (x) is chosen so that $R (τ (x) | x)$ is as small as possible for every x, the overall risk $R$ is minimized.

2.3 Covering rough sets

Definition 2.1. [Covering] [1, 21] Let U be a nonempty universe of discourse and C be a family of subsets of U. If ∪C = U, C is called a covering of U.

It is clear that a partition of U is certainly a covering of U, and the concept of a covering is an extension of the concept of a partition.

Definition 2.2. [21] Suppose U is a finite universe and C = {K₁, K₂, . . . , K_n} is a covering of U. For every x ∈ U, let C_x = ∩ _j {K_j : K_j ∈ C, x ∈ K_j}, then Cov (C) = {C_x : x ∈ U} is also a covering of U. We call it the induced covering of C.

For every x ∈ U, C_x is the minimal set including x in Cov (C). Each element in Cov (C) cannot be written as the union of other element in Cov (C). Cov (C) = C if and only if C is a partition. For every x, y ∈ U, if y ∈ C_x then C_x ⊇ C_y; so if y ∈ C_x and x ∈ C_y, then C_x = C_y.

Definition 2.3. [21] Let Δ = {C_i : i = 1, 2, . . . , m} be a family of coverings of U. For every x ∈ U,let Δ_x = ∩ _i {(C_i) _x : (C_i) _x ∈ Cov (C_i)}. Cov (Δ) = {Δ_x : x ∈ U} is then also a covering of U, and we call it the induced covering of Δ.

For every x ∈ U, Δ_x is the minimal set including x in Cov (Δ). Each element in Cov (Δ) cannot be written as the union of other element in Cov (Δ). If every covering in Δ is a partition, then Cov (Δ) is also a partition and Δ_x is the equivalence class that includes x. For every x, y ∈ U, if y ∈ Δ_x then Δ_x ⊇ Δ_y; so if y ∈ Δ_x and x ∈ Δ_y, then Δ_x = Δ_y.

Let U be a nonempty universe of discourse and Δ be a family of coverings of U. Then, (U, Δ) is called a covering information system.

Definition 2.4. [2, 21] For any X ⊆ U, the lower and upper approximations of X with respect to Δ are defined as follows: $\begin{matrix} \underline{Δ} (X) & = & {x \in U : Δ_{x} \subseteq X}, \\ \bar{Δ} (X) & = & {x \in U : Δ_{x} \cap X \neq \emptyset} . \end{matrix}$

The pair of approximation operators are dual to each other. The positive, negative and boundary domains of X are computed as follows $\begin{matrix} {POS}_{Δ} (X) & = & \underline{Δ} (X), \\ {NEG}_{Δ} (X) & = & U - \bar{Δ} (X), \\ {BND}_{Δ} (X) & = & \bar{Δ} (X) - \underline{Δ} (X) . \end{matrix}$

Remark 2.5. Let Δ = {C_i : i = 1, 2, . . . , m} be a family of coverings of U. If every pair C_i, C_j ∈ Δ such as C_i, C_j are pairwise disjoint, then $\underline{Δ} (X)$ and $\bar{Δ} (X)$ are the Pawlak’s lower and upper approximationsof X.

2.4 Evaluation functions of three-way decisions

Let X and Y be two universes. Map (X, Y) is the family of all mappings from X to Y, i.e. Map (X, Y) = {f ∣ f : X → Y}.

Let (L, ∧ _L, ∨ _L, N_L, 0_L, 1_L) be a fuzzy lattice, i.e. complete distributive lattices with involutive negators. If A ∈ Map (U, L), then A is an L-fuzzy set of U [4].

Especially, if A ∈ Map (U, {0, 1}), then A is a subset of U, i.e. Map (U, {0, 1}) is the power set of U, which can also be simply written as 2^U. If A ∈ Map (U, [0, 1]), then A is a fuzzy set of U [34], namely Map (U, [0, 1]) is the fuzzy power set of U.

The union, intersection and complement for L-fuzzy sets in U are defined pointwise by the followingformulas $\begin{matrix} (A \cap_{L} B) (x) & = & A (x) \land_{L} B (x), \\ (A \cup_{L} B) (x) & = & A (x) \lor_{L} B (x), and \\ N_{L} (A) (x) & = & N_{L} (A (x)) . \end{matrix}$

Then (Map (U, L) , ∩ _L, ∪ _L, N_L, ∅ , U) is a fuzzy lattice, in which ∅ (x) =0_L, ∀x ∈ U and U (x) =1_L, ∀x ∈ U, Order relation of Map (U, L) is written as ⊆_L.

Let (L_C, ∧ _{L
_C}, ∨ _{L
_C}, N_{L
_C}, 0_{L
_C}, 1_{L
_C}) and (L_D, ∧ _{L
_D}, ∨ _{L
_D}, N_{L
_D}, 0_{L
_D}, 1_{L
_D}) be two fuzzy lattices in the following. Let U be a nonempty universe to make a decision on it, called decision universe and V be a nonempty universe in which condition function is defined, named condition universe.

Definition 2.6. [5] Let U be a decision universe and V be a condition universe. Then a mapping E : Map (V, L_C) → Map (U, L_D) is called a decision evaluation function of U, if it satisfies the followingaxioms.

E (∅) = ∅, i.e., E (∅) (x) = ∅ _{L
_D}, ∀x ∈ U,

A ⊆ _{L
_C}B ⇒ E (A) ⊆ _{L
_D}E (B), ∀A, B ∈ Map (V, L_C), i.e., E (A) (x) ⊆ _{L
_D}E (B) (x), ∀x ∈ U, and

N_{L
_D} (E (A)) = E (N_{L
_C} (A)), ∀A ∈ Map (V, L_C), i.e., N_{L
_D} (E (A)) (x) = E (N_{L
_C} (A)) (x), ∀x ∈ U.

Then E (A) is called a decision evaluation function of U (for A ∈ Map (V, L_C)). Decision evaluation function is named an evaluation function for short. (U, Map (V, L_C) , L_D, E) is called a three-way decision space.

2.5 A measure of roughness in covering rough set

Definition 2.7. Let (U, Δ) be a covering information system. The degree of covering rough membership of x in U with respect to Δ, denoted by P (X|Δ_x) for every X ⊆ U, is defined by $P (X | Δ_{x}) = \frac{| X \cap Δ_{x} |}{| Δ_{x} |} .$

Clearly, for all x ∈ U, 0 ≤ P (X|Δ_x) ≤1.

It is obvious to see the following results fromDefinition 2.4, 2.7 and Definition 2.4, 2.7.

Proposition 2.8. Let (U, Δ) be a covering information system. For all X, Y ⊆ U and X_i ⊆ U for i=1,..,m, the following properties always hold:

P (X|Δ_x) + P (X^c|Δ_x) =1.

X ⊆ Y implies P (X|Δ_x) ≤ P (Y|Δ_x).

If X_i is a partition of U,

then Σ_i (P (X_i|Δ_x)) =1, for i = 1, . . , m.

P (X|Δ_x) =1 if and only if x ∈ POS_Δ (X).

P (X|Δ_x) =0 if and only if x ∈ NEG_Δ (X).

0 < P (X|Δ_x) <1

if and only if x ∈ BND_Δ (X).

Proposition 2.9. Let (U, Δ) be a covering information system. The following properties always hold:

P (X ∪ Y|Δ_x) ≥ Max {P (X|Δ_x) , P (Y|Δ_x)}, P (X ∩ Y|Δ_x) ≤ Min {P (X|Δ_x) , P (Y|Δ_x)}.

X ⊆ Y implies P (X ∪ Y|Δ_x) = P (Y|Δ_x),

P (X ∩ Y|Δ_x) = P (X|Δ_x).

Remark 2.10. We easily see that P (X|Δ_x) satisfies three conditions of Definition 2.6. So it is a decision evaluation function and (U, Map (U, {0, 1}) , [0, 1] , P) is a three-way decision space.

Example 2.11. Table 1 shows examination’s results of one class, in which ‘No.’ is ID number of students in class, ‘C₁’ denotes math score, ‘C₂’ denotes physics score, ‘C₃’ denotes chemistry score, ‘C₄’ denotes English language’s score. ‘D’ denotes evaluation for students, in which ‘E’ stands for an excellent student, ‘F’ stands for a fair student. All of the conditional attributes in Table 1 are numerical.

In rough set theory, the equivalence class of x is definitely created by the equivalent relations, but in the rough set based on covering we have to construct one rule to create covering class of x. However, there are many ways to create covering class, such as: for each conditional attribute C_i, (i = 1, 2, 3, 4), a covering element of x_k with respect to C_i can be defined as (C_i) _{x
_k} = {x : d (x_k, x) ≤ ɛ}, in which d (. , .) is distance function and ɛ ≥ 0 is a specified threshold. Then, C_i = {(C_i) _{x
_k} : k = 1, 2, . . . , 7} is acovering of U, and Δ = {C₁, C₂, C₃, C₄} is a family of covering of U.

The readers can see other ways to construct a family of covering of U in [].

3 Decision-theoretic covering rough set

Let (U, Δ) be a covering information system, Δ_x be considered to be the description of x and Cov (Δ) be the set of all possible descriptions.

To apply Bayesian decision procedure into a covering information system, we have a set of two states and a set of three actions for each state. The set of states is given by Ω = (X, X^c). And, the set of three actions is given by $A = {P, B, N}$ , in which P, B, and N represent the three actions in classifying an object x, namely, deciding x ∈ POS (X), deciding x ∈ BND (X), and deciding x ∈ NEG (X), respectively. Then, Table 2 presents the loss function regarding the risk or cost of actions in different states.

In Table 2, λ_PP, λ_BP, and λ_NP denote the losses incurred for taking actions P, B, and N, respectively, when an object belongs to A, and λ_PN, λ_BN, and λ_NN denote the losses incurred for taking actions P, B, and N, respectively, when an object does not belong to X. We assume that the risk of assigning an object into the boundary region is between a correct classification and an incorrect classification. That is, we can see more detail in [25 , 32], $λ_{PP} \leq λ_{BP} < λ_{NP}, λ_{NN} \leq λ_{BN} < λ_{PN} .$ (1)

That means the loss of classifying an object x belonging to X into the positive region POS (X) is less than or equal to the list of classifying x into the boundary region BND (X), and both of these losses are strictly less than the loss of classifying x into the negative region NEG (A). The reverse order of losses is used for classifying an object that does not belong to A.

For every x ∈ U, the conditional risks of the three actions given by the covering class are computed as: $R (a_{P} | Δ_{x}) = λ_{PP} P (X | Δ_{x}) + λ_{PN} P (X^{c} | Δ_{x}),$ (2) $R (a_{B} | Δ_{x}) = λ_{BP} P (X | Δ_{x}) + λ_{BN} P (X^{c} | Δ_{x}),$ (3) $R (a_{N} | Δ_{x}) = λ_{NP} P (X | Δ_{x}) + λ_{NN} P (X^{c} | Δ_{x}) .$ (4)

The Bayesian decision procedure gives rise to three minimum-risk decision rules:

If $R (a_{P} | Δ_{x}) \leq R (a_{N} | Δ_{x})$ and

$R (a_{P} | Δ_{x}) \leq R (a_{B} | Δ_{x})$ , decide x ∈ POS (X),

If $R (a_{B} | Δ_{x}) \leq R (a_{P} | Δ_{x})$ and

$R (a_{B} | Δ_{x}) \leq R (a_{N} | Δ_{x})$ , decide x ∈ BND (X),

If $R (a_{N} | Δ_{x}) \leq R (a_{P} | Δ_{x})$ and

$R (a_{N} | Δ_{x}) \leq R (a_{B} | Δ_{x})$ , decide x ∈ NEG (X).

When P (X|Δ_x) + P (X^c|Δ_x) =1, Equations (2), (3) and (4) can be rewritten as $\begin{matrix} R (a_{P} | Δ_{x}) & = & λ_{PP} P (X | Δ_{x}) + λ_{PN} {1 - P (X | Δ_{x})}, \\ R (a_{B} | Δ_{x}) & = & λ_{BP} P (X | Δ_{x}) + λ_{BN} {1 - P (X | Δ_{x})}, \\ R (a_{N} | Δ_{x}) & = & λ_{NP} P (X | Δ_{x}) + λ_{NN} {1 - P (X | Δ_{x})} . \end{matrix}$

Combined with condition in Equations (1), the minimum-risk decision rules (P1)-(B1) can be rewritten as:

If P (X|Δ_x) ≥ α and P (X|Δ_x) ≥ γ, decide x ∈ POS (X),

If P (X|Δ_x) ≤ α and P (X|Δ_x) ≥ β, decide x ∈ BND (X),

If P (X|Δ_x) ≤ β and P (X|Δ_x) ≤ γ, decide x ∈ NEG (X).

In which $α = \frac{λ_{PN} - λ_{BN}}{(λ_{PN} - λ_{BN}) + (λ_{BP} - λ_{PP})},$ (5) $β = \frac{λ_{BN} - λ_{NN}}{(λ_{BN} - λ_{NN}) + (λ_{NP} - λ_{BP})},$ (6) $γ = \frac{λ_{PN} - λ_{NN}}{(λ_{PN} - λ_{NN}) + (λ_{NP} - λ_{PP})} .$ (7) It is easy to see α ∈ (0, 1], γ ∈ (0, 1) and β ∈ [0, 1). The readers can see more detail about the required parameters α, β and γ in [24 , 29].

Consider an additional condition on the lossfunction:

$\begin{matrix} (λ_{PN} - λ_{BN}) (λ_{NP} - λ_{BP}) > \\ (λ_{BN} - λ_{NN}) (λ_{BP} - λ_{PP}) . \end{matrix}$ (8)

If the loss function satisfies the two conditions in two Equations (1) and (8), then 0 ≤ β < γ < α ≤ 1. Based on tie-breaking criteria, we obtain the decision rules

If P (X|Δ_x) ≥ α, decide x ∈ POS (X),

If β < P (X|Δ_x) < α, decide x ∈ BND (X),

If P (X|Δ_x) ≤ β, decide x ∈ NEG (X).

The threshold γ is no longer needed. Then, the (α, β)-probabilistic positive, boundary and negative regions of X can be defined as follows, $\begin{matrix} {POS}^{α} (X) & = & {x \in u | P (X | Δ_{x}) \geq α}, \\ {BND}^{(β, α)} (X) & = & {x \in u | β < P (X | Δ_{x}) < α}, \\ {NEG}^{β} (X) & = & {x \in u | P (X | Δ_{x}) \leq β} . \end{matrix}$

The corresponding α-probabilistic lower approximation and β-probabilistic upper approximation are defined as follows, ${\underline{Δ}}^{α} (X) = {POS}^{α} (X), {\bar{Δ}}^{β} (X) = ({NEG}^{β} (X))^{c} .$

Then, we have $\begin{matrix} {\underline{Δ}}^{α} (X) & = & {x \in U : P (X | Δ_{x}) \geq α}, \\ {\bar{Δ}}^{β} (X) & = & {x \in U : P (X | Δ_{x}) > β}, \end{matrix}$ since the order on [0, 1] is a linear order. The pair $({\underline{Δ}}^{α} (X), {\bar{Δ}}^{β} (X))$ is called the (α, β)-probabilistic rough set of X.

Example 3.1. Let us consider Example in [2]. Suppose U = {x₁, . . . , x₉}, C = {C_i : i = 1, . . . , 4}, and $\begin{matrix} C_{1} & = & {{x_{1}, x_{2}, x_{4}, x_{5}, x_{7}, x_{8}}, \\ {x_{2}, x_{3}, x_{5}, x_{6}, x_{8}, x_{9}}}, \\ C_{2} & = & {{x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{6}}, \\ {x_{4}, x_{5}, x_{6}, x_{7}, x_{8}, x_{9}}}, \\ C_{3} & = & {{x_{1}, x_{2}, x_{3}}, \\ {x_{4}, x_{5}, x_{6}, x_{7}, x_{8}, x_{9}}, {x_{7}, x_{8}, x_{9}}}, \\ C_{4} & = & {{x_{1}, x_{2}, x_{4}, x_{5}}, {x_{2}, x_{3}, x_{5}, x_{6}}, \\ {x_{4}, x_{5}, x_{7}, x_{8}}, {x_{5}, x_{6}, x_{8}, x_{9}}}, \\ U / D & = & {{x_{1}, x_{3}, x_{6}, x_{7}, x_{8}}, {x_{2}, x_{4}, x_{5}, x_{9}}} . \end{matrix}$

Then,

Δ_{x
₁} = {x₁, x₂}, Δ_{x
₂} = {x₂}, Δ_{x
₃} = {x₂, x₃}, Δ_{x
₄} = {x₄, x₅}, Δ_{x
₅} = {x₅}, Δ_{x
₆} = {x₅, x₆}, Δ_{x
₇} = {x₇, x₈}, Δ_{x
₈} = {x₈}, Δ_{x
₉} = {x₈, x₉}.

For X = {x₁, x₃, x₆, x₇, x₈}, and the loss function is as follows, $\begin{matrix} λ_{PP} = 0.25, λ_{BP} = 0.45, λ_{NP} = 0.80, \\ λ_{NN} = 0.14, λ_{BN} = 0.40, λ_{PN} = 0.75 . \end{matrix}$

Then, this function for A satisfies Equations (1) and (8). By Equations (5) and (6), we have $\begin{matrix} α = \frac{0.75 - 0.4}{(0.75 - 0.4) + (0.45 - 0.25))} = 0.64, \\ β = \frac{0.4 - 0.14}{(0.4 - 0.14) + (0.80 - 0.45)} = 0.42 . \end{matrix}$

The conditional probabilities for Δ_x are computed as follows

P (X|Δ_{x
₁}) = P (X|Δ_{x
₃}) = P (X|Δ_{x
₆}) = P (X|Δ_{x
₉}) =0.5,

P (X|Δ_{x
₂}) = P (X|Δ_{x
₄}) = P (X|Δ_{x
₅}) =0,

and P (X|Δ_{x
₇}) = P (X|Δ_{x
₈}) =1. $\begin{matrix} {POS}^{0.64} (X) & = & {x_{7}, x_{8}}, \\ {BND}^{(0.36, 0.64)} (X) & = & {x_{1}, x_{3}, x_{6}, x_{9}}, \\ {NEG}^{0.36} (X) & = & {x_{2}, x_{4}, x_{5}} . \end{matrix}$ And, $\begin{matrix} {\underline{Δ}}^{0.64} (X) & = & {x_{7}, x_{8}}, \\ {\bar{Δ}}^{0.42} (X) & = & {x_{1}, x_{3}, x_{6}, x_{7}, x_{8}, x_{9}} . \end{matrix}$

4 Probabilistic covering decision system

Let (U, Δ ∪ D) be a covering decision system, in which Δ is a family of covering of U that is called conditional attribute, D is called decision attribute. Let U/D = {D₁, . . . , D_m} be the decision partition of U. If for any object x, there exists D_j ∈ U/D such that Δ_x ⊆ D_j, then the (U, Δ ∪ D) is called a consistent covering decision system, denoted by Cov (Δ) ≤ U/D. Otherwise, (U, C ∪ D) is called an inconsistent covering decision system.

In the covering decision partition U/D, in which D_i ∩ D_j = ∅ , for i ≠ j, for each D_i, we have different loss functions for different decision classes in Tablet 3.

Corresponding to decision partition, we have these families α, β, γ such as α = {α¹, . . . , α^m} , β = {β¹, . . . , β^m} and γ = {γ¹, . . . , γ^m}, then, a group of parameters αⁱ, βⁱ, γⁱ, i ∈ {1, 2, . . , m} for each covering decision class D_i. Then, conditions for λ, β, γ in Equations (1) and (8) are: $λ_{PP}^{i} \leq λ_{BP}^{i} < λ_{NP}^{i}, λ_{NN}^{i} \leq λ_{BN}^{i} < λ_{PN}^{i},$ (9) and,

$\begin{matrix} (λ_{PN}^{i} - λ_{BN}^{i}) (λ_{NP}^{i} - λ_{BP}^{i}) > \\ (λ_{BN}^{i} - λ_{NN}^{i}) (λ_{BP}^{i} - λ_{PP}^{i}) . \end{matrix}$ (10)

Then, the (αⁱ, βⁱ)-probabilistic positive, boundary and negative regions of D_i with respect to Cov (Δ) can be defined as follows $\begin{matrix} {POS}_{Cov (Δ)}^{α^{i}} (D_{i}) & = & {x \in u | P (D_{i} | Δ_{x}) \geq α^{i}}, \\ {BND}_{Cov (Δ)}^{(β^{i}, α^{i})} (D_{i}) & = & {x \in u | β^{i} < P (D_{i} | Δ_{x}) < α^{i}}, \\ {NEG}_{Cov (Δ)}^{β^{i}} (D_{i}) & = & {x \in u | P (D_{i} | Δ_{x}) \leq β^{i}} . \end{matrix}$

The corresponding αⁱ-probabilistic lower approximation and βⁱ-probabilistic upper approximation are defined as follows, $\begin{matrix} {\underline{Δ}}^{α^{i}} (X) & = {POS}^{α^{i}} (X), \\ {\bar{Δ}}^{β^{i}} (X) & = ({NEG}^{β^{i}} (X))^{c} . \end{matrix}$

Then, we have $\begin{matrix} {\underline{Δ}}_{Cov (Δ)}^{α^{i}} (D_{i}) & = & {x \in U : P (D_{i} | Δ_{x}) \geq α^{i}}, \\ {\bar{Δ}}_{Cov (Δ)}^{β^{i}} (D_{i}) & = & {x \in U : P (D_{i} | Δ_{x}) > β^{i}}, \end{matrix}$ since the order on [0, 1] is a linear order. The pair $({\underline{Δ}}^{α^{i}} (X), {\bar{Δ}}^{β^{i}} (X))$ is called the (αⁱ, βⁱ)-probabilistic rough set of D_i. The lower and upper approximations of the partition U/D with respect to Cov (Δ) are the families of the lower and upper approximations of U/D: $\begin{matrix} {\underline{apr}}_{Cov (Δ)}^{α} (U / D) & = & ({\underline{Δ}}_{Cov (Δ)}^{α^{1}} (D_{1}), {\underline{Δ}}_{Cov (Δ)}^{α^{2}} (D_{2}) \\ , . . ., {\underline{Δ}}_{Cov (Δ)}^{α^{m}} (D_{m})), \\ {\bar{apr}}_{Cov (Δ)}^{β} (U / D) & = & ({\bar{Δ}}_{Cov (Δ)}^{β^{1}} (D_{1}), {\bar{Δ}}_{Cov (Δ)}^{β^{2}} (D_{2}) \\ , . . ., {\bar{Δ}}_{Cov (Δ)}^{β^{m}} (D_{m})) . \end{matrix}$

For this m-class problem, we can solve it in terms of m two class problems. Then, ${POS}_{Cov (Δ)}^{α} (U / D)$ indicates the union of all the covering class defined by Cov (Δ). ${BND}_{Cov (Δ)}^{α} (U / D)$ indicates the union of all the covering class defined by Cov (Δ). We have: $\begin{matrix} {Pos}_{Cov (Δ)}^{α} (U / D) = \underset{1 \leq i \leq m}{\cup} {POS}_{Cov (Δ)}^{α_{i}} (D_{i}), \\ {BND}_{Cov (Δ)}^{(β, α)} (U / D) = \underset{1 \leq i \leq m}{\cup} {BND}_{Cov (Δ)}^{(β_{i}, α_{i})} (D_{i}) - \\ \underset{1 \leq i \leq m}{\cup} ({POS}_{Cov (Δ)}^{α_{i}} (D_{i}) \cup {NEG}_{Cov (Δ)}^{β_{i}} (D_{i})), \\ {NEG}_{Cov (Δ)}^{β} (U / D) = U - ({POS}_{Cov (Δ)}^{α} (U / D) \\ \cup {BND}_{Cov (Δ)}^{α} (U / D)) . \end{matrix}$

Clearly, $\begin{matrix} {POS}_{Cov (Δ)}^{α} (D) \cap {BND}_{Cov (Δ)}^{(β, α)} (D) = \emptyset, \\ {POS}_{Cov (Δ)}^{α} (D) \cap {NEG}_{Cov (Δ)}^{β} (D) = \emptyset, \\ {BND}_{Cov (Δ)}^{(β, α)} (D) \cap {NEG}_{Cov (Δ)}^{β} (D) = \emptyset . \end{matrix}$ And $\begin{matrix} {\underline{Δ}}_{Cov (Δ)}^{α} (U / D) & = & {POS}_{Cov (Δ)}^{α} (U / D), \\ {\bar{Δ}}_{Cov (Δ)}^{β} (U / D) & = & ({NEG}_{Cov (Δ)}^{β} (U / D))^{c} . \end{matrix}$

Definition 4.1. Given a covering decision system (U, Δ ∪ D), an attribute set B ∈ Δ, in which Cov (Δ) _B is induced by B and Cov (Δ) _Δ is induced by Δ.

If ${\underline{apr}}_{Cov (Δ)_{B}}^{α} (U / D) = {\underline{apr}}_{Cov (Δ)_{Δ}}^{α} (U / D)$ , then B is called α-lower consistent set with respect to U/D.

If ${\bar{apr}}_{Cov (Δ)_{B}}^{β} (U / D) = {\bar{apr}}_{Cov (Δ)_{Δ}}^{β} (U / D)$ , then B is called β-upper consistent set with respect to U/D.

Definition 4.2. Given a covering decision system (U, Δ ∪ D), an attribute set B ∈ Δ.

If B is an α-lower consistent set with respect to U/D and

${\underline{apr}}_{Cov (Δ)_{B - C_{i}}}^{α} (U / D) \neq {\underline{apr}}_{Cov (Δ)_{Δ}}^{α} (U / D)$ , then B is called α-lower reduct of Δ with respect to U/D.

If B is a β-upper consistent set with respect to U/D and

${\bar{apr}}_{Cov (Δ)_{B - C_{i}}}^{β} (U / D) \neq {\bar{apr}}_{Cov (Δ)_{Δ}}^{β} (U / D)$ for any attribute C_i ∈ B, then B is called β-upper reduct set of Δ with respect to U/D.

Definition 4.3. Given a covering decision system (U, Δ ∪ D), an attribute set B ⊆ Δ,

If ${POS}_{Cov (Δ)_{B}}^{α} (U / D) = {POS}_{Cov (Δ)_{C}}^{α} (U / D)$ ,

then B is called α-probabilistic consitent set with respect to U/D.

If B is α-probabilistic consitent set with respect to U/D and

${POS}_{Cov (Δ)_{B - C_{i}}}^{α} (U / D) \neq {POS}_{Cov (Δ)_{C}}^{α} (U / D)$ for any attribute C_i ∈ B, then B is called α-probabilistic reduct set of Δ with respect to U/D.

Theorem 4.4. Given a covering decision system (U, Δ ∪ D) and B ⊆ Δ, if B is α-lower reduct then B is also α-probabilistic reduct.

Proof. The proof is trivial. □

Algorithm 1. Find α-probabilistic reduct set of Δ with respect to U/D

Compute Cov (Δ) based on the family covering Δ, α, β based on the given loss functions.

Compute α-probabilistic positive region of U/D; ${POS}_{Cov (Δ)_{B}}^{α} (U / D)$ and let Red =∅.

Select an attribute into Red each time until ${POS}_{Cov (Δ)_{R} ed}^{α} (U / D) = {POS}_{Cov (Δ)_{Δ}}^{α} (U / D)$ and output Red.

Example 4.5. Continuous Example 3.1. Let loss functions for different decision classes be different (shown in Table 4).

So, we have α = (0.64, 0.58) , β = (0.42, 0.42). Actually, the loss function for D₁ is same as the one for X in Example 3.1. Thus, the fuzzy conditional probability, P (D₂|Δ_x), is same as P (D₂|Δ_{x
_i}), i = 1, 2, . . . , 9, and the conditional probabilities for D₂ are as follows. P (D₂|Δ_{x
₁}) = P (D₂|Δ_{x
₃}) = P (D₂|Δ_{x
₆}) = P (D₂|Δ_{x
₉}) =0.5, P (D₂|Δ_{x
₂}) = P (D₂|Δ_{x
₄}) = P (D₂|Δ_{x
₅}) =1

and P (D₂|Δ_{x
₇}) = P (D₂|Δ_{x
₈}) =0.

The corresponding fuzzy probabilistic regions of D₁ and D₂ are shown in Table 5.

Then, $\begin{matrix} {POS}_{Cov (Δ)}^{(0.64, 0.58)} (U / D) = {x_{1}, x_{2}, x_{3}, x_{4}, \\ x_{5}, x_{6}, x_{7}, x_{8}, x_{9}} = U \\ {BND}_{Cov (Δ)}^{(0.42, 0.0.42), (0.64, 0.58)} (U / D) = \emptyset, \\ {NEG}_{Cov (Δ)}^{(0.42, 0.42)} (U / D) = \emptyset . \end{matrix}$ And, $\begin{matrix} {\underline{Δ}}_{Cov (Δ)}^{(0.64, 0.58)} (U / D) & = U, \\ {\bar{Δ}}_{Cov (Δ)}^{(0.42, 0.42)} (U / D) & = U . \end{matrix}$

5 An example

Let us consider a house evaluation problem. Suppose U = {x₁, x₂, . . . , x₁₀} is set of ten houses, and let Δ = {structure ; color ; price ; surroundings} be a set of conditional attributes, D = {sale, further - evaluation} be a set of decision attributes. The values of “price” are {high,middle, low}, the values of “structure” are {reasonable, ordinary, poor}, the values of “color” are {good; bad}, the values of “surroundings” are {quiet; slightly noisy; noisy; very noisy}. Their evaluation results are independent of each other. The evaluation are shown in Table 6.

As was noted in [2], a covering can be generated by set-valued attributes, from the set-valued decision above we can induce a family of coverings {Δ_i : i = 1, 2, 3, 4}, $\begin{matrix} Price C_{1} = {{x_{1}, x_{2}, x_{3}, x_{4}, x_{6}, x_{7}, x_{8}, x_{9}, x_{10}}, \\ {x_{3}, x_{4}, x_{6}, x_{7}}, {x_{3}, x_{4}, x_{5}, x_{6}, x_{7}}} . \\ Structure C_{2} = {{x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{6}, x_{7}}, \\ {x_{6}, x_{7}, x_{8}, x_{9}}, {x_{10}}} . \\ Color C_{3} = {{x_{1}, x_{2}, x_{3}, x_{6}, x_{8}, x_{9}, x_{10}} \\ {x_{2}, x_{3}, x_{4}, x_{5}, x_{6}, x_{7}, x_{9}}} . \\ Surroundings C_{4} = {{x_{1}, x_{2}, x_{3}, x_{6}}, \\ {x_{6}, x_{7}, x_{9}}, {x_{6}, x_{8}, x_{9}, x_{10}}, \\ {x_{2}, x_{3}, x_{4}, x_{5}, x_{6}, x_{7}}} . \end{matrix}$

The decision D is given as $U / D = {{x_{1}, x_{2}, x_{3}, x_{6}}, {x_{4}, x_{5}, x_{7}, x_{8}, x_{9}, x_{10}}} .$

Then,

Δ_{x
₁} = {x₁, x₂, x₃, x₆}; Δ_{x
₂} = {x₂, x₃, x₆}; Δ_{x
₃} = {x₃, x₆}; Δ_{x
₄} = {x₃, x₄, x₆, x₇}; Δ_{x
₅} = {x₃, x₄, x₅, x₆, x₇}; Δ_{x
₆} = {x₆}; Δ_{x
₇} = {x₆, x₇}; Δ_{x
₈} = {x₆, x₈, x₉}; Δ_{x
₉} = {x₆, x₉}; Δ_{x
₁₀} = {x₁₀}.

We use loss functions for different decision classes be different which shown in Tablet 4,

so α = (0.64, 0.58) , β = (0.42, 0.42).

The conditional probabilities for Δ_x with respect to D₁ are computed as follows

P (D₁|Δ_{x
₁}) = D₁|Δ_{x
₂}) = P (D₁|Δ_{x
₃}) = P (D₁|Δ_{x
₆}) =1, $P (D_{1} | Δ_{x_{4}}) = P (D_{1} | Δ_{x_{7}}) = P (D_{1} | Δ_{x_{9}}) = \frac{1}{2}$ , $P (D_{1} | Δ_{x_{5}}) = \frac{2}{5}$ , $P (D_{1} | Δ_{x_{8}}) = \frac{1}{3}$ , P (D₁|Δ_{x
₁₀}) =0.

The conditional probabilities for Δ_x with respect to D₂ are computed as follows

P (D₂|Δ_{x
₁} ) = P (D₂|Δ_{x
₂} ) = P (D₂|Δ_{x
₃} ) = P (D₂|Δ_{x
₆} ) =0, $P (D_{2} | Δ_{x_{4}}) = P (D_{2} | Δ_{x_{7}}) = P (D_{2} | Δ_{x_{9}}) = \frac{1}{2}$ , $P (D_{2} | Δ_{x_{5}}) = \frac{3}{5}$ , $P (D_{2} | Δ_{x_{8}}) = \frac{2}{3}, P (D_{2} | Δ_{x_{10}}) = 1$ .

The corresponding fuzzy probabilistic regions of D₁ and D₂ are shown in Table 7.

Then, $\begin{matrix} {POS}^{(0.64, 0.58)} (U / D) = {x_{1}, x_{2}, x_{3}, x_{5}, x_{6}, x_{8}, x_{10}}, \\ {BND}^{((0.42, 0.42), (0.64, 0.58))} (U / D) = {x_{4}, x_{7}, x_{9}}, \\ {NEG}^{(0.42, 0.42)} (U / D) = \emptyset . \end{matrix}$

So, $\begin{matrix} {\underline{Δ}}^{(0.64, 0.58)} (U / D) = {x_{1}, x_{2}, x_{3}, x_{5}, x_{6}, x_{8}, x_{10}}, \\ {\bar{Δ}}^{(0.42, 0.42)} (U / D) = U . \end{matrix}$ The results show that houses x₁, x₂, x₃, x₅, x₆, x₈, x₁₀ are most probably on sale with this scheme with a possibility not less than α = (0.64, 0.58), while no houses are less likely to be on sale. We are not sure for x₄, x₇, x₉ which need further evaluation.

Base on Algorithm 1, we can compute Red {Δ, D} = {{C₁, C₃, C₄} , {C₁, C₂, C₃} , {C₁, C₂, C₄}}.

6 Conclusion

By using concept of measure of roughness in covering rough set $P (X | Δ_{x}) = \frac{| X \cap Δ_{x} |}{| Δ_{x} |}$ , this paper investigates probabilistic covering decision system. A loss function is defined to state how each action is costly, and the final decision is to select the action for which the overall cost is minimum. A pair of threshold parameters α and β generated by the loss functions were used to define the lower and upper approximations. Then, two parameters were used to generate three pair-wise disjoint and obtain the corresponding decision rules automatics. Three-way decision covering rough set mainly studies how to make decision for every region in three regions of a set. We give one algorithm to find α-probabilistic reduct set of Δ with respect to U/D. Our future work will concentrate on the examinationof possible numerical characterizations in the framework of three-way decisions covering rough set and development other measure of roughness in covering rough set to study three-way decisions covering rough set.

Acknowledgments

This research was supported by the National Nature Science Foundation of China (Grant nos. 11571010 and 61179038).

References

Bonikowski

, Bryniarski

and Wybraniec

, Extensions and intentions in the rough set theory, Information Sciences 107 (1998), 149–167.

Chen

D.G.

, Wang

C.Z.

and Hu

Q.H.

, A new approach to attribute reduction of consistent and inconsistent covering decision systems with covering rough set, Information Sciences 177(17) (2007), 3500–3518.

Duda

R.O.

and Hart

P.E.

, Pattern classification and scene analysis, Wiely, New York, 1973.

Goguen

J.A.

, L-Fuzzy sets, Journal of Mathematical Analysis and Applications 18 (1967), 145–174.

B.Q.

, Three-way decisions space and three-way decisions, Information Sciences 281 (2014), 21–52.

B.Q.

, Three-way decisions spaces based on partially ordered sets and three-way decisions based on hesitant fuzzy sets, submitted to Knowledge-Based Systems.

Q.H.

, Yu

D.R.

and Xie

Z.X.

, Neighborhood classifiers, Expert Systems with Applications 34(2) (2006), 866–876.

Katzberg

J.D.

and Ziarko

, Variable precision rough set with asymmetric bounds, Workshops in Computing (1994), 167–177.

Latkowski

, On decomposition for incomplete data, Fundamenta Informaticae 54 (2003), 1–16.

10.

H.X.

and Zhou

X.Z.

, Risk decision making based on decision-theoretic rough set: A three-way view decision model, International Journal of Computational Intelligence Systems 4 (2011), 1–11.

11.

Liu

, Li

and Ruan

, Probabilistic model criteria with decision-theoretic rough set, Information Sciences 181 (2011), 3709–3722.

12.

Liu

, Yao

Y.Y.

and Li

T.R.

, Three-way investment decisions with decision-theoretic rough set, International Journal of Computational Intelligence Systems 4 (2011), 66–74.

13.

Liu

G.L.

and Sai

, A comparison of two types of rough set induced by coverings, International Journal of Approximate Reasoning 50(3) (2009), 521–528.

14.

Lurie

J.D.

and Sox

H.C.

, Principles of medical decision making, Spine 24 (1999), 493–498.

15.

Pawlak

, Rough set, International Journal of Computer Information Science 11(5) (1982), 341–356.

16.

Pawlak

, Wong

S.K.W.

and Ziarko

, Rough set: Probabilistic versus deterministic approach, International Journal of Man-Machine Studies 29 (1988), 81–95.

17.

Shi

Z.H.

and Gong

Z.T.

, The further investigation of covering based rough set: uncertainty characterization, similarity meameasure and generalized models, Information Sciences 180(19) (2010), 3745–3763.

18.

Sun

, Ma

and Zhao

, Decision-theoretic rough fuzzy set model and application, Information Sciences 283 (2014), 180–196.

19.

Ślęzak

, Rough set, Bayes factor, LNCS Transactions on Rough set III, LNCS 3400 (2005), 202–229.

20.

Ślęzak

and Ziarko

, The investigation of the Bayesian rough set model, International Journal of ApproximateReasoning 40 (2005), 81–91.

21.

Wang

C.Z.

, He

, Chen

D.G.

and Hu

Q.H.

, A novel method for attribute reduction of covering decision systems, Information Sciences 254 (2014), 181–196.

22.

W.H.

and Zhang

W.X.

, Measuring roughness of generalized rough set induced by a covering, Fuzzy Sets and Systems 158 (2007), 2443–2455.

23.

Yang

X.P.

and Yao

J.T.

, Modelling multi-agent three-way decisions with decision-theoretic rough set, Fundamenta Informaticae 115 (2012), 157–171.

24.

Yao

Y.Y.

, Information granulation and approximation in a decision-theoretical model of rough set, in: Polkowski

, Pal

S.K.

and Skowron

, (Eds),Roughneuro Computing: Techniques for Computing with Words, Springer, Berlin, 2003, pp. 491–516.

25.

Yao

Y.Y.

, Decision-theoretic rough set models, in: Rough Sets and Knowledge Technology, Second International Conference, RSKT 2007, Proceedings, LNAI 4481, 2007, pp. 1–12.

26.

Yao

Y.Y.

, Probabilistic rough set approximations, International Journal of Approximate Reasoning 49(2) (2008), 255–271.

27.

Yao

Y.Y.

, Three-way decision: An interpretation of rules in rough set theory, in: Rough set and Knowledge and Data Engineering 21(7) (2009), 1014–1026.

28.

Yao

Y.Y.

, Three-way decisions with probabilistic rough set, Information Sciences 180(3) (2010), 341–353.

29.

Yao

Y.Y.

, The superiority of three-way decisions in probabilistic rough set models, Information Sciences 181(6) (2011), 1080–1096.

30.

Yao

Y.Y.

, An outline of a theory of three-way decisions, in: Yao

, Yang

, Slowinski

, Greco

, Li

, Mitra

and Polkowski

, (Eds.), Proceedings of the 8th International RSCTC Conference, LNCS (LNAI) 7413 2012, pp. 1–17.

31.

Yao

Y.Y.

, Wong

S.K.M.

and Lingras

, A decision-theoretic rough set model, in: Ras

Z.W.

, Zemankova

and Emrich

M.L.

, (Eds.), Methodologies for Intelligent Systems, 5, North-Holland, New York, 1990, pp. 17–24.

32.

Yao

Y.Y.

and Wong

S.K.W.

, A decision theoretic framework for approximating concepts, International Journal of Manmachine Studies 37(6) (1992), 793–809.

33.

Yun

Z.Q.

, Ge

and Bai

X.L.

, Axiomatization and conditions for neighborhoods in a covering to from a partition, Information Sciences 181(9) (2011), 1735–1740.

34.

Zadeh

L.A.

, Fuzzy sets, Information and Control 8 (1965), 338–353.

35.

Ziarko

, Variable precision rough set model, Journal of Computer and System Science 46 (1993), 39–59.

36.

Zhang

Y.L.

, Li

J.J.

and Wu

W.Z.

, On axiomatic characterizations of three pairs of covering based approximation operators, Information Sciences 180(2) (2010), 274–287.

37.

Zhang

Y.L.

and Luo

M.K.

, On minimization of axiom sets characterizing covering-based approximation operators, Information Sciences 181(14) (2011), 3032–3042.

38.

Zhao

X.R.

and Hu

B.Q.

, Fuzzy probabilistic rough set and their corresponding three-way decisions, submitted to Knowledge-Based Systems.

39.

Zhou

, Wu

W.Z.

and Zhang

W.X.

, On characterization of intuitionistic fuzzy rough set based on intuitionistic fuzzy implicators, Information Sciences 179 (2009), 883–898.

40.

Zhua

and Wen

Q.Y.

, Entropy and co-entropy of a covering approximation space, International Journal of Approximate Reasoning 53 (2012), 528–540.

41.

Zhu

, Topological approaches to covering rough set, Information Sciences 177(6) (2007), 1499–1508.

42.

Zhu

, Relationship between generalized rough set based on binary relation and covering, Information Sciences 179(3) (2009), 210–225.

43.

Zhu

and Wang

F.Y.

, Reduction and axiomization of covering generalized rough set, Information Sciences 152 (2003), 217–230.

44.

Zhu

and Wang

F.Y.

, On three types of covering-based rough set, IEEE Transactions on Knowledge and Data Engineering 19(8) (2007), 1131–1144.