Some Properties of Residual R -norm Entropy and Divergence Measures

Abstract

Shannon entropy plays an important role in measuring the expected uncertainty contained in the probability density function about the predictability of an outcome of a random variable. However, in certain systems, Shannon entropy may not be appropriate, where some generalized versions of it are only suitable. One such generalization is due to Boekee and Lubee[1], called R-norm entropy. Recently, Nanda and Das[2] studied the R-norm entropy and its divergence measure in the context of used items, useful in reliability modelling. In the present article, we further study R-norm entropy and divergence in the context of weighted models. We also extend these measures to the conditionally specified and conditional survival models, and studied their properties.

Keywords

reliability measures weighted models conditionally specified models stochastic ordering

1 Introduction

Let f be the probability density function of an absolutely continuous non-negative random variable (rv) X. A classical measure of uncertainty for X is the differential entropy, known as Shannon entropy, defined as

H (f) = - E (ln f (X)) = - \int_{0}^{\infty} (ln f (x)) f (x) dx,

(1.1)

where ‘ln’ means natural logarithm. If a unit has survived to an age t, then

H (f)

is not a useful tool for measuring the uncertainty about remaining lifetime of the unit. Accordingly, Ebrahimi[3] introduced a dynamic measure of uncertainty based on the residual lifetime

(X - t | X > t)

, defined by

H (f; t) = - \int_{t}^{\infty} (ln \frac{f (x)}{\overset{̅}{F} (t)}) \frac{f (x)}{\overset{̅}{F} (t)} dx,

(1.2)

where

\overset{̅}{F} (t) = P (X > t) = 1 - F (t)

is the survival function of the rv X. After the unit has elapsed time t, Equation (1.2) measures the expected uncertainty contained in the conditional density of

X - t

, given

X > t

about the predictability of remaining lifetime of the unit.

Several generalizations of the classical Shannon entropy are available in literature, obtained by introducing some additional parameters to it[5],[6],[7]. All these entropies, when the additional parameters tend to one, reduces to Equation (1.1). These generalized information measures have many important properties such as smoothness, large dynamic range with respect to certain conditions that make them applicable in practice. Pharwaha and Singh[8] showed that non-Shannon measures can be used to determine the randomness of mammograms because of having higher dynamic range than Shannon's entropy over a variety of scattering conditions. Non-Shannon measures are also applicable in estimating scatter density and regularity[9]. A generalization of Shannon entropy useful in pattern recognition and coding theory is due to Boekee and Lubee[1], called R-norm entropy, in the discrete case. It has simple relationships with other generalizations of Shannon entropy, namely Renyi's entropy of order α[5] and information of type β[10]. If we define an appropriate measure for the average length of codewords, then coding theorems can be obtained using R-norm entropy. For more recent works and applications of R-norm entropy in coding theory, one could refer to Kumar and Choudhary[11] and Kumar, Ram, and Gupta[12]. A continuous version of R-norm entropy is introduced by Nanda and Das[2], given as

H_{R} (f) = \frac{R}{R - 1} [1 - {(\int_{0}^{\infty} f^{R} (x) dx)}^{\frac{1}{R}}],

(1.3)

which is a real-valued function, for

R \neq 1, R \in (0, \infty)

. When

R \to 1

, Equation (1.3) reduces to Equation (1.1).

Nanda and Das[2] has introduced the residual R-norm entropy, given by

H_{R} (f; t) = \frac{R}{R - 1} [1 - {(\int_{t}^{\infty} {(\frac{f (x)}{\overset{̅}{F} (t)})}^{R} dx)}^{\frac{1}{R}}],

(1.4)

for

R \in (0, \infty), R \neq 1

. Note that as

R \to 1

, Equation (1.4) approaches to Equation (1.2).

Measures of divergence plays an important role in measuring the distance between two populations or distribution functions. Among the different discrimination measures available in literature, a popular one is the Kullback–Leibler (KL) divergence measure. Let X and $Y$ be two absolutely continuous rvs with the common support $S = (l, \infty)$ for $l \geq 0$ . Denote by f, f and $\overset{̅}{F}$ , the probability density function (pdf), the cumulative distribution function (cdf) and the survival function (sf) of X, respectively, and by $g$ , $G$ and $\overset{̅}{G}$ , the corresponding functions of $Y$ . As an information distance between X and $Y$ , Kullback and Leibler[13] proposed a directed divergence (also known as information divergence, information gain, relative entropy or discrimination measure) defined by

H (f, g) = \int_{l}^{\infty} f (x) ln \frac{f (x)}{g (x)} dx .

(1.5)

Note that if $f = g$ (a.e.), then $H (f, g) = 0$ . As the pdf of $g$ is dissimilar or farther from the pdf of f, then $H (f; g)$ is large. For used items, KL divergence of f and $g$ at time t[4], is of the form

H (f, g, t) = \int_{t}^{\infty} \frac{f (x)}{\overset{̅}{F} (t)} (ln \frac{f (x) / \overset{̅}{F} (t)}{g (x) / \overset{̅}{G} (t)}) dx .

(1.6)

For $t = 0$ , Equation (1.6) reduces to Equation (1.5) for $l = 0$ . There are different generalizations on KL divergence and a recent approach is due to Nanda and Das[2], based on the R-norm entropy and its residual form. The residual R-norm entropy divergence is given by

H_{R} (f, g; t) = \frac{R}{R - 1} [{(\int_{t}^{\infty} \frac{f (x)}{\overset{̅}{F} (t)} {(\frac{f (x) / \overset{̅}{F} (t)}{g (x) / \overset{̅}{G} (t)})}^{R - 1} dx)}^{\frac{1}{R}} - 1] .

(1.7)

For more properties of the residual R-norm entropy Equation (1.4) and its divergence Equation (1.7), we refer to Nanda and Das.[2]

The concept of weighted distributions is usually considered in connection with modelling statistical data, where the usual practice of employing standard distributions is not found appropriate in some cases. Associated to a rv X with probability density function f and to a non-negative real function $w$ , we can define the weighted random varaible $X_{w}$ with density function

f^{w} (t) = \frac{w (t) f (t)}{Ew (X)},

(1.8)

where we assume

0 < E (w (X)) < \infty

. When

w (t) = t

X_{w}

is called the length (or size) biased rv, and it is denoted by

X^{*}

. For recent works on weighted distributions, we refer the reader to Navarro, Ruiz, and del Aguila,[14] Bartoszewicz,[15] Kumar and Taneja,[16] and Sunoj and Sreejith.[17]

It is inherently difficult to visualize bivariate distributions. Conditional densities can be easily visualized unlike marginal or joint densities. A variety of transformation are being used to characterize the joint distribution function. Joint characteristic function, joint moment generating function, and joint hazard function are some among them. They are well defined and will determine the joint distribution function uniquely. Sometimes one could identify joint distribution by specifying one of the marginal and a conditional density. That is, the knowledge of one marginal density say $f_{X}$ (or $f_{Y}$ ) and the conditional density of $(Y | X = x)$ , say $f_{Y | X}$ (or the conditional density of $(X | Y = y)$ , say $f_{X | Y}$ ) will completely specify the joint density density function $f_{X, Y}$ . Alternatively, one may specify the distribution solely in terms of the features of two families of conditional densities. This approach is called conditional specification of the joint distribution[18]. Another conditioning popular in literature is the conditional survival models, in which component survival, that is, on events such as $[X > x]$ and $[Y > y]$ , have been conditioned. These two types of models are often useful in two component reliability systems where the operational status of one component is known in advance. For more recent works on conditionally specified and survival models, we refer to Arnold[19], Arnold, Castillo, and Sarabia[18], Sunoj and Sankaran[20], Navarro and Sarabia[21], Sunoj and Linu[22],23], and Navarro, Sunoj, and Linu[24],[25], and the references therein.

In the present article, the residual R-norm entropy Equation (1.4) and its relative form Equation (1.7) are extended to the weighted models, and obtain some bounds based on it. These two measures are also extended to the conditionally specified and survival models. For the conditonally specified models, the concept of proportional hazard rate (PHR) models and R-norm divergence in the context of weighted distribution are studied to obtain some new properties. Moreover, bounds for these entropy measures in terms of hazard rates have been obtained using the likelihood ratio (LR) ordering.

2 Weighted Residual R-norm Entropy

Using Equations (1.4) and (1.8), the residual R-norm entropy for weighted rv $X_{w}$ is given by

H_{R} (f^{w}; t) = \frac{R}{R - 1} [1 - {(\int_{t}^{\infty} \frac{f_{w}^{R} (x)}{{\overset{̅}{F}}_{w}^{R} (x)} dx)}^{\frac{1}{R}}] .

(2.1)

The corresponding weighted residual R-norm divergence measure based on Equation (1.7) is of the form

H_{R} (f, f^{w}; t) = \frac{R}{R - 1} [{(\int_{t}^{\infty} \frac{f (x)}{\overset{̅}{F} (t)} {(\frac{f (x) / \overset{̅}{F} (t)}{f_{w} (x) / {\overset{̅}{F}}_{w} (t)})}^{R - 1} dx)}^{\frac{1}{R}} - 1] .

(2.2)

For certain probability distributions, the computation of $H_{R} (f^{w}; t)$ becomes difficult and in such cases bounds of the measure in terms of other known measures are much useful. The following inequality provides an upper bound for $H_{R} (f^{w}; t)$ .

Theorem 2.1. For $R \neq 1$ and if $w (x)$ is increasing in X, then

[1 - (\frac{R - 1}{R}) H_{R} (f^{w}; t)] \geq \frac{h_{X_{w}} (t)}{h_{X} (t)} [1 - (\frac{R - 1}{R}) H_{R} (f; t)],

(2.3)

where

h_{X} (t) = \frac{f (t)}{\overset{̅}{F} (t)}

and

h_{X_{w}} (t) = \frac{f_{w} (t)}{{\overset{̅}{F}}_{w} (t)}

denotes the failure (hazard) rate functions of X and

X_{w}

, respectively.

Proof: Since $w (x)$ is increasing in X, we have $w (x) \geq w (t) \forall x > t$ . Therefore,

\begin{matrix} [1 - (\frac{R - 1}{R}) H_{R} (f^{w}; t)] & = & {(\int_{t}^{\infty} \frac{f_{w}^{R} (x)}{{\overset{̅}{F}}_{w}^{R} (t)} dx)}^{\frac{1}{R}} \\ = & {(\int_{t}^{\infty} \frac{w^{R} (x) f^{R} (x)}{{[E (w (X) | X > t]}^{R} {\overset{̅}{F}}^{R} (t)} dx)}^{\frac{1}{R}} \\ \geq & \frac{w (t)}{E (w (X) | X > t)} {(\int_{t}^{\infty} \frac{f^{R} (x)}{{\overset{̅}{F}}^{R} (t)} dx)}^{\frac{1}{R}} \\ = & \frac{h_{X_{w}} (t)}{h_{X} (t)} [1 - (\frac{R - 1}{R}) H_{R} (f; t)] . \end{matrix}

Example 2.1 Considering Pareto I distribution with pdf $f (x) = {ck}^{c} x^{- c - 1}, x > k, k > 0, c > 1$ , and $w (x) = x$ , we can easily illustrate Theorem 2.1.

Definition 2.1. If X and $Y$ have probability density function f and $g$ , respectively, then X is said to be less than $Y$ in the LR order (denoted by $X \leq_{LR} Y$ ), if $\frac{g}{f}$ is increasing in the union of their supports.

We now obtain a bound for the residual R-norm divergence using LR order. It gives simple upper (lower) bound to $H_{R} (f, g; t)$ in terms of hazard functions of the rvs X and $Y$ . Thus, the bounds of $H_{R} (f, g; t)$ can be easily accomplished with the knowledge of reliability functions of X and $Y$ .

Theorem 2.2. If $X \leq_{LR} Y$ , then for $R > 1 (0 < R < 1)$

[1 + (\frac{R - 1}{R}) H_{R} (f, g; t)] \leq (\geq) {(\frac{h_{X} (t)}{h_{Y} (t)})}^{\frac{R - 1}{R}} .

Proof: Since $X \leq_{LR} Y$ , $\frac{f (x)}{g (x)}$ is decreasing in X. Therefore, $\frac{f (x)}{g (x)} \leq \frac{f (t)}{g (t)}$ for every $x > t$ . For $R > 1$ ,

\begin{matrix} [1 + (\frac{R - 1}{R}) H_{R} (f, g; t)] & = & {(\int_{t}^{\infty} \frac{f (x)}{\overset{̅}{F} (t)} {(\frac{f (x) / \overset{̅}{F} (t)}{g (x) / \overset{̅}{G} (t)})}^{R - 1} dx)}^{\frac{1}{R}} \\ \leq & {(\int_{t}^{\infty} \frac{f (x)}{\overset{̅}{F} (t)} {(\frac{h_{X} (t)}{h_{Y} (t)})}^{R - 1} dx)}^{\frac{1}{R}}, \end{matrix}

proves the result. The case for

0 < R < 1

is similar.

Corollary 2.1. If $w (x)$ is increasing in X (or $X \leq_{LR} X_{w}$ ), then for $R > 1 (0 < R < 1)$

(1 + (\frac{R - 1}{R}) H_{R} (f, f^{w}; t)) \leq (\geq) {(\frac{h_{X} (t)}{h_{X_{w}} (t)})}^{\frac{R - 1}{R}} .

Example 2.2. Corrolary 2.1 follows easily from Example 2.1.

The following theorem provides bounds for $H_{R} (f, g; t)$ , in terms of the hazard funtion of Y and $H_{R} (X; t)$ .

Theorem 2.3. If $g (x)$ is decreasing in X, then for $R > 1 (0 < R < 1)$

[1 + (\frac{R - 1}{R}) H_{R} (f, g; t)] \geq (\leq) {[\frac{1}{h_{Y} (t)}]}^{\frac{R - 1}{R}} [1 - (\frac{R - 1}{R}) H_{R} (f; t)] .

Proof: Since $g (x)$ is decreasing in X, $g (x) \leq g (t) \forall x > t$ . For $R > 1$ , we have

\begin{matrix} [1 + (\frac{R - 1}{R}) H_{R} (f, g; t)] & = & {(\int_{t}^{\infty} \frac{f (x)}{\overset{̅}{F} (t)} {(\frac{f (x) / \overset{̅}{F} (t)}{g (x) / \overset{̅}{G} (t)})}^{R - 1} dx)}^{\frac{1}{R}} \\ \geq & {(\frac{1}{h_{Y} (t)})}^{\frac{R - 1}{R}} {(\int_{t}^{\infty} \frac{f^{R} (x)}{{\overset{̅}{F}}^{R} (t)} dx)}^{\frac{1}{R}} . to 0 pt □ \end{matrix}

Corollary 2.2. If $f^{w} (x)$ is decreasing in X, then for $R > 1 (0 < R < 1)$

[1 + (\frac{R - 1}{R}) H_{R} (f, f^{w}; t)] \geq (\leq) {(\frac{1}{h_{X_{w}} (t)})}^{\frac{R - 1}{R}} [1 - (\frac{R - 1}{R}) H_{R} (f; t)] .

Example 2.3. Considering an exponential rv with probability density function $f (x) = λ e^{- λ x}, x > 0, λ > 0$ , and $w (x) = e^{- ax}, a > 0$ . Then, $f_{w} (x) = (λ + a) e^{- (λ + a) x}, λ, a > 0, x > 0$ is decreasing in X. Then, for $R > 1$ , we get

\begin{matrix} [1 + (\frac{R - 1}{R}) H_{R} (f, f^{w}; t)] & = & \frac{λ}{{(λ - a (R - 1))}^{\frac{1}{R}} {(λ + a)}^{\frac{R - 1}{R}}} \\ = & \frac{λ^{\frac{R - 1}{R}}}{R^{\frac{1}{R}} {(λ + a)}^{\frac{R - 1}{R}}} {(\frac{λ R}{λ - a (R - 1)})}^{\frac{1}{R}} \\ = & {(\frac{λ R}{λ - a (R - 1)})}^{\frac{1}{R}} {(\frac{1}{h_{X_{w}} (t)})}^{\frac{R - 1}{R}} [1 - (\frac{R - 1}{R}) H_{R} (f; t)] \\ \geq & {(\frac{1}{h_{X_{w}} (t)})}^{\frac{R - 1}{R}} [1 - (\frac{R - 1}{R}) H_{R} (f; t)] . \end{matrix}

3 Residual R-norm Entropy for Conditionally Specified Models

In this section, we study the residual R-norm entropy measure Equation (2.1) and its divergence measure Equation (2.2) for conditionally specified models. Let $(X_{1}, X_{2})$ and $(Y_{1}, Y_{2})$ be two bivariate random vectors with respect to Lebesgue measure in the positive quadrant $R_{+}^{2} = \{(t_{1}, t_{2}) |t_{i} > 0, i = 1, 2\}$ of the two-dimensional Euclidean space $R^{2}$ . The joint probability density function and survival function of $(X_{1}, X_{2})$ are denoted by f and $\overset{̅}{F}$ and that of $(Y_{1}, Y_{2})$ by $g$ and $\overset{̅}{G}$ , respectively. Consider the conditionally specified rvs $(X_{i} | X_{j} = t_{j})$ and $(Y_{i} | Y_{j} = t_{j})$ for $i, j = 1, 2$ , $i \neq j$ . Their probability density function, survival function, and hazard rates are denoted by $f_{i} (t_{i} | t_{j})$ , $g_{i} (t_{i} | t_{j})$ , ${\overset{̅}{F}}_{i} (t_{i} | t_{j})$ , ${\overset{̅}{G}}_{i} (t_{i} | t_{j})$ , $h_{i}^{X} (t_{i} | t_{j})$ , and $h_{i}^{Y} (t_{i} | t_{j})$ , respectively for $i = 1, 2, i \neq j$ . Using Equation (2.1), the conditionally specified residual R-norm entropy for $(X_{1} | X_{2} = t_{2})$ and $(X_{2} | X_{1} = t_{1})$ are defined respectively by

H_{X_{1}}^{R} (f; t_{1}, t_{2}) = \frac{R}{R - 1} [1 - {(\int_{t_{1}}^{\infty} {(\frac{f_{1} (x_{1} | t_{2}}{{\overset{̅}{F}}_{1} (t_{1} | t_{2})})}^{R} {dx}_{1})}^{\frac{1}{R}}]

(3.1)

and

H_{X_{2}}^{R} (f; t_{1}, t_{2}) = \frac{R}{R - 1} [1 - {(\int_{t_{2}}^{\infty} {(\frac{f_{2} (x_{2} | t_{1}}{{\overset{̅}{F}}_{2} (t_{2} | t_{1})})}^{R} {dx}_{2})}^{\frac{1}{R}}] .

(3.2)

Note that

H_{X_{1}}^{R} (f; t_{1}, t_{2}) = H_{(X_{1} | X_{2} = t_{2})}^{R} (f; t_{1})

and

H_{X_{2}}^{R} (f; t_{1}, t_{2}) = H_{(X_{2} | X_{1} = t_{1})}^{R} (f; t_{2})

A basic problem in reliability analysis, when the data on lifetimes is the only input, is to identify the underlying distribution that is supposed to generate the observations. A standard practice adopted in such modelling situations is to ascertain the physical properties of some basic reliability concepts such as failure rate, mean residual life, vitality, etc., express them by means of equations or inequalities, and then solve them to obtain the model. Accordingly, the following theorem characterizes three important bivariate lifetime models based on a functional relationships between conditional residual R-norm entropy and conditional hazard rates.

Theorem 3.1. For the random vector $(X_{1}, X_{2})$ , the relationships

H_{X_{1}}^{R} (f; t_{1}, t_{2}) = \frac{R}{R - 1} [1 - {(C {(h_{1}^{X} (t_{1} | t_{2}))}^{R - 1})}^{\frac{1}{R}}]

(3.3)

and

H_{X_{2}}^{R} (f; t_{1}, t_{2}) = \frac{R}{R - 1} [1 - {(C {(h_{2}^{X} (t_{2} | t_{1}))}^{R - 1})}^{\frac{1}{R}}],

(3.4)

where

C

is a constant independent of

t_{1}

and

t_{2}

hold, if and only if, it is distributed as

(a) bivariate distribution with Pareto conditional given in Arnold [26] with pdf

f (x_{1}, x_{2}) = c_{1} {(1 + a_{1} x_{1} + a_{2} x_{2} + {bx}_{1} x_{2})}^{- c}, c_{1}, a_{1}, a_{2}, b > 0, c > 2, x_{1}, x_{2} > 0,

(b) bivariate distribution with exponential conditionals of Arnold and Strauss [27] with pdf

f (x_{1}, x_{2}) = c_{2} e^{- α_{1} x_{1} - α_{2} x_{2} - β x_{1} x_{2}}, c_{2}, α_{1}, α_{2}, β > 0, x_{1}, x_{2} > 0,

(c) bivariate distribution with beta conditionals with pdf

\begin{matrix} f (x_{1}, x_{2}) & = & c_{3} {(1 - p_{1} x_{1} - p_{2} x_{2} + {qx}_{1} x_{2})}^{d}, c_{3}, p_{1}, p_{2}, q, d > 0, \\ 0 < x_{1} < \frac{1}{p_{1}}, 0 < x_{2} < \frac{1 - p_{1} x_{1}}{p_{2} - {qx}_{1}} \end{matrix}

accordingly as

C =_{>}^{<} \frac{1}{R}

for

R > 1

and

C =_{<}^{>} \frac{1}{R}

for

0 < R < 1

Proof: The first part is straightforward. To prove the converse, we assume that Equation (3.3) holds and assume that $R > 1$ . Then, Equation (3.3) is equivalent to

H_{X_{1}}^{R} (f; t_{1}, t_{2}) = \frac{R}{R - 1} [1 - {(C {(h_{1}^{X} (t_{1} | t_{2}))}^{R - 1})}^{\frac{1}{R}}] .

Then,

\int_{t_{1}}^{\infty} {(\frac{f_{1} (x_{1} | t_{2}}{{\overset{̅}{F}}_{1} (t_{1} | t_{2})})}^{R} {dx}_{1} = C {(h_{1}^{X} (t_{1} | t_{2}))}^{R - 1},

\int_{t_{1}}^{\infty} {(f_{1} (x_{1} | t_{2})}^{R} {dx}_{1} = C {(h_{1}^{X} (t_{1} | t_{2}))}^{R - 1} {({\overset{̅}{F}}_{1} (t_{1} | t_{2}))}^{R} .

(3.5)

Differentiating Equation (3.5) with respect to $t_{1}$ , we get

\begin{matrix} - {(f_{1} (x_{1} | t_{2})}^{R} & = & C (R - 1) {(h_{1} (t_{1} | t_{2}))}^{R - 2} \frac{\partial}{\partial t_{1}} h_{1}^{X} (t_{1} | t_{2}) {({\overset{̅}{F}}_{1} (t_{1} | t_{2}))}^{R} \\ - CR {(h_{1}^{X} (t_{1} | t_{2}))}^{R - 1} {({\overset{̅}{F}}_{1} (t_{1} | t_{2}))}^{R - 1} f_{1} (t_{1} | t_{2}) \end{matrix}

(3.6)

Dividing Equation (3.6) by $({\overset{̅}{F}}_{1} (t_{1} | t_{2}))^{R} (h_{1}^{X} (t_{1} | t_{2}))^{R}$ , we get

- 1 = \frac{C (R - 1)}{{(h_{1}^{X} (t_{1} | t_{2}))}^{2}} \frac{\partial}{\partial t_{1}} h_{1}^{X} (t_{1} | t_{2}) - CR,

\frac{CR - 1}{C (R - 1)} = - \frac{\partial}{\partial t_{1}} (\frac{1}{h_{1}^{X} (t_{1} | t_{2})}) .

Equivalently,

\frac{CR - 1}{C (1 - R)} = \frac{\partial}{\partial t_{1}} (\frac{1}{h_{1}^{X} (t_{1} | t_{2})}) .

(3.7)

Integrating Equation (3.7) with respect to $t_{1}$ , we obtain

(\frac{1}{h_{1}^{X} (t_{1} | t_{2})}) = \frac{CR - 1}{C (1 - R)} t_{1} + B_{1} (t_{2}) = {At}_{1} + B_{1} (t_{2}) .

(3.8)

Applying similar steps on Equation (3.4), we get

(\frac{1}{h_{2}^{X} (t_{2} | t_{1})}) = \frac{CR - 1}{C (1 - R)} t_{2} + B_{2} (t_{1}) = {At}_{2} + B_{2} (t_{1}),

(3.9)

where

A = \frac{CR - 1}{C (1 - R)}

. Equations (3.8) and (3.9) are equivalent to

h_{1}^{X} (t_{1} | t_{2}) = \frac{1}{{At}_{1} + B_{1} (t_{2})}

and

h_{2}^{X} (t_{2} | t_{1}) = \frac{1}{{At}_{2} + B_{2} (t_{1})} .

The remaining part of the proof follows directly from Theorem 4.2 of Sunoj and Linu[22]. Similar steps hold for $0 < R < 1$ .

Consider the random vector $(X_{1}^{w}, X_{2}^{w})$ which has bivariate weighted distribution associated to $(X_{1}, X_{2})$ and non-negative real functions $w_{1}$ and $w_{2}$ , whose probability density function is given by

f^{w} (x_{1}, x_{2}) = \frac{w_{1} (x_{1}) w_{2} (x_{2}) f (x_{1}, x_{2})}{E (w_{1} (X_{1}) w_{2} (X_{2}))},

(3.10)

where f is the joint probability density function of

(X_{1}, X_{2})

and

0 < E (w_{1} (X_{1}) w_{2} (X_{2})) < \infty

. From Equation (3.10), it is easy to see that the marginal rv

X_{i}^{w}

has the univariate weighted distribution associated with

X_{i}

with weight function

w_{i}^{*} (x_{i}) = w_{i} (x_{i}) E (w_{3 - i} (X_{3 - i} | X_{i} = x_{i})), i = 1, 2 .

Similarly, we can easily show that $(X_{i}^{w} | X_{3 - i}^{w} = t)$ has the univariate weighted distribution associated with $(X_{i} | X_{3 - i} = t)$ and $w_{i} (x_{i})$ for $i = 1, 2$ , given by

f_{i}^{w} (x_{i} | x_{j}) = \frac{w_{i}^{*} (x_{i}) f_{i} (x_{i} | x_{j})}{{Ew}_{i}^{*} (X_{i})}, i, j = 1, 2, i \neq j .

Now we obtain the following bounds based on the weighted conditional rv $(X_{i}^{w} | X_{3 - i}^{w} = t_{3 - i})$ and $(X_{i} | X_{3 - i} = t_{3 - i})$ for $i = 1, 2$ .

Theorem 3.2. If $w_{i} (x_{i})$ is decreasing in $x_{i}$ for $i = 1, 2$ , then for $R \neq 1$

[1 - (\frac{R - 1}{R}) H_{X_{1}^{w}}^{R} (f; t_{1}, t_{2})] \leq \frac{h_{1}^{X_{w}} (t_{1} | t_{2})}{h_{1}^{X} (t_{1} | t_{2})} [1 - (\frac{R - 1}{R}) H_{X_{1}}^{R} (f; t_{1}, t_{2})],

and

[1 - (\frac{R - 1}{R}) H_{X_{2}^{w}}^{R} (f; t_{1}, t_{2})] \leq \frac{h_{2}^{X_{w}} (t_{2} | t_{1})}{h_{2}^{X} (t_{2} | t_{1})} [1 - (\frac{R - 1}{R}) H_{X_{2}}^{R} (f; t_{1}, t_{2})] .

Proof: By definition, we have

\begin{matrix} [1 - (\frac{R - 1}{R}) H_{X_{1}^{w}}^{R} (f; t_{1}, t_{2})] & = & {(\int_{t_{1}}^{\infty} \frac{{(f_{1}^{w} (x_{1} | t_{2}))}^{R}}{{({\overset{̅}{F}}_{1}^{w} (t_{1} | t_{2}))}^{R}} {dx}_{1})}^{\frac{1}{R}} \\ = & {(\int_{t_{1}}^{\infty} \frac{{(w_{1} (x_{1}) f_{1} (x_{1} | t_{2}))}^{R}}{{(E_{1} (w_{1} (X_{1}) | t_{2}) {\overset{̅}{F}}_{1} (t_{1} | t_{2}))}^{R}} {dx}_{1})}^{\frac{1}{R}} \\ \leq & \frac{w_{1} (t_{1})}{E_{1} (w_{1} (X_{1}) | t_{2})} {(\int_{t_{1}}^{\infty} \frac{{(f_{1} (x_{1} | t_{2}))}^{R}}{{({\overset{̅}{F}}_{1} (t_{1} | t_{2}))}^{R}} {dx}_{1})}^{\frac{1}{R}}, \end{matrix}

which completes the theorem.

Example 3.1. Suppose $(X_{1}, X_{2})$ follows Arnold and Strauss bivariate exponential distribution given in Theorem 3.1, and taking the weight function $w_{i} (x_{i}) = \frac{1}{x_{i}}$ , a decreasing function, we obtain

\begin{matrix} [1 - (\frac{R - 1}{R}) & H_{X_{i}^{w}}^{R} (f; t_{1}, t_{2})] = \frac{(λ_{i} + θ t_{j} + 1)}{[R (λ_{i} + θ t_{j} + 1)]^{\frac{1}{R}}}, \\ = (\frac{λ_{i} + θ t_{j} + 1}{λ_{i} + θ t_{j}}) \frac{(λ_{i} + θ t_{j})}{[R (λ_{i} + θ t_{j})]^{\frac{1}{R}}} {(\frac{λ_{i} + θ t_{j}}{λ_{i} + θ t_{j} + 1})}^{\frac{1}{R}}, \\ = \frac{h_{X_{i}^{w} | X_{j}^{w}} (t_{i} | t_{j})}{h_{X_{i} | X_{j}} (t_{i} | t_{j})} [1 - (\frac{R - 1}{R}) H_{X_{i}}^{R} (f; t_{1}, t_{2})] {(\frac{λ_{i} + θ t_{j}}{λ_{i} + θ t_{j} + 1})}^{\frac{1}{R}}, \\ \leq \frac{h_{X_{i}^{w} | X_{j}^{w}} (t_{i} | t_{j})}{h_{X_{i} | X_{j}} (t_{i} | t_{j})} [1 - (\frac{R - 1}{R}) H_{X_{i}}^{R} (f; t_{1}, t_{2})] . \end{matrix}

We now define the conditional residual R-norm divergence between the rvs $(X_{1} | X_{2} = t_{2})$ and $(Y_{1} | Y_{2} = t_{1})$ , and $(X_{2} | X_{1} = t_{1})$ and $(Y_{2} | Y_{1} = t_{1})$ as

H_{X_{1}, Y_{1}}^{R} (f, g; t_{1}, t_{2}) = \frac{R}{R - 1} [{\{\int_{t_{1}}^{\infty} \frac{f_{1} (x_{1} | t_{2})}{{\overset{̅}{F}}_{1} (t_{1} | t_{2})} {(\frac{f_{1} (x_{1} | t)_{2} / - {\overset{̅}{F}}_{1} (t_{1} | t_{2})}{g_{1} (x_{1} | t_{2}) / - {\overset{̅}{G}}_{1} (t_{1} | t_{2})})}^{R - 1} {dx}_{1}\}}^{\frac{1}{R}} - 1]

(3.11)

and

H_{X_{2}, Y_{2}}^{R} (f, g; t_{1}, t_{2}) = \frac{R}{R - 1} [{\{\int_{t_{2}}^{\infty} \frac{f_{2} (x_{2} | t_{1})}{{\overset{̅}{F}}_{2} (t_{2} | t_{1})} {(\frac{f_{2} (x_{2} | t_{1}) / - {\overset{̅}{F}}_{2} (t_{2} | t_{1})}{g_{2} (x_{2} | t_{1}) / - {\overset{̅}{G}}_{2} (t_{2} | t_{1})})}^{R - 1} {dx}_{2}\}}^{\frac{1}{R}} - 1],

(3.12)

where

H_{X_{1}, Y_{1}}^{R} (f, g; t_{1}, t_{2}) = H_{(X_{1} | X_{2} = t_{2}), (Y_{1} | Y_{2} = t_{2})}^{R} (f, g; t_{1})

, and

H_{X_{2}, Y_{2}}^{R} (f, g; t_{1}, t_{2}) = H_{(X_{2} | X_{1} = t_{1}), (Y_{2} | Y_{1} = t_{1})}

(f, g; t_{2})

. Hence, Equations (3.11) and (3.12) provide dynamic information on the distance between the conditionally specified rvs

(X_{1} | X_{2} = t_{2})

and

(Y_{1} | Y_{2} = t_{1})

, and

(X_{2} | X_{1} = t_{1})

and

(Y_{2} | Y_{1} = t_{1})

Cox's PHR model is the most widely used semi-parametric model in survival studies. Two rvs X and $Y$ and with common support $(l, \infty)$ satisfy the PHR model, when

h_{Y} (t) = θ h_{X} (t),

(3.13)

for all

t \geq l

, where

θ > 0

, and

h_{Y} = \frac{g}{\overset{̅}{G}}

and

h_{X} = \frac{f}{\overset{̅}{F}}

are the respective failure (or hazard) rate functions. Equivalently, X and

Y

satisfy the PHR model, when

\overset{̅}{G} (t) = (\overset{̅}{F} (t))^{θ}

for all

t \geq l

[28]. Nanda and Das[2] obtained the following result.

Theorem 3.3. (Nanda and Das (2006)) $H_{R} (f, g; t)$ is independent of t, if and only if, f and $G$ satisfy the PHR model in Equation (3.13).

In a similar way, the random vectors $(X_{1}, X_{2})$ and $(Y_{1}, Y_{2})$ satisfy the conditional proportional hazard rate (CPHR) model[29], when the corresponding conditional hazard rate functions of $(X_{i} | X_{3 - i} = t_{3 - i})$ and $(Y_{i} | Y_{3 - i} = t_{3 - i})$ satisfy

h (Y_{i} | Y_{3 - i} = t_{3 - i}) = θ_{i} (t_{3 - i}) h (X_{i} | X_{3 - i} = t_{3 - i})

(3.14)

for

i = 1, 2

and

t_{1}, t_{2} \geq l

, where

θ_{1} (t_{2})

and

θ_{2} (t_{1})

are positive functions of

t_{2}

and

t_{1}

, respectively. Then, we have the following result.

Theorem 3.4. For $i = 1, 2$ , the function $H_{X_{i}, Y_{i}}^{R} (f, g; t_{1}, t_{2})$ depends only on $t_{3 - i}$ , if and only if, $(Y_{i} | Y_{3 - i} = t_{3 - i})$ and $(X_{i} | X_{3 - i} = t_{3 - i})$ satisfy the CPHR model in Equation (3.14).

Proof: The proof is obtained from Theorem (3.3) and using Equation (3.14).

Theorem 3.5. Let $(X_{1}^{w}, X_{2}^{w})$ be a random vector having bivariate weighted distribution associated to $(X_{1}, X_{2})$ and to non-negative differentiable functions $w_{1}$ and $w_{2}$ . Assume that the support of $(X_{1}, X_{2})$ is $S = (l, \infty) \times (l, \infty)$ for $l \geq 0$ . Then, the following conditions are equivalent.

(a) $(X_{1}^{w}, X_{2}^{w})$ and $(X_{1}, X_{2})$ satisfy the CPHR model in Equation (3.14) for $i = 1, 2$ .

(b) $H_{X_{i}, X_{i}^{w}}^{R} (f, g; t_{1}, t_{2})$ is independent of $t_{i}$ for $i = 1, 2$ and $(θ_{i} (t_{3 - i}) - 1) (1 - R) + 1$ .

log {\overset{̅}{F}}_{i} (t_{i} | t_{j}) = \frac{log [w_{i} (t_{i}) / - w_{i} (l)]}{θ_{i} (t_{3 - i}) - 1}

(d) $(X_{1}, X_{2})$ has the following joint PDF

f (x_{1}, x_{2}) = {ca}_{1} a_{2} \frac{w_{1}^{'} (x_{1}) w_{2}^{'} (x_{2})}{w_{1}^{a_{1} + 1} (x_{1}) w_{2}^{a_{2} + 1} (x_{2})} exp (- ϕ a_{1} a_{2} log [\frac{w_{1} (x_{1})}{w_{1} (l)}] log [\frac{w_{2} (x_{2})}{w_{2} (l)}])

for

x_{1}, x_{2} \geq l

, where

c > 0, ϕ \geq 0

, and

a_{i} > 1

a_{i} < 0

for

i = 1, 2

Proof: The equivalence between (a) and (b) is a consequence of Theorem (3.2). The rest of the proof is similar to Theorem 3 of Navarro, Sunoj, and Linu[24].

In analogy with Theorem 2.2 of the univariate case, in the next theorem, we obtain a bound for conditional residual R-norm divergence using the LR order.

Theorem 3.6. If $(X_{i} | X_{j} = t_{j}) \leq_{LR} (Y_{i} | Y_{j} = t_{j})$ for $i, j = 1, 2; i \neq j$ , then for $R > 1 (0 < R < 1)$

[1 + (\frac{R - 1}{R}) H_{X_{1}, Y_{1}}^{R} (f, g; t_{1}, t_{2})] \leq (\geq) {(\frac{h_{1}^{X} (t_{1} | t_{2})}{h_{1}^{Y} (t_{1} | t_{2})})}^{\frac{R - 1}{R}}

and

[1 + (\frac{R - 1}{R}) H_{X_{2}, Y_{2}}^{R} (f, g; t_{1}, t_{2})] \leq (\geq) {(\frac{h_{2}^{X} (t_{1} | t_{2})}{h_{2}^{Y} (t_{1} | t_{2})})}^{\frac{R - 1}{R}} .

Proof: We first consider $i = 1$ . Then, $(X_{1} | X_{2} = t_{2}) \leq_{LR} (Y_{1} | Y_{2} = t_{2})$ implies that $\frac{f_{1} (x_{1} | t_{2})}{g_{1} (x_{1} | t_{2})}$ is decreasing in $x_{1}$ , that is, $\frac{f_{1} (x_{1} | t_{2})}{g_{1} (x_{1} | t_{2})} \leq \frac{f_{1} (t_{1} | t_{2})}{g_{1} (t_{1} | t_{2})} \forall x_{1} > t_{1}$ . Now using Equation (3.11) and for $R > 1$ , we obtain

\begin{matrix} [1 + (\frac{R - 1}{R}) H_{X_{1}, Y_{1}} (R; t_{1}, t_{2})] & = & {(\int_{t_{1}}^{\infty} \frac{f_{1} (x_{1} | t_{2})}{{\overset{̅}{F}}_{1} (t_{1} | t_{2})} {(\frac{f_{1} (x_{1} | t_{2}) / - {\overset{̅}{F}}_{1} (t_{1} | t_{2})}{g_{1} (x_{1} | t_{2}) / - {\overset{̅}{G}}_{1} (t_{1} | t_{2})})}^{R - 1} {dx}_{1})}^{\frac{1}{R}} \\ \leq & {(\frac{h_{1}^{X} (t_{1} | t_{2})}{h_{1}^{Y} (t_{1} | t_{2})})}^{\frac{R - 1}{R}} {(\int_{t_{1}}^{\infty} \frac{f_{1} (x_{1} | t_{2})}{{\overset{̅}{F}}_{1} (t_{1} | t_{2})} {dx}_{1})}^{\frac{1}{R}} \\ = & {(\frac{h_{1}^{X} (t_{1} | t_{2})}{h_{1}^{Y} (t_{1} | t_{2})})}^{\frac{R - 1}{R}} . \end{matrix}

The case of $i = 2$ and for $0 < R < 1$ is similar.

Corollary 3.1. If $(X_{i} | X_{j} = t_{j}) \leq_{LR} (X_{i}^{w} | X_{j}^{w} = t_{j})$ for $i, j = 1, 2; i \neq j$ , then for $R > 1 (0 < R < 1)$

[1 + (\frac{R - 1}{R}) H_{X_{1}, X_{1}^{w}}^{R} (f, g; t_{1}, t_{2})] \leq (\geq) {(\frac{h_{1}^{X} (t_{1} | t_{2})}{h_{1}^{X_{w}} (t_{1} | t_{2})})}^{\frac{R - 1}{R}}

and

[1 + (\frac{R - 1}{R}) H_{X_{2}, X_{2}^{w}}^{R} (f, g; t_{1}, t_{2})] \leq (\geq) {(\frac{h_{2}^{X} (t_{2} | t_{1})}{h_{2}^{X_{w}} (t_{2} | t_{1})})}^{\frac{R - 1}{R}} .

Remark 3.1. Recently, Evren and Tuna[30] compared different goodness of fit statistics including those based on divergences measures, namely KL, Jeffrey's and Hellinger distance. Since $H_{X, Y}^{R} (f, g; t_{1}, t_{2})$ has simple relationships with many of these generalized divergence measures, $H_{X, Y}^{R} (f, g; t_{1}, t_{2})$ can be equally useful as a goodness of fit test to compare two probability distributions. One can also refer to Baratpour and Rad[31,[32] for using cumulative KL divergence as deriving a consistent test statistic for testing the hypothesis of exponentiality against some alternatives.

4 Residual R-norm Entropy for Conditional Survival Models

In this section, we consider the conditional survival rvs $(X_{i} | X_{j} > t_{j})$ and $(Y_{i} | Y_{j} > t_{j})$ ; $i, j = 1, 2, i \neq j$ . Their probability density function, survival function, and hazard rates are denoted by $f_{i}^{*} (t_{i} | t_{j})$ , $g_{i}^{*} (t_{i} | t_{j})$ , ${\overset{̅}{F}}_{i}^{*} (t_{i} | t_{j})$ , ${\overset{̅}{G}}_{i}^{*} (t_{i} | t_{j})$ , $h_{i}^{X *} (t_{i} | t_{j})$ , and $h_{i}^{Y *} (t_{i} | t_{j})$ , respectively for $i = 1, 2$ , where $h_{i}^{X *} (t_{i} | t_{j}) = - \frac{\partial}{\partial t_{i}} log {\overset{̅}{F}}_{i}^{*} (t_{i} | t_{j})$ and $h_{i}^{Y *} (t_{i} | t_{j}) = - \frac{\partial}{\partial t_{i}} log {\overset{̅}{G}}_{i}^{*} (t_{i} | t_{j})$ . Using Equation (2.1), the conditional survival residual R-norm entropy for $(X_{1} | X_{2} > t_{2})$ and $(X_{2} | X_{1} > t_{1})$ are defined respectively as

H_{X_{1}}^{*}^{R} (f; t_{1}, t_{2}) = \frac{R}{R - 1} [1 - {(\int_{t_{1}}^{\infty} \frac{{(f_{1}^{*} (x_{1} | t_{2}))}^{R}}{{({\overset{̅}{F}}_{1}^{*} (t_{1} | t_{2}))}^{R}} {dx}_{1})}^{\frac{1}{R}}]

(4.1)

and

H_{X_{2}}^{*}^{R} (f; t_{1}, t_{2}) = \frac{R}{R - 1} [1 - {(\int_{t_{2}}^{\infty} \frac{{(f_{2}^{*} (x_{2} | t_{1}))}^{R}}{{({\overset{̅}{F}}_{2}^{*} (t_{2} | t_{1}))}^{R}} {dx}_{2})}^{\frac{1}{R}}] .

(4.2)

Similar to Theorem 3.1, now we prove another characterization theorem which uniquely determines three bivariate lifetime probability models using the functional relationships between the residual R-entropy defined in Equations (4.2) and (4.3) and hazard rates in the conditional survival case. Thus, the knowledge on the functional relationships between conditional R-norm entropy and hazard rates easily derives the underlying bivariate distributions.

Theorem 4.1. For the random vector $(X_{1}, X_{2})$ , the relationships

H_{X_{1}}^{*}^{R} (f; t_{1}, t_{2}) = \frac{R}{R - 1} (1 - {(K {(h_{1}^{X *} (t_{1} | t_{2}))}^{R - 1})}^{1 / R})

(4.3)

and

H_{X_{2}}^{*}^{R} (f; t_{1}, t_{2}) = \frac{R}{R - 1} (1 - {(K {(h_{2}^{X *} (t_{2} | t_{1}))}^{R - 1})}^{1 / R}),

(4.4)

where

K

is a constant independent of

t_{1}

and

t_{2}

, hold, if and only if, it is distributed as

(a) bivariate Pareto with sf

\overset{̅}{F} (x_{1}, x_{2}) = (1 + a_{1} x_{1} + a_{2} x_{2} + {bx}_{1} x_{2})^{- c}, a_{1}, a_{2}, c, x_{1}, x_{2} > 0,

(4.5)

(b) Gumbel's exponential with sf

\overset{̅}{F} (x_{1}, x_{2}) = exp (- α_{1} x_{1} - α_{2} x_{2} - β x_{1} x_{2}), α_{1}, α_{2}, β, x_{1}, x_{2} > 0,

(4.6)

(c) bivariate beta density with sf

\begin{matrix} \overset{̅}{F} (x_{1}, x_{2}) & = & (1 - p_{1} x_{1} - p_{2} x_{2} + {qx}_{1} x_{2})^{d}, p_{1}, p_{2}, q, d > 0, \\ 0 < x_{1} < \frac{1}{p_{1}}, 0 < x_{2} < \frac{1 - p_{1} x_{1}}{p_{2} - {qx}_{2}}, \end{matrix}

(4.7)

accordingly as

K =_{>}^{<} \frac{1}{R}

for

R > 1

and

K =_{<}^{>} \frac{1}{R}

for

0 < R < 1

Proof: The first part is direct. To prove the converse, assume that Equation (4.3) holds. For $R > 1$ , Equation (4.3) is equivalent to

\int_{t_{1}}^{\infty} \frac{{(f_{1}^{*} (x_{1} | t_{2}))}^{R}}{{({\overset{̅}{F}}_{1}^{*} (t_{1} | t_{2}))}^{R}} {dx}_{1} = K {(h_{1}^{X *} (t_{1} | t_{2}))}^{R - 1},

and

\int_{t_{1}}^{\infty} {(f_{1}^{*} (x_{1} | t_{2}))}^{R} {dx}_{1} = K {(h_{1}^{X *} (t_{1} | t_{2}))}^{R - 1} {({\overset{̅}{F}}_{1}^{*} (t_{1} | t_{2}))}^{R} .

(4.8)

Differentiating Equation (4.8) with respect to $t_{1}$ , we get

\begin{matrix} - {(f_{1}^{*} (t_{1} | t_{2}))}^{R} & = & K (R - 1) {(h_{1}^{X *} (t_{1} | t_{2}))}^{R - 2} \frac{\partial}{\partial t_{1}} h_{1}^{X *} (t_{1} | t_{2}) {({\overset{̅}{F}}_{1}^{*} (t_{1} | t_{2}))}^{R} \\ - KR {(h_{1}^{X *} (t_{1} | t_{2}))}^{R - 1} {({\overset{̅}{F}}_{1}^{*} (t_{1} | t_{2}))}^{R - 1} f_{1}^{*} (t_{1} | t_{2}), \end{matrix} \begin{matrix} - {(f_{1}^{*} (t_{1} | t_{2}))}^{R} & = & K (R - 1) {(h_{1}^{X *} (t_{1} | t_{2}))}^{R - 2} \frac{\partial}{\partial t_{1}} h_{1}^{X *} (t_{1} | t_{2}) {({\overset{̅}{F}}_{1}^{*} (t_{1} | t_{2}))}^{R} \\ - KR {(h_{1}^{X *} (t_{1} | t_{2}))}^{R} {({\overset{̅}{F}}_{1}^{*} (t_{1} | t_{2}))}^{R} . \end{matrix}

(4.9)

Dividing Equation (4.9) by ${({\overset{̅}{F}}_{1}^{*} (t_{1} | t_{2}))}^{R} {(h_{1}^{X *} (t_{1} | t_{2}))}^{R}$ yields

K (R - 1) \frac{\frac{\partial}{\partial t_{1}} (h_{1}^{X *} (t_{1} | t_{2}))}{{(h_{1}^{X *} (t_{1} | t_{2}))}^{2}} = KR - 1, K (1 - R) \frac{\partial}{\partial t_{1}} (\frac{1}{h_{1}^{X *} (t_{1} | t_{2})}) = KR - 1

(4.10)

Integrating Equation (4.10) with respect to $t_{1}$ , we obtain

\frac{1}{h_{1}^{X *} (t_{1} | t_{2})} = \frac{KR - 1}{K (1 - R)} t_{1} + B_{1} (t_{2}) = {At}_{1} + B_{1} (t_{2}),

where

A = \frac{KR - 1}{K (1 - R)}

. Thus,

h_{1}^{X *} (t_{1} | t_{2}) = \frac{1}{{At}_{1} + B_{1} (t_{2})}

. Applying similar steps on Equation (4.4), we obtain

h_{2}^{X *} (t_{2} | t_{1}) = \frac{1}{{At}_{2} + B_{2} (t_{1})}

. Now using the result of Roy[33], the models (4.5), (4.6), and (4.7) follow. The case for

0 < R < 1

can be similarly obtained.

The following theorems give the bounds for conditional survival residual R-norm entropy.

Theorem 4.2. If $w_{i} (x_{i})$ is decreasing in $x_{i}$ for $i = 1, 2$ , then for $R \neq 1$

[1 - (\frac{R - 1}{R}) H_{X_{1}^{w}}^{*} (R; t_{1}, t_{2})] \leq \frac{h_{1}^{X_{w} *} (t_{1} | t_{2})}{h_{1}^{X *} (t_{1} | t_{2})} [1 - (\frac{R - 1}{R}) H_{X_{1}}^{*} (R; t_{1}, t_{2})]

and

[1 - (\frac{R - 1}{R}) H_{X_{2}^{w}}^{*} (R; t_{1}, t_{2})] \leq \frac{h_{2}^{X_{w} *} (t_{2} | t_{1})}{h_{2}^{X *} (t_{2} | t_{1})} [1 - (\frac{R - 1}{R}) H_{X_{2}}^{*} (R; t_{1}, t_{2})] .

We now define the conditional survival residual R-norm divergence between the rvs $(X_{1} | X_{2} > t_{2})$ and $(Y_{1} | Y_{2} > t_{2})$ , and $(X_{2} | X_{1} > t_{1})$ and $(Y_{2} | Y_{1} > t_{1})$ as

H_{X_{1}, Y_{1}}^{*}^{R} (f, g; t_{1}, t_{2}) = \frac{R}{R - 1} [{\{\int_{t_{1}}^{\infty} \frac{f_{1}^{*} (x_{1} | t_{2})}{{\overset{̅}{F}}_{1}^{*} (t_{1} | t_{2})} {(\frac{f_{1}^{*} (x_{1} | t_{2}) / - {\overset{̅}{F}}_{1}^{*} (t_{1} | t_{2})}{g_{1}^{*} (x_{1} | t_{2}) / - {\overset{̅}{G}}_{1}^{*} (t_{1} | t_{2})})}^{R - 1} {dx}_{1}\}}^{\frac{1}{R}} - 1]

and

H_{X_{2}, Y_{2}}^{*}^{R} (f, g; t_{1}, t_{2}) = \frac{R}{R - 1} [{\{\int_{t_{2}}^{\infty} \frac{f_{2}^{*} (x_{2} | t_{1})}{{\overset{̅}{F}}_{2}^{*} (t_{2} | t_{1})} {(\frac{f_{2}^{*} (x_{2} | t_{1}) / - {\overset{̅}{F}}_{2}^{*} (t_{2} | t_{1})}{g_{2}^{*} (x_{2} | t_{1}) / - {\overset{̅}{G}}_{2}^{*} (t_{2} | t_{1})})}^{R - 1} {dx}_{2}\}}^{\frac{1}{R}} - 1]

It is to be noted that $H_{X_{1}, Y_{1}}^{*}^{R} (f, g; t_{1}, t_{2}) = H_{(X_{1} | X_{2} = t_{2}), (Y_{1} | Y_{2} = t_{2})}^{*}^{R} (f, g; t_{1})$ and $H_{X_{2}, Y_{2}}^{*}^{R} (f, g; t_{1}, t_{2}) = H_{(X_{2} | X_{1} = t_{1}), (Y_{2} | Y_{1} = t 1)}^{*}^{R} (f, g; t_{2})$ . Hence, $H_{X_{1}, Y_{1}}^{*} (R; t_{1}, t_{2})$ and $H_{X_{2}, Y_{2}}^{*} (R; t_{1}, t_{2})$ provide dynamic information on the distance between the conditionally specified rvs $(X_{1} | X_{2} > t_{2})$ and $(Y_{1} | Y_{2} > t_{2})$ , and $(X_{2} | X_{1} > t_{1})$ and $(Y_{2} | Y_{1} > t_{1})$ .

The following theorems provide simple bounds for conditional survival residual R-norm divergence as ratio of conditional hazard rates of $(X_{i} | X_{j} > t_{j})$ and $(Y_{i} | Y_{j} > t_{j})$ , $i, j = 1, 2; i \neq j$ .

Theorem 4.3. If $(X_{i} | X_{j} > t_{j}) \leq_{LR} (Y_{i} | Y_{j} > t_{j}); i, j = 1, 2, i \neq j$ , then for $R > 1 (0 < R < 1)$

[1 + (\frac{R - 1}{R}) H_{X_{1}, Y_{1}}^{*} R (f, g; t_{1}, t_{2})] \leq (\geq) {(\frac{h_{1}^{X *} (t_{1} | t_{2})}{h_{1 |}^{Y *} (t_{1} | t_{2})})}^{\frac{R - 1}{R}}

and

[1 + (\frac{R - 1}{R}) H_{X_{2}, Y_{2}}^{*} R (f, g; t_{1}, t_{2}) \leq (\geq) {(\frac{h_{2}^{X *} (t_{2} | t_{1})}{h_{2}^{Y *} (t_{2} | t_{1})})}^{\frac{R - 1}{R}} .

Corollary 4.1. If $(X_{i} | X_{j} > t_{j}) \leq_{LR} (X_{i}^{w} | X_{j}^{w} > t_{j}); i, j = 1, 2, i \neq j$ , then for $R > 1 (0 < R < 1)$

[1 + (\frac{R - 1}{R}) H_{X_{1}, X_{1}}^{*} R (f, g; t_{1}, t_{2}) \leq (\geq) {(\frac{h_{1}^{X *} (t_{1} | t_{2})}{h_{1}^{X_{w} *} (t_{1} | t_{2})})}^{\frac{R - 1}{R}}

and

[1 + (\frac{R - 1}{R}) H_{X_{2}, X_{2}}^{*} R (f, g; t_{1}, t_{2}) \leq (\geq) {(\frac{h_{2}^{X *} (t_{2} | t_{1})}{h_{2}^{X_{w} *} (t_{2} | t_{1})})}^{\frac{R - 1}{R}} .

5 Conclusion

The article studies the residual R-norm entropy and its relative form, and obtains certain bounds to them in the context of weighted models. It provides some simple bounds for R-norm entropy in terms of hazard functions. We have also introduced R-norm entropy and its relative measure for the conditionally specified and survival models, which are useful in deriving some new bounds to these measures and for identifying bivariate models based on its relationships with corresponding conditional hazard functions. The concept of conditional proportional hazard rate (CPHR) models and R-norm divergence measures are also used to obtain new results.

Footnotes

Acknowledgments

The authors wish to thank the editor and referees for their constructive comments. The second author would like to thank the support of the University Grants Commission, India, under the Special Assistance Programme.

References

Boekee

van der

Lubbe JCA.

. The R-norm information measure. Inform Control. 1980; 45(2): 136–155.

Nanda

Das

Study on R-norm residual entropy. Calcutta Statistic Assoc Bull. 2006; 58(231–232): 197–209.

Ebrahimi

How to measure uncertainty about residual lifetime. Sankhya, A. 1996; 58(1): 48–57.

Ebrahimi

Kirmani

SNUA.

A characterization of the proportional hazards model through a measure of discrimination between two residual life distributions. Biometrika. 1996; 83(1): 233–235.

Renyi

On measures of entropy and information. Proceedings of the 4th Berkeley Symposium on Mathematical Statistics and Probability (Volume 1, pp. 547–561). Berkeley, California: University of California Press; 1961

Kapur

JN.

Generalized entropy of order α and type β.

Math Sem. 1967; 4(4): 78–94.

Tsallis

Possible generalization of Boltzmann–Gibbs statistics. J Statistic Phys. 1988; 52(1–2): 479–487.

Pharwaha

APS

Singh

Shannon and non-Shannon measures of entropy for statistical texture feature extraction in digitized mammograms. Proceedings of the World Congress on Engineering and Computer Science (Volume 2, pp. 1286–1291). San Francisco, USA; 20–22 October 2009.

Smolkovd

Wachowiak

Tourassi

Elmaghraby

Zurada

JM.

Characterization of ultrasonic backscatter based on generalized entropy. Proceedings of the second joint Biomedical Engineering Society EMBS/BMES Conference, Houston, USA (Volume 2, pp. 953–954). 2002.

10.

Havrda

Charvat

Quantification method of classification processes: Concept of structural α-entropy, Kybernetika. 1967; 3(1): 30–35.

11.

Kumar

Choudhary

R-norm Shannon–Gibbs type inequality. J Appl Sci. 2011; 11(15): 2866–2869.

12.

Kumar

Ram

Gupta

. On ’useful’ R-norm relative information and J-divergence measures. Int J Pure Appl Math. 2012; 77(3):349–358.

13.

Kullback

Leibler

RA.

On information and sufficiency. Annal Math Statistics. 1951; 22(1): 79–86.

14.

Navarro

Ruiz

del

Aguila Y.

Multivariate weighted distributions. Statistics. 2006; 40(1): 51–54.

15.

Bartoszewicz

On a representation of weighted distributions. Statistics Probab Letters. 2009; 79(15): 1690–1694.

16.

Kumar

Taneja

HJ.

On length biased dynamic measure of past inaccuracy. Metrika. 2012; 75(1): 73–84.

17.

Sunoj

Sreejith

TB.

Some results on reciprocal subtangent in the context of weighted models. Comm Statistics: Theor Method. 2012; 41(8): 1397–1410.

18.

Arnold

Castillo

Sarabia

JM.

Conditional specification of statistical models. New York, NY: Springer Verlag; 1999.

19.

Arnold

BC.

Conditional survival models. In: Balakrishnan N, editors. Recent advances in lifetesting and reliability. Boca Raton, FL: CRC Press; 1995: 589–601.

20.

Sunoj

Sankaran

PG.

Some characterizations of bivariate weighted distributions in the context of reliability modeling. Calcutta Statistic Assoc Bull. 2005; 27(227–228): 179–194.

21.

Navarro

Sarabia

JM.

Alternative definitions of bivariate equilibrium distributions. J Statistic Plan Infer. 2010; 140(7):2046–2056.

22.

Sunoj

Linu

MN.

Dynamic cumulative residual Renyi's entropy. Statistics. 2012a; 46(1): 41–56.

23.

Sunoj

Linu

MN.

Cumulative measure of uncertainty for conditionally specified models. Calcutta Statistic Assoc Bull. 2012b; 64(253–254): 59–78.

24.

Navarro

Sunoj

Linu

MN.

Characterizations of bivariate models using dynamic Kullback–Leibler discrimination measures. Statistics Probab Letters. 2011; 81(11): 1594–1598.

25.

Navarro

Sunoj

Linu

MN.

Characterizations of bivariate models using some dynamic conditional information divergence measures. Comm Statistics: Theor Method. 2014; 43(9): 1939–1948.

26.

Arnold

BC.

Bivariate distributions with Pareto conditionals, Statistics and Probability Letters. 1987; 5(4): 263–266.

27.

Arnold

Strauss

Bivariate distributions with exponential conditionals. Journal of American Statistical Association. 1988; 83(402): 522–527.

28.

Cox

DR.

The analysis of exponentially distributed lifetimes with two types of failure. Journal of Royal Statistical Society, Series B. 1959; 21(2): 411–421.

29.

Sankaran

Sreeja

VN.

Proportional hazard model for multivariate failure time data. Comm Statistics: Theor Method. 2007; 36(8): 1627–1641.

30.

Evren

Tuna

On some properties of goodness of t measures based on statistical entropy. Int J Research Review Appl Sci. 2012; 13(1): 192–205.

31.

Baratpour

Rad

AH.

Testing goodness-of-fit for exponential distribution based on cumulative residual entropy. Comm Statistic: Theor Method. 2012; 41(8): 1387–1396.

32.

Baratpour

Rad

AH.

Exponentiality test based on the progressive type II censoring via cumulative entropy. Comm Statistic: Simulation Comput. 2016; 45(7): 2625–2637.

33.

Roy

A characterization of Gumbel's bivariate exponential and Lindley and Singpurwalla's bivariate Lomax distributions. Journal of Applied Probability. 1989; 26(4): 886–891