TEXT SIZE

• •   CrossRef (0) Some counterexamples of a skew-normal distribution  Jun Zhaoa, Sang Kyu Leea, Hyoung-Moon Kim1, a

aDepartment of Applied Statistics, Konkuk University, Korea
Correspondence to: 1Department of Applied Statistics, Konkuk University, Korea, 120 Neungdong-ro, Gwangjin-gu, Seoul 05029, Korea. E-mail: hmkim@konkuk.ac.kr
Received July 17, 2019; Revised August 20, 2019; Accepted September 19, 2019.
Abstract
Counterexamples of a skew-normal distribution are developed to improve our understanding of this distribution. Two examples on bivariate non-skew-normal distribution owning marginal skew-normal distributions are first provided. Sum of dependent skew-normal and normal variables does not follow a skew-normal distribution. Continuous bivariate density with discontinuous marginal density also exists in skew-normal distribution. An example presents that the range of possible correlations for bivariate skew-normal distribution is constrained in a relatively small set. For unified skew-normal variables, an example about converging in law are discussed. Convergence in distribution is involved in two separate examples for skew-normal variables. The point estimation problem, which is not a counterexample, is provided because of its importance in understanding the skew-normal distribution. These materials are useful for undergraduate and/or graduate teaching courses.
Keywords : skew-normal, bivariate distribution, independence, quadratic form
1. Introduction

Counterexamples are crucial to understand the main ideas of probability and statistics. Romano and Siegel (1986) and Stoyanov (2013) neatly introduced many counterexamples in this direction. A skew-normal distribution is well summarized by Azzalini and Capitanio (2014, p.24). The probability density function (pdf) of a random variable Z ~ SN(0, 1, α) is given by

$f Z ( z ; α ) = 2 φ ( z ) Φ ( α z ) , ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ α ∈ ŌäØ , z ∈ ŌäØ ,$

where φ(·) and Φ(·) are the pdf and cumulative distribution function of the standard normal distribution, respectively. This can be further extended to include the location and scale parameters:

$Y = μ + σ Z , ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ μ ∈ ŌäØ , σ ∈ ŌäØ + ,$

and thus we write Y ~ SN(μ, σ2, α). A multivariate skew-normal pdf (Azzalini and Capitanio, 2014, p.124) is given by

$f Z ( z ; α ) = 2 φ n ( z ; μ , Ω ) Φ ( α ŌŖż ω - 1 ( z - μ ) ) , ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ α ∈ ŌäØ n , z ∈ ŌäØ n ,$

where φn(z; μ, Ω) is the n-dimensional normal pdf with mean μ and covariance matrix Ω. Remark that ω can be written as ω = (Ω ŌŖÖ In)1/2, where ŌŖÖ denotes the entry-wise or Hadamard product. When Z has the pdf (1.2), we write Z ~ SNn(μ, Ω, α). When α = 0, (1.2) reduces to Nn(μ, Ω) pdf; hence, the parameter α is referred to as a “skewness (slant) parameter.”

This paper provides some counterexamples related to the skew-normal distribution to enhance our understanding of this distribution. As far as we are concerned, only one paper (Lin and Stoyanov, 2009) has appeared in this direction. Our contribution to this field is useful for theoretical and applied statisticians. The examples suggested are also good teaching materials for undergraduate and/or graduate students.

2. Some counterexamples

We provide some counterexamples related to skew-normal distribution.

Example 1. A bivariate distribution that is not bivariate skew-normal, but has skew-normal marginal distributions.

The marginal distributions are also skew-normal (Azzalini and Capitanio, 2014, p.133) if the joint distribution of (X, Y) is a bivariate skew-normal distribution; however, the converse is false. We provide two examples here.

Let a bivariate density function g(x, y), where $x∈R$ and $y∈R$, be the product of two independent density functions, g1(x) and g2(y), with corresponding medians a and b, respectively. Now let (X, Y) have a joint density given by

$f ( x , y ) = { 2 g ( x , y ) , if { x > a and y > b } or { x ≤ a and y ≤ b } , 0 , otherwise.$

Then the marginal density functions of X and Y are g1(x) and g2(y), respectively. Note that g1(x) and g2(y) can be any density function, for continuous cases, normal, skew-normal, t, and skew-t. Even discrete densities are possible for g1(x) and g2(y).

For example, let g(x, y) be a product of two independent skew-normal density functions of the distributions SN(0, 1, α1) and SN(0, 1, α2), whose medians are a and b, respectively. Then f (x, y) is nonnegative and the integration of $R2$ is one. In addition, f (x, y) is a density function, but not of the bivariate skew-normal distribution. Furthermore, the marginal distributions of X and Y are SN(0, 1, α1) and SN(0, 1, α2) by simple integration, respectively.

Another example is as follows. Let (X, Y) ∈ $R2$ with the joint density f (x, y) as

$2 π exp { - 1 2 ( x 2 + y 2 ) } Φ ( α x ) Φ ( α y ) + 1 π e x 3 y 3 Φ ( α ) Φ ( - α ) I [ - 1 , 1 ] ( x ) I [ - 1 , 1 ] ( y ) ,$

where $α∈R$. Integration over $R2$ is one and the minimum values are f (−1, 1) = f (1, −1) = (1/(πe))Φ(α)Φ(−α) → 0 as α → ∞ or α → −∞; thus, f (x, y) is nonnegative. Therefore, f (x, y) is a density function. The marginal density of X is

$f ( x ) = ∫ - ∞ ∞ 2 π exp { - 1 2 ( x 2 + y 2 ) } Φ ( α x ) Φ ( α y ) d y + ∫ - 1 1 1 π e x 3 y 3 Φ ( α ) Φ ( - α ) I [ - 1 , 1 ] ( x ) d y = 2 ŽĢ ( x ) Φ ( α x ) , ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ α ∈ R , x ∈ R ,$

which is a density function of SN(0, 1, α). Similarly, Y ~ SN(0, 1, α).

Example 2. Sum of dependent normal and skew-normal random variables is not distributed as skew-normal.

It is well known that the sum of independent skew-normal and normal random variables follows a skew-normal distribution (Azzalini and Capitanio, 2014, Proposition 2.3). However, Proposition 2.3 is not working if we replace independence with dependence.

Let X follow SN(0, 1, α). Observe X; then, toss a fair coin and define Y as:

$Y = { X , if the toss is “ heads ” , - X , if the toss is “ tails ” .$

This can be defined more rigorously as follows.

$( Y ŌłŻ Z = 1 ) = d X ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ and ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ ( Y ŌłŻ Z = 0 ) = d - X ,$

where X ~ SN(0, 1, α), Z ~ Ber(1/2) and X and Z are independent. If the toss is “heads,” Y ~ SN(0, 1, α), and if the toss is “tails,” Y ~ SN(0, 1, −α). Therefore, the unconditional distribution of Y is given by the standard normal distribution as:

$f Y ( y ) = f ( y ŌłŻ z = 1 ) P ( Z = 1 ) + f ( y ŌłŻ z = 0 ) P ( Z = 0 ) = φ ( y ) [ Φ ( α y ) + Φ ( - α y ) ] = φ ( y ) .$

However, the sum X+Y has a positive probability of 1/2 at zero when the toss is tails. However, X+Y is not degenerate when the toss is heads. Such a mixture of a discrete and a continuous distribution cannot be a skew-normal distribution.

Example 3. Continuous bivariate density with discontinuous marginal density.

Let X have an exponential distribution with mean one. Conditional on X, let Y have a skew-normal distribution with location 1/X, scale 1, and skewness (slant) parameter α, that is, Y|X ~ SN(1/X, 1, α). Then, (X, Y) has a bivariate density function:

$f ( x , y ) = { 2 2 π exp { - x - 1 2 ( y - 1 x ) 2 } Φ ( α y ) , if x > 0 , 0 , otherwise.$

Suppose (xn, yn) → (x, y). If x ≠ 0, it is clear that f (xn, yn) → f (x, y). If x = 0, then f (xn, yn) → f (x, y) = 0 regardless of how many xn values are positive. Therefore, f (x, y) is continuous everywhere in the plane. However, if x ≠ 0,

$f ( x ) = ∫ - ∞ ∞ 2 2 π exp { - x - 1 2 ( y - 1 x ) 2 } Φ ( α y ) d y = e - x ∫ - ∞ ∞ 2 2 π exp { - 1 2 ( y - 1 x ) 2 } Φ ( α y ) d y = e - x$

and f (x) = 0 if x = 0. That is, the marginal density f (x) = exI(x > 0) has a jump at zero. Hence, the marginal density f (x) is not a continuous function.

For the conditional distribution of Y given X, any continuous distribution is possible with location 1/X and scale c > 0. For example, an elliptical distribution (Fang et al., 1990) is possible. Furthermore, a skew-elliptical distribution (Azzalini and Capitanio, 2014) is also possible. A skew-normal distribution is a special case of a skew-elliptical distribution.

Example 4. A family of bivariate distributions such that the range of possible correlations is a small subset of [−1, 1].

Suppose that X and Y are bivariate skew-normal variables, namely (X, Y) ~ SN2(μ, Ω, α) with

$μ = ( μ X μ Y ) , ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ Ω = ( σ X 2 ρ σ X σ Y ρ σ X σ Y σ Y 2 ) , ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ α = ( α X α Y ) .$

Define

$W = exp ( X ) ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ and ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ Z = exp ( Y ) .$

This produces a set (W, Z) of bivariate log-skew-normal random variables, where

$X ~ S N ( μ X , σ X 2 , α 1 ( 2 ) ) , ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ Y ~ S N ( μ Y , σ Y 2 , α 2 ( 2 ) ) ,$

with

$α 1 ( 2 ) = ( 1 + α Y 2 ( 1 - ρ 2 ) ) - 1 2 ( α X + ρ α Y ) , α 2 ( 2 ) = ( 1 + α X 2 ( 1 - ρ 2 ) ) - 1 2 ( α Y + ρ α X ) .$

Using the moment generating functions (mgfs) of X and Y, we obtain the following moments:

$E ( W ) = E ( e X ) = E ( e X t ŌłŻ t = 1 ) = 2 exp ( μ X + 1 2 σ X 2 ) Φ ( δ X σ X ) , Var ( W ) = E ( W 2 ) - ( E ( W ) ) 2 = 2 exp ( 2 μ X + σ X 2 ) { e σ X 2 Φ ( 2 δ X σ X ) - 2 Φ 2 ( δ X σ X ) } , E ( Z ) = 2 exp ( μ Y + 1 2 σ Y 2 ) Φ ( δ Y σ Y ) , Var ( Z ) = 2 exp ( 2 μ Y + σ Y 2 ) { e σ Y 2 Φ ( 2 δ Y σ Y ) - 2 Φ 2 ( δ Y σ Y ) } ,$

where

$δ X = α 1 ( 2 ) 1 + α 1 ( 2 ) 2 ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ and ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ δ Y = α 2 ( 2 ) 1 + α 2 ( 2 ) 2 .$

The expectation of WZ becomes

$E ( W Z ) = 2 exp ( σ X 2 2 + σ Y 2 2 + μ X + μ Y + ρ σ X σ Y ) × Φ ( α X σ X ( μ X * - μ X ) + α Y σ Y ( μ Y * - μ Y ) 1 + h ŌŖż Σ * h )$

by using the perfect square, a transformation, and Lemma 5.2 (Azzalini and Capitanio, 2014), where

$μ X * = μ X + ρ σ X σ Y + σ X 2 , ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ μ Y * = μ Y + ρ σ X σ Y + σ Y 2 , h = ( α X σ X α Y σ Y ) ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ and ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ Σ * = ( σ X 2 ρ σ X σ Y ρ σ X σ Y σ Y 2 ) .$

Therefore, the correlation between W and Z is

$Corr ( W , Z ) = E ( W Z ) - E ( W ) E ( Z ) Var ( W ) Var ( Z ) = e ρ σ X σ Y Φ ( α X σ X ( μ X * - μ X ) + α Y σ Y ( μ Y * - μ Y ) 1 + h ŌŖż Σ * h ) - 2 Φ ( δ X σ X ) Φ ( δ Y σ Y ) [ e σ X 2 Φ ( 2 δ X σ X ) - 2 Φ 2 ( δ X σ X ) ] 1 2 [ e σ Y 2 Φ ( 2 δ Y σ Y ) - 2 Φ 2 ( δ Y σ Y ) ] 1 2 .$

When α1 = α2 ∈ [1, 50] or α1 = −α2 ∈ [−50, −2], the correlation between W and Z is constrained in [0.0156, −0.0003] or [0.0137, −0.0009], respectively. In spite of a near-zero correlation, W and Z are perfectly functionally (but nonlinearly) related.

Example 5. Two independent sequences of random variables, each converging in law to a limit, such that the sequence of term-by-term sums does not converge in law to the sum of the limits.

Consider two independent sequences of random variables

$X 1 , X 2 , … ~ SUN 1 , 1 ( 0 , 1 , ρ , τ , 1 ) ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ and ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ Y 1 , Y 2 , … ~ SUN 1 , 1 ( 0 , 1 , - ρ , τ , 1 ) .$

For a unified skew-normal (SUN) distribution (Azzalini and Capitanio, 2014). Consider two other random variables: Let X ~ SUN1,1(0, 1, ρ, τ, 1) and then define Y = −X so that Y ~ SUN1,1(0, 1, −ρ, τ, 1) but is not independent of X.

$X n → d X ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ and ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ Y n → d Y .$

However,

$X n + Y n ~ SUN 1 , 2 ( 0 , 2 , ( 1 2 ρ - 1 2 ρ ) , ( τ τ ) , ( 1 0 0 1 ) ) ,$

but X + Y = 0. Hence, Xn + Yn does not converge in law to X + Y.

To formulate this type of example, it is sufficient that the distributions of X1, X2, … and Y1, Y2, … have the convolution property, which indicates that the sums of the independent random variables having this particular distribution come from the same distribution family. For example, if Xi ~ χ2(ni) for i = 1, 2, …, n and they are independent, then $∑ i = 1 n X i ~ χ 2 ( ∑ i = 1 n n i )$.

Example 6. A sequence of absolutely continuous distributions with support equal to the entire plane that converges to a limit in a law degenerate at the origin.

Let Xn and Yn be independent skew-normal random variables with location zero, scale 1/n, and slant parameter α/n, namely SN(0, 1/n, α/n) where $α∈R$. Then, the joint characteristic function (Kim and Genton, 2011) of (Xn, Yn) is given by

$ŽĢ ( X n , Y n ) ( s , t ) = exp ( - s 2 + t 2 2 n ) { 1 + i F ( δ s n ) } { 1 + i F ( δ t n ) } ,$

where

$F ( x ) = b ∫ 0 x exp ( u 2 2 ) d u , ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ b = 2 π ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ and ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ δ = α n 2 + α 2 ,$

for all s and t. ŽĢXn,Yn (s, t) converges to one as n → ∞. Hence, (Xn, Yn) converges to a limit law degenerate at the origin (0, 0).

Example 7. A sequence of dependent bivariate random variables that converges in distribution to an independent bivariate random variable.

In general, if Xn and Yn are independent and (Xn, Yn) converges in distribution to (X, Y), then (X, Y) are also independent. This property need not hold if we replace independent by dependent. Let

$( X n , Y n ) ŌŖż ~ S N 2 ( ( 0 0 ) , ( 1 1 / n 1 / n 1 ) , ( α 1 0 ) ) .$

Then, Xn and Yn are dependent since 1/n is not equal to zero. The necessary and sufficient conditions of independence in the bivariate skew-normal distribution are as:

If Y ~ SN2(0, Ω, α) with Ω = {ωi j}, i, j = 1, 2 and α = (α1 α2)ŌŖż, then they are independent if and only if

$( a ) ω 12 = ω 21 = 0 , ( b ) α i ≠ 0 for at most one i , i = 1 , 2.$

The joint mgf of (Xn, Yn) is as:

$M X n , Y n ( t ) = 2 exp { 1 2 ( t 1 2 + t 2 2 + 2 n t 1 t 2 ) } Φ ( α 1 1 + α 1 2 ( t 1 + 1 n t 2 ) ) .$

The limit of the joint mgf becomes

$lim n → ∞ M X n , Y n ( t ) = 2 exp { 1 2 ( t 1 2 + t 2 2 ) } Φ ( α 1 1 + α 1 2 t 1 ) = exp ( 1 2 t 2 2 ) 2 exp ( 1 2 t 1 2 ) Φ ( α 1 1 + α 1 2 t 1 ) .$

So,

$( X n , Y n ) → d ( X , Y ) ,$

where X ~ SN(0, 1, α1) and Y ~ N(0, 1) independently.

3. Discussion

We developed some counterexamples related to the skew-normal distribution that extend those of the normal distribution in some cases. This material is useful for undergraduate and/or graduate teaching courses. For future research, completeness and/or more properties of the skew-normal distribution are needed to develop further counterexamples.

Acknowledgments

This paper was supported by Konkuk University in 2017.

Appendix

This is not a counterexample, but it is good for understanding a skew-normal distribution. Let (Z1, …, Zn) be a random sample from SN(0, 1, α). An unbiased estimator of δ is

$δ ^ u.e. = 1 n b ∑ i = 1 n Z i$

since E(Zi) = , where $b = 2 / π , α = δ / 1 - δ 2$, and $δ = α / 1 + α 2$. The likelihood function of δ or α is

$L ( α ) = ∏ i = 1 n 2 φ ( z i ) Φ ( α z i ) = 2 n ∏ i = 1 n φ ( z i ) ∏ i = 1 n Φ ( α z i ) .$

Therefore, a random sample (Z1, …, Zn) or the order statistic, (Z(1), …, Z(n)), are sufficient statistics for δ or α.

We show that (Z(1), …, Z(n)) is not a complete statistic for δ. It is obvious that $∑ i = 1 n Z ( i ) = ∑ i = 1 n Z i$ and $∑ i = 1 n Z ( i ) 2 = ∑ i = 1 n Z i 2$, so E(Z╠ä) = , where $Z ¯ = ∑ i = 1 n Z ( i ) / n = ∑ i = 1 n Z i / n$. From simple algebra, we find that

$E ( 1 - S Z 2 ) = b 2 δ 2 ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ and ŌĆŖŌĆŖ ŌĆŖŌĆŖ ŌĆŖŌĆŖ E ( n Z ¯ 2 - 1 n - 1 ) = b 2 δ 2 ,$

where $S Z 2 = ( 1 / ( n - 1 ) ) ∑ i = 1 n ( Z i - Z ¯ ) 2$. Therefore, the order statistic is not a complete statistic for δ.

References
1. Azzalini A and Capitanio A (2014). The Skew-Normal and Related Families, Cambridge University Press, New York.
2. Fang KT, Kotz S, and Ng KW (1990). Symmetric Multivariate and Related Distributions, Chapman and Hall, New York.
3. Kim HM and Genton MG (2011). Characteristic functions of scale mixtures of multivariate skew-normal distributions, Journal of Multivariate Analysis, 102, 1105-1117.
4. Lin GD and Stoyanov J (2009). The logarithmic skew-normal distributions are moment-indeterminate, Journal of Applied Probability, 46, 909-916.
5. Romano JP and Siegel AF (1986). Counterexamples in Probability and Statistics, Wadsworth & Brooks/Cole Advanced Books & Software, California.
6. Stoyanov JM (2013). Counterexamples in Probability, Dover Publications, New York.