^{a}Department of Statistics, Sungkyunkwan University, Korea
Correspondence to:^{1}Department of Statistics, Sungkyunkwan University, 25-2, Sungkyunkwan-ro, Jongno-gu, Seoul 03063, Korea. E-mail: cshong@skku.edu
Received January 29, 2021; Revised March 10, 2021; Accepted March 18, 2021.
Abstract
Most graphical representation methods for two-dimensional contingency tables are based on the frequencies, probabilities, association measures, and goodness-of-fit statistics. In this work, a method is proposed to represent the correlation coefficients for each of the two selected levels of the row and column variables. Using the correlation coefficients, one can obtain the vector-matrix that represents the angle corresponding to each cell. Thus, these vectors are represented as a unit circle with angles. This is called a CC plot, which is a correlation plot for a contingency table. When the CC plot is used with other graphical methods as well as statistical models, more advanced analyses including the relationship among the cells of the row or column variables could be derived.
There exist many graphical representation methods for a categorical data. Some methods express the frequencies and probabilities of each cell such as the bar chart, pie chart, and star chart for one categorical variable. For a 2×2 contingency table, Fienberg (1975) proposed the four-fold circular display. Other graphical representations that can be applied to a two-dimensional categorical data include the block chart, mosaic plot (Hartigan and Kleiner, 1981, 1984; Friendly, 1992, 1994), association plot (Cohen, 1980; Friendly, 1991), grouped bar graph (Tufte, 1985), grouped dot plot and framed rectangle chart (Cleveland and McGill, 1984), trellis display (Becker et al., 1996), and the diamond graph (Li et al., 2003), etc.
There are also other kinds of graphical methods to represent the relationships and fitting of the statistical models of their categorical variables. Fienberg (1968) and Fienberg and Gilbert (1970) proposed a method to geometrically represent the association measure using a tetrahedron for a 2 × 2 contingency table. Tukey (1977) suggested the two-way plot that represents the goodness-of-fit (GOF) for a two-dimensional contingency table. Darroch et al. (1980) developed graphical models that could describe an independent model and a conditional independent model for multidimensional contingency tables. There are two other methods that are based on the odds ratios and their confidence intervals for 2 × 2 contingency tables: the contour plot (Doi et al., 2001; Yamamoto and Doi, 2001) and the raindrop plot (Barrowman and Myers, 2000, 2003). Moreover, Hong et al. (1999) proposed graphical methods to describe the relationship among the GOFs of the hierarchical log-linear models by constructing a right-angled triangle plot and a polyhedron plot.
There are other graphical methods to display the relations of the correlation coefficients. Corsten and Gabriel (1976) extended the biplot of Gabriel (1971) and proposed the h-plot to express the correlation coefficients as angles. Gower and Hand (1996) have extended and generalized the ideas of Gabriel. Trosset (2005) later proposed a correlation diagram using the cosine function of the angles. The correlation diagram represents a correlation coefficient matrix using a set of points on a unit circle. Pittelkow and Wilson (2005) developed the GE biplot using the biplot approaches to represent the relationships between the genes and samples. Park et al. (2008) compared the performance of the principal component analysis biplot, factor analysis biplot, multidimensional scaling biplot, and correspondence analysis biplot by analyzing various types of gene expression data. Hong and Lee (2006) suggested the G^{2}-plot that contains information about each log-linear model and all possible pairs of the hierarchical log-linear models using the ideas of the correlation diagram introduced by Trosset (2005).
For the two dimensional I × J contingency table, there are many graphical representation methods including the correspondence analysis. The correspondence analysis method explores the relationship of variables by simultaneously displaying the row and column categories of contingent table data based on the multidimensional reduction (scaling) method. Most of these methods are based on the frequencies, probabilities, and relationships of the statistical models. Even though there are some statistics to measure the association of categorical variables, it is not easy to find graphical methods to represent the relation of the measure of association. In this paper, a graphical method is proposed based on the correlation coefficients of each cell for a two-dimensional contingency table.
Section 2 defines the correlation coefficient for each of the two selected levels of the row and column variables of a I × J contingency table. One can then obtain a I(I − 1) × J(J − 1) correlation coefficient matrix. In Section 3, the I×J vector matrix can be found based on the correlation coefficient matrix by extending the ideas of the correlation diagram of Trosset (2005). Each element in the vector matrix can express an angle corresponding to each cell, so that a correlation plot for the correlation coefficient matrix is represented on a unit circle with angles. This is called the CC plot. The CC plots are explored for various 2×2 and 3×3 contingency tables. Some characteristics obtained from the CC plots are derived. An empirical 4 × 4 contingency table is discussed in Section 4. We explain that the CC plot can be expanded for a high-dimensional contingency table in Section 5. Section 6 summarizes the conclusions of this study.
2. Correlation coefficients for a contingency table
For a I × J contingency table, a partial 2 × 2 contingency table can be considered for the two selected levels of row and column variables, which for example are the i and i′^{th} (i ≠ i′) levels of the row and the j and j′^{th} ( j ≠ j′) levels of the column. The correlation coefficient ρ_{i ji′ j′} for the each of the two selected levels of the row and column can then be defined as
where p_{i}_{+} = p_{i j} + p_{i j′} and p_{+}_{j} = p_{i j} + p_{i′ j}. The correlation coefficient for each of the two selected levels of the row and column from a I × J contingency table exhibits the following properties:
(1) The value is invariant when both order of two levels of row and column variables are exchanged together, i.e., ρ_{i ji′ j′}=ρ_{i′ j′}_{i j}.
(2) The sign of the value is reversed when the order of the levels of either row or column is exchanged, i.e., ρ_{i ji′ j′}=−ρ_{i′ ji j′}=−ρ_{i j′ i′}_{j}.
This can be summarized as ρ_{i ji′ j′} = ρ_{i′ j′}_{i j} = −ρ_{i′ ji j′} = −ρ_{i j′ i′}_{j}. For a I × J contingency table (I ≥ 3, J ≥ 3), one obtains a I(I − 1) × J(J − 1) correlation coefficient matrix, which is denoted as P = (ρ_{i ji′ j′} ). Nonetheless, it is enough to say that for a 2 × 2 contingency table, there exists one correlation coefficient since ρ_{1122} = ρ_{2211} = −ρ_{1221} = −ρ_{2112}. Hence, the correlation coefficient is represented as a scalar for a 2 × 2 table.
3. Correlation plot for the correlation coefficients matrix
The correlation diagram of Trosset (2005) is proposed to visualize a p × p correlation coefficient matrix P = (ρ_{i j}) on a unit circle. The p vectors on a unit circle are represented with the vector θ = (θ_{1}, θ_{2}, . . . , θ_{p}), whose element is the corresponding angle to satisfy the optimization problem for the following objective function.
Trosset (2005) used the S-Plus function, nlminb, which is a quasi-Newtonian algorithm developed by Gay (1983, 1984), to minimize the optimization problem in (3.1).
In this work, using the I(I − 1) × J(J − 1) correlation coefficient matrix P = (ρ_{i ji′ j′} ) obtained in Section 2, one can find the IJ vector-matrix θ = (θ_{i j}) that expresses the I × J angles corresponding to each correlation coefficient in P to solve the following objective function.
Since ρ_{i ji′ j′} has a value from −1 to 1, the value of θ_{i j} belongs to (0, 2π) so that the I × J elements in the vector-matrix can be represented on a unit circle with angles θ_{i j}. We call this correlation plot for the correlation coefficients matrix obtained from the contingency table as the CC plot.
If the correlation coefficient, ρ_{i ji′ j′} , has a positive and large value close to 1.0, this means that the difference between the two vectors, θ_{i j} − θ_{i′ j′} , is close to 0.0 degree, so that two vectors θ_{i j} and θ_{i′ j′} locate closely. On the other hand, when the correlation coefficient has a negative and large value close to −1.0, the difference between the two vectors is almost 180 degrees, and the two vectors are located opposite each other. Also, if the correlation coefficient, ρ_{i ji′ j′} , is close to 0.0, it means that the value of the difference between the two vectors, θ_{i j} − θ_{i′ j′} , is close to 90 degrees, and the angle between θ_{i j} and θ_{i′ j′} is close to a right angle (90 degrees).
3.1. 2 × 2 contingency table
Consider a 2 × 2 contingency table. The objective function with I = J = 2 in (3.2) is then equal to
Setting θ_{11} = 0 as an initial value, we then obtain θ_{22} = cos^{−1}(ρ_{1122}) and θ_{12}–θ_{21} = cos^{−1}(−ρ_{1122}), which implies that θ_{12} = −θ_{21} = cos^{−1}(ρ_{1122})/2 since ρ_{1122} = −ρ_{1221} = −ρ_{2112}. For example. if ρ_{1122} = 0.2588, then θ_{22} = 75° and θ_{12} = −θ_{21} = 105°/2 = 52.5°. If ρ_{1122} = 0.9063, then θ_{22} = 25° and θ_{12} = −θ_{21} = 155°/2 = 77.5°. And if ρ_{1122} = −0.8660, then θ_{22} = 150° and θ_{12} = −θ_{21} = 30°/2 = 15°. Three CC plots for this example are displayed in Figure 1.
When a correlation coefficient ρ_{1122} has a positive and large value, one can find from Figure 1(b) that the vector θ_{22} is located close to the vector θ_{11} = 0, and both the vectors −θ_{12} and θ_{21} are located between 0 and θ_{22}. When a correlation coefficient ρ_{1122} has a negative and large value, it is found from Figure 1(c) that both the vectors −θ_{12} and θ_{21} are located between 0 and θ_{22}, but the vector θ_{22} is located far away from θ_{11} = 0. Moreover, we could say that the vector θ_{22} in Figure 1(a) is located between those in Figure 1(b) and Figure 1(c), since a correlation coefficient ρ_{1122} has a positive but small value.
3.2. 3 × 3 contingency table
Consider four 3 × 3 contingency tables in Table 1. The first and second tables show strong positive and negative relations and the third and fourth tables display two different odd relationships. It is then easy to obtain four correlation coefficient matrices
The 3×3 vector-matrices θ could also be obtained to solve the optimization problem in (3.2) using a quasi-Newtonian algorithm such as the nlminb. Setting θ_{11} = 0 as an initial value.
Figure 2 shows the four CC plots for the four correlation coefficient matrices. Now we explore the relations between Table 1 and Figure 2. Table 1(a) exhibits a strong and positive relation. The CC plot in Figure 2(a) tells that the diagonal vector set (θ_{11}, θ_{22}, θ_{33}) have similar values but each of the three vector pairs (θ_{12}, θ_{21}), (θ_{13}, θ_{31}), and (θ_{23}, θ_{32}) is located opposite each other. In other words, the diagonal cells (1, 1), (2, 2), and (3, 3) have a positive relation but the three pairs of cells ((1, 2), (2, 1)), ((1, 3), (3, 1)), and ((2, 3), (3, 2)) that are facing each other around the diagonal cells have negative relations.
Table 1(b) has a strong but negative relation. The CC plot in Figure 2(b) shows that another diagonal vector set (θ_{13}, θ_{22}, θ_{31}) have similar values but each of the three vector pairs (θ_{11}, θ_{33}), (θ_{12}, θ_{23}), and (θ_{21}, θ_{32}) is located on the opposite side of each other. In other words, the diagonal cells (1, 3), (2, 2), and (3, 1) have a positive relation but the three pairs of cells ((1, 1), (3, 3)), ((1, 2), (2, 3)), and ((2, 1), (3, 2)) that are facing each other around the other diagonal cells have negative relations.
The diagonal cells of Table 1(c) and Table 1(d) have large and analog values. However, the cells under the diagonal of Table 1(c) and those over the diagonal of Table 1(d) have also large values. Hence, the three vectors (θ_{11}, θ_{22}, θ_{33}) are observed to be located closely from the CC plots in Figure 2(c) and (d). One can say that the (1, 1), (2, 2), and (3, 3) diagonal cells have positive but weak relations.
From the CC plot in Figure 2(c), the vector set (θ_{21}, θ_{31}, θ_{32}) whose vectors correspond to under the diagonal has similar values and are located close to the vectors θ_{11} and θ_{22}. However, among θ_{12}, θ_{13} and θ_{23} vectors that belong to above the diagonal cells, both vectors θ_{13} and θ_{23} have similar values and are located close to vector θ_{33} but vector θ_{12} is located far away from θ_{11}. It is found that the (2, 1), (3, 1), and (3, 2) cells that are under the diagonal have strong and positive relations with the (1, 1) and (2, 2) cells. On the other hand, among the (1, 2), (1, 3), and (2, 3) cells that are over the diagonal, the (1, 3) and (2, 3) cells have similar relations, while the (1, 2) cell has a strong but negative relation with (1, 1) cell.
It is evident that the diagonal cells and above the diagonal cells in Table 1(d) have large frequencies, opposite that of Table 1(c). From the CC plot in Figure 2(d), the vector set (θ_{12}, θ_{13}, θ_{23}) whose vectors are over the diagonal has similar values and are located close to the vectors θ_{11} and θ_{22}. However, among θ_{21}, θ_{31}, and θ_{32} vectors that belong to under the diagonal cells, both the vectors θ_{31} and θ_{32} have similar values and are located close to vector θ_{33}, but the vector θ_{21} is located far away from θ_{11}. Hence, it is found that the (1, 2), (1, 3), and (2, 3) cells that are over the diagonal have strong and positive relations with the (1, 1) and (2, 2) cells. On the other hand, among the (2, 1), (3, 1), and (3, 2) cells that are under the diagonal, the (3, 1) and (3, 2) cells have similar relations, while the (2, 1) cell has a strong but negative relation with (1, 1) cell.
Therefore, we might derive some characteristics from the CC plots in Figure 1. The (i, j) cells have positive relations when the corresponding vectors θ_{i j} have similar values. On the other hand, the (i, j) cells have negative relations with others when the corresponding vectors θ_{i j} are close to 180 degrees from the other vectors.
Positive relations are found in
The diagonal cells (1, 1), (2, 2), and (3, 3) in Figure 2(a),
The other diagonal cells (1, 3), (2, 2), and (3, 1) in Figure 2(b),
The diagonal cells (1, 1), (2, 2), and (3, 3) in Figure 2(c) and (d) but weak relations,
The cells under the diagonal with the (1, 1) and (2, 2) cells in Figure 2(c),
The cells over the diagonal with the (1, 1) and (2, 2) cells in Figure 2(d).
Negative relations are found in
Three pairs of cells ((1, 2), (2, 1)), ((1, 3), (3, 1)), and ((2, 3), (3, 2)) in Figure 2(a),
Three pairs of cells ((1, 1), (3, 3)), ((1, 2), (2, 3)), and ((2, 1), (3, 2)) in Figure 2(b),
The (1, 2) cell with (1, 1) cell in Figure 2(c) with strong relation,
The (2, 1) cell with (1, 1) cell in Figure 2(d) with strong relation.
4. Correlation plot for an illustrated example
Consider a 4 × 4 contingency table in Table 2 with the income and job satisfaction variables (Norušis, 1988). The independent model is accepted for this data so that the income variable is independent of the job satisfaction variable (Hong, 1995). However, the linear–by-linear uniform association model is fitted better than the independent model (Hong, 1995). Therefore, it is found that the two variables have a linear relation.
Based on the correlation coefficient matrix (the description of the matrix is omitted since it is a 12×12 matrix), the vector-matrix, θ, from Table 2 is obtained and the CC plot is represented in Figure 3 using the vector-matrix.
Let us take a look at the vectors corresponding to the row variable in Figure 3. The four vector sets (θ_{11}, θ_{12}, θ_{13}), (θ_{21}, θ_{23}, θ_{24}), (θ_{32}, θ_{33}, θ_{34}), and (θ_{41}, θ_{42}, θ_{43}, θ_{44}) have similar values. The income variable I = 1 (lowest level) has relations with the job satisfaction levels J = 1, 2, 3 (very dissatisfied, little dissatisfied, moderately satisfied), while the income variable I = 4 (highest level) has relations with all levels of the job satisfaction. On the other hand, the income variable I = 2 (low level) has relations with the job satisfaction levels J = 3, 4 (moderately satisfied, very satisfied), and the income variable I = 3 (high level) has relations with the job satisfaction level J = 2, 3, 4 (little dissatisfied, moderately satisfied, very satisfied). Hence, those whose incomes belong to either very low or very high levels are neither dissatisfied nor satisfied with their job. Nonetheless, for middle income levels (low or high levels), the related job satisfaction levels are moderately satisfied and very satisfied levels excluding the very dissatisfaction level.
The vectors corresponding to the column variable are then considered. The two vector sets (θ_{12}, θ_{22}, θ_{32}, θ_{42}) and (θ_{13}, θ_{23}, θ_{33}, θ_{43}) have similar values. This means that the job satisfaction J = 1 (very dissatisfied) and J = 4 (very satisfied) have no relation with the levels of the income variable. However, the job satisfaction J = 2 (little dissatisfied) and J = 3 (moderately satisfied) have relations with all levels of the income variable. Hence, when the job satisfaction is either very dissatisfied or very satisfied, the income levels do not have relations with the job satisfaction. However, those with middle job satisfaction levels (little dissatisfied and moderately satisfied) have relations with all levels of the job satisfaction variable. Therefore, we can conclude that the job satisfaction variable has a linear relation with the income variable, which exhibits a similar analysis result of the linear-by-linear uniform association model.
There exists the correspondence analysis method that represents a contingency table data. This explores not only the relationship between the row and column variables with an emphasis on correspondence but also the relationship between each variable’s categories. The CC plot is proposed to be an alternative method that can also describe a contingency table data graphically, and this plot can explain the relationship between each variable’s categories. Moreover, the correspondence analysis is based on chi-squared distance with an emphasis on correspondence, whereas the CC plot is based on the correlation coefficients between row and column variable’s category levels. And the CC plot represents the correlation coefficients as the angles between two vectors in a unit circle geometrically, whereas the correspondence analysis method is shown in a rectangle.
5. Correlation plot for high-dimensional contingency tables
We consider a three-dimensional I × J × K contingency table. For a given kth category of the third layer variable (K = k), the correlation coefficient for the two selected rows and columns are denoted as ${\rho}_{ij{i}^{\prime}\hspace{0.17em}{j}^{\prime}}^{k}$. Then, the K correlation coefficient matrices, (P^{1}, . . . , P^{K}), where ${\mathbf{P}}^{\mathbf{k}}=({\rho}_{ij{i}^{\prime}\hspace{0.17em}{j}^{\prime}}^{k})$, k = 1, . . . , K can be obtained. For each kth correlation coefficient matrix, the I×J vector-matrix, ${\mathit{\theta}}^{\mathbf{k}}=({\theta}_{ij}^{k})$, are calculated. With each vector-matrix, the CC plot can be represented. Hence, we could discuss K CC plots for a three-dimensional contingency table and derive some relationships from the CC plots.
For an example of a 2 × 2 × 2 contingency table in Table 3, two correlation coefficients, ${\mathit{P}}^{\mathbf{1}}=({\rho}_{1122}^{1}=-0.1746)$ and ${\mathbf{P}}^{\mathbf{2}}=({\rho}_{1122}^{2}=-0.1590)$, and two vector matrices, ${\mathit{\theta}}^{\mathbf{1}}=({\theta}_{ij}^{1}=0\xb0,40\xb0,-40\xb0,100\xb0)$ and ${\mathit{\theta}}^{\mathbf{2}}=({\theta}_{ij}^{2}=0\xb0,40\xb0,-40\xb0,99\xb0)$, are calculated.
The two CC plots in Figure 4 have analog shapes since the correlation coefficient ${\rho}_{1122}^{1}=-0.1746$ is almost the same as ${\rho}_{1122}^{2}=-0.1590$. Also, these CC plots are similar to that in Figure 1(a) except for the larger angles of vector θ_{22} in Figure 4(a) and (b) compared to those in Figure 1(a). This is because these correlation coefficients in Figure 4 have signs that are opposite that of ρ_{1122} = 0.2588 in Figure 1(a) and the absolute values are not very different.
6. Conclusions
There are lots of graphical representation methods for the two-dimensional I × J contingency tables. Most methods are based on the frequencies, probabilities, association measures, and goodness-of-fit statistics. In this work, a graphical method is proposed using the correlation coefficient matrix, P = (ρ_{i ji′ j′} ), whose element is the correlation coefficient for the selected levels of the row and column variables from the I × J contingency table such as the i and i′^{th} (i ≠ i′ ) levels of the row and the j and j′^{th} ( j ≠ j′ ) levels of the column.
Each value in the I × J vector-matrix, θ = (θ_{i j}), is represented as the angle corresponding to each (i, j) cell. Therefore, the θ_{i j} vectors could be represented as a unit circle with angles. This plot is named as the CC plot, which is a correlation plot for the contingency table.
Some 2 × 2 and 3 × 3 contingency tables are implemented as the CC plots. From the CC plots, the relationships among the cells in a contingency table could be explained. It is found that the resulting relations are almost the same as those of the log-linear model analysis with an illustrated example.
The CC plot can also be extended to more than the two-dimensional contingency tables. The CC plots are explained for a three-dimensional contingency table and explored for a contingency 2×2×2 table.
There exists the correspondence analysis method which represents a contingency table data. This method explore not only the relationship between the row and column variables with an emphasis on correspondence but also the relationship between each variable’s categories on a rectangle. The CC plot is proposed to be an alternative graphical method for a contingency table. Moreover, the correspondence analysis is based on chi-squared distance with an emphasis on correspondence, whereas the CC plot is based on the correlation coefficients between row and column variable’s category levels. And the CC plot represents the correlation coefficients as the angles between two vectors in a unit circle geometrically, whereas the correspondence analysis method is shown in a rectangle.
Since the CC plot has some advantages that it is easy to use the algorithm for obtaining the angles between two vectors, and simple to interpret the CC plot represented in a unit circle, the CC plot could be used with other graphical methods as an alternative method for a contingency table. Therefore, the CC plot proposed in this work can be a good and worthwhile graphical representation method for categorical data.
Figures
Fig. 1. CC plots for the 2 × 2 contingency tables.
Fig. 2. CC plots for the 3 × 3 contingency tables.
Barrowman NJ and Myers RA (2000). Still more spawner-recruitment curves: the hockey stick and its generalizations. Canadian Journal of Fisheries and Aquatic Sciences, 57, 665-676.
Barrowman NJ and Myers RA (2003). Raindrop plots: a new way to display collections of likelihoods and distributions. The American Statistician, 57, 268-274.
Becker RA, Cleveland WS, and Shyu MJ (1996). The visual design and control of trellis display. Journal of computational and Graphical Statistics, 5, 123-155.
Cleveland WS and McGill R (1984). Graphical perception: theory, experimentation, and application to the development of graphical methods. Journal of the American statistical association, 79, 531-554.
Cohen A (1980). On the graphical display of the significant components in two-way contingency tables. Communications in Statistics-Theory and Methods, 9, 1025-1041.
Corsten LCA and Gabriel KR (1976). Graphical exploration in comparing variance matrices. Biometrics, 9, 851-863.
Darroch JN, Lauritzen SL, and Speed TP (1980). Markov fields and log-linear interaction models for contingency tables. The Annals of Statistics, 8, 522-539.
Doi M, Nakamura T, and Yamamoto E (2001). Conservative tendency of the crude odds ratio. Journal of the Japan Statistical Society, 31, 53-65.
Fienberg SE (1968). (The estimation of cell probabilities in two-way contingency tables (Doctoral dissertation)) , Harvard University, USA.
Fienberg SE and Gilbert JP (1970). The geometry of a two by two contingency table. Journal of the American Statistical Association, 65, 694-701.
Fienberg SE (1975). Perspective Canada as a social report. Social Indicators Research, 2, 153-174.
Friendly M (1991). SAS System for Statistical Graphics (1st ed), USA, SAS Publishing.
Friendly M (1992). Mosaic displays for loglinear models. Proceedings of the Statistical Graphics Section. , American Statistical Association, 61-68.
Friendly M (1994). Mosaic displays for multi-way contingency tables. Journal of the American Statistical Association, 89, 190-200.
Gabriel KR (1971). The biplot graphical display of matrices with applications to principal component analysis. Biometrika, 58, 453-467.
Gay DM (1983). Algorithm 611: subroutines for unconstrained minimization using a model / trust-region approach. ACM Transactions on Mathematical Software, 9, 503-524.
Gay DM (1984). A trust region approach to linearly constrained optimization. Proceedings of the Numerical Analysis Dundee. , Springer, Berlin, 171-189.
Gower J and Hand D (1996). Biplots, London, Chapman and Hall.
Hartigan JA and Kleiner B (1981). Mosaic for contingency tables, Computer Science and Statistics. Proceedings of the 13th Symposium on the Interface. , Springer-Verlag, New York, 268-273.
Hartigan JA and Kleiner B (1984). A mosaic of the television ratings. American Statisticians, 38, 32-35.
Hong CS (1995). Loglinear Model, Seoul, Freedom Academy.
Hong CS, Choi HJ, and Oh MG (1999). Geometric descriptions for hierarchical log-linear models. InterStat on the Internet.
Hong CS and Lee UK (2006). Graphical methods for hierarchical log-linear models. Communications for Statistical Applications and Methods, 13, 755-764.
Li X, Buechner JM, Tarwater PM, and Munoz A (2003). A diamond-shaped equiponderant graphical display of the effects of two categorical predictors on continuous outcomes. American Statisticians, 57, 193-199.
Park M, Lee JW, Lee JB, and Song SH (2008). Several biplot methods applied to gene expression data. Journal of Statistical Planning and Inference, 138, 500-515.
Pittelkow Y and Wilson S (2005). Use of principal component analysis and of the GE-biplot for the graphical exploration of gene expression data. Biometrics, 61, 630-632.
Trosset MW (2005). Visualizing correlation. Journal of Computational and Graphical Statistics, 14, 1-19.
Tufte ER (1985). The visual display of quantitative information. The Journal for Healthcare Quality (JHQ), 7, 15.
Tukey JW (1977). Exploratory Data Analysis, California, Addison-Wesley Publishing Company.
Yamamoto E and Doi M (2001). Noncollapsibility of common odds ratios without/with confounding. Bulletin of The 53rd Session of the International Statistical Institute Seoul, Korea. , 39-40.