TEXT SIZE

search for



CrossRef (0)
An approach based on clustering for detecting differentially expressed genes in microarray data analysis
Communications for Statistical Applications and Methods 2024;31:571-584
Published online September 30, 2024
© 2024 Korean Statistical Society.

Yuki Ando1,a, Asanao Shimokawab

aDepartment of Applied Mathematics, Tokyo University of Science, Japan;
bDepartment of Mathematics, Tokyo University of Science, Japan
Correspondence to: 1 Department of Applied Mathematics, Tokyo University of Science, 1-3, Kagurazaka, Shinjuku-ku, Tokyo 162-8601, Japan. Email:1422701@ed.tus.ac.jp
Received March 14, 2024; Revised May 18, 2024; Accepted August 3, 2024.
 Abstract
To identify differentially expressed genes (DEGs), researchers use a testing method for each gene. However, microarray data are often characterized by large dimensionality and a small sample size, which lead to problems such as reduced analytical power and increased number of tests. Therefore, we propose a clustering method. In this method, genes with similar expression patterns are clustered, and tests are conducted for each cluster. This method increased the sample size for each test and reduced the number of tests. In this case, we used a nonparametric permutation test in the proposed method because independence between samples cannot be assumed if there is a relationship between genes. We compared the accuracy of the proposed method with that of conventional methods. In the simulations, each method was applied to the data generated under a positive correlation between genes, and the area under the curve, power, and type-one error were calculated. The results show that the proposed method outperforms the conventional method in all cases under the simulated conditions. We also found that when independence between samples cannot be assumed, the non-parametric permutation test controls the type-one error better than the t-test.
Keywords : two group comparison, microarray data, DEGs, permutation test