Theory-based Model Validation in the Generalized Multifactor Dimensionality Reduction Algorithm for Ordinal Phenotypes

Section: Research Paper
Published
Jun 25, 2025
Pages
212-224

Abstract

Clinical studies indicate a close relationship between some diseases and the presence of specific interactions between genetic factors. As is the case in many studies, revealing genetic interactions that have a significant impact on the emergence of genetic diseases requires extensive statistical analyses. Because of the enormous volume of genetic data in the human race, it was necessary to develop statistical methods adapted to deal with high-dimensional data. Multifactor Dimensionality Reduction (MDR) is one of the leading nonparametric algorithms in this field. The algorithm reduces the dimensions of genetic data to obtain the most important interaction that has a direct impact on increasing the likelihood of genetic diseases appearing. In its composition, the algorithm relies on a set of nonparametric procedures to diagnose genetic interference with the highest impact exclusively on binary response variables. Like any statistical method, this algorithm is not devoid of weaknesses and application limitations, so the algorithm had to be developed to overcome the obstacles. One of the weaknesses of this algorithm is that the algorithm cannot handle data sets with ordinal response variable. Some researchers have developed a generalization of the multifactor dimensionality reduction algorithm to enable it to work with ordinal data. However, the generalized algorithm is more complex than the original algorithm. Therefore, we proposed developing the original algorithm in a simple way by employing ordinal logistic regression to classify individuals in the sample, while keeping all steps of the original algorithm unchanged. On the other hand, the MDR algorithm adopts a non-parametric method to verify the significance of the interferences nominated in the algorithm. This nonparametric procedure is based on the idea of permutational tests, and it consumes a very long time compared to parametric procedures that relies on theoretical approaches. Some researchers have suggested using the generalized extreme value distribution to verify the statistical significance of candidate interactions, but this method has only been used with continuous and binary dependent variables. In this research, the theoretical method based on the generalized extreme value distribution was employed instead of the permutational tests adopted in the algorithm when the response variable is of the ordinal type.

References

  1. Al-Khaledi, Z. T. (2019). "Serial Testing for Detection of Multilocus Genetic Interactions."
  2. Alzheimer's Disease Genetics Fact Sheet, accessed 18 August 2023, https://www.nia.nih.gov/health/alzheimers-disease-genetics-fact-sheet.
  3. Chauhan, W., Fatma, R., Wahab, A., & Afzal, M. (2022). Cataloging the potential SNPs (single nucleotide polymorphisms) associated with quantitative traits, viz. BMI (body mass index), IQ (intelligence quotient) and BP (blood pressure): an updated review. Egyptian Journal of Medical Human Genetics, 23(1),
  4. Gola, D., et al. (2016). "A roadmap to multifactor dimensionality reduction methods." Briefings in bioinformatics 17(2): 293-308.
  5. Hua, X., et al. (2010). "Testing multiple gene interactions by the ordered combinatorial partitioning method in casecontrol studies." Bioinformatics 26(15): 1871-1878.
  6. Millard SP (2013). _EnvStats: An R Package for Environmental Statistics_. Springer, New York. ISBN 978-1-4614-8455-4.
  7. Nelson, M., et al. (2001). "A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation." Genome research 11(3): 458-470.
  8. Pattin, K. A., et al. (2009). "A computationally efficient hypothesis testing method for epistasis analysis using multifactor dimensionality reduction." Genetic Epidemiology: The Official Publication of the International Genetic Epidemiology Society 33(1): 87-94.
  9. Ritchie, M. D., et al. (2001). "Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer." The American Journal of Human Genetics 69(1): 138-147.
  10. Sofroniou, N. and G. D. Hutcheson (1999). "The multivariate social scientist: Introductory statistics using generalized linear models." The Multivariate Social Scientist: 1-288.
  11. Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0.

Identifiers

Download this PDF file

Statistics

How to Cite

Tarik Al-Khaledi, Z., زيد, Ibraheem Othman, M., & محمد. (2025). Theory-based Model Validation in the Generalized Multifactor Dimensionality Reduction Algorithm for Ordinal Phenotypes. IRAQI JOURNAL OF STATISTICAL SCIENCES, 20(2), 212–224. Retrieved from https://rjps.uomosul.edu.iq/index.php/stats/article/view/20644