Using ARIMA and Random Forest Models for Climatic Datasets Forecasting

Section: Research Paper
Published
Jun 25, 2025
Pages
42-55

Abstract

The climatic changes have important role which may lead to huge problems for the health of human and other organisms, therefore it is necessary to study and forecast this type of datasets to reduce .the damages through planning and controlling for these changes in the future. The main problem can be summarized in the nonlinearity of climatic dataset and its chaotic changes. The common approach is the integrated autoregressive and moving average model (ARIMA) as traditional univariate time series approach. Therefore, more appropriate model for studying the climatic data has been proposed for obtaining more accurate forecasting, it can be called random forest (RF) model.This model cannot deal with nonlinear data correctly and that may lead to inaccurate forecasting results. In this thesis, climatic datasets are studied represented by minimum air temperature and rational humidity for agricultural meteorological station in Nineveh. This thesis aims to satisfy data homogeneity through different seasons and find suitable model deal with nonlinear data correctly with minimal forecasting error comparing to ARIMA as traditional model. The research found the adequate of the model for this type of data, as it was found that there are some factors that contribute to the increase in the number of deaths in the epidemic, such as the advanced age of the patient, the length of stay in the hospital, the percentage of oxygen in the patient's blood, in addition to the incidence of some chronic diseases such as asthma. The study recommended a more in-depth study of other types of these models, and the use of other estimation methods, in addition to paying attention to the methods of data recording by the city health department.

References

  1. Barker, N. D. (1998).Basic concepts of statistics(Vol. 30). Oxford University Press, NY, USA.
  2. Box, G. E. P., & Jenkins, G. M. (1976). Time series analysis: Forecasting and control (rev. ed.) Holden-Day. San Francisco, 575.
  3. Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: forecasting and control. John Wiley & Sons.
  4. Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
  5. Chan, N. H. (2004). Time series: applications to finance. John Wiley & Sons.
  6. Chen, J., Li, M., & Wang, W. (2012). Statistical uncertainty estimation using random forests and its application to drought forecast.Mathematical Problems in Engineering,2012.
  7. Chen, L., Omaye, S. T., Yang, W., Jennison, B. L., & Goodrich, A. (2001). A comparison of two statistical models for analyzing the association between PM10 and hospital admissions for chronic obstructive pulmonary disease.Toxicology Methods,11(4), 233-246.
  8. Cryer, J. D., & Chan, K. S. (2008). Time series analysis: with applications in R (Vol. 2). New York: Springer.
  9. Daz-Robles, L. A., Ortega, J. C., Fu, J. S., Reed, G. D., Chow, J. C., Watson, J. G., & Moncada-Herrera, J. A. (2008). A hybrid ARIMA and artificial neural networks model to forecast particulate matter in urban areas: The case of Temuco, Chile.Atmospheric Environment,42(35), 8331-8340.
  10. Fang, X., Liu, W., Ai, J., He, M., Wu, Y., Shi, Y., & Bao, C. (2020). Forecasting incidence of infectious diarrhea using random forest in Jiangsu Province, China.BMC infectious diseases,20(1), 1-8.
  11. Hyndman, R. J., & Koehler, A. B. (2006). Another look at measures of forecast accuracy.International journal of forecasting,22(4), 679-688.
  12. Kane, M. J., Price, N., Scotch, M., & Rabinowitz, P. (2014). Comparison of ARIMA and Random Forest time series models for prediction of avian influenza H5N1 outbreaks.BMC bioinformatics,15(1), 1-9.
  13. Kitagawa, G. (2010). Introduction to time series modeling. Chapman and Hall/CRC.
  14. Liu, L. M. (2006).Time Series Analysis and Forecasting. 2nd ed. Scientific computing associates crop. Illinois,USA.
  15. Noureen, S., Atique, S., Roy, V., & Bayne, S. (2019). A comparative forecasting analysis of ARIMA model vs random forest algorithm for a case study of small-scale industrial load.International Research Journal of Engineering and Technology,6(09), 1812-1821.
  16. Palma, W. (2007). Long-memory time series: theory and methods. John Wiley & Sons.
  17. Pankratz, A. (1983). Forecasting with Univariate Box-Jenkins Models: Concepts and Cases. John Wily & Sons. Inc. USA.
  18. Petukhova, T., Ojkic, D., McEwen, B., Deardon, R., & Poljak, Z. (2018). Assessment of autoregressive integrated moving average (ARIMA), generalized linear autoregressive moving average (GLARMA), and random forest (RF) time series regression models for predicting influenza A virus frequency in swine in Ontario, Canada.PloS one,13(6), e0198313.
  19. Shukur, O. B. (2015). Artifical Neural Network and Kalman Filter Approaches Based on ARIMA for Daily Wind Speed Forecasting (Doctoral dissertation, Universiti Teknologi Malaysia).
  20. Shukur, O. B., & Lee, M. H. (2015). Daily wind speed forecasting through hybrid KF-ANN model based on ARIMA.Renewable Energy,76, 637-647.
  21. Shumway, R. H., and Stoffer, D. S. (2000). Time series analysis and its applications (Vol. 3). New York: springer.
  22. Wei, W. W. (2006). Time series analysis: univariate and multivariate. Methods. Boston, MA: Pearson Addison Wesley.
  23. Wei, W. W. S. (1990). Time series analysis: Univariate and multivariate methods. 478 pp. New York, AdissonWesley.
  24. Zafra, C., ngel, Y., & Torres, E. (2017). ARIMA analysis of the effect of land surface coverage on PM10 concentrations in a high-altitude megacity.Atmospheric Pollution

Identifiers

Download this PDF file

Statistics

How to Cite

Basheer Shukur, O., عدی, & Aljuborey, O. (2025). Using ARIMA and Random Forest Models for Climatic Datasets Forecasting. IRAQI JOURNAL OF STATISTICAL SCIENCES, 19(2), 42–55. Retrieved from https://rjps.uomosul.edu.iq/index.php/stats/article/view/20901