Using Logistic Regression with Time-Stratified Method for Air Pollution Datasets Forecasting

Section: Research Paper
Published
Jun 25, 2025
Pages
19-28

Abstract

Abstract Particular matter (PM10) studying and forecasting is necessary to control and reduce the damage of environment and human health. There are many pollutants as sources of air pollution may effect on PM10 variable. Studied datasets have been taken from the Kuala Lumpur meteorological station, Malaysia. Logistic regression (LR) is built by using generalized linear model as a special case of linear statistical methods, therefore it may reflect inaccurate results when used with nonlinear datasets. Time stratified (TS) method in different styles is proposed for satisfying more homogeneity of datasets. It includes ordering similar seasons in different years together to formulate new variable smoother than their original. The results of LR model in this study reflect outperforming for time stratified datasets comparing to full dataset. In conclusion, LR forecasting can be depended after datasets time stratifying to satisfy more accuracy with nonlinear multivariate datasets in which PM10 is to dependent variable.

References

  1. support vector machines. Journal of Control Science and Engineering, 2012, 4.
  2. Bai, L., Wang, J., Ma, X., & Lu, H. (2018). Air pollution forecasts: An overview. International journal of environmental research and public health, 15(4), 780.
  3. Dayton, C. M. (1992). Logistic regression analysis. Stat, 474-574.
  4. Ferrer, A. J. A., & Wang, L. (1999). Comparing the Classification Accuracy among Nonparametric, Parametric Discriminant Analysis and Logistic Regression Methods.
  5. Hosmer, D. W., Hosmer, T., Le Cessie, S., & Lemeshow, S. (1997). A comparison of goodnessoffit tests for the logistic regression model. Statistics in medicine, 16(9), 965-980.
  6. Krzyzanowski, M., Bundeshaus, G., Negru, M. L., & Salvi, M. C. (2005). Particulate matter air pollution: how it harms health. World Health Organization, Fact sheet EURO/04/05, Berlin, Copenhagen, Rome, 4, 14.
  7. Malig, B. J., Pearson, D. L., Chang, Y. B., Broadwin, R., Basu, R., Green, R. S., & Ostro, B. (2015). A time-stratified case-crossover study of ambient ozone exposure and emergency department visits for specific respiratory diagnoses in California (20052008). Environmental health perspectives, 124(6), 745-753.
  8. Marill, K. A. (2004). Advanced statistics: linear regression, part II: multiple linear regression. Academic emergency medicine, 11(1), 94-102.
  9. Midi, H., Sarkar, S. K., & Rana, S. (2010). Collinearity diagnostics of binary logistic regression model. Journal of Interdisciplinary Mathematics, 13(3), 253-267.
  10. Pohlman, J. T., & Leitner, D. W. (2003). A comparison of ordinary least squares and logistic regression.
  11. Santner, T. J., & Duffy, D. E. (1986). A note on A. Albert and JA Anderson's conditions for the existence of maximum likelihood estimates in logistic regression models. Biometrika, 73(3), 755-758.
  12. Soderstrom, I. R., & Leitner, D. W. (1997). The Effects of Base Rate, Selection Ratio, Sample Size, and Reliability of Predictors on Predictive Efficiency Indices Associated with Logistic Regression Models.
  13. Tobias, A., Armstrong, B., & Gasparrini, A. (2014). Analysis of time-stratified case-crossover studies in environmental epidemiology using Stata. Paper presented at the United Kingdom Stata Users' Group Meetings 2014.
  14. Vijayaraghavan, N., & Mohan, G. (2016). Air pollution analysis for Kannur city using artificial neural network. International Journal of Science and Research, 5, 1399-1401.

Identifiers

Download this PDF file

Statistics

How to Cite

Amir Mohammad, S., اسامة, & Basheer Shukur, O. (2025). Using Logistic Regression with Time-Stratified Method for Air Pollution Datasets Forecasting. IRAQI JOURNAL OF STATISTICAL SCIENCES, 17(1), 19–28. Retrieved from https://rjps.uomosul.edu.iq/index.php/stats/article/view/20791