Robust Weighted Approaches to Detect and Deal with Outliers in Estimating Principal Component Regression Model.
Abstract
This paper aims to propose an approach to deal with the problem of Multi-Collinearity between the explanatory variables and outliers in the data by using the method of Principal Component Regression, and then using a robust weighting functions for the objective function has been used to deal with the presence of outliers in the data, and in order to verify the efficiency of the estimators, an experimental study was conducted through the simulation approach, and the methods were also applied to real data collected from the files of Badoush Cement Factory in Nineveh Governorate for the period from (2008-2014) with nine explanatory variables representing the chemical properties of cement and a dependent variable representing the physical properties of cement (hardness).The data was tested whether it was suffer from multi-collinearity problem and then the least squares using principal components as an explanatory variables and the model was estimated, and it was found that the variables suffer from Multi-Collinearity problem, and the treatment was done by applying principal component regression weighed by robust weights due to the presence of outlying values in the data in addition to the collinearity problem.
References
- Al-Rawi, Rawiya Emad Karim, (2017AD), Using fuzzy ordinal functions in the impartial estimation of the parameters of the simple linear regression model, masters thesis, College of Computer Science and Mathematics, University of Mosul-Iraq.
- Farrar, D. E., and Glauber, R. R., (1967), Multicollineanty in regression analysis: The problem revisted, The Review of Economics and Statistics, 49:92107, [p495, 496, 498].
- Greene, W. H., (2003), Econometric Analysis, Prentic-Hall, New Jersey, 5th edition.
- Hadi, A. S., and Ling .R. F., (1998), Some cautionary notes on the use of principal components regression, American statistician, 52, 15-19.
- Huber, P.J., (1964), Robust Estimation of a Location Parameter, Annals of Mathematical Statistics.USA, 35:73- 101.
- Huber, P.J., (1973), Robust regression, Asymptotic, conjectures, and Monte Carlo, Ann. Statist., Vol. 1, no.5, 799-821.
- Hubert, M., and Verboven, K., (2003), Robust PCR methods for Partial Least Squares Regression, Journal of chemometrics 17,537-549.
- Hubert, M., and Verboven, S., (2003), A robust PCR methods of High dimensional regressors, Journal of chemometrics 17,438-452.
- Imdadullah, M., Aslam, M., and Altaf, S., (2016), Mctest: An R Package for detection of collinearity among regressors, R J., 8, 499509. Available online: https://journal.r-project.org/archive/2016/RJ-2016-062/index.html (Accessed on 26 March 2020).
- Jolliffe, I.T., (1982), A note on the use of principal components in Regression, Appl. Statist., 31, 300303.
- Kendall, M. G., (1957), A Course in Multivariate Analysis, Charles Griffin & Company, London.
- Student, Diaa Majeed, Student, Bashar Abdel Aziz and Student, Ali Diaa, (2011), Using some strong statistical methods to determine the expected achievements in the jumping and jumping competition for men in the London and Rio de Janeiro Olympic Games (2012, 2016), Al-Qadisiyah Conference , published in the conference proceedings, Al-Qadisiyah University, Iraq.
- Kovcs, P. Petres, T., and Tth, (2005), A new measure of multicollinearity in linear regression models, International Statistical Review / Revue Internationale de Statistique, 73(3):405412.
- Kutner, M.H., Nachtsheim, C.J., and Neter, J., (2004), Applied Linear Regression Models, McGraw Hill Irwin, 4th. Ed.
- Maddala, G. S., (1992),Introduction to econometrics,Macmillan, New York.
- Makridakis, S., and Hibon, M., (1995), Evaluating accuracy (or error) Measures, INSEAD Working Papers Series 95/18/TM. Fontainebleau, France.
- Marquardt, D.W., (1970), Generalized inverses, ridge regression, biased linear estimation, and nonlinear estimation, Technimetrics, 12(3):591612.
- Memmedli, M., and Ozdemir, O., (2009), A Comparison Study of Performance Measures and Length of Intervals in Fuzzy Time Series by Neural Networks, Proceedings of the 8th Wseas International Conference on System Science and Simulation in Engineering.
- Montgomery, D. C., Peck ,E. A., and Vining, G. C., (2001), Introduction to Linear Regression Analyses, 3rd, edition, John Wiley & Sons, New YorkUSA.
- Sarwar, A., and Sharma, V., (2014), Comparative analysis of machine learning techniques in prognosis of type II diabetes,AI & society,29(1), 123-129.
- Willmott, C., and Matsuura, K., (2005), Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance,Climate Research,30(1), 79-82.
- Woschnagg, E., and Cipan, J., (2004), Evaluating Forecast Accuracy, UK konometrische Prognose, University of Vienna,Department of Economics.
- Adam Bremah Suleiman Mastour and Amal Al-Sir Al-Khader Abdel-Rahim, (2016 AD), Treatment of the Linear Overlap Problem Using Principal Components Analysis (by application to fuel consumption in cars), Sudan University of Science and Technology - College of Science - Department of Applied Statistics.
- Jibril, Muhammad Suleiman Muhammad, (2014 AD), Polylinearity, its causes and effects, and treatment with character gradient and main component gradient with application on hypothetical data, Sudan University of Science and Technology - College of Graduate Studies.
- Asteriou, D., and Hall, S. G., (2007),Applied econometrics: A modern approach using EViews and Microfit, Palgrave Macmillan, New York, [p496].
- Boiroju, N.K., and Reddy, M.K., (2012), A Graphical Method for Model Selection, Pakistan Journal of Statistics & Operation Research, pp. 767-776.
- Belsley, D. A., Kuh, E., and Welsch, R. E., (2004), Diagnostics: Identifying Influential Data and Sources of Collinearity, John Wiley & Sons, New York.
- Chatterjee, S., and Hadi, A. S., (2012), Regression Analysis by Example", 4th. Ed., John Wiley and Sons, New York.
- Curto, J. D., and Pinto, J. C., (2011), The corrected VIF (CVIF), Journal of Applied Statistics, 38(7):14991507.