| English | Arabic | Home | Login |

Thesis

2024

Using Nonlinear Transformations with Robust and Bootstrap Regression Analysis

2023
Transformations of the dependent or independent variables or both together can improve the fit and correct violations of model assumptions: constant error variance or normality or linear relation between dependent and independent variables. Furthermore, Ordinary Least Squares (OLS) is the most widely used approach to estimate the parameters of linear regression models. However, in the presence of outliers, robust estimators are used rather than the OLS method. In this study, we used different nonlinear transformation functions with OLS and robust regression models. To illustrate the superior transformation function, we compared the coefficient of determination, Breusch-Pagan test and Shapiro-Wilk test between the transformations function before and after the transformation. The bootstrap approach was also used, which has been effectively used for many statistical inference problems. In this thesis, we used an R package called “trafo”, which makes it simple for the user to decide which transformation function is suitable for fulfilling the assumptions. In practice, it is often the case that the assumption of linear regression is violated, such as when highly influential outliers exist in the dataset, which will adversely impact the validity of the statistical analysis. Finding outliers is important because they are responsible for invalid inferences and inaccurate predictions as they have a greater influence on the calculated values of different estimations. The outliers are divided into Vertical Outliers (VOs), Good Leverage points (GLPs), and Bad Leverage Points (BLPs) but only the VOs and BLPs have an undue effect on parameter estimations. We compared several outlier detection techniques using a robust diagnostic plot to classify between VOs and BLPs, by decreasing both swamping and masking effects for both the untransformed and transformed variables. The results indicated that finding the transformation function of independent and dependent variable will be suitable and beneficial in obtaining a more correct regression model in data. When data contains outliers, the Robust MM (Modified Maximum Likelihood) and the proposed bootstrap robust MM-estimator (Boot-MM) is thus recommended as the best estimate for fitting regression. The thesis indicated that modified generalized DIFFITS (different of fit) against the Diagnostic Robust Generalized Potential (MGDFF-DRGP) successfully detects outliers in the data. All the results and figures were obtained by using the R program.

Back