Universiti Teknologi Malaysia Institutional Repository

Binary logistic regression modelling with appropriate sample size in determining graduate employability factors for public universities in Malaysia

Tengku Mohamed, Tengku Salbiah (2020) Binary logistic regression modelling with appropriate sample size in determining graduate employability factors for public universities in Malaysia. Masters thesis, Universiti Teknologi Malaysia.

[img] PDF
362kB

Official URL: http://dms.library.utm.my:8080/vital/access/manage...

Abstract

The performance of variable selection is essential to build an effective logistic regression model. Generally, p-values are used to identify significant variables or factors in the model. However, when dealing with real tracer study data for a country, the size of the data is typically large of which causes the p-values to be deflated and affect the variable selection performance. Therefore, it is crucial to have an appropriate sample size and sampling ratio for this purpose. In this study, the appropriate sample size has been proposed based on simulated correlation tests and significant variables in order to improve the accuracy of variable selection. In addition, the sampling ratio in the response variable shows its best when it reflects the population ratio. Based on the proposed samples, the logistic regression model for graduate employability factor is subsequently proposed. It has been found that age, Cumulative Grade Point Average (CGPA), discipline of study, gender, state, and type of universities are the factors that significantly affect graduate employability among public universities in Malaysia. The results show that the proposed model has successfully improved the variable selection, model fitting, and classification accuracy as compared to the full model. Thus, by using a smaller sample size, the proposed model is able to maintain its statistical power in real data scenario by accurately selecting the significant factors.

Item Type:Thesis (Masters)
Uncontrolled Keywords:p-values, simulated correlation, Cumulative Grade Point Average (CGPA)
Subjects:Q Science > QA Mathematics
Divisions:Science
ID Code:101869
Deposited By: Widya Wahid
Deposited On:17 Jul 2023 02:36
Last Modified:17 Jul 2023 02:36

Repository Staff Only: item control page