Identifying key patterns of college student’s background through exploratory data analysis
Main Article Content
Abstract
The declining of student interest had forced universities to examine the characteristics of each student. According to higher education statistics on the number of new students, fluctuating values have been found in recent years. Several research used exploratory data analysis (EDA) approach to analyze new student admissions data. EDA is offered a summary of the dataset analysis and preliminary findings. There are variables decided to be dropped because consisted high number of missing values. On the other hand, some data filled with mean and mode because the number of missing not more than 20%. The missing values in each of attribute might be cleaned using another way. The admission team in university might encourage the registrants to complete and input correct data to the system. Based on the visualization, we found that some college students applied to university from several background of area, demographic and etc. The marketing division might apply another strategy is area had small number of college which is Kalimantan. Public health, computer science and insutry technology are major that have potential to be promoted due to the job prospects.
Downloads
Article Details
Badan Pusat Statistik. (2025, February 19). Jumlah Perguruan Tinggi1, Dosen, dan Mahasiswa2 (Negeri dan Swasta) di Bawah Kementerian Pendidikan Tinggi, Sains, dan Teknologi Menurut Provinsi, 2024. Https://Www.Bps.Go.Id/Id/Statistics-Table/3/Y21kVGRHNXZVMEl3S3pCRlIyMHJRbnB1WkVZemR6MDkjMw==/Jumlah-Perguruan-Tinggi1–Dosen–Dan-Mahasiswa2–Negeri-Dan-Swasta–Di-Bawah-Kementerian-Pendidikan–Kebudayaan–Riset–Dan-Teknologi-Menurut-Provinsi–2022.Html. https://www.bps.go.id/id/statistics-table/3/Y21kVGRHNXZVMEl3S3pCRlIyMHJRbnB1WkVZemR6MDkjMw==/jumlah-perguruan-tinggi1–dosen–dan-mahasiswa2–negeri-dan-swasta–di-bawah-kementerian-pendidikan–kebudayaan–riset–dan-teknologi-menurut-provinsi–2022.html
Bany Mohammed, A., Al-Okaily, M., Qaism, D., & Khalaf Al-Majali, M. (2024). Towards an understanding of business intelligence and analytics usage: Evidence from the banking industry. International Journal of Information Management Data Insights, 4(1). https://doi.org/10.1016/j.jjimei.2024.100215
Dang, H. C. T., Thi Nguyen, P. L., Thi Le, P. L., Hoang Nguyen, T. T., & Vu, B. T. (2025). Impact of big data analytics capabilities on sustainable performance of Vietnamese retail companies: The mediating role of innovation. Journal of Open Innovation: Technology, Market, and Complexity, 11(3). https://doi.org/10.1016/j.joitmc.2025.100569
Desiani, A., Dewi, N. R., Fauza, A. N., Rachmatullah, N., Arhami, M., & Nawawi, M. (2021). Handling Missing Data Using Combination of Deletion Technique, Mean, Mode and Artificial Neural Network Imputation for Heart Disease Dataset. In Science and Technology Indonesia (Vol. 6, Issue 4). https://doi.org/11.26554/sti.2221.6.4.333-312
Downing, N. J. (2025). Missing value imputation in environmental, social, and governance data: an impact on emissions scores. Finance Research Letters, 85. https://doi.org/10.1016/j.frl.2025.107818
Emmanuel, T., Maupong, T., Mpoeleng, D., Semong, T., Mphago, B., & Tabona, O. (2021). A Survey on Missing Data in Machine Learning. Journal of Big Data, 8(1). https://doi.org/10.1186/s40537-021-00516-9
Fattah, I. A., Prabowo, H., Tjhin, V. U., & Rahim, R. K. (2025). The interplay between business analytics capabilities and decision-making performance in Indonesia’s public sector. Digital Business, 5(2). https://doi.org/10.1016/j.digbus.2025.100132
Griha Tofik Isa, I., Novianti, L., Elfaladonna, F., Agustri, S., Manajemen Informatika, J., & Negeri Sriwijaya, P. (2023). Exploratory Data Analysis (EDA) dalam Dataset Penerimaan Mahasiswa Baru Universitas XYZ Palembang.
Institut Ilmu Kesehatan Bhakti WIyata. (2025, December). Kuliah Kesehatan Masyarakat: Prospek Kerja Lulusannya. https://iik.ac.id/blog/2025/11/26/kuliah-kesehatan-masyarakat-prospek-kerja-lulusannya/#:~:text=Mengapa%20Lulusan%20Kesehatan%20Masyarakat%20Banyak,dan%20preventif%2C%20bukan%20hanya%20kuratif.
Irwanto, I. (2024). Data on Undergraduate Students’ Self-Regulation in Online and Blended Learning Environments During the COVID-19 Pandemic in Indonesia. Data in Brief, 53. https://doi.org/10.1016/j.dib.2024.110066
Jabir, S. R., Azis, H., & Mansyur, S. H. (2024). Enhancing The Quality of College Decisions Through Decision Tree and Random Forest Models. https://journal.unm.ac.id/index.php/JESSI/article/view/1225?TSPD_101_R0=085ba2cd96ab200005eb449f3cf77e728bdcd2c71d77a5a50be6b62f2ff1105f5d54a0ca4156dc1208e60be126144800c18eee08eb5bfa9f6c28aca1f0dcbf5c0f4645c0419c6ff6a52be03da4c0a7b68e80571a5ee868b0d988d1c8930b31d521421578c4a7be933174da7c4556f0c69205044588123639
Jabir, S. R., Tenripada, A. U., Asis, M. A., Widyawati, D., & Faradibah, A. (2022). Buletin Sistem Informasi dan Teknologi Islam Pengembangan Solusi Perawatan Kesehatan Terhadap Autism Spectrum Disorder (ASD) Menggunakan Pendekatan Data Analysis INFORMASI ARTIKEL ABSTRAK. 3(2), 157–166. https://jurnal.fikom.umi.ac.id/index.php/BUSITI/article/view/1397
Katyal, A., Sharma, P. K., & Kannan, M. (2025). Exploratory Data Analysis (EDA) on Undergraduate Data Science Students Through R Programming. https://doi.org/10.21203/rs.3.rs-7422204/v1
Kazadi Mbamba, C., Keymer, P., Alvi, M., Topalian, S. O. N., Ud Din, F., & Batstone, D. J. (2025). Enhancing data quality in wastewater processes: Missing data imputation with deep Variational Autoencoders and genetic algorithms. Computers and Chemical Engineering, 199. https://doi.org/10.1016/j.compchemeng.2025.109123
Li, G., Zheng, Q., Liu, Y., Li, X., Qin, W., & Diao, X. (2024). A Classification Method for Incomplete Mixed Data Using Imputation and Feature Selection. Applied Sciences (Switzerland), 14(14). https://doi.org/10.3390/app14145993
MZ, Y., Bororing, J. E., Rahayu, S., & Ramadhani, T. A. (2022). Aplikasi Dashboard Visualisasi Data Calon Mahasiswa Baru mengunakan Metabase. Edumatic: Jurnal Pendidikan Informatika, 6(1), 116–125. https://doi.org/10.29408/edumatic.v6i1.5483
Nguyen, H. V., & Byeon, H. (2024). A Hybrid Self-Supervised Model Predicting Life Satisfaction in South Korea. Frontiers in Public Health, 12. https://doi.org/10.3389/fpubh.2024.1445864
Orji, U. E., Ukwandu, E., Obianuju, E. . A., Ezema, M. . E., Ugwuishiwu, C. . H., & Egbugha, M. . C. (2022). 2022 5th Information Technology for Education and Development. https://arxiv.org/pdf/2305.19297
Paramita, A. S., & Ramadhan, A. (2024). An Unsupervised Learning and EDA Approach for Specialized High School Admissions. Journal of Applied Data Sciences, 5(2), 316–325. https://doi.org/10.47738/jads.v5i2.178
Popoola, P. A., Tapamo, J. R., & Assounga, A. G. H. (2025). Effective and efficient handling of missing data in supervised machine learning. Data Science and Management, 8(3), 361–373. https://doi.org/10.1016/j.dsm.2024.12.002
Septiani, N., & Iqbal, M. (2024). Analisis Tren Pendaftaran Siswa Menggunakan Big Data. Bulletin of Information Technology (BIT), 5(4), 366–370. https://doi.org/10.47065/bit.v5i2.1744
Srivastava, N. K., Pandey, P., & Mishra, V. (2023). Exploratory Data Analysis (EDA) Based on Demographical Features for Students’ Performance Prediction. In A. Dagur, K. Singh, P. S. Mehra, & D. K. Shukla (Eds.), Artificial Intelligence, Blockchain, Computing and Security Volume 1 (1st edition, p. 1028). CRC PRess.
Tri, I., Yanto, R., & Handayani, O. P. (n.d.). Visualization of Data Inventory Using Visual Data Mining (VDM) and Exploratory Data Analysis (EDA) Methods. Retrieved December 25, 2025, from https://joiv.org/index.php/joiv/article/view/4075/1429
UMI. (n.d.). Profil & Sejarah UMI. 2025. Retrieved June 24, 2025, from https://umi.ac.id/struktur-umi/profil-sejarah-umi/
Universitas Ahmad Dahlan. (2023, November 7). Tantangan PTS Hadapi Penurunan Mahasiswa Baru dan Upaya Mengatasinya. https://lldikti5.kemdikbud.go.id/home/detailpost/tantangan-pts-hadapi-penurunan-mahasiswa-baru-dan-upaya-mengatasinya
Xiao, H., Cao, R., Chen, Z., Hong, C., Wang, J., Yao, M., Fan, L., & Luo, T. (2025). Handling missing data in large-scale TBM datasets: Methods, strategies, and applications. Intelligent Geoengineering, 2(3), 109–125. https://doi.org/10.1016/j.ige.2025.07.001

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.