Designing and Evaluating a Big Data Analytics Approach for predicting students’ success factors

doi:10.21203/rs.3.rs-2075479/v1

Download PDF

Method Article

Designing and Evaluating a Big Data Analytics Approach for predicting students’ success factors

https://doi.org/10.21203/rs.3.rs-2075479/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 13 Oct, 2023

Read the published version in Journal of Big Data →

You are reading this latest preprint version

Reducing student attrition in tertiary education plays a significant role in the core mission and financial well-being of an educational institution. The availability of big data source from Learning Management System (LMS) can be analysed to help with the attrition issues. This study aims to use an integrated Design science research (DSR) methodology to develop and evaluate a Big Data Analytical Solution (BDAS) embedded in Educational Decision Support System as an educational artefact. The BDAS as DSR artefact harnesses the outcome of the application of Artificial Intelligence(AI) based approaches on the dataset collected from student interaction with LMS to train a predictive model to predict potential students at risk. Identifying students at risk helps to take timely intervention in the learning process to improve student academic progress to increase the retention rate. To evaluate the performance of the predictive model, we compare the accuracy of the collection of representational Artificial Intelligence algorithms in the literature. The BDAS aims not to replace any existing practice but to support educators to implement a variety of pedagogical practices to improve students’ academic performance.

Design Science Research (DSR)

Big Data

Big Data Analytical Solution (BDAS)

Machine Learning (ML)

Deep Learning (DL)

DSR Evaluation

Artificial Intelligence (AI)

Despite the increasing demand for higher qualifications in the industry, a greater number of students discontinue their studies without completing a degree in comparison to the past. On average 23% of the enrolled students in the tertiary sector left without completing the course (1, 2). Student attrition is a challenging issue and a growing concern of higher education(HE) providers. The HE providers compete to acquire the students and find strategies to retain them. These institutions have been attentive towards the student numbers revolving around the declined enrolment, increased competition, retention rate, or attrition rate. Attrition is a natural part of higher education and is defined by the number of non-completing students who leave the degree before finishing (3). Several studies claimed that the attrition trend is significantly increased in Australia.

The incremental change in the attrition rate has multiple consequences ranging from social, and economic (4). Student attrition not only negatively impacts the social interaction of individuals, but also results in negative financial consequences for students, institutions, and the economy. Students, not completing their education degree, fails to find better or appropriate career opportunities. HE providers lose revenue and reputation if students leave before finishing their education. Student attrition not only costs HE providers but the government as well. Non-completing students are unable to peruse progressive careers to earn a higher income. Consequently, this may bring students into a situation of not being able to pay back their study loans (5). According to the Parliament of Australia (6), the total amount of outstanding study loans of approximately 3 million Australians was $68.7 billion in 2020 and approximately 16% of which is not expected to be repaid. Existing studies have been introduced the area of curriculum design (7, 8) and student performance improvement (given in the next section), but student attrition has not been given much attention. Considering these factors, the Department of Education, Science, and Training (DEST) has emphasised student attrition recently as one of the indicating factors to improve the performance of HE providers (9, 10). This has opened a persistent opportunity for the researchers to study HE student attrition and measure different factors and strategies (11) to reduce student attrition.

In the relevant literature (12), student academic progress is considered one of the key determinants of student attrition. The providers can extend academic support to students through quality learning and teaching to enhance their academic performance. Early and timely identification of students at risk by using any Information System (IS) can support the HE providers to take appropriate measures effectively to enhance student academic progress (13–15). For example, an Educational Decision Support System (DSS) can be considered a paramount IS to support the appropriate relevant decision (16). The management can arrange useful early interventions that can help students to cope well in their academics and improve their academic progress. This can increase the probability of not going into the path of leaving studies leading to a low attrition rate.

In HE, educational big data is gathered from different educational management activities, academic or non-academic activities of student. Huge and different set of student data is generated from educational information systems like Student management information systems, LMS or Administrative management system such as demographic and socio-economic data, personal, social, enrolment data, academic attributes-based data, and LMS log data. Big data analytics processes large heterogeneous datasets and supports data visualization, adaptive learning, feedback systems to provide valuable insight for educators (17–19) and widely adopted in educational sector. Big data analytics can be classified as descriptive analytics, diagnostics analytics, decisive analytics, prescriptive analytics, and predictive analytics(20). Machine Learning (ML), Cluster analysis, Text mining, Knowledge domain and reasoning based approaches, decision making methods, pattern matching, search and optimization theory algorithms and semantic analysis are big data analytics techniques and approaches in AI discipline (21–23). Different AI based big data analysis techniques can be employed on these type of bid data to identify students at risk of failing by predicting their academic performance. AI based data analytics techniques can be applied to these datasets to automate the analytical model building to achieve the aim of predicting academic performance. These AI based predictive models can embed into an Educational DSS to support educational management to plan and offer support mechanisms that are beneficial and effective for struggling students to assist them in attaining their academic success goals.

In this research, we adopted an innovative research methodology to develop a novel BDAS and evaluate the BDAS for accurately predicting the students at risk of failing in the early weeks of the semester by utilizing a trained model on the student LMS interaction dataset. This BDAS supports educators to focus more on teaching and research, instead of undertaking tedious and inefficient administrative duties which can be biased due to human intervention. This study has three novelties. First, the innovative research methodology is grounded on the similarities of Design Science Research (DSR) and Design-based Research (DBR) for developing and evaluating BDAS. Second, the BDAS is based on LMS data to detect potential students who can fail earlier in the semester to enhance student learning with accurate and timely intervention. Third, an extended evaluation framework is used to rigorously evaluate the BDAS based on simulation of real scenarios. The timely detection and measurement will improve the student progress which will result in increased retention and decreased attrition with a positive impact on the student, HE providers, and the economy.

The remainder of the paper is organized as follows. First, we review the educational environment, research methodology, BDAS, and evaluation framework to identify the gap to explore. After the background, the paper details the major components of the research including the methodology, artefact design, and evaluation framework. Subsequently, the study presents the results and contributions made. In the final section, the study summarized the study and suggests future directions.

Recently, AI has been adopted in the computing field extensively and effectively. The benefits and enhancement due of AI in the education sector have been highlighted in the literature. A few examples of the application of AI in the educational sector, but not limited to, are applications of data analytics, predicting student enrolments, a recommendation system for career pathway or resource management, adaptive tutoring, prediction of student readiness for employment, monitoring and predicting student academic performance or identifying struggling students. Table 1presents a brief overview of the related previous works.

Table 1

A brief overview of related work
Early prediction of undergraduate Student’s academic performance in completely online learning: A five-year study (15)	Proposed a collection of AI models to predict student academic progress from LMS interaction data and student academic data like GPA and enrolment test data. The data consists of LMS log files, demographics, and academic achievement. No research methodology is identified.
Predicting Students’ Academic Performance Through Supervised Machine Learning (24)	Developed an AI based system to predict student performance from their demographical and LMS interaction data. The dataset comprises of demographical characteristics and LMS interaction data including gender, country, birthplace, view of the LMS content, quiz attempts, and assessment submissions. The nature of the dataset does not allow early prediction. The research methodology is not clear.
Predicting Students’ Academic Procrastination in Blended Learning Course Using Homework Submission Data (25)	Develop an algorithm to enhance students’ academic progress by detecting struggling students through their homework submission behaviours e.g., no submission or late submission. The nature of the dataset does not allow enough time to offer timely interventions and support to enhance student academic performance. No research methodology is identified to construct the predictive model e.g., DSR or DBR.
An Efficient Approach for Multiclass Student Performance Prediction based upon Machine Learning (26)	Predicted the students’ performance by using four classification algorithms. The same dataset is used in other studies as well but with different ML classifiers (27, 28). The study used secondary school students, not HE and did not use of LMS data. Used socio-economic attributes of students which do not allow timely identification of the at-risk student. The research approach is not based on the similarities of DSR and DBR principles.
Design, development, and evaluation of a mobile learning application for computing education (29)	Applied DSR approach to developing mobile learning application for HE for better student learning. The research approach is only based on the DSR approach and not on DBR principles or similarities between DSR and DBR. No AI (DL or ML) models are used to predict student academic performance.

Existing studies does not focus on the LMS big data to predict academic performance earlier in the learning pathways. Most of studies has used data generated from transitional on-campus educational settings or completely online settings and not much studies studied data generated by student interaction with LMS in blended learning. Also, most of the existing research did not highlight the significance of identification of at-risk students in early stages of studies. There is a need to investigate a real-time automated analytical solution to identify student at risk of failing earlier in blended learning environment to timely offer strategies and remedial measures to keep the student academic progress on track. Furthermore, most of the related studies from research methodology and DSR artefact construction and evaluation are insufficient considering: that these studies did not use integrated DSR and DBR methodology to layout the study to design and develop an artefact; these studies Big data analytics approaches but do not employ DSR or DBR or integrated DSR paradigm; these studies did not evaluate the DSR artefacts according to their complexity. However, existing literature can be leveraged to extrapolate to achieve the objective of this study, thus, forming the foundation of this study.

Integrated Design Science Research Methodology

Research methodology defines the guides and boundaries through which a study can be conducted ensuring its scientific value and significance. Researchers highlight research methodology as the most significant step to accomplish the purposes of the research. This study developed and used an innovative IS research methodology based on the similarities of two research approaches: DSR methodology from IS and Design based research (DBR) methodology. DBR is considered a DSR realization in the education sector to conduct research to develop and evaluate an BDAS as an IT and DSR artefact. DSR complements DBR and provides multi-paradigm perspectives to construct fundamental knowledge by researching social pragmatisms (30–32).

DSR approach suits the studies that will justify the research requirement and contribute to knowledge and development of the artefact (33). For example, Miah et. al.(34) have used the DSR framework to design a mobile based application for education; Carstensena and Bernhard (35) designed and improved teaching in the engineering education sector by utilizing the DSR methodology; Miah et. al. (36) utilized DSR approach to extend mobile health information system; and Miah et. al. (37) described development of the design of a DSS as method artefact. DBR methodology intends to achieve outcomes to improve student learning or enhanced understandings about teaching and learning or other educational phenomena (38). The similarities among both methodologies are:

Both are problem solving methodologies
Both approaches design from a viable practical perspective
Both approaches contribute to the knowledge based
Both reflect on the nature of the theory
Both produce the theoretical and practical artefact
Both have an iterative cycle of design and rigorous evaluation

The study followed an integrated DSR methodology (39) consisting of five phases based on the similarities of DSR and DBR leveraging a variation of Peffer’s DSR Methodology (33). The five phases, as shown in Fig. 1, are: (1) Problem Identification; (2) Solution analysis; (3) Artefact Design and Development; (4) Evaluation; (5) Outcome Communication.

The study begins with a detailed problem description and analysis of existing studies to drive the design requirements and objective of designing an BDAS from the literature. This formulates the design principles of design and development of DSR artefact for a later phase by executing Systematic Literature Review and Meta Analysis. Next, the study evaluates the findings to establish design considerations for BDAS. In the third phase, BDAS as a DSR artefact is designed, developed, and evaluated formatively by using AI data analysis techniques (ML and DL algorithms). In the final phases, the summative evaluation is carried out and the outcomes of the study are communicated as a contribution to the knowledge area.

Problem Identification and Objectives of the Artefact

In the initial phases of our integrated DSR research methodology, an extensive systematic literature review and meta-analysis (SLRM) was conducted about the application of AI based technology in HE regarding student academic progress. The systematic literature review aims to understand the trends of application of AI based technology to a wide spectrum related to monitoring and predicting student academic performance and identify the different AI algorithms and process of development of AI models. The SLRM is conducted by using the PRISMA(40) framework with defining a search protocol incorporating inclusion and exclusion criteria and providing rich findings. The SLRM highlighted the phases, algorithms and evaluation metrics used in the studies. These algorithms and evaluation metrics form the foundation of the design and development of BDAS.

The objective of designing and developing the BDAS is to train and evaluate a predictive model with classified data to predict the student's academic progress. The predictive model must be sufficiently accurate to identify students who are at risk of failing. The prediction can assist educators to implement strategies to enhance student learning and improve their academic performance. BDAS can be integrated into coursework for timely and accurate identification of student academic progress, especially for the student at risk. This timely identification of students at risk supports earlier intervention to improve their academic performance. The generic computational model consists of Data collection, Data pre-processing, data analysis with algorithms and evaluation. This generic model is tailored for each iteration of the design and development phase for BDAS. Each iteration utilized different pre-processing techniques and algorithms to achieve the objective of the BDAS. In case of educational big data, a large amount of real-time data is generated by LMS. The BDAS predictive model is trained on a set of training dataset and will be deployed and integrated with LMS using a data processing framework to section the real-time big data stream into small segments via pipelines to feed to BDAS to predict student academic performance for enhanced student academic progress and better decision making. The following figure (Fig. 2) shows the process of design and development of DSR artefact as the BDAS.

Big Data, LMS and Big Data Analytics

Big data technologies can play significant role in improving data processing, data storage, data analytics and visualization (41). Big data significant impact the transformation of learning process and adoption of relevant innovative technologies(13). The overview of big data analytics in HE is illustrated in Fig. 3. LMS platforms are considered as major source of big data and is an essential application to plan, deliver, monitor, and assess learning process e.g., Moodle, Blackboard, Canvas, Forma LMS, OpenOLAT. Moodle and Blackboard are most popular LMS platform. LMS platform has three key purposes: (i) Management of digital content material and student access record, (ii) Management of assessments and student progress, (iii) management of student feedback and interaction (42).

LMS generates rich and huge volume of data which increases the need of innovative solutions to improve learning and education management. There is also an emerging requirement of LMS integrated tools to interpret and manipulate the data generated by LMS (42, 43).

Big data is produced by users (educators, administrator, and students) interacting with LMS in different manners. For example, educators upload material to deliver digital course materials to their students and student access these materials for learning, students attempt the LMS based tests related to a specific concept or students submits the assessment documents on LMS. Big data analytics applies set of analytical techniques to extract useful information and provide insight from big educational data related to students’ learning behaviours, assessment scores, student learning styles, student logging in information, time spend on a task/module, assessment submission patterns, most visited page/content, completing a task or module or posting details about extracurricular activities (44) (45, 46).

Big data analytics allows to identify the real learning pattern of the students more accurately than the traditional practices. Big data analytics supports HE to make better and informed decision making based on the big data generated by LMS. It supports (42, 45, 47–49):

Customized and adaptive learning for better learning path
Plagiarism detection in student submissions to improve academic integrity
Student performance prediction for better course deliver planning
Course Selection or Recommendation System
Identification of students at risk based on their behaviour pattern to plan and delivery appropriate and timely interventions
Dropout prediction
Student participation and engagement measurement tracking to enhance learning experience
Strategic planning to achieve HE goals

AI algorithms take all input data at once and process it to provide output, which is not possible in big data analytics due to the high velocity and huge volume of the big data. There are multiple approaches to solve this issue and apply AI algorithms on educational big data e.g., high-performing computing infrastructure, parallel processing approach and/or data processing platforms for data segmentation. In this study, data processing platform is suggested to deploy BDAS artefact (42, 45).

Artefact Design and Development

An AI based DSR artefact is a complex artefact and designed according to the requirements and objectives identified in previous phases. Design approaches developed around contextual knowledge and general practices lead to enhanced artefact design (50). This study has used two sets of iterations to design and develop the BDAS as a predictive model based on existing approaches in literature: ML based predictive model; DL based predictive model. In this phase, we apply ML and DL algorithms to design and develop ML based and DL based predictive models as DSR artefacts to identify potential students at risk of failing accurately from a dataset based on student LMS interaction. This iterative approach in this phase provides continuous improvement of the construction of DSR artefact by evaluating various performance metrics by using the confusion matrix in each iteration. These performance metrics of different AI algorithms in each iteration are compared to select the best predictive model.

BDAS as a DSR artefact is constructed by a series of tasks consisting of Data collection, Data pre-processing, Data analysis with AI algorithms, Evaluation and successful decision marking (13, 51). All these tasks are tailored to develop and evaluate ML and DL based predictive models. The workflow of training an AI based artefact is illustrated in Fig. 4.

This study has sourced a freely available dataset comprising 230,318 instances of students’ activities and interactions with LMS to train the predictive model. The dataset consists of 13 features including time-series based features i.e., Session number, Student number, Exercise number, Activity name abbreviation, Start time of the activity, End time of the activity, Idle time during activity, Mouse wheel movement count, Mouse wheel click count, count of Mouse left click, count of Mouse right click, Mouse movement count and count of Keystroke. The dataset is pre-processed and normalized, and features are selected by correlational analysis to build a dimensional vector including categorised features. This transformed dataset is then used to train the predictive model by using ML and DL algorithms to detect students at risk of failing.

In the first iteration, five tree based ML supervised algorithms (J48, Random Forest, OneR, Decision Stump, NBTree,) are used to train and evaluate the predictive model. These tree based algorithms use a series of if-then decisions to generate highly accurate, easily interpretable predictions, to identify potential students at risk of failing. A booster ensemble technique is applied to the transformed dataset to further fine-tune it. The predictive model is trained and tested by using k-fold cross validation on the training and testing data using the above five ML supervised algorithm iteratively. In the final step, performance metrics are compared for all the predictive models based on five ML algorithms to select the most accurate predictive model to construct BDAS. In the real-time implementation of the BDAS, a data processing framework, e.g., Apache spark, will be used to receive and segment the real-time big data stream from LMS and decomposes the large data into small batches to be processed by the BDAS predictive model.

In the second iteration of continuous improvement of the design of the AI based artefact, two different data pre-processing techniques are used to modify the class distribution and augment the dataset to resolve the implications of an imbalance dataset. DL algorithms are made up of neural networks with several layers of differentiable nonlinear nodes. Three DL algorithms Long Short-term Memory (LSTM), Multi-layer perceptron (MLP) and Sequential Model (SM), are applied to train the augmented dataset which demonstrated higher classification accuracy of the prediction model and reduces false prediction. The higher classification accuracy and reduced false prediction mean a low instance of incorrectly not identifying students who are not at-risk, therefore addressing the objective of the general description of the BDAS as a DSR artefact.

Artefact Evaluation

The evaluation phases focus on whether the developed artefact has achieved the purpose it is designed for and it is a vital phase of a study in the DSR domain. The evaluation of the developed artefact within its context is a vital component of the evaluation strategy (52). In this study, BDAS as the artefact is evaluated by an innovative DSR evaluation framework to evaluate the utility, efficacy, and effectiveness (53, 54) of the artefact with hybrid evaluation requirements by using the Confusion matrix, given in Fig. 5. In addition, to train, test and evaluate an AI based predictive model the original dataset in sectioned into three (3) sections i.e., Training dataset, Testing dataset and Validation dataset. The predictive model is trained and testing on the training dataset and testing dataset respectively during the construction of the predictive model. The trained predictive model is evaluated to define a generalize predictive model by using the validation dataset.

The summative evaluation episodes highlight the outcome and impact of the implemented artefact in a context, thus performed towards the completion of the study. One of the summative episodes was performed to evaluate the effectiveness and efficacy of the predictive model by accurately identifying the students at risk early in the semester. Validation dataset is used to execute the terminal evaluation episode to evaluate the effectiveness the BDAS predictive model and generate a generalise the BDAS predictive model. The second and final summative episode, an ex-post evaluation, to evaluate the utility of real users with live unseen data is left for future work.

The study outlined an integration of two research methodologies DSR and DBR based on key similarities between them to design, construct and evaluate an BDAS as a DSR artefact. This forms an appropriate research paradigm for designing, developing, and evaluating the BDAS artefact that can be implemented to enhance academic performance with timely intervention strategies for those who are at risk of failing and to support better decision making.

Several technological opportunities like learning analytics are emerging due to the big data from LMS in HE. The objective of BDAS artefact complements existing practices to support educators to discover the potential students at risk very early in the semester and contact students to take remedial actions and mitigate the risk of dropping out. This paper presents the steps to design and develop an AI based BDAS by using integrated DSR methodology and rigorously evaluate to improve the accuracy of BDAS identifying the students at risk. The big data analytics approach contributes to the knowledge area as it utilized multiple AI techniques to improve the accuracy of predictive model i.e., performing correlations between LMS attributes to select attributes, tuning of classifier algorithm parameters, augmenting the dataset and applied both ML and DL algorithms to select best performing predictive model to construct BDAS artefact. This paper presents the two phases to design and develop predictive model to improve identification accuracy and early in the semester. This AI based BDAS can be an alarming system for educators to provide appropriate support by taking necessary steps to improve student academic progress. Our BDAS approach fills the gap of using data generated by student interaction with LMS in blended learning and automated process almost real-time and an early detection of student at risk of failing in blended learning environment, which is beneficial from both academic and administrative perspectives. In addition, in this paper, a great focus is given to evaluate the AI based BDAS by executing numerous formative and summative evaluation episodes due to the hybrid and complex nature of AI based BDAS. The innovative evaluation framework provides well designed phases including evaluation episode plans to guide future researchers about evaluating hybrid and complex artefact like BDAS. The AI based BDAS as Educational DSS would be useful for students and educators from different HE providers (e.g., Massive open online course (MOOC), universities, Non-University Higher Education (NUHE) not to derail their learning pathway.

High performing computational infrastructure and interoperability of educational big data are required for practical deployment of BDAS in educational system. In the future, we will work on the full implementation of the BDAS and integration of the BDAS into the LMS of the students to evaluate the efficiency and utility in the real-time use of the BDAS by students as the client. The extension will enhance the details about how the BDAS might support decision-making about which strategies to use for students identified at risk.

ADR	Action Design Research
AI	Artificial Intelligence
BDAS	Big Data Analytical Solution
BIE	Building, Intervention, and Evaluation
DBR	Design-based Research
DEST	Department of Education, Science, and Training
DL	Deep Learning
DSR	Design science research
DSS	Decision Support System
FEDS	Framework for Evaluation in Design Science
HE	Higher Education
IS	Information System
LMS	Learning Management System
LSTM	Long Short-term Memory
ML	Machine Learning
MLP	Multi-layer perceptron
MOOC	Massive open online course
NUHE	Non-University Higher Education
PRISMA	Preferred Reporting Items for Systematic Reviews and Meta-Analyses
SLRM	Systematic literature review and meta-analysis
SM	Sequential Model

Acknowledgements

Not Applicable.

Author contributions

All authors contributed equally. Both authors read and approved final manuscript.

Funding

Not Applicable.

Availability of data and materials

The data used in the study is downloaded from public data repository and is available publicily.

Declarations

Ethics approval and consent to participate

Not Applicable

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Martín MD, Jansen L, Beckmann EA. Understanding the Problem Student attrition and retention in university Language & Culture programs in Australia. The Doubters' Dilemma. Exploring student attrition and retention in university language and culture programs: ANU Press; 2016. p. 1-30.
Cherastidtham I, Norton A. University attrition: what helps and what hinders university completion? : Grattan Institute; 2018.
TEQSA. Tertiary Education Quality and Standards Agency (TEQSA)'s Risk Assessment Framework. Australian Government; 2019. Contract No.: 2.3.
Ulriksen L, Madsen LM, Holmegaard HT. What do we know about explanations for drop out/opt out among young people from STM higher education programmes? Studies in Science Education. 2010;46(2):209-44.
Sarra A, Fontanella L, Di Zio S. Identifying Students at Risk of Academic Failure Within the Educational Data Mining Framework. Social Indicators Research. 2019;146.
Ferguson H. Parliment of Australia. 2021 [cited 2022]. Available from: https://www.aph.gov.au/About_Parliament/Parliamentary_Departments/Parliamentary_Library/FlagPost/2021/November/HELP-2020-21.
Miah S, Solomonides I, Gammack J. A design-based research approach for developing data-focussed business curricula. Education and Information Technologies. 2020;25.
Miah S, Solomonides I. Design Requirements of a Modern Business Master's Degree Course: Perspectives of Industry Practitioners. Education and Information Technologies. 2021;26.
Panel HES. Final Report - Improving retention, completion and success in higher education. Department of Education and Training (DEST); 2017. Contract No.: ISBN : 978-1-76051-156-2.
Institute TV. Student Attrition Report : Comprehensive Analysis and Recommendations. Victoria University; 2013.
Aljohani O. A Comprehensive Review of the Major Studies and Theoretical Models of Student Retention in Higher Education. Higher Education Studies. 2016;6:1-18.
Beer C, Lawson C. The problem of student attrition in higher education: An alternative perspective. Journal of Further and Higher Education. 2017;41(6):773-84.
Miah S, Miah M, Shen J. Editorial note: Learning management systems and big data technologies for higher education. Education and Information Technologies. 2020;25.
Plak S, Cornelisz I, Meeter M, van Klaveren C. Early warning systems for more effective student counselling in higher education: Evidence from a Dutch field experiment. Higher Education Quarterly. 2022;76(1):131-52.
Bravo-Agapito J, Romero SJ, Pamplona S. Early prediction of undergraduate Student's academic performance in completely online learning: A five-year study. Computers in Human Behavior. 2021;115:106595.
Miah SJ. An ontology based design environment for rural decision support.: Griffith University; 2008.
Kumar P, editor Big Data Analytics: An Emerging Technology. 2021 8th International Conference on Computing for Sustainable Global Development (INDIACom); 2021 17-19 March 2021.
Chunzi S, Xuanren W, Ling L, editors. The Application of Big Data Analytics in Online Foreign Language Learning among College Students : Empirical Research on Monitoring the Learning Outcomes and Predicting Final Grades. 2020 2nd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI); 2020 23-25 Oct. 2020.
Babiceanu RF, Seker R. Big Data and virtualization for manufacturing cyber-physical systems: A survey of the current status and future outlook. Computers in Industry. 2016;81:128-37.
Sun X, Fu Y, Zheng W, Huang Y, Li Y. Big Educational Data Analytics, Prediction and Recommendation: A Survey. Journal of Circuits, Systems and Computers. 2022.
Sekeroglu B, Abiyev R, Ilhan A, Arslan M, Idoko J. Systematic Literature Review on Machine Learning and Student Performance Prediction: Critical Gaps and Possible Remedies. Applied Sciences. 2021;11:10907.
Rahmani A, Azhir E, Ali S, Mohammadi M, Ahmed O, Ghafour M, et al. Artificial intelligence approaches and mechanisms for big data analytics: a systematic study. PeerJ Computer Science. 2021;7:e488.
Begum A, Fatima F, Haneef R. Big Data and Advanced Analytics: Helping Teachers Develop Research Informed Practice. 2019. p. 594-601.
Bhutto ES, Siddiqui IF, Arain QA, Anwar M, editors. Predicting Students’ Academic Performance Through Supervised Machine Learning. 2020 International Conference on Information Science and Communication Technology (ICISCT); 2020 8-9 Feb. 2020.
Akram A, Fu C, Li Y, Javed MY, Lin R, Jiang Y, et al. Predicting Students’ Academic Procrastination in Blended Learning Course Using Homework Submission Data. IEEE Access. 2019;7:102487-98.
Jain A, Solanki S, editors. An Efficient Approach for Multiclass Student Performance Prediction based upon Machine Learning. 2019 International Conference on Communication and Electronics Systems (ICCES); 2019 17-19 July 2019.
Imran M, Latif S, Mehmood D, Shah MS. Student Academic Performance Prediction using Supervised Learning Techniques. International Journal of Emerging Technologies in Learning (iJET). 2019;14(14):pp. 92-104.
Ma X, Zhou Z, editors. Student pass rates prediction using optimized support vector machine and decision tree. 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC); 2018 8-10 Jan. 2018.
Oyelere SS, Suhonen J, Wajiga GM, Sutinen E. Design, development, and evaluation of a mobile learning application for computing education. Education and Information Technologies. 2018;23(1):467-95.
Singh H, Miah S. Smart Education Literature: A theoretical analysis. Education and Information Technologies. 2020;25.
Genemo H, Miah S, McAndrew A. A design science research methodology for developing a computer-aided assessment approach using method marking concept. Education and Information Technologies. 2015;21.
Shah JM, John GG. Ensemble Artifact Design For Context Sensitive Decision Support. Australasian Journal of Information Systems. 2014;18(2).
Peffers K, Tuunanen T, Rothenberger MA, Chatterjee S. A Design Science Research Methodology for Information Systems Research. Journal of Management Information Systems. 2007;24(3):45-77.
Singh H, Miah SJ. Design of a mobile-based learning management system for incorporating employment demands: Case context of an Australian University. Education and Information Technologies. 2018;24(2):995-1014.
Carstensen A-K, Bernhard J. Design science research – a powerful tool for improving methods in engineering education research. European Journal of Engineering Education. 2019;44(1-2):85-102.
Miah SJ, Gammack J, Hasan N. Extending the framework for mobile health information systems Research: A content analysis. Information Systems. 2017;69:1-24.
Miah S, Kerr D, Hellens L. A collective artefact design of decision support systems: Design science research perspective. Information Technology & People. 2014;27.
Anderson T, Shattuck J. Design-Based Research. Educational Researcher. 2012;41:16-25.
Fahd K, Miah SJ, Ahmed K, Venkatraman S, Miao Y. Integrating design science research and design based research frameworks for developing education support systems. Educ Inf Technol. 2021;26:4027-48.
Page M, McKenzie J, Bossuyt P, Boutron I, Hoffmann T, Mulrow C, et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ. 2021;372:n71.
Otoo-Arthur D, van Zyl T. A Scalable Heterogeneous Big Data Framework for e-Learning Systems2020. 1-15 p.
Ang L-M, Ge F, Seng K. Big Educational Data & Analytics: Survey, Architecture and Challenges. IEEE Access. 2020;PP:1-.
Cantabella M, Martínez-España R, Ayuso B, Yáñez J, Muñoz A. Analysis of student behavior in learning management systems through a Big Data framework. Future Generation Computer Systems. 2018;90.
Otoo-Arthur D, van Zyl T. A Systematic Review on Big Data Analytics Frameworks for Higher Education - Tools and Algorithms2019. 1-9 p.
Elatia S, Ipperciel D. Learning Analytics and Education Data Mining in Higher Education. 2021. p. 108-26.
Anshari M, Alas Y, Yunus N, Sabtu N, Hamid M. Online Learning: trends, issues, and challenges in the Big Data Era. Journal of E-Learning and Knowledge Society. 2016;12:121-34.
Sharma A, Dhaka A, Nandal A, Swastik K, Kumari S. Big Data Analysis: Basic Review on Techniques. 2021. p. 208-33.
Ashaari MA, Dara Singh K, Abbasi G, Amran A, Cabanillas f. Big data analytics capability for improved performance of higher education institutions in the Era of IR 4.0: A multi-analytical SEM & ANN perspective. Technological Forecasting and Social Change. 2021;173:121119.
Şahin M, Yurdugül H. Educational Data Mining and Learning Analytics: Past, Present and Future. 2020;9:121-31.
Miah SJ, Gammack JG, McKay J. A Metadesign Theory for Tailorable Decision Support. Journal of the Association for Information Systems. 2019:570-603.
Janssen M, Voort H, Wahyudi A. Factors influencing big data decision-making quality. Journal of Business Research. 2016;70.
Miah S, Debuse J, Kerr D. A Development-Oriented IS Evaluation Approach: Case Demonstration for DSS. Australasian Journal of Information Systems. 2012;17.
Hevner A, R A, March S, T S, Park, Park J, et al. Design Science in Information Systems Research. Management Information Systems Quarterly. 2004;28:75.
Venable J. The role of theory and theorising in design science research. First International Conference on Design Science Research in Information Systems and Technology. 2006.
Venable J, Pries-Heje J, Baskerville R. FEDS: a Framework for Evaluation in Design Science Research. European Journal of Information Systems. 2016;25(1):77-89.
Sein M, Henfridsson O, Purao S, Rossi M, Lindgren R. Action Design Research. MIS Quarterly. 2011;35:37-56.

No competing interests reported.

Download PDF

Journal Publication

published 13 Oct, 2023

Read the published version in Journal of Big Data →

Editorial decision: Major revision
09 May, 2023
Reviews received at journal
31 Mar, 2023
Reviewers agreed at journal
01 Mar, 2023
Reviewers invited by journal
01 Mar, 2023
Editor assigned by journal
26 Sep, 2022
Submission checks completed at journal
26 Sep, 2022
First submitted to journal
17 Sep, 2022

You are reading this latest preprint version

Designing and Evaluating a Big Data Analytics Approach for predicting students’ success factors

Status:

Journal Publication

Version 1

Abstract

Figures

Introduction

Related Work

Discussion And Conclusion

Abbreviations

Declarations

References

Additional Declarations

Status:

Journal Publication

Version 1