1 Introduction
Artificial Intelligence for Education (AI4ED) [
1–
3] has revolutionized traditional education and emerged as a trending topic. These modern educational platforms are dedicated to leveraging artificial intelligence technology to provide personalized and high-quality educational experiences for students. Student cognitive modeling [
4], as a fundamental task in the intelligent tutoring system, aims to capture students’ cognitive ability on diverse aspects (typically various knowledge components (KC)) through their historical learning behaviors (especially exercise answering records). Accurate student cognitive modeling can facilitate a wide range of downstream tasks, such as student profiling [
5,
6], education resource recommendation [
7–
14], adaptive testing [
15], and so on.
With the rapid progress of student cognitive modeling, it is imperative for researchers to develop a project for easily reproducing these published algorithms and designing new algorithms with minimum effort. However, this is not trivial as current student cognitive modeling works are rather fragmented. Researchers put repeated efforts into finding related datasets and reproducing algorithms. Therefore, there is a need to reconsider the implementation of student cognitive modeling techniques. In this paper, we develop a PyTorch-based library called EduStudio for student cognitive modeling and provide a range of user-friendly eco-services to enhance EduStudio. We are committed to promoting research and development for the AI4ED community.
EduStudio integrates models in both cognitive diagnosis (CD) [
16,
17] and knowledge tracing (KT) [
18–
20], which are two mainstream categories in student cognitive modeling field. Fig.1 shows the widely used application scenarios of these two categories. Specifically, CD is often used to quantify a student’s cognitive ability (e.g., the mastery degree of a specific KC) with well-designed questions from an assessment or test. For instance, a well-known scenario of CD is the Programme for International Student Assessment (PISA) [
21,
22], with around 690,000 students took the PISA assessment in 2022, representing about 29 million 15-year-olds from schools in 81 participating countries and economies [
23].
CD is based on the static cognitive assumption and deals with the challenge of how to better quantify student ability with sparse student records. In contrast, KT focuses on tracking students’ knowledge states over a long period with the dynamic cognitive assumption. Many online tutoring APPs are equipped with KT technologies. As such, these APPS can provide personalized exercise recommendation to improve student’s abilities as they can see feedbacks after answering one exercise) and predict their future performances by mining their historical learning behaviors [
7,
24,
25]. In summary, these two categories utilize student answering records to mine students’ cognitive ability. However, due to the differences in cognitive modeling approaches between CD and KT, existing libraries often focus on a particular category and overlook the relationships between them.
Recent efforts have been devoted to developing libraries for student cognitive modeling [
26–
28]. They consider static or dynamic modeling separately and implement some cognitive modeling models. Nevertheless, we have identified some shortcomings and limitations in their endeavors to advance the community. Existing libraries: 1) Focus on a single category, which ignore the relation between the two categories. 2) Lack sufficient abstraction, which leads to poor flexibility and reusability. 3) Lack adequate eco-services, which limits the development of the community. Therefore, we desire to develop a high reusable and flexible library including CD and KT, with the comprehensive eco-services. The comparison of our EduStudio with other libraries is detailed in Section 7. The primary advanced features of our EduStudio are summarized as follows:
● We develop a unified library that combines the CD and KT under the student cognitive modeling view. Unlike existing open-source libraries that primarily focus on a single category, we not only enable the reusability within individual category but also facilitate sufficient reusability between two categories. We aim to facilitate communication between two research groups for better student cognitive modeling.
● We provide the modularized and templatized design when implementing models for better flexibility and reusability. Existing libraries often lack clear boundaries between the individual procedures in the algorithmic pipeline, leading to poor flexibility. We decompose each algorithm pipeline into six modules, and propose horizontal modularization flow of each algorithm. Besides, we extract the commonality of each module with reusable templates, and implement vertical templatization design of each module for high-level management.
● We offer a range of eco-services surrounding EduStudio, which can further enable more researchers to understand and quickly participate in the field of student cognitive modeling. We provide a Github repository that collects valuable resources for student cognitive modeling. In addition, we develop a Leaderboard website to provide a comprehensive comparison of various models.
2 Background
In this section, we introduce the category and data description of student cognitive modeling. Subsequently, we provide a review of existing works including CD and KT.
2.1 Task and data description
Task description. Student cognitive modeling aims to model students’ cognitive states based on learning data, such as their interactive records of answering exercises. Classified from the perspective of variation in cognitive states, CD and KT are two mainstream categories for modeling students’ cognitive states. CD is typically used to assess students’ static cognitive states on knowledge components. It helps to understand students’ knowledge mastery in specific domains and identify their weaknesses and areas for improvement. KT focuses more on monitoring students’ dynamic cognitive changes and learning progress. It tracks the development of students’ cognitive ability at different time steps and identifies their learning trajectories and trends. Therefore, CD and KT are two types of tasks proposed from different perspectives of cognitive state variation.
Data description. Here we discuss the various types of data involved in student cognitive modeling. As shown in Fig.2, the dataset includes the interactive records of students answering exercises, as well as the relationship information between the exercises and the KCs. Additionally, the features of students and exercises, as well as the relations among KCs, also contain rich information that can enhance the accuracy of modeling. Various models selectively utilize different features and data formats based on their requirements.
● Student-side features typically include information about students’ family background, school background, and other relevant factors. These pieces of information are valuable for modeling students’ abilities as prior knowledge.
● Student-exercise interactions are the fundamental input for student cognitive modeling. It encompasses common features such as correctness labels, answering textual content, and interaction timestamp. In addition, some studies [
29–
31] also design diverse forgetting features via interaction timestamp to capture students’ forgetting characteristics.
● Exercise-side features refer to the content information of exercises. This includes various modalities such as textual descriptions, images, and other multimedia elements associated with the exercises. They are valuable for modeling the difficulty of exercises and identifying the KCs they cover.
● Exercise-KC relationships are referred to as Q-matrix [
32] in student cognitive modeling. Q-matrix reveals the KCs encompassed within each exercise. It serves as a bridge for establishing student cognition of KCs through student-exercise interactions.
● KC-side features mainly lies on KC relationships, which typically fall into two categories: inclusion relationships and prerequisite relationships. Inclusion relationships refer to the coarse-grained KCs that encompass multiple finer-grained KCs. Prerequisite relationships indicate that one KC usually be learned before another.
2.2 Existing works
We introduce the development of student cognitive modeling, including CD and KT.
Cognitive diagnosis. Originating from psychometrics, CD emerges as a pivotal branch of test theory. Test theory methods are predominantly formulated on the foundations of educational and psychometric theories and assumptions [
33]. The most exemplary of these is Item Response Theory (IRT) [
34], which integrates factors such as student ability, exercise difficulty, exercise discrimination, and exercise guess probability into a logistic function to forecast the probability of a correct response. Like its predecessor, Classical Test Theory (CTT) [
35], the student ability measured by IRT is on a macro level. Consequently, subsequent researchers propose the incorporation of micro-level knowledge structures (e.g., Q-matrix) into cognitive modeling [
32,
36,
37], improving the interpretability of the model.
With the emergence of deep learning, the NCDM [
38] model pioneers the use of neural networks to replace simple logical functions in modeling the complex interactions of students when answering exercises. Subsequently, more and more neural CD models [
39–
42] further refine the model architecture to enhance the prediction performance of CD. Beyond architectural enhancements, researchers are progressively integrating diverse data sources including exercise, student, and KC-side data. For exercise-side data, CNCD-F [
39] and CNCD-Q [
39] respectively extract the reading comprehension difficulty factor and KCs from the textual content of exercises. For student-side data, ECD [
43] incorporates information such as the student’s family background into the prediction process of student performance, while FairCD [
44] and FairLISA [
45] use student sensitive attributes for fairness research. Models like MGCD [
46] utilize features such as the class identifier to consider group-level CD. Regarding KC-side data, RCD [
47] and HierCDF [
48] introduce the prerequisite relationships of KCs into CD to further enhance performance, while DCD [
49] uses the inclusion relationships of KCs for CD in the scenario where there is substantial absence of KC annotations in exercises.
Knowledge tracing. In the field of KT, its early iterations primarily encompassed probabilistic models and logical models. Probabilistic models assume that the learning process follows a Markov process, where students’ latent knowledge states can be estimated by their observed performance [
19,
50]. Within this paradigm, models such as BKT [
51] and DBKT [
52] stand out as exemplary. Logistic models constitute a significant category of models grounded in logistic functions, which encapsulate the probability of correctly answering exercises within a mathematical framework that accounts for both student and KC parameters. Notable models within this class include LFA [
53], PFA [
54], and KTM [
55].
In the era of deep learning, the evolution of KT is manifested through the sophisticated network architectures to enhance performance. The primary characteristic lies in the incorporation of diverse network structures to model the dynamic cognition of students. DKT [
24] pioneers the introduction of RNN and LSTM to model the evolving cognitive states of students. Subsequently, an array of models based on LSTM or RNN architectures have been proposed [
29,
56–
58]. Inspired by memory-augmented neural networks, subsequent models begin to enhance the representation of students’ memory processes [
59–
61]. With the rise of the transformer, there has been a surge in utilizing attention-based architectures [
62–
64]. Since the interactions between students and exercises, the relationships between exercises and KCs, and the interconnections among KCs can all be represented as graph structures, some researches explore graph-based KT [
65,
66].
Due to the similarities between CD and KT, some works that integrate CD and KT have also been proposed [
67]. A typical category of such work is to use CD models to enhance the interpretability of traditional KT models [
33,
68,
69]. For instance, Deep-IRT [
68] is a synthesis of the IRT [
34] model and DKVMN [
59] to make deep learning-based KT interpretable. DynamicCD [
33] incorporate educational priors from CD models into KT for better interpretability.
3 Overview of EduStudio
In this section, we first summarize the challenges faced in developing EduStudio when unifying CD and KT. To address these challenges, we present the design philosophy in Fig.3. Grounded in the design philosophy, the overall architecture is depicted in Fig.4.
3.1 Challenges of developing EduStudio
After introducing the background, we can observe that the data usage of student cognitive modeling is diverse and there are commonalities and differences in between CD and KT. Here we mainly analyzes the challenges of the process of developing a unified library for CD and KT. The solutions to these challenges are detailed in Section 4.7.
● Unified management of multifaceted data. Data utilized by CD and KT, relating to students, exercises, and KCs, varies in format among different dataset publishers. Standardizing data file formats and maintaining commonality for effective data management is a pressing issue.
● Ensuring reusability and flexibility in the context of unifying CD and KT. Since both CD and KT are methods for student cognitive modeling, there are commonalities and differences in their approaches. Therefore, ensuring reusability for commonalities and ensuring flexibility for differences is a major challenge.
● Compatibility for future task scenarios. In both CD and KT, there are various task scenarios, such as fairness, cold start, and so on. When designing EduStudio, it is necessary to consider compatibility with both existing task scenarios and unknown future task scenarios.
3.2 Design philosophy
3.2.1 Horizontal modularization
From the horizontal modularization viewpoint, we decompose the general algorithmic pipeline into six modules: Configuration reading, data preparation, model implementation, training control, model evaluation, and log storage.
● Configuration reading (Step 1) aims to collect, categorize, and deliver configurations from different configuration portals.
● Data preparation (Step 2) aims to read raw data files from the disk and then convert them into model-friendly data objects.
● Model implementation (Step 3) refers to the process of implementing the structure of each model and facilitating the reuse of model components.
● Training control (Step 4) focuses on the training process of various models.
● Model evaluation (Step 5) focuses on the implementation of various evaluation metrics.
● Log storage (Step 6) aims to implement storage specifications when storing generated data.
Horizontal modularization establishes clear boundaries for each step throughout the algorithm pipeline, facilitating the incorporation of new things to individual modules.
3.2.2 Vertical templatization
When it comes to a specific module, we observe that there are numerous elements within the module that require implementation and management. Without proper high-level management of these elements, subsequent development and reusability can become overly complex. Thus, we implement vertical templatization design within the modules for Steps 2–5 in Fig.3. We manage these complex elements within the modules using templates, which ensures a well-organized structure. Furthermore, we have developed numerous base templates and created new templates by inheriting from these base templates. These templates are reusable by the models, enhancing their reusability. It should be noted that since all models share the same configuration reading method and log storage path management, these two modules are called in a common, model-independent area. In this case, there is no need for the template-based design for them.
3.3 Overall architecture
Based on the above design philosophy, the overall architecture of EduStudio is illustrated in Fig.4. Steps 2–5 are four templatized modules, while Steps 1 and 6 are common modules that are shared by all the models.
For the four templatized modules (i.e., data preparation, model implementation, training control, and model evaluation), we abstract the intricate elements within each module into various reusable templates. Within each templatized module, we implement multiple templates with inheritance relationships. Each template inherits from a basic template prefixed with BaseBase. These basic templates only provide basic functionalities to maintain the fundamental operation of the library. With this templatized design, we can easily extend a new template within any module, enabling reusability when implementing new models.
In addition to the aforementioned four modules, there are two additional modules (i.e., configuration reading and log storage) that are shared by all models. For configuration reading, we prioritize and categorize the configurations from four flexible configuration portals. This allows us to identify five categories of configurations, where four categories correspond to the four templatized modules, and the last category involves framework-specific configurations. For Log Storage, we store logs from failed or ongoing runs in temporary storage, while successful run logs are stored archivally. This allows users to conveniently discard failed experiments.
4 Design of EduStudio
We organize this section into multiple subsections based on horizontal modularization. Within each subsection, we delve into our vertical templatization design. Ultimately, we provide an in-depth explanation of addressing challenges that are described in Section 3.1.
4.1 Configuration reading
Configuration Reading aims to collect, categorize, and deliver configurations from different configuration portals. We first collect configurations from four flexible configuration portals (e.g., configuration file and command line). Then we retain the highest-priority configurations and categorize them into five groups: data template configuration, model template configuration, training template configuration, evaluation template configuration, and frame configuration (library-specific configurations). Categorized configuration objects make it easier for users to find and utilize them. Finally, we deliver categorized configuration objects to their corresponding modules.
4.2 Data preparation
Data Preparation aims to convert raw data from the hard disk into model-friendly data objects. Standardizing the data preparation pipeline is challenging in the library design because various student cognitive models utilize data with diverse content and formats. For example, CD handles interaction data ignoring timestamp, while KT handles sequential interaction data. Additionally, models may selectively utilize features such as relations, contexts, and other relevant data features.
To address the aforementioned challenges, let us first clarify the workflow of data preparation, as shown in Fig.5. The first step is to load the raw data from the hard disk. Then, a series of processing steps are performed to obtain model-friendly data objects. Finally, these data objects are passed on to other modules. We simplify the data preparation into three stages:
● Data loading: Loading necessary data from the hard disk.
● Data processing: Convert the raw data into model-friendly data objects by a range of data processing operations.
● Data delivery: Deliver model-friendly data objects to other modules.
Among these three stages, data processing is the most complex and feature-rich stage in data preparation. Therefore, we have established a set of standardized protocols and developed a series of atomic data operations for data processing (detailed in Section 4.2.1). These protocols and operations help streamline and enhance the data processing stage, making it more efficient and effective. Finally, we utilize the data template (detailed in Section 4.2.2) to manage and control these three stages, enabling a complete and reusable data preparation process. The data template ensures consistency and standardization throughout the stages, facilitating efficient data preparation for the following steps.
4.2.1 Protocols for data processing
In order to standardize the complete workflow of data preparation, we propose three protocols for the data processing stage: data status, middle data format, and atomic data operation protocols.
● Data status protocol. We categorize data into three statuses: 1) inconsistent rawdata: the original data format provided by the dataset publisher. This data format is diverse and lacks unification; 2) standardized middata: the standardized middle data format defined by EduStudio. This unified format is friendly for researchers to read; 3) model-friendly cachedata: the data format that is convenient for model usage. In EduStudio, we implement data cache functionality, which allows users to bypass the data processing procedure in subsequent experiments after saving cached data from the previous experiment.
● Middle data format protocol. As mentioned in the Data Status Protocol, the middle data is the standardized data format. We define some standardized data files for student-exercise interaction data, student-side features, exercise-side features and so on, which is detailed in EduStudio official website.
● Atomic data operation protocol. To achieve reusability and flexibility in data preparation, we propose the concept of atomic data operation to convert the whole data processing into some reusable atomic data operations. From rawdata to middata, we require users to specify one atomic data operation (i.e., a Python class prefixed with R2M) to convert raw data into standardized middata. From middata to cachedata, we allow users to specify multiple atomic data operations (i.e., multiple python classes prefixed with M2C) sequentially.
Founded on above protocols, we offer a comprehensive range of atomic data operations to facilitate the transformation of rawdata into middata, and subsequently into cachedata. These operations include R2M (Rawdata to Middata) and M2C (Middata to Cachedata) atomic operations. The flexibility to combine and substitute atomic operations enables flexibility.
● Atomic data operations for transformation of rawdata to middata: Due to the diverse nature of rawdata in different datasets, we provide a total of 18 R2M operations for all inherited datasets within the library. These operations are designed to transform the raw data into an intermediate data format, facilitating subsequent processing and analysis.
● Atomic data operations for transformation of middata to cachedata: To ensure the compatibility of data objects with models, particularly cachedata, we meticulously devise a range of M2C operations. These operations can be broadly classified into four main categories based on the type of data processing: data cleaning, data conversion, data partition, and data generation. As indicated in Tab.1, data cleaning focuses on refining the data by applying filters to students or exercises and addressing missing values. Data conversion aims to modify the data format. We specifically design operations to accommodate the triple form in CD and the sequence form in KT. Data partition involves dividing the entire dataset into training, validation, and test sets for CD and KT. Data generation aims to produce additional features that can enhance prediction capabilities, such as KC inclusion relationships and KC prerequisite relationships.
4.2.2 Data templates
Data templates ensure consistency and standardization throughout the three stages of data preparation, facilitating efficient data preparation for the following steps. Tab.2 demonstrates three highly reusable data templates: the base template, general template, and educational template. The base data template is not specific to educational data and provides basic functionalities to maintain the fundamental operation of the library. The general template inherits from the base template and focuses on scenarios involving simple educational data with only student-exercise interaction data. It implements three protocols in data preparation. The educational template inherits from the general template and includes additional student-side and exercise-side features. When implementing a new data template, the focus lies in loading data and composing various atomic data operations.
4.3 Model implementation
Model Implementation refers to the process of implementing the structure of each model and facilitating the reuse of model components. We designed two basic model templates: Base (BaseModelTPL) and Gradient Descent Base (GDBaseModelTPL). By inheriting the basic model templates, we collectively implemented 45 student cognitive models.
As listed in Tab.2, there are two basic model templates, namely BaseModelTPL and GDBaseModelTPL, which define the specifications for model implementation in EduStudio. The difference between BaseModelTPL and GDBaseModelTPL lies in the fact that the latter builds upon the former by considering models that can be optimized using gradient descent methods. GDBaseModelTPL provides additional tools and functionalities specifically designed for gradient descent-based optimization models. All models are required to inherit from one of these basic model templates and adhere to the corresponding interface functions. For example, we specify the interface function of for loading extra required data (such as Q-matrix, KC relationships) except student-exercise interactions. Additionally, we define for returning a loss dictionary that contains multiple losses.
During the implementation process of the model, we develop reusable components for portability. For instance, we implement a Positive MultiLayer Perceptron (PosMLP) to support the monotonicity assumption [
38] that is widely used in CD models for interpretability. The monotonicity assumption states that the probability of a correct response to an exercise increases monotonically with any dimension of the student’s cognitive proficiency.
We currently implement 16 models for CD and 29 models for KT in EduStudio. We arrange implemented models in terms of data usage and technique usage in Tab.3.
4.4 Training control
Training control focuses on the training methods of different models. It is worth noting that in the training control procedure, some implemented training templates are shared between the CD and KT. This highlights the ability of EduStudio to promote significant reusability between them.
For the models that have been implemented so far, we summarize three mainstream training paradigms for student cognitive modeling and provide corresponding training template for each training paradigm: general training (GeneralTrainTPL) and adversarial training (AdversarialTrainTPL), as listed in Tab.2. Their ancestral training template (i.e., BaseTrainTPL) provides the necessary functionality to maintain the basic operation of the library. GDBaseTrainTPL based on BaseTrainTPL provides some utilities for gradient descent based models. When a new training paradigm comes, we can inherit these base training templates to implement a new training template.
4.5 Model evaluation
Model evaluation primarily focuses on the implementation of various evaluation metrics. They can be shared by all CD and KT models according to their respective needs. As illustrated in Tab.2, we currently implement four kinds of important metrics for student cognitive models.
● Student performance prediction evaluation aims to evaluate the prediction performance that students’ response to exercises, which usually can be formulated as a binary classification task. Common metrics include classification metrics such as Area Under the Curve (AUC) and ACCuracy (ACC), as well as regression metrics like Root Mean Square Error (RMSE).
●
Cognitive representation interpretability evaluation aims to evaluate the students’ cognitive results. NCDM [
38] proposes the Degree of Agreement (DOA) metric whose intuition is that if student
has a better mastery on KC
than student
, then a is more likely to answer exercises related to
correctly than
. The authors of IC-IDM [
85] consider that the order of interpretable students’ knowledge proficiencies should be consistent with the order of response scores on relevant exercises. They propose the Degree of Consistency (DOC) metric.
●
Cognitive representation identifiability evaluation aims to measure the discrepancy between cognitive ability of students with the same response distribution. In general, students exhibiting the same response distribution should demonstrate similar cognitive outcomes. IC-IDM [
85] proposes the identifiability concept of various CD models and a quantitative Identifiability Score (IDS) to measure the identifiability.
●
Cognitive fairness evaluation aims to measure the fairness. FairCD [
44] explores the fairness in CD and proposes the
metric whose intuition is that a model is considered to be fair if the gap between true proficiency and predicted proficiency is identical across different groups. FairLISA [
45] utilizes the classical fairness metrics: Demographic Parity (DP) [
86] and Equal Opportunity (EO) [
87] to measure the fairness.
4.6 Log storage
Log Storage aims to implement storage specification when storing generated data primarily depends on path management. Tab.4 displays the path management. For path management of log storage, we specify <project>/temp/ directory to store logs of ongoing or failed experiments as temporary storage and <project>/archive/ directory to store logs of completed experiments as archive storage, which is convenient for users to abandon failed experiments. When it comes to a detailed experiment log, we stipulate: 1) config.json: store all configuration information; 2) <ID>.log: store training log; 3) result.json: store model evaluation result; 4) /pth/: store model parameters at each epoch or the best epoch.
4.7 Summary
After introducing the detailed design of EduStudio, in this section, we elaborate our solutions to the challenges discussed in Section 3.1.
The primary challenge of the EduStudio is to efficiently reuse the commonalities (reusability) of CD and KT while preserving their differences (flexibility). We adopt a modularized and templatized design (detailed in Section 3.2) to address this challenge. This design philosophy is reflected in all six delineated modules, which is the content that this subsection will elaborate on. 1) For the modules of Configuration Reading and Log Storage, we reuse the same configuration and storage methodologies across both the CD and KT, as these two modules are task-agnostic. 2) In the Data Preparation module, we segment the entire data processing process into a series of atomic data operations, some of which are shared between CD and KT, while others are specific to the tasks of CD and KT, respectively. 3) In the Model Implementation module, we develop reusable components between CD and KT for portable model implementation. 4) In the Training Control module, from the perspective of training methodologies (such as general training, adversarial training, and other methods), we develop various training templates that can be utilized by both CD and KT models. 5) In the Model Evaluation module, we design distinct evaluation templates based on different assessment types, some of which are shared between CD and KT (e.g., the PredictionEvalTPL), while others are specific to CD (e.g., the IdentifiabilityEvalTPL).
For the challenge of unified management of multifaceted data, we devise a series of protocols for data processing (detailed in Section 4.2) to manage data efficiently. For the challenge of compatibility for existing task scenarios and future task scenarios, the modularized and templatized design can support the challenge in a user-friendly manner. When we face new task scenarios, what we need to consider is to follow relevant protocols to develop new templates to support new models (detailed in Section 5.2).
5 Usage of EduStudio
The code example of running a model is illustrated in Fig.6. The function is the entry point for the whole experimental process including running an existing model and running a customized model.
5.1 Running existing models
To run an existing model, we need to specify at least the dataset name (i.e., dataset parameter in ) and template name in each step in the algorithm workflow (i.e., the or key in corresponding parameter dictionary). The corresponding templates of models are detailed in online Reference Table. In addition, users could also specify some parameters in the parameter dictionary to replace the lower-priority configuration. For instance, the parameter in would replace the default configuration of .
5.2 Implementing new templates
We can implement a new template by inheriting an existing template (i.e., a Python class). To run a customized model or replace an existing template, we just need to specify the address of corresponding template class as the value of or key instead of the value of string type. In Fig.6, the training template and model template are customized. We can specify the class object in cls key to implement customization instead of the template name. It can be seen that EduStudio is highly flexible and can cover the new things that appear at each step. In response to how to implement a new template, we have placed this part of the content in the developer guide of the explanatory document, which can help developers quickly develop custom templates.
6 Eco-services of EduStudio
To further enable more researchers to understand and quickly participate in the field of student cognitive modeling, we offer some eco-services including a Github repository and Leaderboard website surrounding EduStudio.
6.1 Awesome-student-cognitive-modeling repository
The Github repository awesome-student-cognitive-modeling collects valuable resources about student cognitive modeling:
● Dataset collection and description. Here, we collect available public datasets for educational data mining and provide a detailed description for each dataset. We summarize the characteristics of each dataset to facilitate researchers in efficient selection of the dataset that is applicable to their current research.
● Research direction categorization. We summarize existing research directions in student cognitive modeling including detailed description, representative papers, and commonly used datasets of each research direction. This enables researchers to swiftly comprehend the student cognitive modeling.
● Paper collection and categorization. We collect and keep up-to-date with the latest related literature. The collected papers can be categorized into: 1) research papers; 2) survey papers; 3) dataset papers. For research papers, we also make a detailed categorization. We illustrate data usage, technique usage, and research direction of each paper, which facilitates researchers to rapidly grasp the content of these papers.
6.2 Leaderboard
To ensure the reproducibility and comparison of various student cognitive modeling models, we provide a public leaderboard. As illustrated in Fig.7, there are two major features: Task Selection and Detailed Leaderboard. The former requires users to specify elements such as task type and dataset. The latter provides a comprehensive comparison between models in the form of graphs and tables based on specified elements.
To support all users in uploading their experiment results, we provide a portable processing flow. In EduStudio, each experiment eventually forms a specific log directory (as depicted in Tab.4). After users submit their own experiment log directory to the specific github repository, the Python script could process the new experiments and convert them into .json files required by Leaderboard frontend, and the Leaderboard frontend will automatically display the new experimental results according to the json files.
7 Comparison with existing libraries
With the growing attention from researchers toward student cognitive modeling, in the past few years, there has been a successive release of open-source algorithm libraries. Like existing libraries, EduStudio is also built using PyTorch. We summarize and compare the characteristics of existing student cognitive modeling libraries in Tab.5.
EduStudio boasts a more extensive collection of models compared to the existing libraries, thereby reducing the burden of extensive model reimplementation. Specifically, when considering individual tasks such as CD or KT, the number of models in EduStudio also surpasses those in the existing libraries. Regarding the support for datasets, our EduStudio supports a greater number of datasets. Furthermore, we provide a comprehensive data preparation process, tailored a data status protocol, middle data format protocol, and atomic data operation protocol for the data.
EduStudio supports more features, including 1) From the perspective of student cognitive modeling, integrating CD and KT, rather than considering individual tasks alone, which not only facilitates communication among researchers from both communities but also encourages the integration of the two types of student cognitive modeling approaches into one unified model. 2) The modularized and templatized design makes the library highly reusable and flexible. 3) Providing comprehensive eco-services encourages more researchers to understand and participate in this field.
8 Future directions
In this section, we first discuss the research trend of student cognitive modeling. Subsequently, we talk about future work of EduStudio based on the research trend and existing limitations of EduStudio.
8.1 Research trend of student cognitive modeling
For the research trend of student cognitive modeling, we summarize some aspects according to current hotspots and opportunities.
● Data perspective. From data perspective, multimodal and cold-start research are two promising directions. Multimodal student cognitive modeling [
88,
89] aims to employ multi-modal data from student-side, exercise-side, and KC-side. Existing related work covers studies related to cold-start students [
90–
94], cold-start exercises [
95], and cold-start KCs [
96].
● Model perspective. From model perspective, student cognitive modeling with Large Language Models (LLMs) is emerging as a mainstream trend. LLMs have recently attracted global attention in various fields, leading some researchers to incorporate relevant technologies into student cognitive modeling [
94,
97–
100].
● Evaluation perspective. From evaluation perspective, beyond accuracy evaluation, recently more researchers propose various evaluation aspects based on students’ cognitive characteristics. Fairness has consistently been a trending topic in the trustworthy AI [
101–
103], and ensuring fairness in education is also essential. Recently, an increasing number of researchers are delving into the fair student cognitive modeling [
44,
45,
104–
107]. IC-IDM [
85] propose the identifiability evaluation, which aims to measure the discrepancy between cognitive ability of students with the same response distribution.
8.2 Future work of EduStudio
Here, we discuss the future work for EduStudio based on existing limitations and the research trend of student cognitive modeling.
● Implement models including more scenarios. EduStudio adopts a modular and template-based design, focusing on balancing commonality and diversity, but it lacks sufficient consideration for diverse scenarios of student cognitive modeling. The model integration for specific scenarios is not comprehensive (such as the cold-start [
90–
94] and causality-based [
104,
108]). As described in the research trend, we will see the emergence of more new scenarios. Therefore, we will keep track of the developments in the field of student cognitive modeling and promptly implement relevant models.
● Integrate models of downstream educational applications. Student cognitive modeling has a series of downstream applications, among which the two most representative types are educational recommendation systems and Computerized Adaptive Testing (CAT). Educational recommendation systems aim to recommend relevant learning resources for students, such as learning path recommendation [
8–
10,
109], course recommendation [
110,
111], and exercise recommendation [
11,
112,
113]. CAT aims to provide tests that adapt dynamically to each student by tailoring test exercises based on the student’s performance [
114]. The CD model is an essential component of CAT, as CAT requires CD to continuously assess students’ cognitive states [
115,
116]. In the future, we may consider integrating models of downstream applications based on student cognitive modeling.
● Refine and update the eco-services promptly. The current eco-services still requires refinement, such as enhancing the comprehensiveness and richness of the awesome-student-cognitive-modeling repository. As the trends of the future continue to change, we will update the latest content into the eco-services to ensure that it remains up-to-date and continually improve the usage of EduStudio.
9 Conclusions
In this paper, we released a unified library EduStudio for student cognitive modeling. Compared to existing libraries, we unified cognitive diagnosis and knowledge tracing, which not only enable the reusability within individual category but also facilitate sufficient reusability between them. In addition, our EduStudio is modularized and templatized design when implementing models, which sufficiently improves reusability and flexibility. To further enable more researchers to understand and quickly participate in the field of student cognitive modeling, we also offered a range of user-friendly eco-services surrounding EduStudio.
The Author(s) 2024. This article is published with open access at link.springer.com and journal.hep.com.cn