1 Introduction
The rapid advancement of AI technology has significantly impacted all sectors of society, including higher education (
Yau et al., 2023), offering both advantages and challenges (
Alasadi & Baiz, 2023). The current generation of students can readily access extensive material through AI-powered search engines. Consequently, conventional education approaches, particularly classroom lectures, are being increasingly scrutinized and facing growing competition (
Alzayani et al., 2022;
Guo et al., 2022;
Kuo et al., 2023).
AI technologies are now prevalent in higher education, assisting students in their academic and personal endeavours (
Alsobhi et al., 2022). However, despite their widespread use, significant issues persist regarding students’ engagement and perception of technology. Many pupils lack a comprehensive understanding of AI’s capabilities, often perceiving it as a study instrument rather than leveraging its unique cognitive and analytical potential (
Wang et al., 2024). Moreover, AI’s inherent nature, particularly its accuracy and reliability, complicates its effective utilization (
Parker et al., 2024). These issues are more noticeable where detailed and correct information is essential.
To mitigate these challenges, educators should devise creative pedagogical approaches that incorporate AI rather than dismissing it. Teachers should actively use AI technology to enhance student learning (
Dangi et al., 2023). This initiative primarily aims to determine the application of AI tools, such as AI teaching assistants in educational settings (
Kim et al., 2020), which aim to address comprehension gaps and enhance individual engagement and productivity.
This study investigates the application of AI teaching assistants in biochemistry sessions at a prominent medical university in China to enhance pedagogical approaches. It explores the impact of using early biochemistry literature published by Science Press, in conjunction with AI functionalities, on students’ learning outcomes. The AI assistant used in this study, Blueink, facilitates an exploration of how AI can enhance critical thinking skills and deepen students’ comprehension.
The pre-test and post-test surveys are administered to assess alterations in students’ knowledge, application, and critical thinking skills regarding AI (
Domenghini et al., 2014). The results offer practical guidelines for educators and specialists seeking to integrate AI into higher education. Moreover, this study can provide valuable insights for developers of AI-based educational tools, illustrating the impact of this technology on classroom environments.
This study proposes an integrative framework to support the adoption of AI teaching assistants in higher education, focusing on reconciling technological innovation with pedagogical needs. Based on the implementation of the Blueink system in biochemistry courses, the framework offers three preliminary guidelines for educators: first, to leverage AI tools as supplements to enhance their students’ critical thinking and problem-solving skills; second, to balance AI-enabled personalization with structured curriculum delivery; third, to address emerging challenges such as algorithmic transparency and equitable access. While acknowledging this study’s limitations in generalizability beyond biochemistry education, the findings suggest that well-structured AI integration can strengthen students’ analytical capabilities without compromising human-centric educational values. The framework serves as a foundation for interdisciplinary dialogue, inviting further refinement across diverse academic contexts. By fostering collaboration between AI and educators, this study contributes to ongoing efforts to align technological advancements with the enduring goals of higher education.
2 Literature Review
AI is not only influencing education but also revolutionizing how we teach and learn. From personalized learning tools to smart grading systems, AI is reshaping classrooms worldwide (
Cumming, 1998). Over the past five years, numerous advances have been made in AI applications in education, leading to its increased adoption (
Renz et al., 2020). Research has explored AI’s potential to enhance instruction, improve teaching strategies, and create smart campus ecosystems (
Liu et al., 2020). As part of this expansion, frameworks for the application and assessment of AI in higher education are being developed (
Jantakun et al., 2021).
AI is starting to play a significant role in school administration. Researchers are investigating its capacity to enhance digitalization and transparency, facilitate outcome prediction, and improve management and supervision choices (
Yu, 2021). However, challenges persist, such as the division of responsibilities between AI and humans as well as handling potential issues during operation (
Yu & Lu, 2021). Moreover, the moral concerns surrounding the application of AI in education are gaining prominence. Human roles should be carefully considered when developing and utilizing AI-based educational systems (
Holmes et al., 2022). These systems need human involvement at every stage, including data collection, pattern recognition, and adaptability (
Ninaus & Sailer, 2022).
AI can also be used for individualized education. Researchers investigated how AI-driven systems, such as social media, chatbots, expert systems, intelligent mentors, machine learning, and virtual learning environments, developed personalized learning pathways for individual pupils (
Tapalova & Zhiyenbayeva, 2022). Moreover, AI’s role in remote learning and the potential of blockchain technology to enhance its efficacy was investigated (
Rakhimberdiev et al., 2022). The proliferation of generative AI (GenAI), particularly technologies such as ChatGPT, is attracting significant interest. Researchers examined its potential applications in education and the technical assistance required for its successful integration into education (
Konstantinova et al., 2023;
Yu & Guo, 2023).
In the context of higher education, Tian et al. (
2024) integrated the unified theory of acceptance and application of technology and the expectation‒confirmation model to adopt AI chatbots among Chinese graduate students. They found that performance expectancy, confirmation, satisfaction, and personal innovativeness significantly influenced students’ behavioural intentions, while effort expectancy, social influence, and facilitating conditions had insignificant impacts. Jaboob et al. (
2025) studied the effects of GenAI on students’ cognitive achievement in the Arabic higher education system, revealing that it had a positive and significant impact on student behaviour and cognitive achievement. Similarly, Kumar et al. (
2024) used a qualitative mixed-methods approach to explore the impact of ChatGPT on higher education, especially in business education. They highlighted ChatGPT’s potential to foster pedagogical innovation while raising concerns such as plagiarism. Moreover, Michel-Villarreal et al. (
2023) adopted a thing ethnography approach to analyzing ChatGPT’s role in higher education, identifying both challenges, such as academic integrity and quality control, as well as opportunities, including round-the-clock assistance and personalized learning. Chiu (
2024) investigated the impact of GenAI in primary and secondary education, taking ChatGPT and Midjourney as examples from the perspectives of teachers and administrators. The study examined its influence across four domains, including learning, teaching, assessment, and administration, finding that GenAI altered teachers’ views on new learning outcomes, prerequisite knowledge, and the importance of generic skills. It also had implications for practice, policy, and future research directions.
Despite AI’s numerous potential benefits, it also entails specific concerns that necessitate careful governance of digital technology utilization (
Filgueiras, 2024). Researchers examined the advantages and challenges of implementing AI in primary and secondary education, along with the potential hazards it might pose to education overall (
Guan, 2023;
Rochim, 2024). They investigated the possible influence of AI on the future of educational institutions, especially higher education (
Begum, 2024;
Juma, 2021). Researchers were underway on using AI systems to enhance efficiency, including the application of support vector machines in conjunction with differential evolution algorithms (
Long & Gao, 2022). They examined wisdom education and its correlation with AI-driven teaching systems (
Liu, 2018). Explainable AI (XAI) was emerging as a critical research area, complementing fundamental studies on AI in education (
Khosravi et al., 2022;
Yu & Guo, 2023;
Yu & Lu, 2021). Therefore, integrating AI into education is a multifaceted issue that requires both creativity and scrutiny as it evolves.
The existing research on this topic presents three obvious shortcomings. First, most of the studies are cross-sectional and longitudinal studies that consistently assess the long-term impact of AI on education are lacking. For example, the assessment is related to how students’ attitudes, skills, and academic performance change over a semester with AI use. Second, while ethical issues are recognized, there is little in-depth research on AI’s broader social consequences, such as AI over-reliance and the development of AI usage habits. Third, further discussion is needed on the better integration of AI into different educational environments and curriculum contents, considering factors such as subject matter complexity and discipline-specific requirements.
3 Blueink, an Intelligent Virtual Learning Assistant
3.1 Brief Introduction of Blueink
AI-based virtual assistants, also referred to as dialogue robots or intelligent personal assistants, facilitate natural user interactions and foster innovation across applications. These assistants are at the forefront of dialogue-driven intelligent application services, with conversational AI forming the core of the Conversations as A Platform (CaaP) model. This model is incrementally establishing a significant informational ecosystem in the AI era. Intelligent virtual assistants possess attributes of natural human–computer interaction, intelligent service models, and comprehensive technical frameworks, establishing an effective learning support system. Canbek and Mutlu (
2016) asserted that virtual assistant robots provided distinct benefits in course tutoring, knowledge retrieval, practice and simulation, and personalized learning, thereby facilitating the attainment of personalized education objectives (
Canbek & Mutlu, 2016). Dagnon (
2017) further posited that intelligent assistants could enhance students’ interest and participation in the learning process. The use of general-purpose voice assistants, such as Cortana and Siri (
Fleming, 2014;
McNeal, 2016), as well as intelligent learning assistants, such as Watson Jill (
Maderer, 2016), MOOCBuddy (
Holotescu, 2016), and QuizzleBot (
Klopfenstein & Bogliolo, 2017), substantially improved the viability of intelligent learning for students within the educational framework. This paper presents Blueink, an intelligent virtual learning assistant integrated into the three-dimensional AI-enhanced textbook
Medical biochemistry from Science Press (
Hu & Liang, 2019). Zhipu AI, a large language model (LLM), based on the GLM-4-Air foundational model, developed by Tsinghua University, Blueink is designed to respond to personalized student inquiries online 24 hours a day. It offers three distinct question‒answer modes: First, the dialogical question‒answering model searches for answers in the textbook and identifies specific chapter topics through multimodal search. Second, the exploratory question-and-answer sessions expand beyond the textbook leveraging an LLM. Third, the role-playing question‒answering model provides a personalized and engaged learning environment where the AI tutor poses questions and students are expected to participate actively. Students are encouraged to engage with Blueink throughout their educational journey to enhance their AI proficiency and critical thinking skills.
3.2 Core Functions of Blueink in the Course
Blueink, an AI assistant, is tailored to the medical biochemistry curriculum. It is built on the Zhipu AI and supports students by offering intelligent positioning and recommendation of the textbook, diagnosis of differential chemistry, and generative iterative sharing. Blueink constructs a complete AI education ecological closed-loop through its three core functions, including accurate recommendations, learning diagnosis, and personalized tutoring.
First, Blueink recommends accurate content based on students’ questions and locates the associated traditional textual knowledge explanations, structured chapter knowledge guide maps, knowledge animation demonstrations, instructor teaching videos, and English pronunciation of professional medical terminology in various forms.
Second, based on students’ interactions with AI, Blueink provides learning diagnosis reports on students’ behavioural patterns in engaging with AI. These records include learning progress, learning time allocation, note-taking activities, discussions, exercises, and accuracy rate. Moreover, statistical analyses are conducted on AI Blueink queries, including the total number of questions, the percentage of dialogue-based and inquiry-based interactions, and the distribution of questions in different chapters. These data contribute to the development of a diagnostic report on students’ learning progress and provide a comprehensive profile of each learner. The analysis captures individual and collective learning difficulties and effectiveness accurately, providing a scientific basis for large-scale personalized learning support strategies and information references for future students’ learning and teachers’ instruction process.
Third, Blueink accompanies students in personalized thinking throughout the learning process. It supports students through dialogue-based question-and-answer sessions, inquiry-based discussions, and guided lecturing. This is the primary function of AI Blueink and a comprehensive application of the first two functions. At the beginning of the learning process, AI Blueink serves as a learning tutor and provides students with relevant learning content, tiered exercises, and feedback. During the learning process, if students struggle to retain knowledge during exercises or independent reviews, they can utilize AI Blueink’s dialogue-based search function to seek clarification. AI Blueink delivers targeted instructional content, including knowledge lectures, 3D demonstrations of cellular pathology, teaching videos, and extended exercises. These resources are personalized based on students’ questions and learning status to enhance comprehension ability effectively through interactive question-and-answer sessions and learning assistance. When students advance and deepen their understanding, AI Blueink leverages its LLM functions to foster students’ critical thinking competency. Through associative prompts, it encourages students to engage in transfer thinking, variant thinking, analogical reasoning, and innovative problem-solving. After completing a learning module, AI Blueink can intelligently manage students’ learning progress, including their initial learning behaviour reports, customized learning plans, summary of test points, and random questions.
3.3 Teachers’ Strategies for Applying Blueink in Teaching and Learning
A guided self-study course, such as the vitamins course, serves as an illustration. It follows a two-stage learning facilitated through group cooperation. The fundamental stage requires students to divide responsibilities, engage with the micro-teaching videos, cloud teaching materials, and other resources to capture key knowledge points, and fill out the online form. The advanced stage involves the application of the AI Blueink quiz to complete the highlighted reflections section, which necessitates searching, discussing, and providing information on the key points. In this stage, the AI Blueink quiz requires the search, discussion, and refinement of the research frontiers, as well as the practical and prospective applications of specific vitamins.
For instance, in a glucose metabolism course, theoretical lectures are conducted on specific topics. The pre-course stage involves students independently reviewing a 10-minute micro-teaching video. While the teacher emphasizes the fundamentals, a challenging extended multiple-choice question is intentionally omitted. Instead, students are encouraged to use AI Blueink to stimulate their thoughts and make decisions. The results are tallied by the teacher before the class. At the beginning of the class, the teacher displays the statistical results of the multiple-choice questions, followed by an opportunity for students who select alternative options to elaborate their reasoning processes. Then, the teacher displays the dialogue page encouraging students to comprehend the significance of critical and precise questioning.
To enhance knowledge acquisition in a particular subject, such as a metabolic integration and regulation course, students are instructed to review the 10-minute micro-teaching video independently during the pre-course stage. While the primary focus is on the fundamental knowledge, a comprehensive subjective question is included. Students are tasked with selecting a hub metabolite, using AI Blueink to identify gaps, and tracing all metabolic pathways through the hub. Students are instructed to project their responses onto the teacher’s main screen and take turns presenting their findings to the audience. Meanwhile, the teacher records key concepts on the board and summarizes the main points.
As an example of a basic clinical dual-teacher course, metabolic integration thinking in a clinical nutritional support course incorporates a collaborative teaching approach. During the pre-course thinking phase, the biochemistry and general surgery teachers jointly developed two case studies on enteral and parenteral nutrition. These case studies were distributed to students before the class, along with 12 biochemical and clinical questions. Working in groups, the students selected four of the provided questions and employed AI Blueink to assist them in reorganizing and thinking creatively to formulate their initial opinions. During the classroom discussion phase, the teacher of general medicine summarized the clinical experience associated with the problem, while the biochemistry teacher focused on the application of biochemical mechanisms. Three to four students were invited to present each issue to the entire class, fostering discussion and intellectual exchange among peers. In the post-course phase, students iteratively refined their pre-course thinking assignments by incorporating insights gained from group discussions and teacher–student discussions. Using various coloured annotations, students documented the evolution of their understanding, demonstrating the generative learning process.
4 Research Methods
This study explores the integration of AI search engines into college biochemistry classrooms and assesses their effect on students’ critical thinking abilities.
4.1 Questionnaire Design
A total of 460 undergraduates in biochemistry major participated in a study that assessed their AI utilization through two questionnaires based on the UNESCO framework for AI competencies. This framework emphasizes four core skills, including human-centred thinking, AI ethics, AI technology and application, and AI system design. The questionnaires are structured around two primary aspects, including AI cognition and AI usage proficiency. The questionnaires aimed to gauge students’ perspectives on AI’s role in education, the balance between human involvement and AI in the current age, the trustworthiness of AI-generated responses, and their ability to discern vital information from AI sources. The participants rated their agreement with each statement on a 10-point Likert scale, with 1 representing strong disagreement and 10 indicating strong agreement. The questionnaire items align with key AI competencies, cognition, and usage, as shown in Tab.1.
4.2 Sample Collection
The study samples comprised sophomores majoring in clinical medicine. These students have finished two-semester academic learning without any concerns, such as job searching. While they demonstrated a strong commitment to academic learning and knowledge acquisition, they often lacked the competency to develop independent critical thinking.
Regarding their knowledge framework, these students have undertaken courses, such as normal human morphology course and cell biology course. Moving forward, they will engage in more advanced studies, including the basics of infection and immunity course and medicine of organs and systems course. The elective course for this study, medical biochemistry, comprehensively examines molecular interaction networks in human health and illness from a microscopic viewpoint, effectively bridging the two types of courses. It provides an in-depth exploration of the concepts underlying disease manifestation and the fundamental challenges related to critical thinking and learning, requiring students to possess a foundational understanding of the physiological aspects of the human body.
In 2024, Science Press, China’s famous publisher, released the nation’s inaugural collection of stereoscopic 3D AI biochemistry textbooks. These textbooks developed through LLM training, incorporate an integrated virtual dialogue robot, Xiaolan AI, which forms the foundation of this study.
4.3 Research Procedure
This study examined changes in students’ AI usage and critical thinking development over time. The experiment commenced with participants completing a post-evaluation questionnaire to evaluate their comprehension of AI usage and their proficiency in critical thinking. In the class, educators instructed students on the effective use of AI as search engines. A primary teaching objective was to assist trainees in critically analyzing details from AI and enhancing their critical thinking skills. At the end of the class, the same questionnaire was administered again to evaluate any changes in students’ AI proficiency and critical thinking skills. Subsequently, this study analyzed the data before and after the assessment to compare the changes.
4.4 Data Analysis
Data analysis was conducted using IBM SPSS software. First, in the case of multiple questionnaires filled out by the same student, we retained the most recent answers in chronological order to achieve uniqueness while screening out valid samples that participated in both the pre-test and post-test by comparing student numbers to ensure that the data used were complete and valid. We used descriptive statistics to summarize the participants’ demographic information and their response distributions. Second, the study conducted paired sample t-tests and the McNemar-Bowker test to compare students’ AI perceptions and usage before and after the intervention. Then, the research group analysed multiple linear regression to control the effects of gender and students’ prior AI experience, thereby enabling a more accurate assessment of the intervention’s impact. After the intervention, the study examined individuals’ behaviours, such as students’ activities, students’ feelings about using AI, AI responses, AI-preferred issue-solving situations, and any AI modifications.
5 Experimental Results
5.1 Descriptive Statistics
A total of 418 valid paired responses from the pre-test and post-test were collected after selecting participants who completed both assessments. The sample comprised 68.4% males (n = 286) and 31.6% females (n = 132). The preliminary test results revealed a substantial disparity in students’ understanding of AI tools and most students used these tools infrequently, as shown in Tab.2. At the same time, a minority employed them consistently for learning purposes.
The descriptive statistics further indicated that, while most students possessed a basic comprehension of AI tools and encountered them in their learning process, their proficiency in leveraging AI for learning remained underdeveloped, as shown in Tab.2.
5.2 Paired Sample t-Test Results
5.2.1 Utilizing AI in Learning
The first aspect examined students’ perceptions of AI as an essential tool for modern education. The results showed that the pre-test average score was 9.480 ± 1.700 and the post-test increased to 9.830 ± 1.510. The difference was statistically significant, with a t-value of –4.180 (p < 0.001) and Cohen’s d of –0.204. This intervention reinforced students’ perception that AI was critical in education, possibly because they used AI tools in their biochemistry courses. Although the effect size is small, the result of the study anticipates that with increased interactions and more extensive AI integration, students’ awareness of AI’s educational significance will continue to grow.
5.2.2 Assessing the Importance of Human and AI
The students were asked to evaluate whether human cognition or AI plays a more crucial role in the AI era. The mean score increased from the pre-test score of 8.920 ± 1.940 to the post-test score of 9.270 ± 1.820, with a t-value of –3.270 (p < 0.001) and a Cohen’s d of –0.160. This suggested that the intervention effectively emphasized the complementary role of human intelligence in AI-driven education, aligning with the research purpose of fostering critical thinking skills and reducing over-reliance on AI. However, fostering a human-centred approach to AI is a complex process. Future research should explore more effective and intensive interventions to further enhance its impact.
5.2.3 Trusting in AI-Generated Answers
The average trustworthiness of AI-generated answers improved slightly, from the pre-test score of 7.150 ± 1.410 to the post-test score of 7.270 ± 1.650. The difference was not statistically significant (t = –1.430, p < 0.05). While the interventions improved the students’ ability to evaluate AI outputs, their trust in AI systems remained largely unchanged.
5.2.4 Identifying Key Information
As shown in Tab.3, the intervention significantly improved the student’s ability to extract key information from AI-generated content. The mean score increased from 7.790 ± 1.590 in the pre-test to 8.670 ± 1.560 in the post-test, with a t-value of –10.040 (p < 0.001) and the Cohen’s d of –0.491. This apparent improvement showed that teaching strategies were effective in helping students analyze complex AI outputs. Although the effect size remained modest, the cumulative effects of interventions could be of significant value in the context of education, which was a practice with a long-term perspective.
Meanwhile, gender and AI usage experience were selected as control variables, with change values of responses to the four key questions set as the dependent variables for conducting multiple regression analysis. The R² of all models, the coefficient of determination, is below 1.5%, with adjusted R² near zero or negative, indicating that both gender and AI usage experience had minimal effects on the dependent variable, and neither factor reached statistical significance (p < 0.05) . These findings supported the conclusion that the observed changes in students’ perceptions and critical thinking abilities could be attributed primarily to the intervention itself, rather than pre-existing demographic factors.
In Tab.4, b is the regression coefficient while β is the predictor variable. Notably, a t-value of –2.115 (p < 0.05) for prior AI usage is found when analyzing the assessing the importance of human and AI group, indicating that individuals with more experience in AI use might emphasize more on human importance over AI capabilities, which is consistent with findings. Meanwhile, a t-value of –2.115 (p < 0.05) for gender was observed when analyzing the trusting in AI-generated answers group, indicating that gender had a statistically significant effect in this group, as shown in Tab.4. Specifically, females might be associated with a lower trust in AI-generated answers. It was recommended that future studies should include more samples of female participants to further explore this phenomenon.
5.3 Current Status Analysis
The current status analysis delves into students’ post-intervention behaviours and perceptions concerning AI usage, focusing on response accuracy, preferred problem-solving contexts, and adjustments in usage patterns.
A post-intervention assessment of AI response accuracy revealed substantial improvement. As shown in Tab.5, the number of students who said that AI answers were “mostly accurate” significantly increased from 35.3% to 60.4%. Conversely, the number of students who said that AI was right “half the time” dropped from 55.1% to 33.4%. These results suggest that the intervention helped students improve at asking questions and using AI tools more effectively. Learning about keyword optimization and critical analysis likely contributed to this progress, enabling individuals to ask more precise questions and critically evaluate AI responses. These findings also underscore the importance of AI literacy education, particularly in academic environments where AI integration is widespread and inevitable.
The McNemar-Bowker test was conducted to assess the symmetry of the paired categorical data. df, degree of freedom, indicates that the number of independent variables is estimated in a statistical analysis. The results revealed a significant difference (χ² = 81.457, df = 12, p < 0.001) between the two variables, indicating a non-symmetrical distribution of responses. The results from the Monte Carlo simulation and the Fisher–Freeman–Halton exact test were consistent (p < 0.001). The total sample size for the analysis was 419. To ensure the proper functioning of the McNemar-Bowker test, the “never used” category was added. Given the large sample size, this adjustment is unlikely to introduce any significant error in the experimental results.
The participants were also asked to identify the problem types where AI was the most helpful. The analysis showed simple modifications. As shown in Fig.1, a greater proportion of students said that AI was helpful for completely open-ended or complex problems without standard answers, increasing from 9.6% to 14.4%. In contrast, the percentage of students who viewed AI as beneficial for knowledge structure or framework-related questions declined from 20.3% to 15.3%. Meanwhile, the utility of AI for routine, fixed-answer problems remained stable at 34.7%. These findings indicated that, while students were increasingly recognizing AI’s utility for complex, non-standardized tasks, its perceived effectiveness for routine or knowledge-structuring tasks has diminished. This shift might reflect a growing awareness of AI’s limitations and strengths. Given this trend, developers of AI-enhanced learning tools should focus on augmenting capabilities for addressing complex and higher-order cognitive tasks, which were increasingly valued by students.
Students also used other AI platforms, showing varied preferences. 52.9% of students said they would use Blueink with 1 other major AI platform, while 32.5% would use Blueink with 2 or 3 other platforms. Only 7.4% of students relied entirely on Blueink. This showed that the AI environment used by students was diverse, with many opting for multiple tools to address different learning needs. The widespread use of multiple platforms highlighted students’ experimentation with different AI features to complement classroom tools. Educators and platform developers should consider the compatibility of these tools and their respective advantages to enhance the learning experience.
When faced with an inadequate AI response, 51.0% of students adopted a strategic approach by rephrasing the question, demonstrating their willingness to refine their inquiry methods. Approximately 33.0% of students opted to conduct further research on the initial answer, reflecting a commitment to deeper understanding. Notably, 12.9% of students switched to alternative human‒machine interaction platforms, while a mere 0.5% discontinued their search altogether. The prevalent practice of rephrasing queries and exploration suggested that students were encountering challenges in effectively utilizing AI tools. To optimize AI’s role in education, instructional strategies should emphasize iterative questioning, enabling students to better harness these technologies.
The teaching intervention proved effective in enhancing students’ ability to critically assess the accuracy of AI-generated responses, as evidenced by an average post-intervention score of 8.22 out of 10. Nearly 70% of students achieved a high level of accuracy as shown in Fig.2. This improvement showed the development of their critical thinking and judgment skills, aligning with one of the primary objectives of the intervention. By discerning the quality of AI outputs, the students demonstrated a crucial aptitude for responsible and wise utilization of AI-assisted learning. Future endeavours should reinforce these assessment skills and address any remaining shortcomings in judgment.
Targeted instructional interventions effectively enhanced students’ engagement with AI tools in both practice and assessment. The improvements manifested in increased response accuracy, enhanced adaptability, and better critical evaluation skills. However, the analysis also revealed areas requiring further development, particularly in AI’s capacity to handle specific question types and seamless integration across multiple platforms. These outcomes represented the evolving nature of AI-assisted learning and offered practical guidance for educators, developers, and policy-makers.
6 Results
6.1 Role of AI in Enhancing Learning Necessity and Awareness
Integrating AI tools in biochemistry education has heightened students’ confirmation of AI’s importance in current knowledge. Before and after the research, the students acknowledged the importance of AI and highlighted the usefulness of academic techniques that incorporated AI as a crucial learning resource. Regular use of AI tools provided students with practical experience and bridged theoretical knowledge with real-world applications. This experiential learning improved the students’ engagement considerably, which aligned with the current trend of using AI to improve education.
Despite identifying accurately AI’s value, the intervention also revealed shortcomings in the awareness of AI capabilities. Although it was motivating to see students becoming more conscious of utilizing AI, most of their interactions remained focused on search engines rather than further analysis. This suggested that, in addition to assisting in developing more sophisticated cognition among students, more treatments are needed to illustrate AI’s innovative capabilities.
6.2 Balancing Human Cognition and AI Capability
The research sought to understand people’s thoughts about how human cognition and AI interacted. Post-intervention data revealed that the participants’ views of human reasoning abilities were higher than those of AI capabilities. This shift demonstrated the AI’s ability to promote a solid understanding of knowledge among students and encourage the use of AI as a supplementary tool to enhance subject-matter proficiency. The system emphasized AI’s regulations, highlighting its reliance on accurate information and situational awareness, enabling students to carefully judge AI outputs and trust in their views.
Notably, this study highlights that AI in education, particularly biochemistry, demands complete comprehension and critical thinking. According to the research, AI does not negate the value of human cognition. Instead, it emphasizes the need for a symbiotic relationship between human cognitive abilities and AI capabilities to solve complex problems. Therefore, educational approaches that strike a balance between human and AI strengths are essential.
6.3 Critical Evaluation and Credibility of AI Outputs
The results of the research revealed significant changes in human’s ability to critically evaluate AI-produced details. Post-intervention tests showed a substantial improvement in human’s capacity to deduce crucial data from AI-generated results. This suggested that the educational approach improved human’s capacity to use AI tools more efficiently. However, the increase in human trust in AI interactions was not statistically significant.
Despite the progress in critical analysis, students’ continued trust in AI posed 2 key challenges, including maintaining a reasonable scepticism toward AI and teaching students to critically evaluate its outputs. In the future, possible efforts should be made to combat these findings by enhancing the transparency of AI systems and developing strategies to confirm AI-generated information, so that students could safely use these tools.
6.4 Practical Applications and Behavioural Adjustments
The intervention led to significant changes in students’ use of AI tools. Students’ perceptions of AI responses as being “mostly accurate” increased from 35.3% to 60.4% as shown in Fig.2, which suggested that they were now more adept at using and analyzing AI. Moreover, students were more cognizant of the value of AI in solving open-ended or complex problems, as demonstrated by a surge in the percentage of students who found AI important in this context from 9.6% to 14.4%. In contrast, AI’s perceived effectiveness in regular work and information retrieval showed a slight decline, perhaps due to a shift in people’s focus from AI’s higher-level programs.
A significant proportion of students (51.0%) changed their initial questions to unexpected AI responses, while 33.0% chose to conduct further research based on the AI responses. This flexibility demonstrated how AI use could be improved by changing students’ conceptions of progressive doubting. Moreover, students showed interest in exploring various AI platforms, along with specific tools such as Blueink, given that they were enthusiastic about using several features to enhance their teaching knowledge.
The growing use of modern tools by individuals, improving training methods and consequences, was proof of the rise of AI in education. To meet different learning needs and maintain student engagement, educators and software developers should collaborate to create user-friendly platforms that seamlessly integrated AI into continuous learning.
7 Discussion and Conclusions
This study investigates the integration of AI teaching assistants in biochemistry education, evaluating their impacts on students’ cognitive competencies, critical thinking skills, and the efficient supply of AI technologies. The results demonstrate a significant shift in higher education. In essence, the course makes students more aware of the value of AI in modern education. Integrating AI tools into college activities enables students to move from a conceptual understanding of AI to a sensible understanding of its valuable applications, thereby enhancing their comprehension of AI’s importance in education. This integration practice highlights the advantages of incorporating AI into curricula, especially in certain fields such as biochemistry. Consequently, the students’ knowledge of human cognition and the role of AI improved. The intervention emphasizes human cognition, enhancing AI’s effectiveness and developing critical thinking abilities in AI-supported educational options. The research results effectively demonstrate the potential for resolving a complex professional issue by perfectly combining AI’s functions with human intelligence. Moreover, students show significant improvement in obtaining the necessary information from AI-generated content, which suggests that AI technologies can improve students’ capacity for critical analysis and data comprehension. However, a considerable increase in trust toward AI-generated responses is not detected, underscoring the need for increased transparency and dependability in AI systems to maintain user confidence. The interactions between learners and AI stimulate them to display revolutionary actions. More advanced keywords and progressive problem-solving methods demonstrate greater usage of AI tools. The increased use of different AI platforms and devices additionally supports the effectiveness of a diverse AI environment in addressing specific learning needs.
The findings highlight both the benefits and potential for improvement in AI-assisted education. One key priority moving forward is to address the scepticism surrounding AI practices by developing its transparency and explainability. These modifications should allow users to understand the decision-making process behind using AI tools. Collaboration across different areas, such as cognitive science, education, and computer science, is necessary to create more effective and engaging AI-supported educational tools. Such collaboration can result in personalized learning platforms tailored to the unique needs of each student.
Moreover, the significance of introducing AI intelligent assistants into pedagogical innovation research in biochemistry education lies not only in verifying the effectiveness of technology application but also in promoting systematic changes in the paradigm of higher education. Regarding curriculum design, AI should not only be seen as a learning assistant but as an integral part of curriculum design. Its efficient use should be considered as a curriculum objective, helping students master AI tools to optimize the learning process. The modular and flexible curriculum design can help students adjust the learning difficulty and content sequence with the assistance of AI, creating personalized and adaptive learning paths as well as breaking through the limitations of the traditional linear teaching framework. Meanwhile, the role of educators has been redefined in the AI era, where teacher training should focus on the dual transformation of technical skills and pedagogy, enabling teachers to transform from knowledge transmitters to learning guides and master human‒AI collaborative instructional design capabilities. At the higher education policy level, it is necessary to formulate guiding principles for applying AI in education, including quality standards, data privacy protection, and the adjustment of the teacher evaluation system, to build an AI-ready education ecosystem while addressing the digital divide issue to ensure the equity of education.
Despite Blueink’s demonstrated benefits, educators face three challenges in adopting AI tools. First, technical barriers, such as limited familiarity with AI interfaces or algorithmic transparency issues, could hinder its effective implementation. Second, resistance to pedagogical shifts, such as balancing AI-driven personalized instruction with structured curricula, may require institutional support through targeted teacher training programs. Third, ethical concerns call for clear guidelines to ensure responsible AI usage. To address these challenges, universities should prioritize professional development workshops, establish AI ethics committees, and foster collaborative platforms where educators share best practices for human‒AI co-teaching models.
While this study focuses on the learning effects of students’ use of AI, it ignores ethical issues in the application of AI in education, such as data privacy concerns, algorithmic bias concerns, and the consequences of students’ declining ability to think for themselves as a result of over-reliance on AI. Therefore, future research can begin with the construction of a reliable data protection system, the diversity of algorithm design, the assessment of regulatory issues, and the proper guidance of educators on students’ perceptions and usage of AI, so that AI can better serve education.
Furthermore, this study prioritizes quantitative data over students’ qualitative AI experiences. Qualitative data, such as students’ feelings, learning difficulties, practical skills, and subjective evaluations of their learning experiences, can reflect AI’s actual effects on teaching from different perspectives. Therefore, the follow-up study can increase qualitative research on students’ AI experience by conducting in-depth interviews to understand how they choose and apply AI tools to different learning tasks and how AI develops their problem-solving skills and shapes learning strategies. We may also observe students’ AI use both inside and outside the classroom, record their behaviours, and assess their motives and needs.
The research has, so far, completed only a one-semester cross-sectional study on this subject. In the future, more longitudinal studies to evaluate the effects of AI teaching assistants on students at various educational levels should be implemented. Subsequently, more studies should modify the functionalities and content of AI teaching assistants to meet individual needs. The study provides some methods to enhance learning experiences and advantages in response to the growing body of research defending the integration of AI and education. By addressing the challenges and leveraging the potential of AI, educators and institutions can develop pioneering, equitable, and forward-thinking training strategies that facilitate students to understand modern demands. Multi-institution collaborations conducted to gather varied data, including students in diverse majors and grades, can also enhance the generalizability and applicability of the findings and validate their robustness. Moreover, future research should adopt longitudinal designs to comprehensively assess the sustained impact of AI teaching assistants. For instance, a study can track students’ AI utilization patterns, critical thinking development, and academic performance across different stages of their biochemistry curriculum by conducting periodic surveys and recording behavioural logs with Blueink interactions. Overall, this study aims to become a pivot point for prying the digital transformation of higher education. Through the resonance of technological innovation and institutional innovation, the leapfrog development from AI-assisted teaching to AI-enabled education ecology will be ultimately realized.