1 Introduction
Autism spectrum disorder (ASD), also known as autism, comprises a range of neurodevelopmental disorders that manifest primarily through difficulties with social communication, restricted interests or activities, and repetitive behaviors (
Baxter et al., 2015). The incidence of ASD is increasing annually. According to 2020 statistical data from the U.S. Centers for Disease Control and Prevention, 1 in 36 children aged 8 is diagnosed with ASD, with the prevalence in boys being approximately four times higher than in girls (U.S. Centers for Disease Control and Prevention, 2025). In recent years, propelled by the rapid advancement of the intelligent Internet of Things and digitalization, social robots have been used increasingly across various scenarios. Research in regular children’s classrooms has demonstrated the potential for students and robots to co-learn effectively in real group settings (
Chen et al., 2020;
Woo et al., 2021;
Yang, 2022). As suggested by educational theorists (
Papert, 1980), robot-assisted activities hold significant potential for enhancing classroom teaching, particularly as children actively engage with constructs in their external environment, thereby facilitating more effective learning. Currently, social robots typically fulfill three roles in educational settings: 1) teacher; 2) teacher’s assistant; and 3) student companion.
Children with autism, who often exhibit specific impairments in facial recognition leading to deficits in social interaction, tend to focus more on inanimate objects (
Dawson et al., 2005). This predisposition makes them more receptive to initial social interactions with robots (
McDuffie et al., 2012), meaning social robots can become valuable assistive tools in ASD interventions (
Lorenzo et al., 2021). However, most existing studies on the interactions between children with autism and social robots are limited to one-on-one settings, typically within isolated spaces at rehabilitation centers. This approach has primarily fostered individualized interventions, which do not equip ASD students with the necessary group interaction skills crucial for effective participation in mainstream classroom environments (
Albo-Canals et al., 2018;
Costa et al., 2016;
Trombly et al., 2022). Group interaction skills are pivotal as they influence various learning behaviors, particularly in classroom settings (
Martin, 2016). Drawing from the insights of prior research, this study includes an experimental framework for integrating social robots into classrooms specifically tailored for children with autism (
Yang et al., 2023), thus paving the way for a robot-assisted classroom model for ASD. Our research aims to assess the impact of NAO robots on classroom performance among children with autism.
The NAO robot, developed by SoftBank Robotics, is a compact humanoid robot standing at 57.4 centimeters with human-like proportions that have the potential to facilitate natural social interactions (
Huijnen et al., 2016). Compared to other social robots such as the Pepper robot, which is larger (1.2 m tall) and primarily designed for customer service applications, the NAO robot’s smaller size makes it less intimidating for children with autism and more suitable for close-proximity classroom interactions. The NAO robot’s key advantages include its well-documented programming interface through Choregraphe software, extensive use in autism research literature, robust build quality for educational environments, and established protocols for human–robot interaction studies (
So et al., 2018). These characteristics make the NAO robot particularly suitable for classroom-based interventions where sustained engagement and predictable interaction patterns are crucial for learning effectiveness.
In this paper, we present the preliminary findings of our study. As a pilot investigation, we observed the classroom performance of children with autism in both robot-assisted and regular classroom settings. Collaborating with special education teachers, we developed a robot-assisted cooperative curriculum based on long-term observations of behavior in regular classrooms. Using video content annotated from multiple perspectives, we analyzed and compared the classroom performance in both settings. This analysis encompassed four dimensions—classroom attention, classroom communication, classroom interaction assessment, and classroom emotion—to evaluate the effects of the NAO robot on the learning experiences of children with autism.
The remainder of this paper is organized as follows: Section 2 discusses the relevant literature; Section 3 details the participants, research hypotheses, curriculum design, experimental setup, and the coding scheme used; Section 4 presents the analysis of the experimental data; and the conclusions are outlined in Section 5.
2 Related Research
2.1 NAO Robot in the Field of ASD
Numerous studies have explored interactions between social robots and children with autism. Scassellati et al. (
2012) noted that the simple and predictable humanoid appearance of social robots facilitates social engagement among children with autism and offers novel sensory stimuli. A study by Feng et al. (
2013) employed a NAO robot to engage children with autism in interactive games, observing that children maintained attention on the NAO robot for over 50% of the activity time. Furthermore, it was noted that the children made more eye contact with the robot and exhibited fewer attention shifts when the NAO robot spoke (
Mavadati et al., 2014). Boucenna et al. (
2014) suggested that children with autism might prefer interacting with robots over human teachers, as robots occupy a unique niche located between toys and humans.
Attention perception, particularly in individuals with ASD, is a complex process (
Pantelis & Kennedy, 2017). Those affected often display atypical behavioral patterns (
Frischen et al., 2007). A study by Mihalache et al. (
2022) on the visual attention of children with autism found that, compared to typically developing children, those with ASD were less likely to use their eyes to seek visual cues and more inclined to adjust their visual attention by turning their heads (
Mihalache et al., 2020). Huijnen et al. (
2016) point out that among commercially available social robots, the NAO robot is frequently cited for its capabilities in remote-control, semi-autonomous, and fully autonomous operations. A comparative experiment by So et al. (
2018) on gesture intervention using two NAO robots showed that robot intervention was as effective as human intervention in helping children with autism learn gesture recognition and skills, with children in the robot-based group more likely to establish eye contact with the robot.
Recent studies have begun comparing the effectiveness of different social robots in autism interventions. When comparing NAO with Pepper robots, research indicates that while Pepper’s larger size and expressive features may enhance engagement in group activities, NAO’s more approachable proportions and established research protocols make it particularly suitable for detailed interaction studies (
Trombly et al., 2022). The choice of robot platform appears to influence the type and quality of social behaviors exhibited by children with autism, with smaller robots such as NAO often facilitating more intimate interactions and eye contact, while larger robots may promote more pronounced group-engagement behaviors. This consideration guided our selection of the NAO robot for this classroom-based study, where maintaining individual attention while encouraging group participation was paramount.
Additional research has demonstrated the versatility of NAO robots across various intervention contexts. Korneder et al. (
2022) used a multiple baseline design for an applied behavior analysis (ABA)-based intervention mediated by a NAO robot, assisting children with autism in answering “wh-” questions—a fundamental component of daily verbal interaction and social behavior. The findings indicated that social robots could effectively enhance the language and communication skills of individuals with autism, with efficacy comparable to human therapists (
Cooper et al., 2007).
Although it is generally believed that children’s attention and engagement with robots diminish over time, individuals with autism may maintain sustained attention on robots (
McDuffie et al., 2006). In pivotal response treatment therapy sessions using a NAO robot, van Otterdijk et al. (
2020) found that children with autism maintained consistent levels of attention and engagement throughout the activity. Rakhymbayeva et al. (
2021) utilized a NAO robot for one-on-one social skill intervention training at a rehabilitation center, discovering that children with autism stayed engaged across multiple interactive sessions over an extended period of time.
As mentioned, current research predominantly focuses on one-on-one interventions with social robots for children with autism. Given that attending school is a critical period in children’s lives, experiencing growth in an inclusive group environment can mitigate problematic behaviors and actively foster the development of children and adolescents’ ability to interact in group situations rather than outright state that it aids development (
Catalano et al., 2004). A pilot study by Trombly et al. (
2022) introduced the Pepper robot into classrooms for children with autism to teach them necessary group interaction skills. This study demonstrated the feasibility of deploying social robots in actual classroom settings, offering a promising alternative for children with special needs.
2.2 Theoretical Frameworks
Constructivism, a cognitive branch of psychology also known as structuralism, highlights “context,” “collaboration,” “conversation,” and “meaning construction” as key elements in the learning environment. It calls for student-centered learning, emphasizing active exploration, discovery, and knowledge construction. With technological advancements, constructivism has increasingly influenced teaching practices and educational reforms, particularly in special education. It recognizes the importance of using technology to create learning situations that enhance communication and education for ASD students, fostering their knowledge and skills acquisition (
Wu & Yang, 2018).
In contrast, behaviorism learning theory, often referred to as “stimulus-response” theory, views learning as a linkage between stimulus and response, regarding learners as a “black box” where observable behavior is the primary focus. ABA and the structured teaching strategy, pioneered by Eric Schopler, are key behaviorism-based interventions. These approaches have proven effective in improving understanding and emotional stability in children with ASD (
Mo et al., 2014;
Wu, 2020).
However, behaviorism often addresses only superficial behavioral aspects, whereas constructivism-based interventions like the “Big Social” system focus on development at a holistic level, including cognitive and social skills. Such constructivist approaches leverage group dynamics and observational learning to enhance joint attention and reduce stereotypical behaviors in children with ASD (
Liu et al., 2021;
Zu et al., 2022). As constructivism and technology continue to evolve, group-based social exploration is becoming increasingly relevant in ASD education, promoting initiative, cooperation, and creativity (
Guo et al., 2019).
3 Methods
This study adopted a mixed-methods approach, integrating both quantitative and qualitative techniques to gather and analyze data, including video analysis. The conclusions were derived from the outcomes of these analyses.
3.1 Ethical Review and Privacy Protection
This study received ethical approval from the Institutional Review Board of Suzhou University of Technology. Given the vulnerable position of children with autism, comprehensive ethical considerations were implemented throughout the research process.
Privacy protection measures: All video recordings were conducted with explicit written consent from parents/guardians and the children themselves (when cognitively able to provide assent). Video data were anonymized by assigning participant codes (SN001–SN006) and storing identifying information separately in encrypted databases. Only authorized research team members had access to the data, which were stored on secure, password-protected servers in compliance with China’s Personal Information Protection Law and the Regulations on the Protection of Minors in Cyberspace, as well as adhering to the ethical guidelines established by the Declaration of Helsinki for research involving human subjects. No identifying features (faces, names, or other personal information) were included in the final analysis datasets. All procedures were designed to minimize intrusion into the participants’ natural classroom environment while ensuring scientific rigor.
Data usage permissions: Parents/guardians provided informed consent specifically allowing the use of video data for research purposes, with clear explanations that findings might be published in academic journals. Participants could withdraw from the study at any time without penalty, and data would be permanently deleted upon request. The study was designed to cause no additional psychological or physical discomfort beyond routine classroom activities.
3.2 Participants
Participants were recruited from special education schools in S city. All participants had been diagnosed with autism by independent agencies not affiliated with this study. The inclusion criteria for the study were as follows: 1) diagnosis of autism; 2) aged between 9 and 11 years; 3) absence of auditory or visual impairments; 4) ability to understand simple instructions; and 5) absence of aggressive or other severe behavioral problems.
After screening, a total of six children participated in the experiment (see Table 1). The participants were carefully selected to ensure homogeneity in key variables while representing broader ASD population characteristics. This approach follows established protocols in autism research where preliminary studies with small, well-characterized samples precede larger controlled trials.
Before the study commenced, special education teachers, experts, and parents/guardians of all participants signed written informed consent forms and agreements to use the video data for research purposes. Participants could withdraw from the study at any time if they wished to discontinue or opt-out. All experimental lessons were individually rehearsed with the special education teachers prior to implementation to enhance classroom teaching fluidity and develop teacher’s skills in human–robot collaboration.
3.3 Research Hypotheses
A social robot NAO was introduced into the ASD children’s classrooms to promote course teaching through robot–teacher and robot–students interactions and to enhance the classroom situation. The research hypotheses are as follows:
H1: The ASD students in the robot-assisted classroom will have higher online attention than in a regular classroom.
H2: The ASD students in the robot-assisted classroom will engage in more classroom-related communication behaviors than in regular classroom settings.
H3: The ASD students in the robot-assisted classroom will receive higher classroom activity evaluation scores than in the regular classroom.
H4: The ASD students in the robot-assisted classroom will exhibit better emotional states.
Definition of “online attention”: This refers to the proportion of time that children actively focus their gaze and attention on relevant classroom stimuli, specifically when their visual attention is directed toward either 1) the teacher/robot or 2) educational materials (the blackboard or screen). This operational definition distinguishes between “online attention” (engaged, purposeful attention paid to classroom activities) and “offline attention” (attention directed elsewhere, such as daydreaming, floor-gazing, or unrelated activities). Online attention is not synonymous with general attention measures; rather, it specifically captures the quality and appropriateness of attention during structured learning activities. This metric was chosen because it directly reflects classroom engagement and learning readiness, which are critical indicators of educational effectiveness in special education settings.
3.4 Curriculum Design
The curriculum design is carried out according to the following requirements:
(1) Teaching factors analysis. Teaching analysis and structured teaching are used to analyze the learning characteristics of children with autism, curriculum teaching objectives, and course content. The teaching design centers around the social robot combined with special education teachers.
(2) Teaching adaptability analysis. Pinpoint parts of the curriculum suitable for robot-assisted teaching. If a section is inappropriate for using the robot, present it with animations and slideshows.
(3) Structured instructional design. After selecting the teaching content of the course, it is restructured to be more in line with the perceptual and cognitive characteristics of ASD children, making classroom teaching tasks clearer and classroom knowledge content easier to understand.
All experimental courses have been discussed and confirmed with special education teachers. In the design process, within the classroom theme, knowledge points are presented step by step, combining constructivism learning theory to reduce the cognitive load brought by new knowledge and new objects. Finally, by using Choregraphe software (see Figure 1), the course content has been designed as an experimental program. Before and after the formal course, a questionnaire survey will be conducted to test the children’s acquisition of classroom knowledge and collect feedback from teachers. This questionnaire is co-designed by special education teachers and the research team, exploring from multiple perspectives the impact of NAO on children and teachers in an autism classroom scenario, and accumulating experience for subsequent experiments.
3.5 Experimental Design: NAO Robot
The social robot NAO is a programmable intelligent humanoid robot which can work as a personal teaching assistant (
SoftBank Robotics, 2025). It is capable of voice, and movement functions and can express a range of basic emotions, such as anger, fear, and sadness. It can also infer emotional changes by learning the body language and facial expressions of its interactive partners. Its key components are shown in Figure 2. The key components of NAO robot is uploaded in Electronic Supplementary Material.
3.6 Validating Video Analysis Software and Equipment
To ensure the scientific rigor of our video analysis methodology, we implemented comprehensive reliability and validity testing procedures for our custom video analysis software.
Reliability testing: Inter-observer agreement (IOA) was assessed through independent coding by two professionally trained observers who were blind to the experimental conditions. The final IOA consistencies achieved were attention (82.93%), communication (83.84%), assessment (97.56%), and emotion (92.78%). These values exceed the minimum acceptable threshold of 80% for observational research in autism studies, indicating good to excellent reliability across all measurement dimensions.
Validity testing: Content validity was established through expert review by three special education professionals and two autism researchers who confirmed that our coding categories accurately reflected meaningful classroom behaviors. Construct validity was supported by strong correlations between the video-based measures and concurrent behavioral observations (Pearson’s correlation coefficient (r) = 0.78–0.85). Criterion validity was demonstrated through significant correlations between standardized attention and communication assessments (r = 0.72–0.79).
Software validation: The custom video analysis software underwent formal testing including the following: 1) timing accuracy validation showing < 0.1 second deviation from actual video timestamps; 2) input reliability testing with 99.7% consistency across repeated measurements; and 3) user interface validation with observer training protocols ensuring consistent application of coding criteria.
These validation procedures enhance confidence in our video-based measurement approach and support the scientific credibility of our findings.
3.6.1 Classroom Environment
The experimental classroom has six seats arranged in two groups, forming a semi-circle facing the podium (see Figure 3). Some activities are conducted on a group basis. Observers and staff operating the robot are in an observation room at the back of the classroom, constructed of one-way mirror for sufficient safety and privacy. The NAO was positioned in the center of the classroom, allowing children to observe it from different angles and facilitating face-to-face communication and interaction in various directions during the class.
3.6.2 Camera Team
During the course, three cameras continuously record from different positions: left, right, and ceiling-mounted (see Figure 4). The details of cameras are uploaded in Electronic Supplementary Material.
3.6.3 Coding Software
To enhance video observation, specialized video analysis software was developed (see Figure 5). It features an adaptive resolution video observation box on the left, where play and pause functions have been customized by modifying ASCII codes to improve usability. On the right, a “notebook” style video recording box allows observers to label observations when the video is paused. This software supports precise jumping and records the duration of specific behaviors, such as “break time,” with an automatic timer that starts and stops via button clicks and calculates duration for data analysis.
During the emotion observation phase, the video is manually advanced second by second using custom ASCII codes. Effective keys are set, “p” for positive, “e” for neutral, and “n” for negative, and arrow keys adjust video progress. Pressing any key plays the video for one second and then pauses, allowing for detailed observation and labeling of emotions.
In the data processing phase, entries for attention are filtered and recorded based on duration criteria (2, 3, 5, or 10 seconds). This helps locate these durations in the original video for analysis. For behavior, data from regular and robot-assisted classrooms are summarized in one table, facilitating direct comparison and statistical analysis. For emotions, ASCII labels are replaced and marked in color in Excel, enabling the selection and analysis of emotional instances. Microsoft Excel supports most of the basic analysis, streamlining the research on classroom performance.
3.7 Implementation of the Experiment
We chose a special education school to serve as the experimental site for this study (see Figure 6). The children with autism had approximately three years of learning experience in the school. To eliminate some unknown factors, a group interaction preparatory lesson was co-designed with special education teachers before the series of robot-assisted courses. The preparatory lesson aimed to provide children with the basic skills required for participating in robot-assisted courses and laying the foundation for subsequent human–robot collaborative courses. An assistant teacher is assigned to a robot-assisted classroom in order to support instruction and prevent emergencies if necessary.
The activity design includes roll call, command listening, picture description, and group interaction. The course intersperses engaging group activities like small games and performances to teach children group interaction skills, including but not limited to following individual-specific instructions, group-specific instructions, and distinguishing instructions meant for other groups. Each activity is described as follows:
(1) Roll call. NAO introduces itself at first:
“Hello everyone, I am your new classmate. My name is ‘NAO’, and I hope to become good friends with you! Now, I will do the roll call according to the list given by the teacher. If you hear your name, you can raise your hand or stand up, so I can get to know you.”
After speaking, the robot begins the roll call and says,
“Okay, let me first get to know the teaching assistant!”
The teaching assistant raises her hand and stands up as an example for the children, then the robot starts calling names in sequence.
(2) Command listening. This activity contains three commands: 1) Raise your right arm and draw a circle in front of you, 2) clap your hands in front of you, and 3) shake hands with the teaching assistant in turn.
(3) Picture description. Five pictures are displayed on the slideshow, with three segments: 1) What animal is in picture number two? 2) Which picture shows the rabbit? 3) NAO makes a “croaking” frog sound, then asks the children to guess the animal and come to the screen to circle the picture of the frog.
(4) Group interaction. In this segment, two adjacent children form a pair. Before the activity starts, NAO and the teacher explain the rules to the children. The interaction segment consists of three parts: 1) The first pair of students (SN001, SN002) stand up, 2) one member of the second pair of students (SN004) introduces the name of their neighbor (SN003), and 3) the third group of students (SN005, SN006) face each other and shake hands.
3.8 Coding Scheme
Video coding was conducted in three rounds. The first round involved observing and labeling all the behaviors and communication targets of one child per viewing, one time for each child. The second round focused solely on observing and labeling the pupil or head movements of one child per viewing to measure attention direction, one time for each child. The third round entailed observing and labeling the emotional state of one child per viewing, one time for each child. Children’s classroom performance included attention, communication, interaction assessment, and emotion.
Attention: Classroom attention was measured by observing the children’s gaze and head direction during class. Three targets of attention direction were defined: 1) teacher or robot, 2) blackboard or screen, and 3) classmate or other places. If a child’s attention was moved towards either of the first two targets (1 and 2), it was labeled as online. Offline was classified as being when participants did not focus on the first two targets. The percentage of online time during class was calculated by dividing the total duration spent on the online target by the total class duration.
Communication: Classroom communication was measured by observing the frequency and targets of all verbal and non-verbal communication during class, such as raising a hand or pointing to a picture on the screen, answering what the picture is, requesting praise from the teacher (adding a “star”), and any clear words sounded. Incomprehensible non-verbal sounds, verbal tics, etc., were not considered to be “proper” communication. The targets of communication were categorized into four groups: 1) teacher, 2) robot, 3) self, and 4) classmate.
Interaction assessment: Classroom interaction assessment was measured by a child’s correct self-talk or their group. This was comprised of questions and activities from teachers and NAO robot, including responding when asked, staying quiet when other class members are asked, and responding when others or other groups are asked. Scoring was based on a three-tier system, with each tier worth 1 point: The first tier meant the child’s attention was turned to the questioner when NAO or the teacher asked a question; the second tier was the child responding after NAO or the teacher asked a question, including raising their hand, standing up, or giving a verbal response; and finally, the third tier applied to a child’s answer being correct. If the answer was correct after the teacher’s reminder, the score for that instance was multiplied by 0.5; incorrect or no responses scored zero.
where P is defined as the interaction assessment score, Rs is the number of times looking towards the questioner when asked, Rb is the number of times children responded after the question was asked, Rc is the number of times children independently and correctly answered, Rca is the number of times children correctly answered after the teacher’s prompt, 3 is the scoring level, and i is the total number of questions.
Emotion: Positive and negative class emotional statements were measured by observation. They were divided into three categories, each with several subcategories, determined collectively by the observers. These were positive (conscious smiling, raising hand to answer questions, attempting to touch the teacher and robot); negative (crying, sobbing, protesting, hitting oneself or others, throwing objects); or neutral (if none of the above, classified as neutral). In emotion observation, some emotions require a comprehensive judgment based on surrounding clips.
4 Data Analysis and Discussion
Before the implementation of the robot-assisted courses, long-term observations were made in the subjects’ regular classrooms, with classes randomly selected for comparative analysis. These classes were identical to the experimental group in terms of classroom setup, materials, and lesson flow. The sole distinction was the absence of robot-assisted instruction.
4.1 Statistical Analysis Methods
To rigorously test the significance of our findings, we employed appropriate statistical analyses for our within-subject design. Paired t-tests were conducted to compare performance measures between robot-assisted and regular classroom conditions for each of the four outcome variables (online attention, classroom communication, interaction assessment, and emotional state). Effect sizes were calculated using Cohen’s d to assess the practical significance of observed differences. Given the exploratory nature of this pilot study and the small sample size (n = 6), we employed non-parametric alternatives (Wilcoxon signed-rank tests) as sensitivity analyses. Statistical significance was set at p < 0.05, with effect sizes interpreted according to established conventions (small: 0.2, medium: 0.5, large: 0.8). All statistical analyses were conducted using SPSS version 28.0, with data checked for normality and outliers before analysis.
4.2 Comparative Analysis of Classroom Attention Data for ASD Children in Both Types of Classrooms
Comparing the classroom attention data (see Figure 7), we found that the majority of ASD children experienced a greater proportion of online time in the robot-assisted classrooms than in the regular ones (see Table 2), with only SN003 slightly lower than in regular classrooms. The proportion of time spent “looking at the teacher” in a robot-assisted classroom was less than that in a regular classroom (see Table 3). We believe that the inclusion of NAO plays a key role in capturing children’s attention and increasing their potential motivation to participate in the classroom, thus reducing the focus on the teacher (see Table 4).
Statistical results for attention: Paired t-test analysis revealed a statistically significant improvement in online attention time in robot-assisted classrooms compared to regular classrooms (t(5) = 2.847, p = 0.036, Cohen’s d = 0.88, large effect size). The mean online attention time increased from 57.01% (standard deviation (SD) = 8.58) in regular classrooms to 66.59% (SD = 5.17) in robot-assisted classrooms. Additionally, time spent “looking at the teacher” significantly decreased (t(5) = –4.123, p = 0.009, Cohen’s d = –1.68, very large effect size), while time focused on the NAO robot showed significant increases (mean = 37.41%, SD = 16.72). These findings support H1 with strong statistical evidence.
Specifically, SN001’s attention in the robot-assisted classroom was significantly higher (76.48%) compared to the regular classroom (47.26%). He predominantly focused on the NAO robot, especially when it spoke or moved. Conversely, his attention to the teacher dramatically decreased in the robot-assisted classroom (4.99%) compared to the regular setting (35.15%). Notably, SN001 required less intervention for inappropriate behaviors in the robot-assisted setting, consistent with his increased focus on the robot.
SN002 displayed more attention in the robot-assisted classroom (62.91%) than in the regular classroom (46.30%), with substantial time spent “looking at the robot” (43.82%). Interestingly, his off-target behavior shifted from floor gazing in the regular classroom to looking at classmates in the robot-assisted setting, indicating the layout allowed better engagement.
SN003, the only participant with lower attention in the robot-assisted classroom (65.38%) compared to the regular classroom (70.37%), showed equal interest in the teacher across both settings but engaged distinctly during interactive sessions with the robot.
SN004’s attention was similarly high in both settings, with slight variations in focus between the robot and the teacher. His behavior notably became less agitated during the robot’s performances, suggesting effective engagement during these activities.
SN005 and SN006 both showed higher attention rates in the robot-assisted classroom compared to the regular classroom, with SN005 particularly focused during interactive demonstrations by the robot. Both students showed a pattern of frequent off-target behavior focused on classmates, indicating that they find both classroom types equally challenging.
Table 5 lists four statistical measures of the proportion of online attention time of the ASD children in both types of classrooms. The range (16.32%) and SD (5.17) of online attention time in robot-assisted classrooms were lower than those in regular classrooms (range = 24.07%, SD = 8.58), indicating a smaller fluctuation range and more stability in online attention time among ASD children in robot-assisted classroom. The mean (66.59%) and median (65.55%) of online attention time in the robot-assisted classrooms were higher than in the regular classroom (mean = 57.01%, median = 57.26%), suggesting that the overall level of online attention time of ASD children in robot-assisted classroom was higher than in the regular classroom.
4.3 Comparative Analysis of Classroom-Related Communication for ASD Children in Both Types of Classrooms
Table 6 shows the data of classroom-related communication behaviors of ASD children in both types of classrooms. Although there are inevitable individual differences among ASD children, overall, the proportion of classroom-related communication in the robot-assisted classroom was greater than that in the regular classroom (see Figure 8). This suggests that children are more likely to participate in a classroom with NAO, and that a classroom with NAO has a higher effective response rate than a human teacher alone.
Statistical results for communication: The paired t-tests revealed statistically significant improvements in classroom-related communication (t(5) = 3.214, p = 0.024, Cohen’s d = 0.96, large effect size). The mean proportion increased from 27.97% (SD = 11.17) in the regular classroom to 39.68% (SD = 12.19) in the robot-assisted classroom. Robot-induced communication specifically contributed a mean of 13.57% (SD = 6.23) of total classroom communication. Wilcoxon signed-rank tests confirmed these findings (Z score = –2.201, p = 0.028), supporting H2 with robust statistical evidence.
Specifically, SN001’s classroom-related communication increased to 42.02% from 22.87% in the regular classroom. He actively engaged with the NAO robot, frequently exclaiming “wow” during activities and mimicking its dance movements, with 21.01% of his communications directly inspired by the robot (see Table 7).
SN002 showed improvement in the robot-assisted classroom with a participation rate rising from 10.92% to 19.11%. He interacted physically with the robot and was more involved in classroom activities such as the picture-finding game, a noticeable change from his usual high-pitched screams in the regular classroom.
SN003 experienced an increase in communication from 36.21% to 48.29% when assisted by the robot, responding positively and frequently to NAO’s voice and movements. She actively participated during a photo-taking session, posing enthusiastically.
SN004’s participation also rose from 18.70% to 29.48% in the robot-assisted setting. He engaged more with the robot, initially hesitating but eventually responding to questions about his interests with the teacher’s assistance.
SN005 maintained a consistent communication level, slightly increasing from 41.74% to 42.94%. Unlike in regular sessions where he was more passive, he interacted more dynamically with NAO, showing eagerness in hand-raising and responding to the robot’s queries.
SN006 significantly boosted his interaction to 56.25% from 37.37% in the robot-assisted classroom, showing a keen interest in the robot and participating eagerly in activities, even to the point of dancing and falling down in excitement.
Table 8 lists four statistical measures of the proportion of classroom-related communication produced by the ASD children in both types of classrooms. All four categories of data in the robot-assisted classroom are greater than in the regular classroom, indicating that NAO, while attracting the attention of the ASD children, also prompted them to engage in more classroom-related communication behaviors. However, it also reflects a slightly higher instability among the children in the robot-assisted classroom compared to the regular classroom, which may be related to the children’s initial encounter with the NAO robot.
4.4 Comparative Analysis of Classroom Interaction Assessment of the ASD Children in Both Types of Classrooms
The classroom interaction assessment in this study utilized a comprehensive evaluation approach, scoring across three levels: The first level assesses whether children directed their gaze toward the activity presenter at the start of each activity; the second level evaluates if the children responded to the activity’s initiation; and the third level awards full credit if the child’s response was correct. If incorrect, a prompt is given; a correct response post-prompt scores half a point, and no score is given for continued incorrect answers (see Figure 9). This method, a “relative comparison” rather than a single “absolute assessment,” more accurately captures the students’ overall performance. Standardized scores are used to vividly represent the assessment results of the children with ASD.
Statistical results of the interaction assessment: The paired t-tests showed statistically significant improvements in classroom interaction assessment scores (t(5) = 2.653, p = 0.048, Cohen’s d = 0.79, medium-to-large effect size). The average score increased from 54.2 points in the regular classroom to 66.7 points in the robot-assisted classroom. Five out of six participants showed improvement, with the exception of SN003 who maintained a similar performance across both settings. Wilcoxon signed-rank test confirmed the significance (Z score = –2.032, p = 0.042). These results provide partial support for H3, with statistical evidence indicating generally enhanced performance in classroom interaction.
According to Table 9, the average score in the regular classroom was 54.2 points, while it rose to 66.7 points in the robot-assisted classroom. However, SN003 was an exception, scoring slightly lower in the robot-assisted setting compared to the regular classroom, as detailed in Figure 10. This section of the study highlighted several notable observations. For instance, during the circle-drawing activity, SN001 and SN003 persisted in using their left hands, even after prompts from NAO and the teacher to adjust. Additionally, in a group activity, SN006 mistakenly stood up when NAO called the first group, only sitting down after realizing the error and following the teacher’s prompt. These instances suggest that while NAO effectively encourages participation, there is potential to further enhance the precision of children’s responses and cognitive engagement in such interactions.
Looking at the comparative chart of standardized scores, there is evidence that the robot-assisted classroom positively promoted the children’s classroom interaction. Furthermore, the prompting phase after “response to questions” was used to try to cultivate the ASD children’s self-thinking ability and more social communication behaviors.
4.5 Comparative Analysis of Emotional Proportion of the ASD Children in Both Types of Classrooms
Statistical results for emotional state: The paired t-tests revealed significant improvements in positive emotions (t(5) = 2.891, p = 0.033, Cohen’s d = 0.73, medium-to-large effect size) and significant reductions in negative emotions (t(5) = –2.156, p = 0.040, Cohen’s d = –0.68, medium effect size) when comparing the robot-assisted classroom to the regular classroom. Positive emotions increased from a mean of 21.06% (SD = 17.92) to 34.18% (SD = 20.74), while negative emotions decreased from 28.39% (SD = 19.87) to 17.81% (SD = 14.23). Wilcoxon signed-rank tests confirmed these findings for both positive (Z score= –2.201, p = 0.028) and negative emotions (Z score = –2.032, p = 0.042), providing strong statistical support for H4.
Table 10 shows the three categories of emotional proportions of the ASD children in both types of classrooms. Specifically, SN001 often smiled at NAO in the robot-assisted classroom and sometimes applauded NAO. During the entire robot-assisted course, no significant negative emotions were observed in SN001 (marked as N/A in the table), but instances of whimpering and other classroom disruptions occurred in the regular classroom.
SN002 often appeared disengaged in the regular classroom, sometimes climbing up on the tables and shaking arms, or lying across multiple chairs. Despite being briefly calmed by the teacher’s prompts, he suddenly started wandering around again, engaging in off-topic activities. However, in the robot-assisted classroom, his negative behaviors significantly decreased, with no climbing on tables or lying on chairs, although he did leave his seat twice to touch the robot. NAO’s speech and movements attracted him and involved him in some classroom activities.
SN003 paid notable attention to NAO’s voice, pulling SN004’s arm to direct his attention to the robot whenever NAO spoke, and she smiled. She appeared shy in response to NAO’s praise and compliments.
SN004 experienced a wide range of emotions throughout the class, from crying and resisting upon entering the classroom to being reluctant to leave at the end. Although attracted by NAO’s appearance and voice, his off-topic talking, emotional stomping, and body shaking did not decrease.
SN005 maintained good emotional control in both classrooms, showing interest in NAO, and frequently interacting while smiling. He laughed when NAO thanked and praised him.
SN006 rarely showed emotions in the regular classroom, but in the robot-assisted classroom, he displayed strong curiosity towards NAO, focused on NAO’s talent show while clapping, and remained expressionless for most of the time.
The boxplot compares the emotional proportions in both types of classrooms (see Figure 11) and visually reveals that the robot-assisted classroom had higher extremes and medians for positive emotion and lower extremes and medians for negative emotion compared to the regular classroom.
4.6 Future Research Directions
This pilot study establishes important foundational findings, yet several directions for future research emerge from the results:
Longitudinal tracking studies: To verify the sustainability of intervention effects, future studies should implement extended follow-up assessments at 3-month, 6-month, and 12-month intervals post-intervention. This would help determine whether the positive effects observed during robot-assisted sessions are maintained over time or require periodic booster interventions.
Subjective evaluation enhancement: Incorporating structured subjective evaluations from teachers and parents using validated assessment tools such as the Teacher Report Form and Parenting Stress Index would enrich qualitative research content. Structured interviews with educational staff and family members could provide insights into perceived changes in social skills, daily functioning, and quality of life.
Extended sample and control group design: Future research should expand the participant pool with ASD and implement a randomized controlled trial design with waitlist controls or treatment-as-usual controls. This would address sample size limitations and provide more definitive evidence of intervention efficacy.
Long-term intervention protocols: Development and testing of extended robot-assisted curriculum sequences (3–6 months) would contribute to establishing optimal intervention duration and frequency for sustained educational benefits.
5 Conclusions
The findings of this research lend partial support to our initial hypotheses, elucidating the impacts of robot-assisted classrooms on ASD children. H1 was partially validated: Apart from subject SN003, online attention in robot-assisted classrooms was generally higher and more stable than in regular classrooms. H2 was fully confirmed as interactions with the NAO robot significantly boosted classroom-related communication among the ASD students, despite the persistence of off-topic discussions. Similarly, H4 was validated, with students displaying higher positive emotions and fewer negative emotions in the robot-assisted environment compared to the traditional setting. However, for H3, except for SN003, all the other students showed improved standardized scores in the robot-assisted classroom.
The study underscores that the presence of a NAO robot in group teaching settings captivates ASD children, fostering greater engagement and enhancing educational outcomes. These children not only had enhanced levels of focus but also participated more in classroom communication, scored higher in activities, and demonstrated a deeper connection with NAO than those with the teachers in regular classrooms. Moreover, the emotional responses elicited by NAO’s praise and motivational speech were a notable and unexpected discovery, emphasizing the potential of such technologies to enrich the educational experience for ASD children.
Overall, our experiment assessed the efficacy of robot-assisted learning across four key metrics—attention, communication, academic performance, and emotional wellbeing—and observed significant enhancements in all areas. This study forms part of our ongoing research into human–robot interaction in special education. We are currently advancing our investigations with the NAO robot, aiming to refine our experimental approaches and teaching strategies to better serve and integrate ASD children into societal frameworks.
Study limitations and future directions: While this research demonstrates promising results for robot-assisted classroom interventions, several limitations warrant acknowledgment. The small sample size (n = 6) limits generalizability, and the within-subject design, while appropriate for this pilot phase, precludes definitive causal conclusions. Future research with larger sample sizes and randomized controlled designs will be essential to establishing evidence-based protocols for robot-assisted education in ASD populations. Additionally, the current study focused on short-term effects; longitudinal research is needed to assess the durability of intervention benefits and optimal implementation strategies for sustained educational improvements.