Large language model-based debugging training significantly boosts K-12 students' programming skills, computational thinking, and self-efficacy through interactive dialogue.
Objective: The main goal of this study was to evaluate the effectiveness of incorporating a large language model (LLM) into programming debugging training for K-12 students, specifically examining its impact on students' programming debugging abilities, computational thinking, and self-efficacy compared to conventional teaching methods.
Methods: The researchers conducted a quasi-experiment with 80 sixth-grade students from an elementary school in China, who were randomly assigned to either an experimental group (40 students) using LLM-based programming debugging training or a control group (40 students) using conventional debugging methods. The study lasted four weeks, during which both groups participated in eight sessions focusing on debugging variables, control structures, loop structures, and program functions. Assessment was conducted at three time points: before intervention (T0), immediately after intervention (T1), and one month post-intervention (T2). Evaluation measures included programming debugging tests, computational thinking assessments, self-efficacy questionnaires, and semi-structured interviews. Data was analyzed using linear mixed-effects modeling to examine changes in student performance over time.
Key Findings:
- Students in the LLM group showed significantly greater improvement in programming debugging ability compared to the control group, especially at the one-month follow-up assessment (2.587 units higher than the control group, p=.007).
- The LLM group demonstrated significantly improved computational thinking on the delayed test (T2), with scores 8.46 units higher than the control group (p=.039), while the control group showed no significant improvement in computational thinking.
- Programming self-efficacy increased significantly in the LLM group immediately after intervention (0.41 units higher than the control group, p=.034), but this advantage diminished by the one-month follow-up.
- Interviews revealed that students in the LLM group reported deeper understanding of programming concepts, improved problem-solving skills, increased learning interest, and appreciated the immediate feedback provided by the LLM.
- Control group students frequently mentioned lack of guidance, boredom, and challenges with debugging due to limited syntax knowledge.
Implications: The findings demonstrate that LLM integration in programming education can significantly enhance the teaching and learning experience for novice programmers at the K-12 level. The dialogic interaction with the LLM provides personalized, immediate feedback that helps students better understand programming concepts, identify errors, and develop problem-solving strategies. This approach aligns with Vygotsky's Zone of Proximal Development theory, where the LLM serves as a scaffold to help students achieve higher levels of understanding. The study contributes to the growing field of AI in education by providing empirical evidence for the effectiveness of LLMs as educational tools, particularly for programming education where immediate and personalized feedback is crucial for learning complex concepts.
Limitations: The study acknowledges several limitations. The sample size was restricted to one grade level in a single elementary school in China, potentially limiting the generalizability of findings. Additionally, there was a relative lack of process data that could provide deeper insight into how the LLM specifically affected the programming debugging process. The researchers also noted that the effects on programming self-efficacy appeared to be transient, suggesting that maintaining long-term motivation remains a challenge even with LLM support.
Future Directions: The researchers suggest several avenues for future research:
- Examining a more diverse sample of beginning students across different grade levels and regions to explore the impact of LLM-based programming debugging methods on various demographics.
- Collecting and analyzing more detailed process data during the learning experience to better understand how LLMs influence specific aspects of programming debugging.
- Recording student behavioral data to identify and address issues with the platform through user feedback and usage data analysis.
- Incorporating the latest AI technologies to improve the LLM's interaction capabilities and response accuracy for more personalized debugging recommendations.
- Establishing ongoing online learning communities where students can share experiences and support each other to maintain long-term self-efficacy and engagement.
- Ensuring alignment with established learning theories like constructivism and self-efficacy theory when applying LLMs in programming education.
Title and Authors: "Employing large language models to enhance K-12 students' programming debugging skills, computational thinking, and self-efficacy" by Shu-Jie Chen, Xiaofen Shan, Ze-Min Liu, and Chuang-Qi Chen.
Published On: 2025
Published By: Educational Technology & Society, 28(2), 259-278.