GAI Versus Teacher Scoring: Which is Better for Assessing Student Performance?

Li, X.; Zappatore, M.; Li, T.; Zhang, W.; Tao, S.; Wei, X.; Zhou, X.; Guan, N.; Chan, A.

doi:10.1109/TLT.2025.3572518

The integration of generative artificial intelligence (GAI) into educational settings offers unprecedented opportunities to enhance the efficiency of teaching and the effectiveness of learning, particularly within online platforms. This study evaluates the development and application of a customized GAI-powered teaching assistant, trained specifically to enhance teaching efficiency for educators and improve learning outcomes for students in online education. Using four Grade 12 courses (i.e., English, Mathematics, Financial Accounting, and Simplified Chinese), we assessed the performance of generative pretrained transformer (GPT)-4, GPT-4o, and the Trained-GPT model. Results demonstrate that the Trained-GPT achieved grading accuracy and consistency comparable to human teachers, with strong correlations observed in Mathematics (0.996) and English (0.874). While GPT-4o performed well in specific cases, its variability highlights areas for improvement. These findings underscore the potential of AI-powered teaching assistants to streamline grading, deliver timely feedback, and support scalable, high-quality online education.