Addressing the Feedback Bottleneck in Large Technical Courses: A Comparative Study of LLM-Assisted Assessment in Computer Architecture

Baumeister T, Abdennadher H, Fey D (2026)


Publication Language: English

Publication Type: Conference contribution

Publication year: 2026

Event location: Valencia ES

Abstract

Providing timely, personalized feedback is critical for learning in technical higher education, yet remains a significant challenge in large courses. With over 100 students submitting complex assignments in domains like Computer Architecture, teaching staff struggle to deliver detailed, consistent feedback before its pedagogical value diminishes.

This study investigates whether Large Language Models (LLMs) can help address this feedback bottleneck. We comparatively evaluated four state-of-the-art models (ChatGPT-4o, Gemini 2.5 Pro, Claude Sonnet 4, and DeepSeek-V3) on 380 real, anonymized student submissions from an undergraduate Computer Architecture course covering RISC-V assembly programming, microprogramming, and pipeline analysis. Each submission was processed using four distinct prompt configurations, ranging from a basic structured rubric to a hybrid prompt combining reference solutions with feedback exemplars. The resulting 5,960 AI-generated feedback instances were then assessed through systematic human evaluation across six pedagogical criteria: Correctness, Clarity, Depth, Consistency, Usefulness, and Strictness.

Results demonstrate that current LLMs can produce technically accurate and pedagogically valuable feedback when supported by well-designed prompts. Gemini 2.5 Pro achieved the highest overall correctness (8.71/10 mean), while ChatGPT-4o exhibited the most balanced performance across all criteria. Claude Sonnet 4 demonstrated superior clarity and pedagogical tone, and DeepSeek-V3 provided a cost-efficient open-weight alternative. Critically, prompt engineering influenced feedback quality comparably to model selection: enriched prompts improved correctness by up to 17%, with the hybrid prompt consistently yielding superior results across all models.

These findings suggest that LLM-assisted feedback systems offer substantial potential for addressing scalability challenges in technical education. Practical implications include deployment as draft feedback generators for teaching assistants, enabling consistent, detailed feedback for all students while preserving essential human oversight. With careful prompt engineering and transparent implementation, LLMs can help make personalized formative assessment achievable at scale.

Authors with CRIS profile

How to cite

APA:

Baumeister, T., Abdennadher, H., & Fey, D. (2026). Addressing the Feedback Bottleneck in Large Technical Courses: A Comparative Study of LLM-Assisted Assessment in Computer Architecture. In Proceedings of the 20th annual International Technology, Education and Development Conference. Valencia, ES.

MLA:

Baumeister, Tobias, Hazem Abdennadher, and Dietmar Fey. "Addressing the Feedback Bottleneck in Large Technical Courses: A Comparative Study of LLM-Assisted Assessment in Computer Architecture." Proceedings of the 20th annual International Technology, Education and Development Conference, Valencia 2026.

BibTeX: Download