
Framework for LLM-Driven NPCs in Educational Games
Exploring how Large Language Models can introduce agency and dynamic NPC-driven narratives
Overview
A collaborative research project with PhD Researcher and Game Designer Scott DeJong, developing a testing framework that empowers non-technical designers to create LLM-driven NPCs. The system enables rapid prototyping of dynamic interactions that balance learning objectives with player agency in educational game environments.
Technologies
Context & Challenge
Media literacy games struggle to balance structured learning objectives with meaningful player agency, often relying on linear narratives that limit student exploration. As misinformation proliferates, there's growing need for engaging educational tools that teach middle and high school students to critically evaluate sources and recognize bias.
The educational game Zarhi's Missing faced a critical challenge: creating dynamic, scalable NPC interactions for 30+ simultaneous student players while maintaining pedagogical alignment with media literacy goals. Traditional static narratives couldn't adapt to diverse student questioning or provide the emergent storytelling needed for authentic learning about journalism, bias, and fact-checking.
Large Language Models offered a solution but introduced new challenges: preventing hallucinations that could undermine learning objectives, maintaining narrative consistency across multiple student groups, and enabling non-technical educators to rapidly prototype and refine NPC behaviors without engineering expertise.
User Pain Points
Educators struggled to create scalable, dynamic media literacy experiences without extensive technical expertise or classroom facilitation burden.
Students became disengaged with static narratives that offered minimal agency to explore questions about journalism, bias, and fact-checking.
Technical Challenges
Preventing LLM hallucinations while maintaining factual consistency across multiple simultaneous student interactions.
Synchronizing multi-layered memory architecture (session, character, global) without context loss or prompt overload.
Balancing character authenticity with knowledge boundaries and pedagogical goals through effective prompt engineering.
Development and Testing Workflow
