Mandarin Learning Machine
Interactive System for Mandarin Pronunciation and Character Writing
Duration
2022
Researcher
Hanju Seo
Affiliation
- Imperial College London · Dyson School of Design Engineering
- Royal College of Art · School of Design
Context
Cyber Physical Systems 2022 · Imperial College London & Royal College of Art
Role
- Hardware Designer (modular frame, motor control, mechanical assembly)
- Embedded Systems Developer (Arduino motor control)
Collaborator
- Sabrina Tian
- Xinyi Zhou
Most language learning tools present a character on a flat screen. The stroke order appears as a sequence of static lines, and the learner traces it with a finger or stylus. Mandarin Learning Machine proposes a different premise: that the physical emergence of a character in three-dimensional space, timed to the learner's own voice, creates a more direct and memorable connection between sound and form than any flat representation can offer.
The system works in two stages. First, the learner speaks into a microphone. A Wekinator machine learning model, trained on native speaker recordings of the four Mandarin tones, classifies the input in real time and determines whether the pronunciation is sufficiently close to the target. Only when the tone is correctly recognised does the second stage begin. A 5x5 grid of physical blocks, controlled by Arduino-driven motors, activates in sequence to render the corresponding character in three-dimensional space, stroke by stroke, according to correct writing order. The characters selected for the initial system, 一, 二, 三, 口, and 日, were chosen for their structural simplicity and the range of fundamental strokes they contain, making them well-suited to early learning stages.
The hardware was constructed as a modular frame designed for easy adjustment and component replacement. Mechanical prototypes were developed through iterative 3D printing, with each revision informed by testing of the movement mechanisms. Audio input was processed in Processing, with OSC packets sent to the Arduino for motor actuation. The result is a system in which the learner's voice directly drives a physical display: pronunciation and character formation linked not through instruction but through embodied cause and effect.
Modular frame construction, 5x5 block grid mechanism, Wekinator tone classification interface, and system demonstration.