MindLens·Lab
← Research

Originating paper

Artificial Intelligence in Emotional Intelligence Training for Autism.

Kim, E. (2025). Curieux Academic Journal, July 2025, Part 2 Issue, pp. 394–410.

Publication

Published in the Curieux Academic Journal (July 2025), and on the UNESCO Learning Planet Institute Youth Fellow platform.

Download PDF ↗

1 · What the paper asked

Do AI tools actually help autistic users read emotion?

By 2024 the field had produced a generation of AI-driven tools meant to help autistic users — children especially — build emotional intelligence. The promise was clear: turn ambiguous social moments into trainable, predictable signals. The review asked the simpler version of the question: across the actual tools available, are they delivering on that promise, and where are they running into trouble?

The review covered four tools, picked to span the design space: a wearable (EmotionNet Nano), a classroom platform (FECTS), a recognition engine (Affectiva), and a social robot (RoboKind's Zora). The goal wasn't to rank them — it was to find what kept failing across all of them.

2 · What it found

The same four limitations, across all four tools.

Despite their very different surfaces, the four tools all ran into the same four problems. None of these were implementation bugs — they were structural to how the field built emotion-AI for autism support.

Finding i

The annotation bottleneck

Every tool was trained on emotion-labelled data — and the labels assumed one correct emotion per moment. Annotators were instructed to pick the “right” emotion, and divergent annotations were treated as noise to be averaged or excluded. But real readings of social moments are plural and culturally inflected; the single-label assumption builds the wrong foundation. Every downstream claim — accuracy, transferability, generalization — inherits that assumption.

Finding ii

Short-term gains, weak transfer

Users improved on the trained tasks — that part was real. But the gains rarely transferred to messy, multimodal real-world interactions: a child who could identify “surprised” on a flashcard couldn't always identify it on their classmate. The skill stayed inside the training environment. The paper read this as a structural mismatch between the controlled training conditions and the ambiguous, contextual conditions where the skill is actually used.

Finding iii

Overdependence on AI feedback

Users — children included — increasingly deferred to algorithmic suggestions, even when those suggestions were demonstrably wrong. Tools designed to scaffold independent emotional judgement risked substituting for it instead. The paper called this an under-recognized cost of high-frequency AI feedback in developmental contexts.

Finding iv

Unresolved ethical concerns

Emotion data sits in a regulatory gap. GDPR doesn't classify it as a special category, and most national frameworks haven't caught up to what it means to collect, store, and potentially share affective signals from minors. The paper argued that the field had moved faster on capability than on the frameworks needed to use that capability responsibly.

3 · What it recommended

AI should complement — not replace — human-mediated learning.

The paper's central recommendation was explicit. AI tools for emotional intelligence training are most useful when they sit alongside teachers, caregivers, and peers — augmenting context-sensitive human judgement rather than acting as a substitute for it. Two corollaries followed:

  • Training data must reflect plural readings. Future systems should be trained on data that captures the natural variability in how humans read emotion — including autistic-user-specific patterns where they differ from the neurotypical baseline — rather than averaging that variability into a single answer.
  • Ethical scaffolding should precede deployment. Frameworks for affective data — especially when collected from minors and from clinical populations — need to mature before any large-scale rollout.

4 · What came next

From paper to platform.

The recommendations pointed at training data that didn't yet exist. So the next step wasn't another tool — it was the substrate. MindLens Lab Phase 1 is the empirical follow-up: a dataset of plural human readings, structured so that later phases can build on it responsibly.

For the personal story of how that decision was made — the intent, the intermediate exploration, the wall, the pivot — see the Project story.

For the formal pre-registered methodology of the current work, see the Pre-registered research plan.