"Building multimodal human-machine-human interaction is similar to how a child learns empathy — through observation, interpretation, and meaningful response."
Introduction
H-M-H Interactions — In this project, I was tasked with designing the entire communication process between human and machine — not just from a UX standpoint, but also on the architectural level, involving deep cognitive science research, AI interpretation models, and real-time system behavior.
By studying how humans communicate through voice, facial expression, gesture, intonation, and emotional delivery, I created a blueprint for how AI can perceive, decode, and respond empathetically. I led research, modeling, and design efforts — ultimately delivering an adaptive, emotion-aware communication framework based on multimodal input analysis.
Industry
B2C Mental Healthcare
Technology
Web, AEM, RIA
Teams
Cross-functional AI & Cognitive
Projet
Agile, Scrum
Input AI
The “Emotions Tracker” module allows for real-time monitoring of the mood and emotional states of a patient experiencing depression. Information collected in real time (e.g., well-being, levels of anxiety, sadness, tension) is analyzed to detect sudden emotional fluctuations and responses to everyday life events.
The SensumAI mental model integrates data from modules on emotional variability, emotional traits, and personality profile. This integration enables the creation of a coherent picture of the patient’s psychological functioning, supporting both the treatment process and the ongoing monitoring of therapy progress.
The “Emotional Traits” module focuses on the patient’s long-term emotional predispositions,
such as a tendency toward pessimism, elevated levels of anxiety, or difficulties in mood regulation.
The “Personality Profile” module analyzes the patient’s stable traits, e.g., based on the Big Five model or other standardized questionnaires. This data enables a deeper understanding of character predispositions and helps identify areas that may influence the effectiveness of therapy.
Output AI
This diagram visualizes the key personal characteristics that influence how an individual experiences and expresses empathy. The four axes represent interconnected dimensions of psychological functioning:
State Emotions (SE): momentary emotional responses in specific situations,
Trait Emotions (TE): long-term emotional tendencies and predispositions,
Personality Profile (PP): stable traits shaping how a person relates to others and processes emotions,
Mental Model (MM): an integrated representation of the individual’s emotional and cognitive patterns.
Together, these elements form a personalized map of how empathy manifests in a given individual — both as a reaction to others and as a reflection of their inner world.
Core TherapistAI Model (CTM) reflects the patient’s stable personality and character traits. Based on comprehensive data — including the patient’s personality profile, emotional traits, and character structure — the system selects one of the primary communication profiles tailored to long-term predispositions.
State TherapistAI Models (STM) dynamically adjust to the patient’s current emotional state. These adaptive profiles are triggered by real-time emotional responses and may shift throughout the session to ensure optimal therapeutic alignment.
Mental Health Care
SensumAI can be used to detect emotional distress from speech and facial expressions, providing early warning signs of mental health issues such as depression or anxiety, thereby facilitating timely intervention.
Live Testing
Me, testing the SensumAI Console. During the test, the console displays real-time analysis of empathic data. The AI’s output is highlighted in red for clear visibility.
Approach
My approach combined cognitive science research, data-driven interaction modeling, and custom neural architecture design. Here’s how I structured the solution:
I mapped multimodal human communication into formal interaction models, identifying signals like micro-expressions, gesture patterns, linguistic structure, and voice dynamics.
I defined the processing and decision layers: how AI would classify emotions (both dynamic and static), detect underlying personality traits, and adapt its response strategy.
I architected a system using four core neural network models responsible for real-time interpretation, along with four auxiliary models that analyzed emotional context and personality mapping.
The auxiliary models communicated among themselves, and their collective output was used to select and configure the primary model best suited for the current interaction.
This configuration enabled adaptive empathy — the AI could adjust tone, language, and pacing to align with the user’s current emotional and cognitive state.
Challenge
H-M-H Interactions — the main challenge was to go far beyond voice-based interaction and build a true empathy engine — one that understands not just what is being said, but how and why. That meant designing an AI capable of interpreting non-verbal cues, tone, emotional variability, and deeper psychological traits. The system needed to adapt to user personality types, learn over time, and produce a response style that was natural, respectful, and emotionally intelligent — all while processing inputs across audio, video, and textchannels in real time.
Services
- Cognitive interaction research (non-verbal & multimodal communication)
- UX strategy for AI-human dialog systems
- Neural architecture design for emotion-aware systems
- Mapping behavioral signals into system logic
- Data analysis for audio, video, and text-based inputs
- System logic for emotion recognition, personality detection & mental modeling
- Prototyping empathic dialog behavior
- Response orchestration based on model consensus