Sumeru AI Updates Mugen3D to Turn a Single Photo Into a Live 3D Teacher

Deployed across multiple universities, the platform turns a single photo into a live 3D teacher that speaks, gestures, and responds in real time.

SHENZHEN, China–Sumeru AI announced a major update to Mugen3D, its real-time interactive 3D content engine, enabling users to turn one photograph and voice sample into a talking, emoting 3D human ready for live conversation. The technology is already deployed across university classrooms in China, marking an early operational use of geometry-based spatial AI in education.

From a Single Photo to a Physics-Aware 3D Asset

A user uploads one photograph. Within minutes, proprietary geometric algorithms combined with 3D Gaussian Splatting (3DGS) produce a true-to-source model preserving facial structure, hair, fabric texture and surface lighting at 4K resolution. A single unified pipeline handles generation across humans, objects and scenes.

The update shifts Mugen3D from reconstructing the body to enabling real-time interaction. SumeruAI, the company’s interaction engine, connects the 3D figure to voice input, multilingual dialogue, role-based knowledge and a speech-to-face animation pipeline with under 150 milliseconds of latency. The result is not a prerendered loop or rigid avatar, but a live persona that speaks, reacts, answers questions and converses.

Geometry-First Architecture

Sumeru AI built its generation model on eight RTX 5090 GPUs. A single model generates on one RTX 5090, and real-time interaction runs on standard consumer hardware, including smartphones. The pipeline completes full generation and optimization in minutes on a single RTX 5090 GPU.

Image 2: Compute & Cost Efficiency infographic — comparison of traditional video generation versus Mugen3D geometry pipeline

Because Mugen3D produces spatial geometry rather than flat pixels, each asset is generated once and can be reused across any rendering environment without additional compute. Where video platforms require continuous high-cost inference to sustain output, a 3D asset generated once can be reused, re-rendered and redeployed indefinitely. This reusability makes the geometry-based approach practical for deployments that require consistent, repeatable interaction at scale.

Real-World Deployment

In its latest public demo, Sumeru AI showcases a Math Teacher persona generated from a real photograph. The avatar explains concepts, answers live questions and switches languages on demand.                                  

Image 3: Classroom deployment — students interacting with 3D AI teacher on large display

Institutions already using the platform include multiple universities across China. Nearly 1,000 educators and tens of thousands of students now interact with Sumeru AI’s digital teaching personas across these institutions.

On June 17, 2026, Sumeru AI earned second place in the AI Agent track, the “Second Brain” Challenge, at the 36Kr National AI+ Scenario Application Competition during WAVES 2026 in Guangzhou.

Image 4: Character Plaza presentation at 36Kr WAVES 2026 competition by Dr. Cheng Feng, CEO of Sumeru AI

Image 5: Robot Human-Likeness Test demo at 36Kr WAVES 2026

“Most image-to-3D tools stop at the render. We treat that as the starting point,” said Dr. Cheng Feng, CEO of Sumeru AI. “World models cannot be built on flat video. Reality is 3D. Whether training robots in simulated rooms or teaching students in virtual classrooms, you need geometry with physical bounds, not hallucinated pixels. A 3D teacher can explain a concept, answer a follow-up and remain with a student until the idea sticks. That is presence, not content.”

Mugen3D replaces manual sculpting, rigging and animation with a photo-driven workflow that cuts production time while maintaining precise visual reconstruction. The pipeline supports Unity, Unreal Engine and WebGL environments, positioning the platform for education, robotics simulation, spatial computing, 3D printing and interactive entertainment.

Sumeru AI was founded by Dr. Cheng Feng, a physicist and University of Leicester PhD who previously built a computer vision company to a 1.5 billion RMB valuation, and Dr. Tomohiro Nagasaka, a Kyoto University mathematics PhD and International Mathematical Olympiad medalist who built China’s first optical motion capture system. The company completed a ten-million-yuan angel round in 2025 and joined the Microsoft China Innovation Center and NVIDIA Inception programs in 2024.

Mugen3D is available at www.sumeruai.us/mugen3d.

About Sumeru AI

Sumeru AI, also known as Shenzhen Quxiang Spacetime Technology, builds spatial intelligence infrastructure for education, robotics and spatial computing. Its Mugen3D platform turns images and voice into precise 3D assets and conversational digital personas capable of real-time interaction.

Related posts

POTR Launches Helix, a NASA-Inspired Expandable Self-Watering Plant Pot

National Assn. of Worksite Health Centers Rebrands as ‘National Assn. for Workplace Health Care’

Dolphin Data Capture Introduces Versatile Line of Zebra Handheld Scanners for Industrial, Healthcare, and Retail Sectors