The 2nd MLC-SLM Challenge 2026 Opens Registration with a USD 20,000 Prize Pool

by Joseph Wilson May 8, 2026

written by Joseph Wilson May 8, 2026 2 minutes read

The 2nd Multilingual Conversational Speech Language Models Challenge 2026, also known as the MLC𝐌𝐋𝐂-SLM𝐒𝐋𝐌𝟐𝟎𝟐𝟔, is now open for registration. This year’s challenge features a total prize pool of USD 20,000𝟐𝟎,𝟎𝟎𝟎 and invites academic teams, industry teams, and individual researchers to advance Speech Large Language Models for real-world multilingual conversational speech.

The 2nd MLC-SLM Challenge focuses on key capabilities required for next-generation Speech LLMs, including speaker diarization, speech recognition, and conversational speech understanding. Participants will work with multilingual, multi-speaker conversational speech data designed to reflect real-world dialogue scenarios.

2nd MLC-SLM Challenge Offers:

Free registration
Free access to a large-scale multilingual conversational speech dataset for registered participants, featuring around 2,100 hours of data across 14 languages
A total prize pool of USD 20,000

Support for both academic and industry teams, as well as individual researchers

Following the success of the first MLC-SLM Challenge, which attracted 78 teams from 13 countries and regions, the 2026 edition introduces a larger and more diverse dataset. The first challenge also received 489 valid leaderboard submissions and 14 technical reports, and its summary paper has been accepted by ICASSP 2026.

Challenge Tasks

Participants can join two tracks:

Task1: Multilingual Conversational Speech Diarization and Recognition

Participants are required to build systems that can identify who is speaking when and transcribe multilingual conversational speech. During evaluation, no oracle segmentation or speaker labels will be provided, making the task closer to real-world speech processing scenarios.

Task2: Multilingual Conversational Speech Understanding

Participants are required to build systems that understand multilingual conversations using both acoustic and semantic information. Evaluation will be based on multiple-choice questions about the full conversation, testing the model’s ability to capture meaning, context, and speaker-level information.

Both pipeline-based systems and end-to-end Speech LLM systems are welcome. External datasets and pretrained models are allowed, as long as they are freely accessible and clearly reported.

Dataset Highlights

The challenge dataset contains around 2,100 hours of two-speaker conversational speech across 14 languages𝟐,𝟏𝟎𝟎 including English, French, German, Italian, Portuguese, Spanish, Japanese, Korean, Russian, Thai, Vietnamese, Tagalog, Urdu, and Turkish.

The dataset also includes diverse regional accents, such as Canadian French, Mexican Spanish, Brazilian Portuguese, British English, American English, Australian English, Indian English, and Philippine English.

This makes the MLC-SLM Challenge a valuable benchmark for researchers working on multilingual ASR, speaker diarization, Speech LLMs, and spoken language understanding.

Registration

Registration is now open. Participation is free, and the dataset will be provided free of charge to registered participants.

Registration Link: https://forms.gle/jfAZ95abGy4ZiNHo7

Official Website: https://www.nexdata.ai/competition/mlc-slm

Contact Email: mlc-slmw@nexdata.ai

Join the 2nd MLC-SLM Challenge 2026 and help advance the next generation of multilingual Speech Large Language Models.

MLC-SLM Organizing Committee

NEXDATA TECHNOLOGY INC.

Joseph Wilson

Joseph Wilson is a veteran journalist with a keen interest in covering the dynamic worlds of technology, business, and entrepreneurship.

About Us

Feature Posts

Useful Links

Popular Topics