Sesame Unveils Groundbreaking AI Model Behind Viral Virtual Assistant Maya

Sesame Unveils Groundbreaking AI Model Behind Viral Virtual Assistant Maya

Sesame, an innovative AI company, has unveiled its latest base model, which powers the impressively realistic voice assistant, Maya. This release is generating excitement in the tech community, particularly due to its potential applications in various industries.

Introducing CSM-1B: The Backbone of Maya

The newly launched model, known as CSM-1B, features a robust architecture comprising 1 billion parameters. These parameters are crucial components that help generate sound from text and audio inputs. Licensed under the Apache 2.0 license, CSM-1B can be utilized commercially with minimal restrictions, making it an attractive option for developers.

Understanding RVQ Audio Codes

CSM-1B generates RVQ audio codes, a cutting-edge technique for encoding audio into discrete tokens or codes. This innovative approach is utilized in various contemporary AI audio technologies, including:

Technical Details of CSM-1B

CSM-1B employs a model from Meta’s Llama family as its backbone, supplemented with an audio “decoder” component. According to Sesame, a fine-tuned version of this model powers the Maya voice assistant.

Sesame has emphasized that the open-sourced model is a base generation model, capable of producing a variety of voices but not fine-tuned for any specific voice. Additionally, it has limited capability for non-English languages due to training data contamination.

Concerns Regarding User Safeguards

While the model showcases impressive capabilities, there are notable concerns regarding user safety. Sesame has not implemented significant safeguards, relying instead on an honor system. They urge developers and users to refrain from:

  • Mimicking a person’s voice without consent
  • Creating misleading content, such as fake news
  • Engaging in harmful or malicious activities
READ ALSO  Atomicwork Secures Khosla's Support for Revolutionary AI Solution Disrupting Traditional IT Software like ServiceNow

User Experiences and Industry Insights

In a recent demo on Hugging Face, users reported that cloning a voice took less than a minute, allowing for easy generation of speech on various topics, including sensitive issues like elections and propaganda. This capability has raised alarms, as highlighted by Consumer Reports, which cautioned that many voice cloning tools lack effective safeguards against fraud and misuse.

About Sesame

Founded by Brendan Iribe, co-creator of Oculus, Sesame has gained traction for its advanced voice assistant technology that nearly crosses into the uncanny valley. Both Maya and Sesame’s other assistant, Miles, demonstrate natural speech patterns, including breathing and disfluencies, and can be interrupted while speaking, much like OpenAI’s Voice Mode.

Future Developments at Sesame

In addition to voice assistant technology, Sesame is actively prototyping AI glasses designed for all-day wear, which will incorporate their custom models. The company has secured funding from prominent investors like Andreessen Horowitz, Spark Capital, and Matrix Partners.

As Sesame continues to innovate, the intersection of AI technology and voice assistance promises to reshape the future of communication.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *