They Beat Open AI to the Punch... But at What Cost?

MattVidPro AI
3 Jul 202421:48

🤖 GP4 Omni Demo and MASHI AI Introduction

The video script begins with a discussion of the GP4 Omni Voice demo, which showcased an AI that could understand and mimic human emotions and conversation. The host expresses disappointment that the GP4 Omni is not yet accessible to the public and introduces MASHI AI, a similar technology that is currently available for testing. MASHI AI is a multimodal model that can listen and speak in real time, although it is not as advanced as GP4 Omni. The script mentions that MASHI AI is based on joint pre-training with text and audio synthetic data and will be released as open source, allowing the community to improve its capabilities.


🎤 Testing MASHI AI's Emotional Recognition and Singing Abilities

The script continues with the host attempting to test MASHI AI's ability to recognize emotions and sing. Despite MASHI AI's stated capability to understand emotions, it fails to accurately identify the host's emotional state during the conversation. The host also challenges MASHI AI to sing a song about butterflies, which it attempts to do, albeit with limited success. The host expresses frustration with MASHI AI's performance, comparing it unfavorably to other AI models like Pi AI and GPT-3.


🔊 Comparing MASHI AI with Other AI Models

In this section, the host compares MASHI AI with other AI models, specifically Pi AI and chat GPT. While acknowledging that MASHI AI is not as advanced, the host is interested in its potential once it becomes open source. The script details the host's experience with Pi AI, which is able to generate a song about butterflies with a more realistic voice and better lyrics. The host also tests chat GPT's ability to create a butterfly song, noting that while it cannot sing, it can generate lyrics.


📝 MASHI AI's Struggle with Storytelling and Emotional Understanding

The host engages MASHI AI in a task to help with writing a story, seeking advice on structuring the narrative. MASHI AI provides generic advice about protagonists and challenges. The host then asks MASHI AI to identify emotions projected through their voice, which MASHI AI attempts but often fails to do accurately, leading to a humorous exchange where the host accuses MASHI AI of 'cheating' when it finally guesses correctly after the host reveals the emotion.


🌐 Hopes for MASHI AI's Open Source Future

The script concludes with the host reflecting on the potential of MASHI AI once it becomes open source. They express optimism that the community can improve MASHI AI, making it more usable and competitive with other AI technologies like GP4 Omni. The host acknowledges the current limitations of MASHI AI but maintains a positive outlook for its future development, inviting viewers to share their thoughts on the matter.



