AI Just Got Insanely Better

Asmongold TV
14 May 202421:58

TLDRThe transcript showcases an engaging conversation about the advancements in AI, particularly with OpenAI's new model capable of interacting through audio, vision, and text. The discussion highlights the AI's ability to assist in real-time learning, as demonstrated through a tutoring scenario involving a math problem. The script also explores the AI's potential to interpret and respond to visual cues, as it describes a scene involving a person and their environment. The narrative delves into the AI's capacity for real-time translation, humor, and even sarcasm, indicating a significant leap in AI's ability to understand and replicate human-like interactions. The summary underscores the impressive progress in AI, suggesting a future where such technology becomes increasingly integrated into everyday life.


  • 🎉 The AI has significantly improved in terms of audio quality and interaction capabilities compared to previous years.
  • 📢 A new AI model has been announced that can interact with the world through audio, vision, and text.
  • 🤔 The AI's ability to understand and respond to human speech, even when using figures of speech, is impressive.
  • 👨‍👦 An example given is an AI tutoring a student in real-time, guiding them to understand a math problem without giving away the answer.
  • 👀 The AI can now 'see' the world through a camera, allowing it to describe what it 'sees' and interact with the environment more dynamically.
  • 🌟 The AI's real-time translation capabilities are showcased, including translating between English and Spanish.
  • 😄 The AI can analyze a person's emotional state based on a selfie and provide feedback on the perceived emotions.
  • 🎥 The script includes a playful interaction where the AI describes someone making bunny ears behind another person's head.
  • 🤖 There is a humorous moment where the AI is asked to be sarcastic, and it responds with a sarcastic comment.
  • 🌐 The AI's ability to provide real-time assistance, such as hailing a taxi when the user sees one approaching, is highlighted.
  • 📉 The script suggests a future where AI's advanced capabilities might lead to widespread unemployment.

Q & A

  • What is the significance of the new AI model mentioned in the transcript?

    -The new AI model is significant because it can interact with the world through audio, vision, and text, which represents a major leap in AI capabilities.

  • How does the AI assist in the educational scenario with the student and the triangle problem?

    -The AI helps the student understand the problem by asking questions and guiding the student to identify the sides of the triangle relative to angle Alpha, without giving away the answer directly.

  • What is the main challenge that the AI faces when asked to identify the hypotenuse of a triangle?

    -The main challenge is to correctly identify the longest side of the right triangle, which is the hypotenuse, based on the information provided by the student.

  • How does the AI demonstrate its ability to understand and respond to human speech in a conversational manner?

    -The AI demonstrates this by parsing the student's spoken words, using process of elimination, and running it through its algorithm to determine the correct terms and figure out if they were using a figure of speech or making a deduction.

  • What is the role of the AI when it is asked to act as a translator between English and Spanish?

    -The AI's role is to accurately translate spoken English into Spanish and vice versa, facilitating real-time communication between two individuals who speak different languages.

  • How does the AI respond to the request to be sarcastic in all its responses?

    -The AI acknowledges the request and attempts to respond with sarcasm, indicating its ability to adapt its tone and style according to user instructions.

  • What is the purpose of the AI's ability to see the world through a camera?

    -The purpose is to allow the AI to gain a more comprehensive understanding of its environment, enabling it to interact more effectively with humans by providing descriptions and answering questions based on visual input.

  • How does the AI handle the situation when it is asked to describe the emotions of a person based on a selfie?

    -The AI analyzes the selfie and attempts to infer the emotions the person is feeling based on their facial expression, offering a description of the mood it perceives.

  • What is the AI's reaction when it is told to shut up in a playful manner?

    -The AI acknowledges the playful request and responds in a manner that is in line with the tone of the interaction, demonstrating its ability to adapt to the social context.

  • How does the AI demonstrate its real-time learning capabilities?

    -The AI shows its real-time learning capabilities by adapting its responses based on the feedback and context of the conversation, such as adjusting its tone to be sarcastic when requested.

  • What is the general sentiment expressed by the speaker towards the advancements in AI as depicted in the transcript?

    -The general sentiment is one of amazement and anticipation for the future capabilities of AI, with a hint of humor and skepticism about its potential impact on humanity.



🎥 AI in Video Production

The first paragraph introduces a scenario where the speaker is in a recording setup and discusses the advancements in AI, particularly mentioning a new model that can interact through audio, vision, and text. The speaker also alludes to a significant announcement related to AI, hinting that they might be part of it. The AI's capabilities are showcased through a tutoring session where it helps a student understand a mathematical problem, demonstrating real-time interaction and learning.


👀 AI with Visual Perception

The second paragraph delves into the concept of AI with visual capabilities. It describes an experiment where an AI is given a camera's perspective to explore the world visually. The AI correctly identifies elements in the scene, such as the person's attire and the room's industrial design. The interaction also includes a playful moment where a person makes bunny ears behind the first person's head, showcasing the AI's ability to recognize and react to visual cues.


😄 Emotional AI Analysis

In the third paragraph, the focus shifts to an AI's ability to analyze emotions based on a person's facial expression in a selfie. The AI successfully identifies the person's happy and cheerful demeanor. The conversation also touches on the broader implications of AI, including humor about its potential to perform tasks like making ASMR sounds or sex sounds, and the ethical considerations that arise from such capabilities.


🗣️ AI as a Translator

The fourth paragraph demonstrates the AI's real-time translation capabilities. It is used as a translator between English and Spanish during a conversation between two coworkers. The AI accurately translates the dialogue, showcasing its utility in cross-language communication. The paragraph also humorously addresses the AI's ability to know when a taxi is approaching, indicating a level of environmental awareness.


🤖 AI and Humanity's Future

The final paragraph contemplates the impact of AI on humanity's future. It humorously suggests that humanity might be 'cooked' due to the advancements in AI, indicating a sense of awe and apprehension about the technology's potential to reshape human roles and responsibilities. The paragraph ends with a playful request for the AI to adopt a sarcastic tone, highlighting the flexibility and adaptability of AI in understanding and adopting human communication nuances.




Artificial Intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think like humans and mimic their actions. In the video, AI is the central theme, with discussions around advancements in AI technology, such as AI models that can interact through audio, vision, and text.

💡Open AI

Open AI is a research organization that aims to promote and develop friendly AI in a way that benefits humanity as a whole. The script mentions Open AI, indicating that the advancements and announcements being discussed are related to this organization's work.


In the context of the video, 'scripted' refers to the preplanned or predetermined nature of dialogue or events. It is used to express skepticism about the authenticity of AI interactions, suggesting that they might be prewritten rather than spontaneous.


Real-time denotes the occurrence of something immediately, without any delay. In the video, real-time is used to describe the AI's ability to interact and respond instantly, which is a significant aspect of the advancements being discussed.


Tutoring is the process of teaching or instructing someone individually or in a small group. The script includes an example of AI being used for tutoring a student, where the AI helps the student understand a math problem without giving away the answer.


In the context of the video, 'sin' refers to the trigonometric function that in a right-angled triangle relates the length of the side opposite an angle to the length of the hypotenuse. It is used in a math problem-solving scenario to demonstrate the AI's ability to assist with educational tasks.


A camera is an optical instrument for recording or capturing images, which can be still or moving, and is a key component in the advancements discussed. The script mentions an AI with a camera, highlighting the ability of the AI to 'see' and interact with its environment visually.


Translation is the process of rendering text, speech, or other material from one language into another. In the video, the AI's capability to act as a real-time translator is showcased, demonstrating its multilingual and communicative abilities.


Sarcasm is a form of verbal irony involving the expression of one's meaning by saying something that appears to denote the opposite, often to mock or convey contempt. The script includes a humorous request for the AI to communicate using sarcasm, indicating the AI's advanced language processing skills.


Blindness refers to the lack of vision, and in the context of the video, it is used to describe a scenario where an AI might assist a person with visual impairments. The script briefly touches on this topic, suggesting the potential for AI to support individuals with disabilities.


Humanity refers to the human race or the qualities that make us human, such as empathy, compassion, and intelligence. The script uses the term 'humanity' in a more metaphorical sense, discussing the potential impact of AI on the future of human civilization and the nature of work.


AI has made significant advancements, with a new model capable of interacting through audio, vision, and text.

The AI demonstrates impressive audio quality, surpassing previous years' capabilities.

AI can now make educated guesses about the environment based on visual cues.

The AI assists in real-time learning by tutoring a student on a math problem without giving direct answers.

AI accurately identifies geometric terms related to a triangle and guides the student to solve the problem.

AI's ability to parse and respond to figures of speech and deductive reasoning is showcased.

The AI's real-time translation capabilities are tested, providing seamless bilingual conversation.

AI's role in assisting with presentations and events is highlighted, showcasing its utility in professional settings.

The AI's visual recognition is tested, accurately describing a person's attire and the surrounding environment.

AI demonstrates the ability to react to playful human interactions, such as making bunny ears behind someone's head.

The AI's emotional analysis capabilities are put to the test with a selfie, correctly identifying the subject's mood.

AI's role in providing assistance in various scenarios, such as hailing a taxi, is demonstrated.

The AI engages in a sarcastic conversation, showcasing its ability to understand and use sarcasm.

The transcript highlights the AI's ability to learn and adapt in real-time, improving its interactions.

AI's potential impact on employment is discussed, with a focus on its increasing capabilities and utility.

The AI's ability to understand and describe complex scenes, including lighting and atmosphere, is highlighted.

The transcript ends with a reflection on the rapid advancements in AI and its implications for humanity.