Ethical AI Music Production with Udio and

Bob Doyle Media
18 Apr 202426:36

TLDRIn this video, the host explores the creative potential of AI in music production by using Udio to create an original song and to modify vocals and harmonies. The host highlights the ethical sourcing of AI voices in and demonstrates how to train a custom voice model using clear, dry vocals. The process involves converting the original male voice to a female voice, experimenting with different voice models, and blending voices to create unique vocal tracks. The video also discusses the importance of using these tools as a source of inspiration rather than a replacement for human creativity. The host encourages musicians to embrace AI as a means to fill in creative gaps, such as when live singers are not available, and to use AI-generated music as a starting point for further creative development.


  • 🎵 The video demonstrates how to create an original song in Udio and then use Kits AI to modify vocals and harmonies, showcasing a workflow for music generation tools.
  • 🌟 Kits AI has a new desktop app and updates to its website, offering voice conversion tools and the ability to train custom voice models ethically.
  • 📚 Concerns about copyright and data sourcing are addressed, with Kits AI ensuring all voices are ethically sourced and transparent about their data practices.
  • 🎙️ The process of training a voice model is outlined, emphasizing the need for clear, dry vocal samples without effects or background music for best results.
  • 🔄 Kits AI allows blending of different voice models to create unique vocal sounds, which can be used for various creative music projects.
  • 🚫 The video stresses the importance of avoiding copyright infringement by creating original songs and using AI as a tool for inspiration rather than replication.
  • 👩‍🎤 Live musicianship and the value of human performance are highlighted as irreplaceable aspects of the music industry, countering fears that AI might replace human artists.
  • 🎶 AI music generation tools are praised for their potential to inspire creativity, rather than being a substitute for original, human-made music.
  • 🛠️ The video provides a tutorial on using AI-generated music as a base, which can then be built upon with real instruments and further creative editing.
  • ⚙️ The potential for using AI-generated music with additional software, like Band in a Box, to rework tracks in various styles is mentioned.
  • 📈 The video encourages viewers to explore AI music tools for practical applications in the music industry, beyond simple cover songs or mimicry.
  • 💡 Ethical considerations and creative freedom are emphasized, encouraging musicians to use AI as a tool to enhance their creative visions in an original and respectful manner.

Q & A

  • What is the main focus of the video?

    -The main focus of the video is to demonstrate the creation of an original song using Udio and then modifying and adding vocals and harmonies using Kits AI, showcasing a workflow for using these music generators as creative inspiration.

  • What is the purpose of using Kits AI in the process?

    -Kits AI is used to change the vocals of the original song created in Udio, adding different vocal styles and harmonies to enhance the creative process and demonstrate the capabilities of AI in music production.

  • How does Kits AI ensure ethical sourcing of voices?

    -Kits AI sources all of its voices ethically, and more information on their ethical sourcing and philosophy can be found through the provided link in the video description.

  • What is the process of training a voice model in Kits AI?

    -To train a voice model, one needs to upload clear, dry vocal samples without effects, background music, or harmonies. The process involves providing a good sample of the voice to be used, ideally recorded in a sound booth, and can be based on speaking or singing, depending on the desired outcome.

  • How can users create unique voice models in Kits AI?

    -Users can blend existing models from Kits AI's voice library with their trained model or another sample in the library to create a completely new and unique voice model by deciding the ratio between the different voice sources.

  • What is the role of Udio in the music creation process shown in the video?

    -Udio is used to create the original song structure, including the music and the initial vocals. The user can input specific styles and lyrics to generate a song that will later be modified using Kits AI.

  • How does the video address concerns about AI replacing musicians?

    -The video suggests that AI is not a threat to musicians, especially those who perform live, as live performances have unique nuances that AI cannot replicate. It emphasizes that AI tools are meant to inspire creativity rather than replace human musicians.

  • What are the steps taken to modify the original song created in Udio?

    -The steps include extending the song to create a complete track, downloading the audio, separating the vocals from the instrumentals, and then using Kits AI to convert the original male voice to different vocal styles, including female voices and harmonies.

  • How does the video demonstrate the use of AI voices for backing vocals?

    -The video shows how AI voices can be used as backing vocals by trying different voice models, adjusting pitch levels, and adding effects to fit the song's style. The AI-generated backing vocals are then mixed with the lead vocals to create a fuller sound.

  • What is the importance of using original lyrics in the song creation process?

    -Original lyrics are important because the lyric writing capability of AI services is often basic and not very creative. Writing your own lyrics allows for a more personalized and specific message, which is especially important when the song is about personal experiences.

  • How does the video presenter plan to use the AI-generated song in future videos?

    -The presenter plans to take the song creation process even deeper in future videos, exploring more advanced features of Udio and Kits AI, as well as using other software to convert audio tracks into MIDI tracks for further manipulation and creation of harmonies.



🎵 Introduction to Kits AI and Music Generation

The video begins with an introduction to creating an original song using AI and Kits AI's capabilities for changing and adding vocals and harmonies. The host discusses the creative potential of music generators and provides an overview of Kits AI's new desktop app and website updates. The focus is on using these tools as sources of creative inspiration rather than as replacements for human musicianship.


🔄 Exploring Kits AI's Voice Conversion and Training

The host dives into the functionality of Kits AI, emphasizing its use as a voice conversion tool for AI covers or adding singers to projects. The importance of ethical voice sourcing is highlighted, with a link provided for more information. The process of training a voice model is outlined, stressing the need for clear, dry vocal samples. The video also covers blending different voice models to create unique ones and the various AI voices available.


🎶 Creating and Modifying a Song with Kits AI

The video demonstrates creating a song in udio, focusing on writing original lyrics while allowing the AI to generate music. The host discusses the limitations of AI-generated lyrics and the creative process of extending the song using udio's features. The process of downloading and separating vocals from instrumentals is shown, followed by importing the tracks into an audio editor for further work.


🎧 Adjusting Vocals and Adding Effects in Adobe Audition

The host uses Adobe Audition to manipulate the vocal tracks, removing reverb and echo to prepare for adding effects. The process of converting the original male vocal track to a female voice using Kits AI is detailed. The challenges of finding a matching vocal model and adjusting pitch levels are discussed. The video also covers adding background vocals and experimenting with different voice models to achieve a desired sound.


🎤 Layering Vocals and Creating Harmony

The video continues with the process of layering vocals, adding a custom voice model to the mix, and adjusting the harmony. The host experiments with pitch shifting and panning to create a fuller sound. The limitations of the starting vocal model are acknowledged, and the video shows how to blend different vocal tracks for a richer sound. The potential for using additional software to convert audio tracks into MIDI for further manipulation is mentioned.


🎷 Final Touches and the Role of AI in Music Creation

The host discusses the potential for real musicians to enhance the AI-generated music, suggesting the use of AI as a creative starting point rather than a finished product. The video concludes with thoughts on the ethical use of AI in music, encouraging musicians to see the practical uses of AI in the industry. The host invites viewers to engage with the content, subscribe for more, and share their thoughts on the use of AI in music creation.



💡Ethical AI

Ethical AI refers to the responsible and moral development and use of artificial intelligence, ensuring that AI systems are designed and deployed with respect to human rights, inclusivity, and fairness. In the context of the video, ethical AI is crucial when sourcing voices for music production, ensuring that the data used to train AI models is obtained legally and with consent.


Udio is a music creation platform that allows users to create original songs by selecting various styles and inputs. It is highlighted in the video as a tool for generating a base song that will later be modified using AI voice conversion tools like Kits AI. The platform is used to create a 'honky tonk Barrel House Dr. John Style' song, showcasing its capabilities in music generation.

💡Kits AI

Kits AI is an AI voice conversion tool that enables users to convert one voice into another, which can be used for various creative projects such as music production. The tool is showcased in the video for changing vocals and adding harmonies to a song created in Udio, demonstrating its utility in enhancing and modifying the original music composition.

💡Voice Conversion

Voice conversion is the process of altering a voice's characteristics to match or resemble another voice. This technology is central to the video's narrative, as it is used to change the original male vocals of a song created in Udio to different AI-generated voices, including female and custom models, to create a unique vocal arrangement.

💡AI Voices

AI voices are synthetic vocal models created using machine learning techniques. They are a core component of the Kits AI tool and are used in the video to replace and add to the original vocals of the song. The video discusses the ethical sourcing of these voices and their application in creating music without infringing on copyright or using someone else's voice without permission.

💡Training a Voice Model

Training a voice model involves providing AI with a sample of a voice—either singing or speaking—to create a unique vocal profile. In the video, the creator has trained their own voice model based on speaking, which is then used to generate vocals for the song. This process requires clear, dry vocals without effects or background music to ensure the model can be effectively used in post-production.

💡Vocal Remover

A vocal remover is a tool that attempts to isolate and remove the vocal track from a song's instrumentals. In the context of the video, the vocal remover feature within Kits AI is used to separate the vocals from the instrumentals of the song created in Udio, allowing for further manipulation and the addition of new AI-generated vocals.

💡Multitrack Session

A multitrack session is a method of recording and mixing music where different elements, such as vocals and instrumentals, are recorded on separate tracks. This allows for greater control during the editing process. In the video, the creator uses a multitrack session in Adobe Audition to edit and layer the AI-generated vocals onto the original song.


Harmonies are the combination of simultaneous musical notes to produce a richer, more complex sound. The video discusses adding harmonies to the song using AI voices, which can enhance the overall quality and depth of the music. The process involves singing a harmony part and then converting it into an AI voice to be added to the mix.

💡Audio Editor

An audio editor is software used to manipulate and process audio files, such as adding effects, adjusting levels, and mixing tracks. In the video, Adobe Audition is used as the audio editor to integrate the AI-generated vocals with the original instrumentals and to apply effects like reverb and pitch shifting to achieve the desired sound.

💡Creative Inspiration

Creative inspiration refers to the stimulation of new ideas or solutions in the creative process. The video emphasizes the use of AI music generation tools not as a replacement for human creativity but as a source of inspiration to aid in the creative process. The AI tools help fill in gaps where traditional resources may be lacking, such as when a musician needs a specific type of voice or instrument they do not have access to.


Today's session involves creating an original song in Udio and using Kits AI to modify vocals and harmonies.

Kits AI has released a new desktop app alongside updates to their website.

Kits is a voice conversion tool that can be used for AI covers or adding singers to projects.

Kits AI offers over 35 AI voices and ensures ethical sourcing of voice data.

Users can train their own voice models with Kits AI using clear, dry vocal samples.

The process of training a voice model with Kits AI is straightforward but requires specific types of audio.

Kits AI allows blending of different voice models to create unique vocal profiles.

The presenter emphasizes the importance of using AI music tools as a source of creative inspiration rather than a replacement for human musicians.

AI-generated music can be a starting point for fleshing out musical ideas and can be combined with real instruments.

The presenter discusses the capabilities of Udio for creating a song in the style of Dr. John, albeit without using his actual voice for legal reasons.

Lyrics for the song are written by the presenter, highlighting the limitations of AI in creative writing.

The song created in Udio is extended to its full length using the platform's features.

Vocals are separated from the instrumentals using Kits AI's vocal remover tool.

Adobe Audition is used as the external audio editor to further process and add effects to the vocals and instrumentals.

Different voice models are tested for compatibility with the original song's vocal range.

The presenter converts the original male vocal track to a female voice using Kits AI.

Harmony tracks are created by singing and then converting the presenter's voice into an AI voice for inclusion in the song.

The final song is a blend of AI-generated vocals and manually adjusted tracks to create a unique piece.

The use of AI in music production is positioned as a tool for enhancing creativity, not replacing human input.