How to transcribe audio to text for FREE - Riverside’s new AI transcription tool

Joey /// VP Land
21 Mar 202307:04

TLDRRiverside has launched a new AI-powered transcription tool that's completely free to use. You can upload any audio or video file, and it will transcribe it into over 100 languages. The tool is built on Whisper, an OpenAI technology, known for its accuracy. The video demonstrates the tool's ease of use, with no account needed and a simple drag-and-drop interface. It also shows the real-time transcription process and the option to download the transcript in text or SRT format for captioning. While the tool doesn't identify speakers or allow text editing, it's highly accurate and a great starting point for transcribing content. The video concludes with a comparison of the tool's output against a proofread transcript, showing an impressive 88% similarity, making it a valuable resource for podcasters and content creators.

Takeaways

  • 🆓 Riverside has launched a free AI transcription tool that can transcribe audio or video files in over 100 languages.
  • 🌐 The tool is accessible at riverside.fm/transcription and does not require an account to use.
  • 📂 Users can simply drag and drop their audio or video files for transcription.
  • 🕒 The transcription process is real-time, and the tool processed a 50-minute interview in approximately 1-2 minutes.
  • 🚫 The tool does not identify speakers, so for speaker labels, users would need to use another tool like Descript.
  • 📋 The transcription can be copied to clipboard with timecode markers but without speaker labels.
  • 📄 Transcripts can be downloaded in text file format, which includes a raw text output without timecodes or speaker identification.
  • 📺 An SRT file can also be downloaded for video captioning purposes, which is compatible with platforms like YouTube and Vimeo.
  • ⚠️ The free service does not allow users to preview or edit the transcribed text for accuracy.
  • 🔍 For higher accuracy, users might need to compare and edit the transcript using another tool.
  • 📈 Riverside's transcription tool is separate from the Riverside app, but there might be future integration.
  • 📊 The tool's accuracy was tested and found to be approximately 88% similar to a proofread transcript, with minor differences in punctuation and proper nouns.

Q & A

  • What is the new AI transcription tool released by Riverside?

    -Riverside has released a new AI transcription tool that is completely free and can transcribe audio or video in over 100 different languages.

  • What is the basis of Riverside's transcription tool?

    -The transcription tool is built on top of Whisper, another tool from OpenAI, the same company that created ChatGPT, GPT-3, and GPT-4.

  • How does the Riverside transcription tool work?

    -The tool has a simple interface where users can drag and drop their audio or video files to start the transcription process without needing an account.

  • What are the limitations of Riverside's transcription tool?

    -While the tool is free, it does not identify speakers, so there are no speaker labels in the transcript. Users also cannot preview the audio for accuracy or edit the text directly within the tool.

  • How long did it take to transcribe a 50-minute interview using Riverside's tool?

    -The transcription of an hour-long interview took approximately 1 to 2 minutes using Riverside's tool.

  • What format options are available for the transcribed text?

    -The transcribed text can be copied to the clipboard with timecode markers, downloaded as a text file without timecodes, or downloaded as an SRT file for video captioning.

  • What is the process for using the transcription tool with Riverside?

    -As of the time of the video, the transcription tool is a separate service from the Riverside app, and there is no direct integration to transcribe podcast episodes.

  • How accurate is the transcription provided by the tool?

    -The tool's accuracy is quite high, with an overall similarity of 88% compared to a proofread transcript. Most differences are minor, such as punctuation and capitalization.

  • What is the process for uploading a file to Riverside's transcription tool?

    -To use the tool, one can go to riverside.fm/transcription, drag and drop their file, and then click 'start transcribing' to upload and process the file.

  • What is the file size limit for Riverside's transcription tool?

    -The video does not mention specific file size limits, but it demonstrates the tool handling a file just under 5 gigabytes.

  • How can one save the transcription for future reference?

    -The transcription can be saved for future reference by copying it to the clipboard with timecode markers or by downloading it as a text file or an SRT file for captioning.

  • What is the suggested next step if speaker identification is needed in the transcript?

    -If speaker identification is required, the transcript can be loaded into another tool like Descript, which can perform speaker identification.

Outlines

00:00

🚀 Riverside's New Free AI Transcription Tool Overview

Riverside has launched a free AI-powered transcription service that can transcribe audio and video files in over 100 languages. Built on OpenAI's Whisper technology, the tool is accessible at riverside.fm/transcription and features a simple interface that doesn't require an account. Users can upload files up to 5 gigabytes for transcription, and the process is shown in real-time with timestamps. The tool provides a raw text output with timecodes but does not identify speakers. For speaker identification, integration with another tool like Descript is suggested. Users can copy the text to clipboard, download it as a text file or an SRT file for captioning. The service is free but lacks the ability to preview audio for accuracy and edit text directly. There's also no current integration with the Riverside app for podcast episodes, but an announcement is expected soon regarding app updates.

05:02

📊 Comparing Riverside's Transcription Accuracy with Proofread Transcript

The video script includes a comparison of the Riverside transcription tool's output with a proofread transcript. The accuracy of the transcription is assessed using an online comparison tool, and the overall similarity is found to be 88%. The differences are minor, mostly related to punctuation and a few proper nouns. The tool incorrectly transcribed 'GPT' as 'GTB-3' and misspelled a name, but it correctly transcribed other proper nouns and common words. The video emphasizes that while the tool is highly accurate for free, any auto-transcription should be proofread. The transcript can be used as a starting point for captioning or reference, and the tool is praised for its impressive accuracy and utility.

Mindmap

Keywords

💡AI transcription tool

An AI transcription tool refers to software that uses artificial intelligence to convert spoken language from audio or video files into written text. In the context of the video, Riverside's new AI transcription tool is highlighted as a free service that can transcribe content in over 100 languages, demonstrating the growing capabilities of AI in facilitating tasks that were previously manual and time-consuming.

💡Whisper

Whisper is a tool developed by OpenAI, the same company behind ChatGPT and GPT models. It is an advanced AI model designed for processing and understanding speech. In the video, Whisper serves as the underlying technology for Riverside's transcription tool, emphasizing the accuracy and efficiency of the transcriptions it produces.

💡Transcribe

To transcribe means to convert spoken language into written form. In the video, the process of transcribing involves uploading an audio or video file to Riverside's tool, which then uses AI to generate a written transcript. This is a key feature of the tool, as it allows users to obtain written records of spoken content quickly.

💡Accuracy

Accuracy in the context of transcription refers to how closely the written text matches the spoken words in the audio or video file. The video script discusses testing the accuracy of Riverside's AI transcription tool by comparing its output to a proofread transcript, highlighting the importance of precision in transcription services.

💡Timecode markers

Timecode markers are timestamps that indicate specific points in time within a video or audio file. They are used to reference particular moments in the media. In the video, it is mentioned that the transcription includes timecode markers, which can be helpful for locating parts of the audio or video during review or editing.

💡Speaker labels

Speaker labels are identifiers used to denote which speaker is speaking at any given moment in a transcript. The video script notes that Riverside's tool does not provide speaker labels, meaning that it does not differentiate between speakers in the transcript, which could be a limitation for users needing to attribute quotes or dialogue.

💡SRT file

An SRT file, or SubRip file, is a format used for video captions and subtitles. It includes timestamps that synchronize the text with the spoken words in the video. The video mentions that users can download an SRT file from Riverside's tool, which can then be used to add captions to videos on platforms like YouTube or Vimeo.

💡Proofread

Proofreading is the process of reviewing written text to correct errors and improve clarity. In the video, the term is used in the context of comparing the AI-generated transcript with a proofread version to evaluate the tool's accuracy. This step is crucial for ensuring the quality of transcriptions, especially for professional or public use.

💡Custom vocab library

A custom vocab library is a collection of specific terms or words that an AI transcription tool can be trained to recognize accurately. The video script points out that Riverside's tool does not allow for uploading a custom vocab library, which could lead to inaccuracies in transcribing proper names or specialized terminology.

💡Integration

Integration refers to the process of combining different systems or applications to work together seamlessly. The video mentions that Riverside's transcription tool is currently separate from the main Riverside app, suggesting that future updates may integrate the transcription feature more closely with Riverside's other services.

Highlights

Riverside has released a free AI transcription tool that can transcribe audio or video in over 100 languages.

The tool is built on Whisper, an OpenAI technology, which is also the company behind ChatGPT and GPT models.

The transcription service is completely free and does not require an account to use.

Users can simply drag and drop their audio or video files for transcription.

The tool processed a 50-minute interview file in approximately 1 to 2 minutes.

The transcription includes timecode markers but does not identify speakers.

For speaker identification, the transcript can be loaded into another tool like Descript.

Transcripts can be copied to clipboard, pasted, and saved with timecode markers.

Downloads are available in text file format and SRT format for video captioning.

The SRT file is likely more accurate than YouTube's auto transcription for captioning.

The free service does not allow for audio preview or text editing within the tool.

The transcription tool is a separate entity from the Riverside app and does not require a Riverside account.

There might be future integration with the Riverside app, as hinted at an upcoming announcement.

The accuracy of the transcription was tested and found to be around 88% similar to a proofread transcript.

Most inaccuracies were minor, such as punctuation and capitalization differences.

The tool did not correctly transcribe certain proper nouns and lacked custom vocabulary support.

Despite the limitations, the tool is considered highly accurate and useful for free transcription services.

The transcription can serve as a great starting point, though proofreading is still recommended.

The tool is accessible at riverside.fm/transcription and offers transcription for podcasting and video podcasting.