When working with multimedia files, many users may need to translate audio into text, that is, translate speech into text, transcribe what was said as text. Such an opportunity is often in demand, in cases where users need to translate voice into text.
Some users do not always have the opportunity to listen to audio files or view video files. Often, it is much more convenient and faster to read a voice message or speech accompaniment in the form of text, in printed form.
Therefore, it is necessary to translate audio, video into text, for further use of this material, for example, in an article or somewhere else. This work is performed as follows: speech is recognized and what is said is translated into text using a program or online service, manually or automatically.
Transcription: what is it
Transcription – converting speech from audio or video to text, translating voice information into printed data. Perform transcription manually or using applications and services.
The following data is used as the source material for voice extraction:
- own speech dictated into a voice recorder or spoken into a microphone;
- audio podcast;
- local audio or video file;
- video on YouTube or on another hosting;
- audio from the Internet;
- TV show;
- talking on the phone;
- a voice message;
When using the manual method, transcription is performed in two ways:
- In the first case, this is done by the user himself, synchronously entering text from the listened speech into the text editor.
- In the second case, the decoding of the voice into text is done by a specially hired employee.
There are specialized exchanges where you can place a task for translating speech into text, performed for a fee. Manual transcription is a rather time-consuming process.
When converting audio to text, you need to pay attention to the following points, which affect the quality of converting audio to text:
- diction must be clear;
- the pace of speech is normal;
- correct pronunciation.
Otherwise, due to a fast or, conversely, too slow pace of speech, accent, slurred diction, external noise, quiet sound, software errors may occur. In any case, the typed text must be edited, put punctuation marks, and correct errors.
In this guide, you will find instructions on several ways to convert voice to text using programs and online services. This makes transcription much easier.
Depending on the circumstances and the software used, voice typing is possible from a foreign language or into a foreign language. You will find information about this feature in this article, when describing some useful tools.
Implementation of additional activities
Some computers will require the VB-CABLE Virtual Audio Device driver to be installed . In the volume mixer settings, you need to enable a virtual audio cable in order for voice input to work on the PC when using online services or in some programs.
On a PC with Realtek sound cards, you do not need to install a driver, in the sound settings, in the “Sound” window, in the “Recording” tab, enable the “Stereo Mixer” option.
On my computer, it was not necessary to perform these manipulations. Therefore, before installing the virtual cable driver, check the operation of the microphone in the online translator. If voice input from the microphone is working, no driver installation is required.
Google Translate will help us translate voice to text online. This method works in the Google Chrome browser, or in other browsers based on it.
Open the Google Translate service in your browser, and then do the following:
- You must first select a source language to enable voice input.
- Click the Voice Input (Microphone) icon.
- Allow the translator to use the microphone on your device.
- After the microphone image changes color, start speaking into the microphone. The application will automatically enter text into the translator window.
- Copy the translation, paste it into any text editor, such as Notepad, Microsoft Word, etc.
Dictated text can be immediately translated into another language. To do this, in the adjacent translator area, select the translation into another language.
The service has a limit of 5,000 characters per translation. You can get around the limitation in the following way: dictate the text in parts, copying the translation one by one into a text editor.
In Google Translate, you can translate speech to text online from audio or video files located on the Internet:
- Click on the microphone icon located in the field for entering the translator.
- Then, in another browser tab, start playing the video or audio online.
In this image, Google Translate is converting speech to text from a YouTube video.
But what about a local audio or video file on a computer if we need to extract text from there? Don’t worry, Google Translate will come to the rescue again.
You will need to do the following:
- Open Google translator, turn on voice input.
- Start playing the video or audio file on your computer.
- The text will appear in the translator window.
The Yandex Translator service for translating speech into text works in any browser, unlike Google translator.
Follow these steps:
- Open the Yandex Translator page in a browser.
- Click on the microphone icon (Voice input) located in the source text input field.
- Allow Yandex Translate to use the microphone on your computer.
- Speak into the microphone, the text will be displayed in the translator window. At the same time, the text will be simultaneously translated into another language, if you need such an opportunity.
Yandex Translator has the ability to translate video or audio files from the Internet:
- Turn on the microphone on the panel to enter the original text.
- Open another tab in your browser, start playing audio or video from the Internet.
- The text will be displayed in the Yandex Translator window. In parallel, a translation into another language will be introduced (if you need it).
The Yandex Translator Service has a limit of 10,000 characters for one translation. Bypassing the limit on the number of translated characters:
- When approaching the limit, pause the player, or stop dictating into the microphone.
- Copy the translated text into any text editor.
- Enable voice input, and then play the original video or audio file again to continue translating audio to text online.
Converting speech to text from a video or audio file stored on a PC using Yandex Translator:
- Open the Yandex Translator window, click on the “Voice input” (microphone) button.
- Play a video or audio file on your computer using the multimedia player.
- The voice-to-text translation will appear in the translator window for entering the source text.
The Google Drive cloud storage has a built-in Google Docs service, in which you can translate audio data into text. This method works in the Google Chrome browser, and in other browsers based on Chromium.
Go through a few steps:
- Sign in to Google Drive.
- Click on the “Create” button.
- In the context menu, select first “Google Docs”, and then “Create a new document”.
- In the “New Document” window, open the “Tools” menu, click on the “Voice Input” item (called by the key combination “Ctrl” + “Shift” + “S”).
- Click on the microphone button and then start talking.
- The speech spoken into the microphone is translated into text that is entered on the document page.
- Save the document to the cloud storage, or download the file to your computer in one of the supported text formats.
There is no limit on the number of characters you can type in Google Docs.
To extract text from video or audio files on the Internet, you will need to enable voice input, and then start playing the desired file in another browser tab.
If you need to translate voice into text from a video or audio file on your computer, do the following:
- In the Google Docs window, enable voice input.
- Play the video or audio file in the player on the PC.
- Text from the local video or audio file being played will appear in the document.
Speechpad – Notepad for speech input
The speechpad.ru online service works in the Google Chrome browser. For speech translation, the Google translator service is used. There is an extension SpeechPad (voice text input) for the browser, with which you can enter text from your voice on sites on the Internet.
For best quality, it is recommended to use an external microphone.
On the spechpad.ru site page, do the following:
- On the page of the “Notepad for speech input” service, click on the “enable recording” button.
- The Result Field will display the text extracted from your voice.
- Edit the received text, and then download it to your computer.
Recording time in this mode is limited to 15 minutes.
The service can translate video or audio files from the Internet or from a computer into text. For this, two methods can be used.
- In another browser tab, start playing video or audio on the Internet, or play an audio or video file from your computer in the player.
- On the Spechpad speech notepad page, click on the “enable recording” button.
- The text from the video or audio will appear in the resulting field.
- On the main page of the service, click on the “Transcription” button, which is located under the resulting field.
- On the Transcription Panel page, select a file from your computer, or enter the media file’s URL.
- Start media playback in the built-in player. For a YouTube video, insert the video ID into the field, not the full link, as in the example.
It has a lot of settings that you can change to get the best result.
In transcription mode, the recording time is not limited.
Online service Dictation.io
The dictation.io service translates sound dictated into a microphone into text for free, or speech from video and audio files.
Using the service is very simple:
- Select a voice input language.
- Click on the button in the form of a microphone.
- Start speaking into the microphone.
- The field will contain text from your message, or speech playback from the Internet (opened in another browser tab) or from a file being played in a multimedia player on your computer.
The result can be copied, downloaded to a computer as a text file, sent by e-mail, played in the player (you will need a voice engine installed in Windows), sent to print.
Free program LossPlay for transcribing audio or video, working in Russian. This is a multimedia player for playing audio or video files. The program was developed to decrypt (transcribe) audio and video files manually.
The main features of the LossPlay program:
- Support for a large number of media formats;
- Using hot and multimedia keys on the keyboard;
- Support for pasting timecode;
- Change the playback speed;
- Create screenshots of playing files.
The program can be downloaded from the official website of the developer.
After installation, the program will offer to download and then install on your computer the necessary codecs from the K-Lite Codec Pack and QuickTime to be able to play all supported media formats.
Manual transcription in LossPlay is done with the following steps:
- Add a multimedia file to the program window.
- Open a text editor.
- Start playing the file in the player.
- Listen and at the same time manually type the text you are listening to in the text editor window.
VOCO is a program for translating audio to text
The VOCO application is designed to translate voice to text in the Windows operating system. The Voco program is paid, the application works in Russian.
You can download the application from the official website of the Center for Speech Technologies. The motto of the program: “Write with your voice.”
The main features of the VOCO program:
- launching the program using hot keys;
- basic dictionary of 85,000 words;
- automatic insertion of punctuation marks in recognized speech from audio files;
- installing a transcription plugin for Microsoft Word in Voco.Professional and Voco.Enterprise versions;
- the ability to work without using the Internet.
Voice to text recognition goes like this:
- Launch the Voco program on your computer. With default settings, the program starts with the system.
- Click the mouse cursor in the text editor window (Notepad, Word, etc.) in which you want to enter text.
- Enable recognition from the context menu of the program icon located in the notification area, or using hot keys: press the “Ctrl” key 2 times. A green microphone icon will appear above the notification area.
- To disable recognition, double-click on the “Ctrl” key.
When using the Voco.Professional and Voco.Enterprise versions of the program, the “Transcriber” tab will appear in the Microsoft Word text editor. This function allows you to convert audio recordings recorded in single-channel “mono” into text. If the audio is recorded in stereo, the text will appear as if it was recorded by multiple speakers.
Do the following:
- Open the Transcriber tab in the Word window.
- Click on the “Transcriber” icon, the buttons for managing the transcription process will open.
- The built-in player will open in a separate window.
- Click on the “Open” button, select an audio recording.
- Click on the “Recognize” button to start the process of converting voice to text.
- Wait for the recognition to complete, and then edit the received text.
Many users are faced with the need to translate speech into text from an audio or video source. This process is called transcription. To solve the problem, you can use online services or programs. Depending on the selected tool, text output occurs automatically or the text is typed manually.