Voice/Speech Recognition
Take Me To VOICE RECOGNITION DEVICES Right Now!
|
A Voice Recognition voiceprint is a spectrogram. A spectrogram is a graph that shows a sound’s frequency on the vertical axis and time on the horizontal axis. Different speech creates different shapes on the graph. Spectrograms also use colour or shades of grey to represent the acoustical qualities of sound. All of our voices are uniquely different (including twins) and cannot be exactly duplicated. Speech is made up of two components. A physiological component (the voice tract) and a behavioural component (the accent) Some companies use voice recognition so that people can gain access to information without being physically present, like in a phone call. Unfortunately people can bypass this system by using a pre recorded voice from an authorized person. That’s why some systems will use several randomly chosen voice passwords or use general voiceprints instead prints of specific words. The voiceprint generated upon enrolment is characterised by the vocal tract and a cold does not affect the vocal tract. Only extreme vocal conditions such as laryngitis will prevent the system from proper voice recognition. |
![]() |
A Voice Recognition voiceprint is a spectrogram. A spectrogram is a graph that shows a sound’s frequency on the vertical axis and time on the horizontal axis. Different speech creates different shapes on the graph. Spectrograms also use colour or shades of grey to represent the acoustical qualities of sound.
All of our voices are uniquely different (including twins) and cannot be exactly duplicated. Speech is made up of two components. A physiological component (the voice tract) and a behavioural component (the accent)
Some companies use voice recognition so that people can gain access to information without being physically present, like in a phone call. Unfortunately people can bypass this system by using a pre recorded voice from an authorized person. That’s why some systems will use several randomly chosen voice passwords or use general voiceprints instead prints of specific words.
The voiceprint generated upon enrolment is characterised by the vocal tract and a cold does not affect the vocal tract. Only extreme vocal conditions such as laryngitis will prevent the system from proper voice recognition.
During enrolment, the user is prompted to repeat a short phrase or a sequence of numbers. Voice recognition can utilize various audio capture devices (microphones, telephones and PC microphones). The performance of voice recognition systems may vary depending on the quality of the audio signal. Random words and phrases are used so that no unauthorized use is suspected.
The benefits of voice recognition are that it can use existing telephone systems, it can be automated and used with speech recognition and that it has a low perceived invasiveness. The weakness of the system is a high false non-matching rate.
Speech recognition is the computing task of validating a user’s claimed identity by using characteristics extracted from their voice. Speaker recognition uses the acoustic features of speech that are different in all of us. These acoustic patterns reflect both anatomy (size and shape of mouth & throat) and learned behaviour patterns (voice pitch & speaking style),
If a speaker claims to be of a certain identity and their speech is used to verify this claim. This is called verification or authentication. Identification is the task of determining an unknown speaker’s identity.
Speech recognition can be divided into two methods. Text dependent and text independent methods. Text dependent relies on a person saying a pre determined phrase whereas text independent can be any text or phrase. The methods can easily be deceived by someone playing a pre recorded phrase of a person who is authorized.
A speech recognition system has two phases. Enrolment and verification. During enrolment, the speaker’s voice is recorded and typically a number of features are extracted to form a voice print, template or model.
In the verification phase, a speech sample or utterance is compared against a previously created voiceprint. For identification systems, the utterance is compared against multiple voiceprints in order to determine the best match or matches, while verification systems compare an utterance against a single voiceprint. Because of this process, verification is faster than identification.
Voice / speech recognition systems are mostly used for telephone based applications. Voice verification is used in government offices, healthcare, call centres, financial services and customer authentication for service calls.
To find the best solution for voice and speech recognition systems, please check out our sponsors below.


