A website enhances your digital presence as well as functionality. And they are growing in numbers day by day and becoming sophisticated over time.
The startups are leveraging various tactics to establish their online market through stunning web designs and ultimate blogging frameworks.
Web designers are spending a great deal of their time mulling over out of the box ideas. They are continuously looking for creative and arresting designs and features to blend into their websites. However, there is a simple but attractive feature that many of them miss, and that is speech recognition.
Understanding Speech Recognition
Speech recognition is reckoned to be a complicated task by many. Yes, it may have been, but that was before the dawn of Web Speech APIs. You may have seen the mic icon while using Google Chrome or Firefox. These browsers are implemented using the Web Speech APIs created by Google. But before we go into the Web Speech APIs, it is essential to understand the fundamental of speech recognition.
Conversion of Speech Form
In a speech recognition system, the first element is “speech.” The speech that comes as a sound is converted to a digital signal with a microphone attached to the device. An analog to digital converter is a part of the system along with the mic to produce the digital equivalent of the speech.
Structuring of Speech
The speech structure analysis and theory is in the domain of Natural Language Processing. Here the complex concepts of linguistic and phonetics are employed to devise strategies to structure and analyze speech. However, a programmer need not take such troubles. But a firm understanding of the process could provide meaningful insights during coding.
A speech is a waveform and is thus analyzed part by part. The waveform is split into segments by inserting silences or gaps into the waveform. Thus the waveforms get segmented.
Each segment of the waveform is called a phoneme. It is the fundamental unit of speech.
Phonemes are like utterances, and each utterance varies from person to person. Then the phonemes are matched to decipher the most likely word.
Web Speech API
The W3C community in 2012 introduced the Web Speech API guidelines. But Google Chrome seems to be the only search engine that has capitalized on these guidelines. Hence the Google’s speech recognition engines can be used to develop creative and inventive speech recognition features. Google’s presence currently makes it free as well.
Broadly, speaking the Web Speech API can perform two functions:
- Speech Recognition
- Speech Synthesis
This API allows us to input the voice and generate a corresponding textual form. The text will later drive the various features on the website like tab control, speech-based search, booking, paying, password reading, etc.
A Sample Speech Recognition Tool
The Web Speech API has a primary interface which acts as the controlling interface for the speech recognition service. It also consists of many other closely interwoven interfaces dealing with vocabulary, grammar, results, etc.
Here a simple coding to design and test your own speech recognition tool has been given. If all goes rightly, it will look like this :
The CSS codes will provide the blue colouring and design as you can see above. You can further work on it to express your creativity.
<!– CSS Styles –>
<style>
html, body {
display: flex;
align-items: center;
justify-content: center;
background-color: lightblue;
}
.record {
position: relative;
width: 246px;
display: inline-block;
}
.record input {
text-align:center;
border: 0;
width: 240px;
display: inline-block;
height: 30px;
}
.record img {
float: right;
width: 25px;
height: 25px;
border: none;
position: absolute;
right: 7px;
top: 3px;
}
.container {
display: inline-block;
text-align: center;
}
h1 {
font-family: constantia;
}
</style>
The HTML and JavaScript essentials you need to call the API into your service and do the recognition for you are provided below :
<!DOCTYPE html>
<html>
<head>
<title>Voice Recognition: Mindster</title>
</head>
<body>
<!– Search Form –>
<div class=”container”>
<h1>Voice Recognition in HTML</h1>
<div class=”record”>
<form id=”speak-form” method=”get” action=”/8ffdefbdec956b595d257f0aaeefd623/search”> <input type=”text” name=”q” id=”transcript” placeholder=”Speak” /> <img onclick=”startRecording()” src=”http://icons.iconarchive.com/icons/designbolts/free-multimedia/1024/Studio-Mic-icon.png” />
</form>
</div>
</div>
</body>
</html>
<!– HTML5 Speech Recognition API –>
<script> function startRecording() {
if (window.hasOwnProperty(‘webkitSpeechRecognition’)) {
var recognition = new webkitSpeechRecognition();
recognition.continuous = false;
recognition.interimResults = false;
recognition.lang = “en-US”;
recognition.start();
recognition.onresult = function(e) {
document.getElementById(‘transcript’).value = e.results[0][0].transcript;
recognition.stop();
document.getElementById(‘speak-form’).submit();
};
recognition.onerror = function(e) {
recognition.stop();
}
}
}
</script>
Security Challenges
There are some significant security challenges in using this feature. Some of them are listed below :
- User consent before the activation of the feature is necessary.Users should know that recording of audio is going on in the background
- Some visible feature like a beeping light signal or similar indication should be provided to alert the user about the audio recording.
- Developers must work on creating a three-level security module to categorize the recording of passwords and confidential speech elements.
- Provide an elaborate guideline to educate the user about the kinds of threats and how to set up the environment around them to prevent overhearing, disturbance, etc.
The speech recognition feature and its integration have opened up a whole new scenario to web designs and how we use the internet. The speech recognition is already revolutionizing the entire realm of mobile application development.
With speech recognition, the world will become much friendlier to specially challenged individuals. You won’t have to write the mundane things in filling application forms, all you have to do is speak aloud lying comfortably on your back, and the blanks will supply itself. If you do not like the color of a website speak aloud your favorite color and it will be so.
All the above applications and much more are possible just by customizing your website with speech recognition. So, do not brood on with this new possibility. Transform your site now with speech recognition and see the miracle.
Premjith leads the Digital Marketing team at Aufait Technologies, a top notch SharePoint development company in India. With his 4 valuable years of experience in online marketing, he helps clients expand their online presence and mushroom novel business ideas.