Theme
Today, and every day in the UK, about seven million people will talk to themselves, with only a machine as company. On a weekly basis, about 13 million will do so.
Most often they will be asking questions. If they are at home they are likely to be naming a song. Some may simply be wondering what day of the week it is; others may clamour for a joke or ask an irreverent question.
These people will likely be heard, and should get a response, albeit from a machine, and via a synthesised voice that serves as the vocal representation of digital voice assistants that now reside on tens of millions of devices in the UK. The best-known assistants currently are: Apple’s Siri, Amazon’s Alexa, Google’s Assistant, Microsoft’s Cortana and Samsung’s Bixby.
Most smartphones, tablets and PCs have a pre-loaded digital assistant, and apps for additional voice assistants can be downloaded. The latest TV sets include a voice assistant, and many smart TVs feature voice recognition. This technology is also found in and supported by an ever-widening array of devices and objects, from cars to ovens, and from robot hoovers to garden sprinklers.1
The availability of a voice assistant does not guarantee it will be used. As of mid-2018, the majority of voice assistants integrated into devices are dormant. For example, 88 per cent of 16-75 year olds have a smartphone, but only 34 per cent (about-two fifths) have ever used any voice assistant on this device (see Figure 1), and a mere nine per cent (a tenth) use this functionality daily (see Figure 2). Daily usage levels on most other devices are even lower: the exception is the smart speaker (see Figure 3).
While a mere 12 per cent of 16-75 year olds have access to a smart speaker, this cohort are enthusiasts. Over half (55 per cent) of smart speakers are used daily (equivalent to four per cent of all respondents), and 85 per cent are used weekly (see Figure 3). The smart speaker’s voice assistant is the second most commonly used on a daily basis across all devices.
The high frequency of usage may reflect a high proportion of early adopters among current owners of smart speakers, as well as a degree of novelty. Or it may simply be that voice assistants and speakers are a particularly effective combination, available at an accessible price point, and with voice being the predominant means of interacting with this device.
Over the last year, access to a smart speaker (which would most commonly mean one being in the respondent’s home) more than doubled from 5 to 12 per cent. Should the base of smart speakers grow 140 per cent again over the next 12 months and the frequency of usage be maintained, this device may displace the smartphone most commonly used as a voice assistant.
Smart speaker adoption should continue to grow strongly: they are still relatively affordable and scarce. Their potential intrigues. And their capabilities should increase over the coming years as voice recognition become more accurate, and as the range of applications available expands.
Access to smart speakers has been able to climb fast partly because of the cost, and also because of the prior low ownership base. Entry level smart speakers from Amazon and Google are often available on promotion for about £30, a level at which people may be tempted to experiment with one. At this relatively modest cost, the device may have to fulfil only a few applications to be considered value for money. Thirty pounds would pay for a reasonable radio, a reasonable Bluetooth speaker, or it would pay for a smart speaker (at promotional prices) that you could ask to play a radio station or your music. And that speaker could also be your kitchen timer, a gadget that would otherwise cost a few pounds.
One analyst has forecast a global base of 100 million smart speakers by the end of 2018.2 This is a large number, and as they are placed in homes, the user base of these devices may be 200 million, but for context, 1.5 billion smartphones, most of which would come pre-loaded with a voice assistant, were shipped in 2017. The base of smart speakers is also far smaller than the global base of PCs, tablets or TV sets.
The application of a voice assistant is likely to vary by context. Each device is used in different environments.
Smartphones accompany users and are typically held when used, and may be used in environments with background noise, factors which may cause a higher failure rate among questions asked.
Smart speakers are often in a room, tethered to the mains, and often with no ambient noise. Users would typically be within a few metres of the device.
The most common application of a digital assistant on a smartphone is to search for general information. This was mentioned by 54 per cent of smartphone owners who used a voice assistant on this device (see Figure 4).
For speakers, it is to play music (it is a speaker, after all), used by 77 per cent of those utilising a voice assistant on this device (see Figure 5). While the audio quality of an entry level smart speaker may be merely adequate, this may suffice for a large chunk of the population.3
For both devices, the second most common application is weather information (we are British, after all).
On speakers, search is the third most popular application; on smartphones playing music via a digital assistant is the fifth most popular.
Voice assistants on smartphones and smart speakers will always be used for different contexts. Voice commands are likely to become more popular on TV sets, and TV related commands, from requesting specific programmes to changing the volume are inevitably going to be the most common commands on that device. As online content libraries swell, voice recognition may become increasingly used as a way to request specific episodes of a favourite series.
While the surge in usage of digital assistants remains relatively recent, adoption and usage may be strongest among older age groups. Access to smart speakers trebled year on year among 55-75 year olds from a low base; across most other age groups, ownership doubled (see Figure 6).
As of mid-2018, daily usage of smart speakers was highest among 45-75 year olds, at 63 per cent, and only 40 per cent among 18-24 year olds. Weekly usage among 45-54 year olds is 90 per cent.
One reason for smart speakers’ popularity among these age groups may simply be ease of use. The ability to speak a request of a machine, as opposed to typing it, may avoid the need to find reading glasses. Some older users may also be more at ease conversing with a machine than traversing a menu on a screen.
Over the next few years, voice assistants will likely proliferate. Usage should rise steadily, accuracy will continue improving, language support should increase, the array of devices incorporating a voice assistant will increase, and the number of applications that support a voice assistant will rise.
It is very likely that we will talk to machines more, but this does not mean that we will cease to tap, type or swipe. In fact usage of all forms of interface with machines may increase.
Voice works well in specific contexts, but may always be hampered by the inherent inefficiency and nuance of voice. Ordering a pizza via an app that pulls name, address and credit card information off the phone is simple; speaking that same information is a chore. Furthermore, tapping, typing and swiping have no equivalent of an accented voice. In London there are three hundred different languages spoken, and perhaps thousands of variations on the pronunciation of ‘Gloucester Terrace’ or ‘Theydon Bois’.
Making a pizza with dough sodden hands, and using a voice assistant to get instructions, set timers, and select music is where voice comes into its own. And this qualifies voice as a core facet of the evolution of the user interface, but it does not mean it will become the dominant user interface.
15 Best Amazon Alexa compatible gadgets to buy in 2018, MakeUseOf, 18 January 2018: link
Smart speaker installed base to hit 100 million by end of 2018, Canalys, 7 July 2018: link
The smart speaker category describes a vast array of audio capability. Entry level devices, which are likely to represent the majority of units sold, may be purchased predominantly for their voice assistant. High-end speakers costing hundreds of pounds may come with a voice assistant by default, but be selected on the basis of their perceived audio quality.