We’re always trying to learn new things and experiment with new technologies. Often projects come about by a question being asked that we don’t know the answer to. Our latest project stemmed from such a question, will voice activation work for OOH?
Voice recognition/activation is a technology that is growing exponentially. According to ComScore, 50% of all searches will be completed via voice by 2020. That’s a figure that becomes less surprising as time moves on and more people have an Alexa or Google Home. With voice commands seamlessly assisting with the day to day running of households, why wouldn’t it be able to be used in OOH as well?
To find out how possible it is, we thought we should start by buying a few different types of microphones to test what was going to be the most suitable in the OOH environment. We opted for two little mics, one stage and one shotgun mic. We thought this would provide a decent range of sizes and multi-directional options to at least give us a starting point.
The actual process for testing a microphone isn’t remarkably scientific. Our creative technologist Jon Jones attached various mics to an easyVR module for testing. And by attaching, it is literally getting the old soldering iron out and sticking the parts together. Then someone stands a metre or so away from it to see whether easyVR can understand what they’re saying without being affected by irrelevant noises.
We found that while the easyVR module was trainable to a particular voice, it would struggle to accurately understand the same phrase being said by someone else. As it’s primarily designed as a hobbyist item, we had to explore different software options for speech recognition. Writing some software to interface with Microsoft Azure’s Cognitive voice services proved hugely successful and accurate in transcribing what a user was saying.
In our tests, we found that all of the microphones performed well but the directional shotgun mic gave us the greatest flexibility, but being realistic, it wouldn’t be practical for digital OOH. We would likely need something much smaller and more compact, that could deliver us the same accuracy with results.
We decided to see how far we could push the compactness of the microphone and invested in a small USB lapel mic, which would be perfect for hiding in a digital OOH installation. A couple of test trips down Regent Street and at bus stops later … Success! This combination of hardware and software was performing brilliantly.
With the mic sorted, the next phase was training the AI to recognise what people are trying to say. People generally veer away from the script if the question is too vague or open-ended.
Entering a bit more code enabled us to not only hear what a user was saying but also understanding, in-depth, the meaning and intent of what they were saying. For example, saying “I’m hungry” or “What’s for lunch?” has the same intent – it’s a request to find somewhere to eat. With voice activation/detection in OOH, simplicity is the key, so minimise the options available as potential answers or actions.
To train the device, we had our colleagues talk to the computer and answer the set question. We ended up with some rather rogue responses… But it was all a learning curve and really reinforced the need to ask simple questions that couldn’t be misinterpreted.
Overall it’s been an incredibly interesting project, and certainly the potential for bringing voice activation to digital out of home is there. It will be exciting to see how voice technology evolves and becomes a more prominent feature in people’s lives.