Voice is About to Become the New User Interface and a Global Equalizer

In November 2014, Amazon released the first smart speaker that had a digital assistant named Alexa. For those not familiar with what this is, its simply a cloud-connected smart speaker that a user can issue voice based commands to do simple things like play music, seek weather, traffic, and news updates and also control other smart devices in the home. All this is done by initiating a conversation with the smart speaker by uttering a ‘wake’ word before the command such as ‘Alexa how is the weather today?’ and Alexa would respond with the weather update, here the word ‘Alexa’ is the wake word and the minute it hears this, it actively listens for the next words and decodes what you are saying by use of cloud-based speech recognition systems. As of mid 2019, Amazon estimates that 30% of American and European homes have a smart speaker, up from 22% a year ago. This is about 100 million Alexa devices in the market.
Not to be be left behind, in 2016, Google also released a virtual assistant called Google Assistant and was initially available on select smart speakers but in 2019 made it available in over 1 Billion android phones in the world (talk about scale!). Other virtual assistant flavors include Apple’s Siri that’s available in all iPhones, Microsoft Cortana in Windows 10 and Samsung’s Bixby.

With these assistants available in smart speakers and phones, a user is able to interact with a computing device such as phone or personal computer to access information and carry out tasks that would have traditionally required them to use an input device such as a touch screen, mouse or keyboard. For example, instead of unlocking my phone screen and opening Google maps to check traffic conditions to say Galleria Mall, all I need to do now is say to my phone ‘Hey Google, traffic to Galleria Mall?’ and the assistant would answer back with the results like ‘There is moderate traffic to Galleria mall, from where you are, it should take you 7 minutes to get there’. I can also initiate a phone call by simply saying ‘Hey google, call Thomas Sankara’ and the assistant will search for his number in my phone book and initiate the call without me touching the phone. On the appliances and electronics side, I will no longer need to look for the TV remote and change channels and I can instead simply say ‘TV, change channel to BBC news’ and its done. This is so good in many ways because:

  1. Its much faster and involves fewer steps to get the same results if not better
  2. It is more natural and intuitive than current interfaces that often need some training/skill or even literacy to use
  3. I can do all this while my hands and eyes are occupied doing something else. For example if I’m driving, I can still get to use maps and make calls without looking at or touching the phone. another example if I could ask the TV to change channels while I’m busy preparing a sandwich.

Other than accessing information from the internet as per the above examples, voice based assistants can also be used to control smart devices and appliances (explains Samsung’s foray with bixby) schedule/cancel meetings and open apps in the phone, all by using voice commands.

Why is this a big deal?
With the recent advances in Artificial Intelligence and Machine learning, Speech recognition systems have become pretty accurate in deciphering words in human speech. With speech being a highly variable input because everyone has a unique voice and accent and variable surrounding noises, it was initially difficult to get computing systems to understand human speech, but with AI and Machine learning advances in the last 5 years, this is now possible. Google assistant and Alexa can now decipher English speech and accent by Lemaiyan from Narok or Billy Ray Cyrus from Texas with near equal accuracy for both inputs.

The biggest leverage that voice has is that AI systems that power these digital assistants are now being trained in various languages and dialects. As of mid 2019, Amazon’s Alexa supports seven overall languages: English, French, German, Italian, Japanese, Portuguese (Brazilian), and Spanish. Google Assistant on the other hand currently supports sixty overall languages including Swahili, Telugu, Gujarati, Zulu, Mandarin and many more.
With the addition of more languages currently ongoing, a voice based interaction with the Internet through mobile phones and smart speakers means that people who were previously locked out of the benefits of the Internet because they could not read and write would all over sudden be able to access the limitless opportunities that being connected presents to them in the comfort of their local language. It will soon be possible for everyone in the world to search for information on the internet, interact with a mobile phone or computer apps, home appliances and electronics by simply speaking to it using the local language. This will be the most significant step in bridging the digital divide since the liberalization of telecommunications in the 1990’s and can be leveraged to create a more equal society. The multiplier effect of this is mind boggling if you think about it. A farmer in Eldoret will be able to seek markets for his produce or even operate a herbicide spraying drone by issuing voice commands in his local language, A mother in rural Sri Lanka will be able to seek nutritional information for her child by speaking the local language to her phone’s digital assistant, set reminders for hospital visits or school meetings without the need for her to know how to read and write in English. A non Greek speaker will also be able to participate in conversations taking place in Greek seamlessly by using the assistant to translate the conversations back and forth.

The popularity of voice based interaction is also growing with the touch screen slowly taking a backseat as the main user interface to the treasure trove that is the Internet and modern appliances and electronics. The below stats sampled from developed countries lend to the fact that voice based user interface to technology and the services it provides is on a hockey stick trajectory in adoption (source):

  1. 40% of adults use voice search on a daily basis (Forbes)
  2. 52% of people use voice search while driving (Social Media Today)
  3. 65% of consumers ages 25-49 years old talk to their voice-enabled devices daily (PwC)
  4. On average, more men than women use voice search at least once per month (Social Media Today)
  5. A study conducted by Uberall found that 21% of respondents were using voice search on a weekly basis (Search Engine Watch)
  6. Close to 50% of people are now researching products using voice search (Social Media Today)
  7. The number of voice search increased by 35x from 2008 to 2016 (Kleiner Perkins)
  8. A HubSpot survey found that 74% of respondents had used voice search within the last month (HubSpot)
  9. Mobile voice search on Google is now translated in over 60 languages (Wikipedia)

With the main mode of interaction with the online world being voice based, the rise of voice based services will also be on the rise. Organizations are today deploying chatbots and voicebots to answer customer queries, take orders and fulfill them. For example, in the USA, its now possible to order pizza from Pizza hut by simply saying ‘Alexa, order pizza hut’ and it will provide the menu options. If you instead say ‘Alexa reorder pizza hut’, then it proceeds to re-order what you ordered last time. This improves the efficiency of service delivery as these bots are available 24/7 at nearly zero marginal cost per additional customer unlike hiring humans to do the work. These systems are also very well versed in the specific details and operations of the company and know were each bit of information is in the organization. A chatbot does not need to put the customer on hold to confirm something from sales or finance department, it has access to all this information and can serve the customer in real-time.

Social media will also move from the current text and multimedia based platforms such as Facebook to voice based personas or avatars. Instead of curating an abstract Facebook wall with posts and status updates, people will curate voice avatars that will be continuously trained to learn information about us and even speak on our behalf (in our exact voice even). For example, a person can train his avatar to respond to questions on social media on their behalf. If my Avatar has access to my calendar and I have allowed it to respond to people (or specific people) about my schedule and itinerary for the day, then another avatar/user can ask it where the other user is or what they will be doing at 3PM today and get and answer. My avatar can also represent me in online meetings and take note of what was discussed and what my take aways or action points from the meeting are, and share this with me at the end of the day. The blurring of the line between social media and real-life will also happen as this avatar can also take on responsibilities in real life. For example. Instead of the HR manager sending a mail to staff inviting them for a physical meeting to brief them on the new staff medical cover, the manager can instead invite all staff avatars to the meeting and leave me to do more productive activities during the meeting time, a win-win for everyone. The avatars being AI based systems, will also be more efficient in recalling information and analysis better than a human and can be used to carry out repetitive tasks or work on my behalf and I get paid. The avatar efficiency and closeness to my offline behavior and character will be a function of how much information I allow it to learn about me. The more I let it learn about me (how I speak, my moods, my social life, my work life, my plans for the day etc), the closer it will be to resembling me as I am in real-life. Mix this with all the information that is on the Internet and you have yourself a virtual worker who can work on my behalf and also interact with others online while I sleep or go fishing in Murang’a. This is the idea behind Microsoft Cortana, create a digital assistant for the workplace that can learn about you and assist you in your work in the office to schedule and remind me of meetings, look for information in the company ERP systems, respond to emails, read reports and take action, etc.

Despite all these possibilities, the issue of privacy and security is at the forefront as the major road block to voice based user interface adoption. For example, is your smart speaker or google assistant on your phone constantly listening to your conversations that are outside the wake word? Can hackers eavesdrop into your intimate or personal one-on-one talk with others in the room?
The truth is there will be no escape to voice adoption as it presents the most natural way for most humans to use and control technology and also allow technology to talk back to us with feedback or results in a way we understand. With the coming hyper-connected world and IoT devices, the current user interfaces such as touch screens will be unable to make us efficiently interact with technology. There is therefore a need for the developers of these systems to put in place measures that will build trust in these systems and instill confidence that the systems are not being abused or used to intrude into our private spaces, thoughts and speech.

The other fear is the cybersecurity aspect. There was a story last year where hackers used AI speech generation systems to imitate the voice of a company CEO on phone and stole a large amount of money. (Read about it here or a local version of the same here). This presents a new threat by voice based systems to the cyberspace and this needs to be dealt with in the design and implementation of these systems.

Finally, web based systems and apps are these days being designed with the ‘mobile first’ philosophy, this is about to change into Voice first, Watch this space.

My Thoughts on The US-China Trade Wars’ Effect on US Tech Dominance

The arrest of Huawei’s CFO, Meng Wanzhou in Vancouver in December 2018 at the request of the US brought to fore the ongoing trade war between the US and China which is mostly instigated by US president Donald Trump. The arrest came after the U.S. Department of Justice accused Meng (she’s also the daughter of Huawei CEO Ren Zhengfei ) for allowing SkyCom (a Huawei subsidiary), to do business in Iran, violating U.S. sanctions against the country and misleading American financial institutions in the process. This action attracts a jail term of over 30 years in the US.
Hot on the heels of her arrest were concerns that the perceived close ties Huawei has with the Chinese communist government, would allow the Chinese state to spy on any country that runs Huawei telecom equipment especially the upcoming 5G network. It is alleged by the US that Huawei has ‘backdoors’ to all their hardware that can allow unhindered entry into any network and conduct espionage or even shut down the equipment. The US has therefore banned all US telecom operators from using Huawei equipment in their networks especially in 5G deployment.

Why 5G and not 4G?
Unlike 4G which was backward compatible with older 3G and 2G technologies, 5G is the first generation of mobile technology that is not backward compatible. This means that to roll out 5G, a totally new network is needed compared to previously where an upgrade from 3G to 4G was mostly done by upgrading the software of several network components and adding few more components. 5G therefore means building an entirely new network from scratch for mobile operators.
Another factor is that 5G is designed to power the next generation of connected devices and enhance the adoption of IoT worldwide. What this means is that in the near future, vehicles, furniture, and factory machinery will all be connected; surgeries, remote robot operations and many of today’s manual activities will be done by machines that will be connected via the 5G network. In a global economy that is increasingly dependent on connectivity to derive any economic and technical efficiencies, the desire to have full control of the 5G network is obvious. Take a scenario where the entire US transport, agriculture and manufacturing sectors runs on 5G network equipment that Chinese government has a direct access through backdoors created by Huawei for the Chinese state. This is what Trump fears. Indeed he is somewhat justified to habour these fears, but the question is whether they are valid fears. I will not delve into the politics of it but will instead look at the possible scenario of the effect of America’s actions towards Huawei and the industry dynamics.

Trade Wars and Cybersecurity
With the escalating Trade war between US and China, Donald Trump last week declared a national emergency on cybersecurity. Trump signed an executive order declaring a national emergency relating to securing the US cybersecurity supply chain. Under the order’s provisions, the U.S. government will be able to ban any technologies that could be deemed a national security threat. The order, “Securing the Information and Communications Technology and Supply Chain,” opened the door for the US government to classify companies like Huawei as a national security threat and ban the company’s technology from the US and also forbade US companies from trading with Huawei unless they have special permission/license from the government.

This order has resulted in companies such as Intel, Qualcomm and Alphabet (Google’s parent company) stopping the supply to Chinese firms of components that use American technology. One of the most notable announcement was by Alphabet stopping Huawei from accessing the licensed Android Operating System that it uses for its smart phones. With Huawei being the largest telecom equipment manufacturer and 2nd largest smartphone manufacturer in the world (Samsung being the leader and Apple being number 3), this ban will have far reaching effects on Huawei’s business plans. However, if the rumors are true, Huawei has been anticipating this day and already have an in-house developed mobile OS that is believed will takeover from Android OS. They have also been stockpiling chips and components that can last them 3 more months as they seek alternatives.

In their 2011 book “That Used to be Us: How America Fell Behind in the World It Invented and How We Can Come Back”, Thomas Friedman and Michael Mandelbaum list the major problems the US faces today and possible solutions. These problems are: globalization, the revolution in information technology, the nation’s chronic deficits, and its pattern of energy consumption. They also go ahead and state that to reclaim their position, the US must approach this revival with ‘war-like’ conviction. Something Trump seems to be doing to the letter.

US Tech Dominance is Waning
There are several events that have shown that the US is losing its dominance in technology space and is doing all it can to try protect that privilege.
Early in 2018, the US banned ZTE, a Chinese mobile equipment manufacturer from sourcing electronic parts from US suppliers, the reason for the ban again was the flouting of the Iran sanctions by supplying Iran with telecom equipment. The ban resulted in ZTE seeking alternative suppliers and also led to more investment in the chip manufacturing sector in China. A month before, Trump vetoed the proposed takeover of the US based microchip manufacturer Qualcomm by Broadcom, (a US founded firm but domiciled in Singapore) due to the ownership structure. FYI, Qualcomm holds several key patents on 3G and 4G which are one of its biggest revenue streams. all 3G and 4G phones purchased globally pay a royalty fee to Qualcomm for using its patents.
With the upcoming massive uptake of 5G and the rollout of a connected global economy, the US fears that Broadcom’s takeover of Qualcomm would make the US lose control of the 5G technology space. With Huawei having developed their own 5G microchip and the global market for 5G chipsets in smartphones expected to grow at a compound annual rate of 75 per cent between 2019 and 2024, The US is fearful that the cheaper and better Huawei 5G chips will erode Qualcomm’s revenues and dominance.

The recent settlement between Apple and Qualcomm inadvertently put Huawei as a top contender for the 5G chip market. This is because when Apple agreed to pay Qualcomm the disputed royalties for Apples’ use of Qualcomm patents, they also agreed to now use Qualcomm chips in all their subsequent 5G phones. Since 2016, Apple has been using Intel chips as the court battle raged, this gave Intel an opportunity to finally ride the mobile wave crest it nearly missed some years ago. However, the shock announcement by Intel that its pulling out of 5G chip research and development hours after Apple and Qualcomm settled their dispute placed Huawei as a possible major 5G chip supplier. All this time, Huawei has been developing 5G chips for its own internal consumption (used on Huawei equipment only), but with Intel pulling out, Qualcomm’s obsession with high royalty fees and Huawei offering the same technology (if not better) for much lower cost than Qualcomm and devoid of royalties, means that Huawei can easily start supplying their 5G chips to third parties such as Nokia, Ericsson, Samsung and others. This prospect of a global economy running on Chinese supplied 5G chips is what is scaring Trump. All the ‘national security’ talk is but an excuse to defend these seemingly drastic measures the US is taking against China as they head full-steam towards tech dominance. On the Internet front, the US is also feeling the heat of the decreasing dependency on US internet infrastructure to power the global internet. China has effectively managed to create a separate internet for its citizens whose size cannot be ignored. For example, WeChat has 1.04 Billion active users in China (Compared to WhatsApp’s 1.5 Billion globally), Weibo, China’s version of twitter, has over 462 Million monthly active users (Twitter has 260 Million), E-commerce site Alibaba has been growing at an annual rate of 58%, despite its revenues being far behind Amazon ($178B vs $40B), Alibaba’s net profit is comparable to Amazon. If this growth is sustained (and all signs show it will), Alibaba will surpass Amazon in sales within 5 years. China also has all odds stacked in their favour because:

  • China has over 830Million internet users (That’s more than twice the entire US population)
  • China plus its Asian neighbors account for 49% of the global internet users, closely followed by Europe with 16.8% of global internet users.
  • China’s 830Million internet users account for 20% of all users globally, followed by India’s 500Million users.
  • 98% of all Chinese access the web via a mobile device, compared to 73% in the USA.

The above few examples show that China has been successful in creating its version of the web that is somehow independent of western resources.

Russia’s National Internet Plans

With the announcement by Russia that a law is now in place to create what they call a ‘national internet’, Russian aims to be in full control of how Russian internet traffic is routed. Among other measures, it dictates the creation of the infrastructure to ensure the smooth operation of the Russian internet in the case of Russian telecom operators’ failure to connect to foreign root servers.

This seemingly drastic move was necessitated by the aggressive nature of the U.S. National Cybersecurity Strategy adopted in 2018 under Trump. To give a background as to why Putin signed the bill into law, It would be worthy noting that the US currently controls the top level root servers that help all internet users worldwide to convert human friendly domain names (such as http://www.tommakau.com) to computer friendly address names, see the list here. This effectively means if the US decided to block certain countries from accessing the root servers, they will have effectively blocked them from accessing much of the internet and world wide web. The US also has legal right over all dot com, dot org, and dot net domain names (including this blog) and can take down any website it so wishes because they are all legally US websites. “Under these conditions, protective measures are necessary for ensuring the long term and stable functioning of the internet in Russia.” Said Vladimir Putin when signing the bill.

What Next?

These recent events indicate one thing; The world is detaching itself from over-dependence on US technology and infrastructure. This has got the US into panic mode. With the future becoming highly connected, technology will be at the fore front of human advancement and the US fears it might not be the world leader it has been in the past, like oil today; whoever has control of technology in future will control the world. To cloth this fear in ‘national security’ is callous for Trump to say the least, this is not about national security but about commercial dominance. The outcomes of such drastic moves cut both ways and will end up harming the US more than helping it. What I am sure is that China will emerge victorious if today’s statement by the Huawei CEO is anything to go by. In the last 8Hrs, the US has offered a 3-month ‘stay of execution’ for the Huawei ban because of the number of US citizens who own Huawei phones and operators who are heavily dependent on Huawei gear. This is to enable them transition to other phones and network equipment.

Locally, the US-China trade war might have major ramifications especially if the ban extends to other Chinese telecom gear and handset manufacturers. Tecno, infinix and iTel command a 34% market share of all smartphones. All these three brands are made by one company and this extends the risk further based on the market share by this one company should it also face the ban. It is also worth noting that large sections of Kenya’s mobile network runs on Huawei gear (including the base stations serving State house). The short term effect of this ban will be increased operating costs by local mobile operators as they seeks alternatives from European suppliers whose equipment and deployment cost is more expensive. The long term effect is that these Chinese companies will adapt to not depending on US tech and come up with their own technology and processes to manufacture even cheaper telecom equipment; couple this with the ambitious Belt and Road Initiative by Xi Jinping and the US will have forever lost its dominant position on the world stage.