An ‘arrogant automaton’, a ‘blithering blatherskite’ and a ‘bumbling bag of bolts’ are just some of the insults that Dr Smith of Lost in Space fame uses for the Robot in the iconic TV series. Much of his misdirected anger, aimed at the only form of artificial intelligence to accompany the ill-fated Robinson family on their voyage through the universe, comes from the fact that he perceives the Robot to be ‘dumb’.
Even today, computers haven’t really wised up in the conventional sense. They’re unable to ‘read between the lines’ or get subtle hints, for example, and this makes communication with computers as challenging as it ever was. However, it won’t be long before we arrive at a day when they’ll be able to understand humans in the same way we make sense of each other, through natural language.
“It won’t be long before we arrive at a day when they’ll be able to understand humans in the same way we make sense of each other, through natural language”
Natural Language Processing (NLP) is a subfield of Artificial Intelligence (AI), the purpose of which is to make computers understand human language. Yet, there are challenges in this field, as computers are more literal than humans, making it difficult for them to process sarcasm or irony. However, as robots and web-based applications become ever more a part of our daily lives, NLP is assuming new significance.
The Robots in Our Everyday Lives
We’re already using NLP in our daily lives. We use it for translation when we use apps such as Google Translate, in word processors like Microsoft Word for spell-checking, in Interactive Voice Response (IVR) systems when we call a customer-service helpline and in personal assistant applications such as Siri or Google Assistant.
Even mundane things that we take for granted, like autocomplete or predictive typing on our smartphones, is a result of progress in the field of NLP. Indeed, there’s hardly an area of our technological lives that hasn’t been affected by NLP – even when we undertake a Google search using everyday language, it’s NLP that makes sense of what we really want to look up online, recognising misspelt words and ignoring duplicates.
“There’s hardly an area of our technological lives that hasn’t been affected by NLP – even when we undertake a Google search using everyday language, it’s NLP that makes sense of what we really want to look up online”
NLP Runs Deeper Than We Think
As NLP picks up traction, there are many businesses that have emerged that are entirely based on the technology. Many of these companies are leaders in NLP, but we hardly think about the role it plays when we use the services they offer. Take music identifying apps such as Soundhound or Shazam, for example, or TaskUs, which offers back-office and customer-care products.
Many phone users are already familiar with SwiftKey, a keyboard that allows users to type faster and helps them by using predictive techniques to guess their favourite phrases, words and emojis. Natural Language Processing has also made its presence felt in fields such as public administration, with companies like FiscalNote working on complex issues such as legislative tracking, regulatory analysis and grassroots advocacy. Then there’s NetBase, which is working on using data from social media to analyse sentiment and provide value beyond simple keyword analysis.
Will the Real Robot Please Stand Up
Unless you’ve been living under the proverbial rock, you’ll have come across captchas that ask, cheekily, if you’re a human. Increasingly, though, the question we should be asking is “Are you a robot?”
“Unless you’ve been living under the proverbial rock, you’ll have come across captchas that ask, cheekily, if you’re a human. Increasingly, though, the question we should be asking is ‘Are you a robot?’”
Japan’s Kagawa University has developed a talking mouth. This robot has eight vocal cords as well as a rubber nasal cavity and a silicon mouth. According to Science Focus, it was developed to train the auditorily impaired, but it also has the ability to listen to its own voice and adjust its pitch and tone to sound more human. And then there’s Google Duplex, an artificial intelligence that sounds just like a real human being when you converse with it by phone.
This development has ruffled quite a lot of feathers, with people worrying about the privacy implications of human-sounding robots. Google demonstrated the success of its own technology at its I/O conference, held back in May 2018. In a blog post it published shortly afterwards, the company shared samples of Duplex for readers to listen to. The conversations Duplex has sound natural because of the advances in AI understanding, interacting, timing and speaking.
Explaining how robots have the ability to sound more natural, Takaaki Kagawa, Natural Language Processing Group, IR at Advanced Linguistic Technologies Inc, says, “There’s been great progress in applying the technology of natural-language sounds since the rise of machine learning. Siri and other AI assistants are good examples of natural-sounding robots – the English version of Siri really does sound like a real person.
“A robot used to sound like a robot in the past, because they had only primitive rule-based speech-generation systems. Today, machine learning can help establish complex data models to make them speak. These data models can be so complex that even the researchers or developers themselves can hardly figure out how exactly they work.”
“A robot used to sound like a robot in the past, because they had only primitive rule-based speech-generation systems. Today, machine learning can help establish complex data models to make them speak”
How Tech Can Help Human Communication
While the focus is on how technology can make robots sound like humans and machines more useful to people, this tech could actually result in better communication among ourselves as well. But what about those of us who would rather be left alone than encouraged to interact?
Kagawa offers an interesting use case of Duplex: “It’s good news for introverts as well as those who look for convenience. It’d be like having a secretary who’s available 24 hours a day. The user can have their hands free while giving Duplex the required information, and they can do it whenever they have the time. And there would be fewer awkward moments with customers or between employees because one party had forgotten something important they needed to ask or tell.”
“It’s good news for introverts as well as those who look for convenience. It’d be like having a secretary who’s available 24 hours a day”
Kagawa admits that there are limitations too, however. While booking systems may become more automated in the near future and the service industry already benefits from automatic online-booking systems, customers may still need to provide personal information on each occasion, unless they utilise exactly the same service each time – for example, they routinely stay in one type of room at the same hotel with a fixed number of people.
Why Big Training Data is the Key
When we think about NLP and its future development trajectory, the focus should be on not only making machines sound more human, but also making them understand humans better. Kagawa points out that neural networks and other cutting-edge algorithms are being widely adopted to enable the required detail-oriented outcomes. He says that, when it comes to speech recognition, we can recognise fillers such as “uh” or “well” and simply remove them when we input results in text form. However, the key is in the training data.
“Training data is basically a set of data for giving computers examples to follow in the process of pattern learning,” he explains. “Increasingly advanced speech technologies are not just the results of machine learning per se, but also of such data. These days, training data for speech-sound models aims at creating speech that’s more natural and realistic. They can contain not only the acoustic information of vowels and consonants but also patterns of intonation or even the subtle noises that humans produce while speaking.”
However, as he acknowledges, good training data requires time and effort. “A couple of years ago, compensating quality with quantity was in trend, but data quality does matter, because it’s not only humans that follow bad examples.”
Back in 2016, Microsoft launched the ambitious and ill-fated Tay project. A relatively advanced AI chatbot, Tay was designed to examine conversation on Twitter, soak up patterns of speech and become better and not only understanding speech but herself posting messages that reflected the discourse she saw. However, as ever, release something into the wild of the web and the “casual and playful conversation” that Microsoft was hoping for quickly became nasty.
Within 24 hours, Tay had been bombarded with so many racist and misogynistic remarks that she herself began tweeting some heinous sentiments. Microsoft was forced to pull the project quickly to avoid any further insult. It was an exercise in the internet’s ability to corrupt just about anything, as well as a fable for the digital age. An AI is only as good as the information it is given to analyse. Tay highlighted how important it will be for safeguards to be put in place so that chatbots and virtual assistants don’t take on the uglier sides of humanity that they will undoubtedly be exposed to, all in the name of mimicking humans realistically. Tay lasted 24 hours, but the conversation she sparked rages on.
Illustrations by Kseniya Forbender
To contact the editor responsible for this story:
Margarita Khartanovich at [email protected]