Computers and humans have never spoken the same language. Over and above speech recognition, we also need computers to understand the semantics of written human language. We need this capability because we are building the Artificial Intelligence (AI)-powered chatbots that now form the intelligence layers in Robot Process Automation (RPA) systems and beyond.
Known formally as Natural Language Understanding (NLU), early attempts (as recently as the 1980s) to give computers the ability to interpret human text were comically terrible. This was a huge frustration to both the developers attempting to make these systems work and the users exposed to these systems.
Computers are brilliant at long division, but really bad at knowing the difference between whether humans are referring to football divisions, parliamentary division lobbies or indeed long division for mathematics. This is because mathematics is formulaic, universal and unchanging, but human language is ambiguous, contextual and dynamic.
As a result, comprehending a typical sentence requires the unprogrammable quality of common sense — or so we thought.
Solving human semantics with mathematics
But in just the last few years, software developers in the field of Natural Language Understanding (NLU) have made several decades’ worth of progress in overcoming that obstacle, reducing the language barrier between people and AI by solving semantics with mathematics.
“Such progress has stemmed in no small part from giant leaps forward in NLU models, including the landmark BERT framework and offshoots like DistilBERT, RoBERTa and ALBERT. Powered by hundreds of these models, modern NLU software is able to deconstruct complex sentences to distill their essential meaning,” said Vaibhav Nivargi, CTO and co-founder of Moveworks.
Moveworks’ software combines AI with Natural Language Processing (NLP) to understand and interpret user requests, challenges and problems before then using a further degree of AI to help deliver the appropriate actions to satisfy the user’s needs.
Nivargi explains that crucially here we can also now build chatbots that use Machine Learning (ML) to go a step further: autonomously addressing users’ requests and troubleshooting questions written in natural language. So not only can AI now communicate with employees on their terms, it can even automate many of the routine tasks that make work feel like work – thanks to this newfound capacity for reading comprehension.
What a chatbot thinks when you talk to it
Nivargi provides an illustrative example of an IT support request, which we can break down and analyze. Bhavin is a new company employee and a user is asking the chatbot how he can be added to the organization’s marketing group to access its information pool and data. The request is as follows (graphic shown below at end):
“Howdo [sic] I add Bhavin to the marketing group.”
In large part due to the typing/spelling mistake at the start (instead of ‘how do’, the user has typed ‘howdo’) we have an immediate problem. As recently as two years ago, there was not a single application in the world capable of understanding (and then resolving) the infinite variety of similar requests to this that employees pose to their IT teams.
“Of course, we could program an application to trigger the right automated workflow when it receives this exact request. But needless to say, that approach doesn’t scale at all. Hard problems demand hard solutions. So here, any solution worth its salt must tackle the fundamental challenges of natural language, which is ambiguous, contextual and dynamic,” said Nivargi.
The first challenge is always ambiguity
A single word can have many possible meanings; for instance, the word ‘run’ has about 645 different definitions. Add in the inevitable human error — like the typo in this request of the phrase ‘how do’ — and we can see that breaking down a single sentence becomes quite daunting, quite quickly. Movework’s Nivargi explains that the initial step, therefore, is to use machine learning to identify syntactic structures that can help us rectify spelling or grammatical errors.
But, he says, to disambiguate what the employee wants, we also need to consider the context surrounding their request, including that employee’s department, location and role, as well as other relevant entities. A key technique in doing so is ‘meta learning’, which entails analyzing so-called ‘metadata’ (information about information).
“By probabilistically weighing the fact that Alex (another employee) and Bhavin are located in North America, Machine Learning models can ‘fuzzy select’ the firstname.lastname@example.org email group, without Alex having to have specified his or her exact name. In this way, we can potentially get Alex’s help and get him/her involved in the workflow at hand,” said Nivargi.
Human service desk agents already factor in context by drawing on their experience, so the secret for an AI chatbot is to mimic this intuition with mathematical models.
Dynamic language, dynamic chatbots
Finally let’s remember that language — in particular the language used in the enterprise — is dynamic. New words and expressions arise every month, while the IT systems and applications at a given company shift even more often. To deal with so much change, an effective chatbot must be rooted in advanced Machine Learning, since it needs to constantly retrain itself based on real-time information.
Despite the complexity under the hood, however, the number one criteria for a successful chatbot is a seamless user experience. Nivargi says that what his firm has learned when developing NLU technologies is that all employees care about is getting their requests resolved, instantly, via natural conversations on a messaging tool.
As we stand at the turn of the decade, we humans are arguably still not 100% comfortable with chatbot interactions. They’re still too automated, too often non-intuitive and (perhaps unsurprisingly) too to machine-like. Technologies like these show that we’ve started to build chatbots with semantic intuitive intelligence, but there is still work to do. When we get to a point where technology can navigate the peculiarities and idiosyncrasies of human language…. then, just then, we may start to enjoy talking to robots.