I have spent the last few weeks understanding the impact of a great revolution in the world of Artificial Intelligence and NLP on the customer experience. Not from a purely technical point of view, but trying to estimate the competitive advantage that this new approach can generate. We are facing yet another disruptive innovation, and it can bring significant advantages, let’s try to find out which ones.
It all started with the paper “Attention Is All You Need” that has put the NLP world in turmoil. It was immediately understood that something new appeared in the world of artificial intelligence. This was immediately followed by the release of BERT and the associated paper. Skipping a few milestones, OpenAI presented his new model GPT-3, a model that, according to those who have tried it, offers incredible performance. Unfortunately, GPT-3 will not be an open-source model as it is GPT-2.
But what can these models really do, and why are they hailed as the new NLP frontier? I will try to explain it to you in simple terms the competitive advantage they can provide to those who can skillfully employ them to support business processes.
First of all, we are talking about pre-trained models. They are models trained by millions and millions of data. Models designed to act in a completely unsupervised way, generalizing the context as much as possible. It is, of course, possible to fine-tune them with a minimum effort, to focus on the specific context.
In this case, we are talking about transfer learning, something like training a human brain to get the knowledge to pass the maturity exam, and then later make them study a very precise field. Fine-tune the brain to understand and generalize in that field. I realize it’s a bit of a long-shot similarity, but you get the idea.
In business terms, it means leaving something very expensive and cumbersome: the creation of training datasets to train models. Whoever, in these last years, has clashed with this reality, knows what I’m talking about: hours and hours of people manually classifying datasets, such as those for sentiment analysis. Read some tests, and classify them in positive, neutral, and negative. Quite a nuisance, in terms of cost and time.
I have faced my study of these last weeks with a library in particular: huggingface.co. This library is a small revolution within the great transformer revolution.
Think that today to work with these kinds of models you have to use two reference frameworks: PyTorch and/or TensorFlow. In the second case, you can use a slightly higher-level interface, Keras, sacrificing flexibility. Well, huggingface.co is like the X Window System for Unix distributions. It is the Mac OS of transformers. An extremely simple interface to deal with transformers.
But let’s see what a huggingface pipeline can do today, out-of-the-box, from a business point of view:
- Sentiment Analysis, returning the sentiment of a text, for example, feedback from a customer.No need for previous training.
- Text Generation, give the beginning of the sentence to the template, and it will complete it. For example to comment on financial or operational results. No need for previous training.
- Name Entity Recognition (NER) by exactly identifying the entities described in a text, an extremely important capability for all chatbots that we want to automatically dialogue with clients. No need for previous training.
- Question answering: in this case you provide the model with a context, a long text like a section of a user manual, and a series of questions. The model will automatically answer the questions.No need for previous training.
- Filling Masked Text: given a text with masked words, fill the blanks. This is a fundamental ability of transformers to “understand” the word in its context, and not in sequence.
- Summarisation: generate a summary of a long text.
- Translation: translate a text in another language.
- Feature extraction: return a tensor representation of the text.
But what really surprised me in the last few days from the point of view of using these models in NLP is what is called zero-shot classification.
Practically pre-trained models with millions and millions of data, which need no further training to classify new texts. Just give them the corpus you want to classify, and the candidate labels you want to classify. The model, without any further training, will be able to classify with an accuracy that in my tests proved to be very accurate. (No need for previous training)
We are certainly faced with something extremely innovative that can completely change the analysis of unstructured data and the competitive advantage that this can create. The transformers will definitely help to better understand and provide services to your customers. It can be a chatbot, a Q&A service, a framework to analyze sentiment, a topic detection data visualization, etc. Transformers will help us to humanize better interactions between customers and machines. They will definitely help to provide The Phygital Journey faster and easier for your clients.
In my small world, this is the next AI step I will bring to sandsiv+ to manage and improve the performance of our Customer Experience platform, with something truly revolutionary, helping our clients to gain competitive advantage through Customer Experience Management.
For the geeks only:
Previously published at https://www.linkedin.com/pulse/impact-ai-transformers-customer-experience-federico-cesconi