Chatbots: a scourge of humankind

[Update 13/1/2023: and then in late 2022 ChatGPT — a Large Language Model (LLM) — was launched and the world was amazed by its perfect sentences and seemingly knowledgeable answers. Most existing chatbots still use the 1980’s ‘expert system’ technology for a knowledge domain, but ChatGPT seems to show this is not necessary. This is a bit premature, though, see: Cicero and chatGPT — signs of AI progress?]

Who has ever interacted with a helpdesk-type chatbot and been satisfied and impressed? From my personal experience: my satisfaction hovers around -5 on a scale of 1 to 10. Is yours different? [De Nederlandstalige versie van dit artikel vind je hier]

Working in IT, I also know why chatbots are generally put in place. Real people to answer your questions are expensive in many ways. A chatbot is much cheaper. Sadly, your customer gets what you pay for.

Chatbots are also for another reason interesting. With all the recent reporting on AI, you might think that AI really is intelligent. But chatbots are a perfect example that the AI hype today isn’t that much different from the AI hype 30 and 50 years ago (Yep. There already was an AI-hype fifty years ago). In reality, chatbots are morons. Irritating morons, at that. They might help only in the simplest cases, which of course why I find them frustratingly irritating. Simple cases, I can generally fix myself. When I need help, I’m beyond “did you turn it on?”, so, my perspective is skewed. Still, my interaction with them shows how ‘smart’ (read: ‘dumb’) they actually are and what kind of 1980’s style (failed) AI technology we are actually dealing with.

“1980’s?”, I hear you say. Yep: 1980’s. Because either you were not yet aware of the ‘first waves of AI’ or you had the idea that our current situation in AI is quite different in technology than the past and thus will not end in the same kind of fiasco. But appearances do mislead here. Most chatbots (probably all outside the domain of pure research) are task-oriented chatbots, meaning they are based on a human-crafted model of a limited problem space, where the text you put in is used to get from one state in that model to another state. Basically, such models are exactly the same as what was the cutting edge of AI in the 1980’s: the ‘expert system’. I get a déjà-vu: the first code I wrote in a job was a frontend in C for an expert system written in Prolog, but I digress, as usual. These days, the linguistic analysis maybe be a deep neural net or it may be simply the use of trigger words. But the know how you are confronted with, is much like the expert system of long ago.

Back to 2020. Recently, I had to interact with the chatbot of Marktplaats, a Dutch online market that is owned by eBay. eBay is not a minor outlet, so I can assume that the chatbot technology used is not the cheapskate version but something serious. Actually, all my other experiences with chatbots from large organisations are the same, so Marktplaats is as good an example as any. The situation was this: I wanted to look into an older conversation I had with someone who bought something from me. The reason was that I was going to offer something else later and I had promised to tell him as soon as it became available. As I like to keep promises, I went looking for his messages on the web site. But, all my messages older than six months were gone. I remembered — probably incorrectly — that I had seen much older messages just a few months ago, so I wondered, where have my old messages gone? On to the website, start up the help function and my (GW) chat with the chatbot (CB) begins:

CB: Welcome to Marktplaats. How can I be of assistance?

GW: Where are my old messages?

CB: Good to know: Now more than ever, it is our job to properly match supply and demand. With corona help you can offer or ask for help, for yourself or others. For more information, please refer to our Help and Info page.

CB: Do you want to know where you can find an overview of your sent messages?

The chatbot state machine has, from the analysis of my question, put my state as ‘trying to find overview of sent messages’ . I did not ask for that, but my question did have the word ‘messages’ and the word ‘where’ and this is probably the most asked question based on those two words. You need to talk really simple and clearly, otherwise they will get confused. So, to tell the engine this is not my state, I start with a clear ‘no’ and repeat my question.

GW: No, where my old messages are 

The chatbot now knows my state is not ‘trying to find overview of sent messages’, so it falls back to establishing if ‘messages’ are really my problem area. After all, I could have written something like “I received messages promising me a discount but I never got it” and that question is not about messages — even if the word is there — but about a discount. 

CB: If I understand correctly, you are experiencing problems with your messages. Is that right?

GW: Yes

CB: Most Messages issues are resolved by adding the following Marktplaats domain names to safe senders within your email account:
– @ marktplaats.nl
– @ mail.marktplaats.nl
– automatic@marktplaats.nl

CB: You can read how to do this on the ‘Safe Sender’ website. Is there anything else I can help you with?

First, the chatbot — with that last sentence — assumes that it has correctly answered my question, which of course is rather irritating. It has apparently put me in the state ‘messages do not arrive’. Again, that is probably an often encountered problem, with junk mail filters and all. But it is — again — not my problem, nor have I in any way said that it is. So, I tell the chatbot as simple as I can that this is not my problem:

GW: This is not my problem

CB: I don’t quite understand what you mean. May I ask you again what it is about?

So, the chatbot is back to the beginning. So, I start over again.

GW: Where are my old messages?

CB: If I understand correctly, you are experiencing problems with your messages. Is that right?

Hmm, seen that one before. The chatbot tentatively sets my state to ‘messages problems’. It also ignores the word ‘old’ so it has no use for that. At this point I could conclude it has no idea what ‘old messages’ is, it is not in the expert system / state machine. But I do not know that yet, I’m still hoping against all odds, so:

GW: Yes

CB: Most Messages issues are resolved by adding the following Marktplaats domain names to safe senders within your email account:
– @ marktplaats.nl
– @ mail.marktplaats.nl
– automatic@marktplaats.nl

CB: You can read how to do this on the ‘Safe Sender’ website. Is there anything else I can help you with?

I am put back where I was before. I also get the same answer that I already got and of which I said this is not my problem. Having arrived here, I can keep this going indefinitely — well, not really, they can, I cannot — so, I put in a message that I hope will put me in the state ‘needs human assistance’:

GW: This chatbot is worthless

CB: I’m sorry I couldn’t give a good answer. Would you like to contact my colleagues?

I am apparently — see below — put in the tentative state ‘wants to speak a human’ and the chatbot checks if this is correct. So I answer:

GW: Yes

CB: Maybe I can also help you with your question or comment, otherwise I can check what the contact options are. What do you want to contact?

Apparently, the chatbot has put me in the state ‘wants to speak to a human’. But in this state, it tries to steer me away from that — expensive — option, regardless of the whole conversation that went before. So I get the irritating first suggestion (“No! Been there, done that! Bad chatbot!”).

I, now, make a mistake, but that is not completely illogical as the final question is probably meant to mean “do you want to go on with me or with a human?” but is not quite clear. I apparently misread this as “what do you contact about” as a result.

GW: My messages

CB: I can also give you more information about an overview of your sent messages⁠. Do you want this?

The chatbot does another try with a tentative state I have already rejected. I get the feeling I’m playing a sadistic version of Adventure: “You’re in a maze of little twisty passages, all different“.

GW: No.

CB: My colleague will come to chat with you live. To help you even faster, I can already pass on some information. What’s your name?

Yes!! I’m allowed to speak to a real human! So, I answer:

GW: Gerben

CB: My colleague will be there soon. You can indicate at any time that you no longer want to wait.

“Please! Please! Do not use an expensive agent. Please use me or go away!”.

Now every thirty seconds or so, I get a message. There is variation, apparently to soothe me and give me the feeling I am talking to something humanoid, worthy of respect’ and not a system that mainly acts as the first line of defense to prevent me getting the help that I need and that I really want to be friendly with with a blunt instrument.

CB: Please be patient, my colleague will help you.

CB: Please be patient, my colleague will help you.

CB: Please be patient.

CB: We have to wait a while for my colleague. This will take a maximum of 8 minutes.

“We have laid off most humans in the help desk and replaced them with a state of the art chatbot. Our shareholders are happy. Isn’t that great?”

CB: Please wait a moment.

CB: Please wait a moment.

CB: Please be patient.

CB: Please be patient, my colleague will help you.

CB: Wait a minute please.

CB: Please be patient.

CB: Just a moment, my colleague will come and help you.

CB: [Agent’s name] writes a message …

[It turned out they routinely throw away your messages when they are older than a number of months.]

This, I’m afraid, is the state of the art in AI with regard to interacting with humans. And this, I’m also afraid is not that different from what it was 30(!) years ago, which is telling given the unbelievable growth of computing power in that period. If you don’t believe me, read the short post The State of Chatbots in 2019, which uses the chatbot Mitsuku, four-time winner of the Loebner prize, a prize which is given yearly to the most human-like chatbot.

Expert systems are — whatever way you interact with them — hard and expensive to maintain and brittle. They do not scale well. So, unless your problem space is really limited and simple, they will not work well. They are — by and large — a rehash of one of the failed attempts at AI.

You won’t hear me saying that all current AI is nonsense. Certainly not. A nice example about discovering new antibiotics was described in IT is Us. In Something is (still) rotten in the kingdom of artificial intelligence I have written that the deep neural networks and other statistical approaches are far more powerful than the AI of 30 or 50 years ago. But in that same article, I have also explained why anything resembling actual ‘intelligence’ is not to be expected from statistical methods on digital computers and that these methods have a number of essential weaknesses that are generally overlooked.

One aspect of this ‘third wave of AI’ we are in is that it is still in a large part driven by the assumption that what we are doing will eventually get us the desired outcomes, even if these outcomes are currently nowhere to be seen. Take the critical article The State of Chatbots in 2019 that I mentioned before. Here is a quote:

We can expect chatbots to improve markedly over the coming years, as natural language processing in AI improves and proliferates. Major tech companies now regularly build deep neural network technology that allows speech recognition machines to better understand and interact with people. Once deep neural networks become more common, chatbots will start surpassing the classic Turing Test, proliferating online, and becoming a less awkward part of our everyday lives. In fact, some experts claim that chatbots could, one day, evolve to replace search bars in our internet browsing.

1979 Edition of What Computers
Can’t Do
by Hubert Dreyfus

This can be summarised as “It doesn’t work very well now, but we can expect it to work very well in the near future based on the assumption that problems can be overcome”. Exactly such statements were already made in the 1960’s and 1980’s and as Hubert Dreyfus already showed in the early 1970’s, they are based on erroneous assumptions that are deeply embedded in our culture: the belief that human-level intelligence has something to do with logic (rather little as it turns out). If the first waves of AI showed one thing, it was the experimental verification that philosophers like Uncle Ludwig had been right. It took the AI community about 30 years and two AI winters to finally admit Dreyfus had been right in pointing that out. Seriously: don’t read my book on Enterprise Architecture (another mostly failed disciplne). If you read one non-technical book on IT, read What Computers (Still) Can’t Do, by Hubert Dreyfus, his complete and utter takedown of the first 10 (edition 1), 20 (edition 2), or 30 (edition 3) years of AI, including a dissection of the hype of those years and the insight where it came from (the same place where it comes from now).

It is currently a good time to be in AI. And I am convinced that data scientist and neural network engineers are here to stay, though probably not in the numbers many are now thinking. The current statistical AI is extremely powerful in the hands of humans. Billions are now being invested on the promise of magical results. It is quite possible that this third wave of AI also will be followed by a Winter of  AI, though my guess is that the usefulness of statistics will make this winter less harsh.

And chatbots? That is one where customer dissatisfaction will probably in the end put a limit on using them to keep the customer away from the costly helpdesk humans. Of course there is the advantage for the user that you can get yourself tortured 24/7. So, watch out for anyone peddling them, or worse, getting you on board of a dream where chatbots can become really intelligent and can be used for complex interactions with real humans. That ship already sailed, hit a rock, and sunk in the 1980’s… 

PS. There are chatbots that really work. In a domain that is extremely simple, you can have a meaningful and effective interaction. E.g. a chatbot where you can reserve a slot in an appointment calendar (very simple domain, only needs to understand dates and times and a calendar). In some countries, you can cancel subscriptions via your credit card, just cancel the recurring payment. So, a service exists that scans through your credit card statements for recurring items, marks these as subscriptions and lets you cancel them by typing ‘cancel’. Such chatbots seem ‘chat’-based, but they are more akin to somewhat fuzzy ‘command’ interfaces. It’s not what a human would call ‘chatting’.

6 comments

  1. Haha! My qualification is -10. I have had similar experiencies. At least, the backtracking mechanism of Prolog in the ’80s worked.

    Like

  2. Great article and explanation for someone who knows nothing about the technical side of this. I understand the limitations of accuracy and the problems with Tay. And I hate those repetitive chatbots. But why not have one that is limited to making pleasant small talk while I wait?

    Like

Leave a comment