Koko, ChatGPT, and the outrage over corporate experimentation

Online mental health service Koko accidentally sparked outrage recently by disclosing it experimented with using ChatGPT to write messages to users. But the outrage, in my view, was focused on the wrong issue: nonconsensual experimentation.

What happened

Koko is a peer-to-peer messaging service for mental health support. Users post messages anonymously, and other users answer them anonymously. According to tweets from its CEO, Rob Morris, it ran an experiment where help-givers had the option to generate draft messages with OpenAI's ChatGPT. The human helper then had a few options – send the message without editing, edit it and then send, or scrap it and write their own. By analyzing thousands of chats with and without ChatGPT, the company said it found people rated ChatGPT-assisted messages more highly than human-composed messages.

At first it appeared that recipients of help weren’t informed of ChatGPT’s involvement, and several mainstream media outlets reported the story as such. “Mental health service used an AI chatbot without telling people first” read the headline of a New Scientist article, which has since been changed. Later, articles in Gizmodo and Vice clarified that all users were informed, and AI-generated messages came with a notice saying ‘written in collaboration with kokobot’. It was easy to get the wrong impression, though. Morris had said on Twitter, “once people learned the messages were co-created by a machine, it didn’t work” - which implied they hadn’t been initially informed. 

What people got outraged about

The outraged discussion about this has focused on the fact that Koko was (apparently) experimenting on users without their knowledge. We now know this was never true, but the nature of the outrage is interesting. People wanted to know, did Koko have this reviewed by an IRB (Institutional Review Board)? The CEO responded, correctly, that IRB review wasn’t required. Further outrage ensued. This kind of experimentation is wildly unethical and probably illegal, folks said. 

Experimentation is the wrong focus for our moral outrage

In general, product testing is not subject to IRB approval. Companies experiment on users all the time without IRB review and without specifically notifying users. Such tests are not only legal, but may even be a really good thing.

The term "A/B Illusion", coined by Bioethics professor Michelle Meyer, brilliantly captures the faulty reasoning behind fear of corporate experiments. It describes the fact that we tend to think A/B testing is bad, but just implementing A or B without any testing is fine. In a 2015 New York Times Op-Ed with the provocative title “Please, Corporations, Experiment on Us,” Meyer and Christopher Chabris make a compelling argument that (in some cases) experimenting on people without their consent is not only legal and ethical – it’s a good idea. If we don’t know whether A or B is better, performing an experiment may be the best way of finding out. Absence of consent might be essential, because If people know about an experiment, that knowledge can distort the results. They’re careful to note this argument only applies to low-risk settings, and no one should be performing high-risk experiments without informed consent.

You might argue that mental health support is a high-risk setting, and therefore doesn’t fit Meyer’s and Chabris’ argument. In addition, OpenAI specifically disallows the use of ChatGPT for health purposes. But keep in mind Koko is a peer support network; the human helpers are likely unqualified too. No one is receiving, or expecting to receive, standard-of-care treatment through Koko. 

To be clear, I’m not arguing that this means it would have been fine for Koko to use ChatGPT without informing users. What I'm arguing is that the failure to inform users would have violated a different ethical principle, unrelated to experimentation.

Transparency is the real issue

Transparency is one of the core principles of AI ethics frameworks such as the EU Ethics Guidelines, the IEEE Ethically Aligned Design guidelines, the OECD AI Principles, and many others. In fact, a systematic review of AI ethics principles revealed that transparency was the single most common principle found in AI ethics frameworks. Transparency includes many elements, but one of the most basic requirements is that users know when they’re interacting with AI. 

In this, Koko appeared to have failed. In Morris’s initial descriptions, it sounded as if support-givers knew they were using ChatGPT, but support-receivers didn’t. If this had been true, it would have been a clear violation of one of the most universally-agreed-upon principles in AI ethics. 

We can learn something else about transparency from this incident: having a human in the loop doesn’t remove the need for transparency. Even though Morris always made it clear that humans were reviewing every ChatGPT-generated message and deciding whether to use it, people were outraged at the idea that recipients didn’t know the messages came from an algorithm. If there’s one important takeaway here, it’s this: people really want to know when they’re interacting with AI.

Previous
Previous

What’s the difference between the new Bing and ChatGPT?

Next
Next

Why I don’t think ChatGPT is going to replace search anytime soon