Generative AI needs more than a light touch

Chatbots such as ChatGPT raise huge data-protection and moral questions regulators must address.

generative AI,ChatGPT,OpenAI — The ‘gee-whiz’ promotion of ChatGPT cannot mean OpenAI gets a regulatory free pass (rafapress/shutterstock.com)

Italian users cannot gain access to ChatGPT. The chatbot based on artificial intelligence, launched in November 2022, is now geo-blocked in the country. At the end of last month, following an investigation, the Italian Data Protection Authority (DPA, also known as the Garante), adopted a landmark precautionary order to limit temporarily the local processing of Italian users’ data by OpenAI, the company based in the United States developing and managing the platform.

Mainstream media outlets and even powerful ministers lamented the DPA’s move as reckless. Technology pundits and start-uppers accused it of conspiring against the country’s global competitiveness. The story is however more complicated—and, with Spain and the European Union’s data watchdog adding to the scrutiny, offers important lessons.

‘Privacy nightmare’

All over the world, broad concerns are emerging about the nefarious consequences and ‘risks to society’ posed by generative-AI models, prompting experts and business leaders to call for a moratorium on updates, to favour research and implement safety protocols. While to those enthralled by ‘digital enchantment’ this cautious approach may seem a neo-Luddite plot ignited in academic and policy circles, it is about fundamental values in democratic societies. Indeed, technology companies are often accorded ‘a regulatory latitude not given to other sectors’.

As experts have demonstrated, large language models (LLMs) represent ‘a privacy nightmare’. They are based on processing gigantic swaths of data, scraped from undisclosed sources. This critically relies on a free underlying infrastructure of personal data, in some cases even proprietary or copyrighted—never mind the sensitive data which may be lightheartedly shared by users when interacting with these systems.

Professionals are starting to use generative-AI applications as low-cost assistants. The information they enter—a draft employment contract, a budgetary report to revise or top-secret data—could be the output for others’ queries. Such privacy nihilism is troubling.

Data breaches

On March 22nd, the chief executive of OpenAI, Sam Altman, tweeted that it ‘had a significant issue in ChatGPT due to a bug in an open source library’. In other words, some users had full access to the titles of other users’ conversation histories where chats are stored. The company admitted to ‘feel[ing] awful about this’. A similar data breach was reported regarding information on payments by subscribers.

Both glitches ‘seemed to indicate that OpenAI has access to user chats’, the BBC reported from San Francisco. In a parallel reality, this invasion of informational self-determination would not go unnoticed, causing public outrage and reputational damage. Not for the first time however, a purported corporate ‘disruptor’ appeared to enjoy a huge ‘get out of jail’ card.

The Italian DPA notified ChatGPT about a set of serious infringements. First, the company had provided no information to users and data subjects whose data were collected by OpenAI (as required by article 13, General Data Protection Regulation). Secondly (and remarkably), it had not identified any robust lawful basis for the massive collection and processing of personal data (article 6, GDPR).

Thirdly, it had shown lack of respect for accuracy: the chatbot was inclined to make up details which proved false. Finally, the absence of any age-verification mechanism might expose children to responses inappropriate to their age and awareness, while the service was only addressed to users aged over 13 according to OpenAI’s terms.

Clearly mandated

The reaction of Altman was patronising. Nevertheless, the issues raised by the Garante were clearly mandated by the EU GDPR. Utilising its prerogatives ‘to impose a temporary or definitive limitation including a ban on processing’ (article 58(2)(f), GDPR), it set an example other national authorities may soon follow.

OpenAI was given weeks to explain how it intended to come within European guardrails. It however decided to discontinue its service in Italy. The move has sparked uncertainty for all operators in the field. Yet, after a meeting between the company and the DPA, several conditions to be met by the end of April were identified for the ban to be lifted. Should it fail to demonstrate that the ‘legitimate interest’ or ‘consent’ criteria are fulfilled, the company could face fines, sanctions or a definitive ban.

OpenAI’s reaction is typical of some technology companies when they believe they can sidestep universal constraints: selectively withdraw from a market, blame the regulator and mobilise users (and others who fall for this pitch) to defend a service operating free of constraint. It recalls the dawn of the platform era, when food-delivery and other gig-economy players circumvented legislation under the strange assumption that an innovation was only genuine if retrospective forgiveness, not advance permission, was sought for it. It remains to be seen what the response will be, considering that analogous controversies may soon emerge, after the launching by the European Data Protection Board of a dedicated task force ‘to foster cooperation and to exchange information on possible enforcement actions’.

Light-touch approach

The proposed EU regulation on AI envisages minimum transparency obligations for specific systems—in particular where chatbots or ‘deep fakes’ are used. The draft text was presented in April 2021 and is undergoing legislative scrutiny.

The advent of complex, generative-AI models however shows the need for a broad comprehension of AI, whose application can be malign as well as benign. The risks include mass misinformation, the generation of prejudiced and stereotypical content and large-scale manipulation (otherwise prohibited in the proposed regulation). This reckoning should push EU co-legislators to reconsider the light-touch approach, with only notification obligations for lower-risk systems.

We have argued that, despite the aim of delivering a modular and targeted framework, AI technologies are classified in an ‘abstract’ and ‘context-neutral’ manner within the regulation as drafted, with no consideration of case-specific uses. This fails to appreciate the multi-purpose, versatile and adaptive nature of AI systems. Aside from the developers’ duties, the framework only barely addresses the progressive widening of the use of systems beyond the purposes for which they were originally intended and designed. The co-legislators should consider the general approach of the Council of the EU, adding provision for situations where AI systems can be used for many different purposes.

There is another, often neglected, dimension to this. The sociologists Jenna Burrell and Marion Fourcade have written that ‘what stands beneath the fetish of AI is a global digital assembly line of silent, invisible men and women, often laboring in precarious conditions, many in postcolonies of the Global South’. And a Time investigation has documented that OpenAI depends on the exploitation of Kenyan, Ugandan and Indian workers.

To reduce toxic and unsafe content, the company outsourced labelling to a San Francisco-based firm, Sama, whose contracted-out workers had to tag situations such as ‘child sexual abuse, bestiality, murder, suicide, torture, self harm, and incest’ as inappropriate material. These labelling, classifying and filtering tasks were remunerated at between $1.32 and $2 per hour, according to roles and seniority.

Enormous concerns

ChatGPT and its sisters, such as DALL·E, Synthesia and MusicLM, raise enormous technological, ethical, social, environmental and political concerns. The DPA simply addressed the challenge from the perspective of data protection, which at the moment is one of the few sets of operating rules to target the very first phases of the AI lifecycle. Non-European tech firms dealing with EU-based data subjects must follow the same rules as European companies.

OpenAI’s first response lacks moral scruple. Imagine a car company failing to provide mandatory seat belts in its cars and being so alerted by a national transport authority. How should one judge a corporate choice to quit selling cars in the country rather than remedy the error?

The norm-breaking ethos of technology companies must be tackled with less lenient responses, not allowing vague pro-innovation rhetoric to go uncontested. Digital progress may well improve the way we live, work, learn and interact with one another. But emerging technologies must be governed in such a way as to achieve socio-economic sustainability.

Antonio Aloisi

Antonio Aloisi is a professor of European and comparative labour law and digital transformation researcher at IE University Law School, Madrid. He co-authored Your Boss Is an Algorithm (Hart, 2022) and advises institutions on algorithmic management and AI-at-work policy.

Valerio De Stefano

Valerio De Stefano is a law professor at Osgoode Hall School, York University, Toronto.

We need your help.

Support Social Europe for less than €5 per month and help keep our content freely accessible to everyone. Your support empowers independent publishing and drives the conversations that matter. Thank you very much!