AI development offshore v onshore

Artificial intelligence (AI) development offshore

The most talked about aspect of Generative AI (GenAI) is AI developers’ use of other people’s content in connection with ‘training’ foundation Large Language Models (LLMs). These are mostly related to text (e.g. analysing and generating text content). There are also AI Diffusion Models, which are primarily for images, audio and video.

These have resulted in a large number of lawsuits, mostly in the US where the developers are based. The companies being sued include: OpenAI, Microsoft, Alphabet (Google), Meta (Facebook, Instagram), Stability AI (Stable Diffusion), Anthropic, Github, Nvidia, Databricks (MosaicML), Suno and Udio.

Some of these companies have used Australians’ works, such as Australian writers’ works in the ‘Books3’ pirated books dataset (see here). It is difficult to know the extent to which Australians’ works have been used, however, because AI developers do not currently disclose the content that they have used.

There have been some licensing deals between OpenAI and publishing companies (including NewsCorp and the Financial Times), but the past use of content remains unresolved.

For a range of reasons, including cost and that the training occurred in another country, it is difficult for Australians to take legal action under copyright law to redress the unauthorised and uncompensated use of their content in connection with foundation LLMs.

The companies that have developed foundation LLMs are offering products and services in Australia. Copyright Agency and others have asked the Government to look at mechanisms to compensate Australians whose works have been used in connection with foundation LLMs.

AI-related activities in Australia

Ethical and equitable use of content for AI development in Australia

It seems unlikely that AI development on the scale of the foundation LLMs would occur in Australia, because of the computing power and resources required. Development here will continue to involve smaller datasets. They can be sourced ethically and legally, including under copyright licensing arrangements.

Australia’s copyright legislation is well set up to enable ethical, equitable AI development in Australia that will be beneficial to Australians. Copyright Agency and others in the creative industries have asked the Government to resist calls by multinational companies to water down Australia’s copyright system, because it is not in Australia’s interest.

So far, companies and governments that are piloting local AI development appear to be using data that they own copyright in, or are permitted to use. These early developments are things like chatbots developed by consultancy firms (here), personalised learning tools for school students (here and here), internal AI tools for bank staff for summarising documents and generating communications to customers (here), and government analysis of documentation (here).

As developments progress, locally developed AI products, services and tools may benefit from other people’s content available under licensing arrangements.

Using other people’s content in ‘prompts’ for AI tools

People using AI tools can use other people’s content as part of their prompt. For example, a person may ask an AI tool to summarise a report or journal article.

This can be covered by copyright licensing arrangements for the person’s workplace. For example, teachers can use other people’s content in prompts under the education statutory licence, provided the requirements of the licence are met, including that the content is not copied or stored outside the school system (see here).

Human-authored content in outputs from AI tools

AI developers of foundation LLMs say that they are not supposed to include human-authored content in outputs. This has, however, occurred, as evidenced by some of the lawsuits against the developers.

A person using an AI tool may not know if the output includes human-authored content. Copyright licensing arrangements for workplaces can cover this scenario. For example, teachers can copy and share outputs from AI tools that include human-authored content under the education statutory licence, provided the requirements of the licence are met (see here).

Copyright protection for outputs from AI tools

There is quite a lot of discussion about whether outputs from AI tools can be covered by copyright. This is mostly in the US, where there is a copyright registration system.

Australia does not have a copyright registration system. If someone were sued for breach of copyright for using an AI output generated by someone else, a court would likely look at the level of human involvement in the generation of the output and how the output had been used by the alleged infringer.

August 2024

Share Tweet