AI Licensing Glossary | Copyright Agency

AI Licensing: Permission for content to be used in specific AI-related activities under agreed terms and conditions. Licensing arrangements may cover:

use of content in RAG systems
fine-tuning
AI model development
Enterprise AI tools

AI model: The underlying software system that powers an AI tool. An AI model is trained on large amounts of information so it can recognise patterns and generate responses.

AI model developer: A company or organisation that builds or develops AI models. Some develop large general-purpose models, while others build smaller specialist models for particular industries or tasks.

AI training dataset: A collection of content used to train an AI model. Datasets can include books, articles, websites, images, research papers, audio and other materials.

Artificial Intelligence (AI): A broad term for computer systems that can perform tasks that normally require human intelligence, such as writing, summarising, searching, analysing or generating content. Examples include chatbots, image generators, recommendation systems and research assistants.

Collective licensing: A licensing model where a collecting society, such as Copyright Agency, manages permissions on behalf of many rights holders. Collective licensing can help simplify permissions and payment processes where large amounts of content and many creators are involved.

Enterprise AI: AI tools used within organisations for internal business purposes. Examples include:

workplace chatbots
research assistants
document summarisation tools
internal knowledge systems

Expression of Interest (EOI): An early consultation process where members indicate whether they may be interested in future AI licensing opportunities. An EOI is not a commitment to participate.

Generative AI: AI systems that create new content such as text, images, audio, video or code. Examples include:

chatbot responses
AI-generated images
automated summaries
drafted reports or emails

Grounding: A technique used to make AI outputs more accurate and reliable by connecting the AI system to trusted reference content. Grounding helps reduce incorrect or invented answers (sometimes called “hallucinations”). The term is often used alongside RAG.

Guardrails: Rules, protections and conditions designed to control how AI systems use licensed content. Examples may include:

transparency requirements
security protections
limits on how outputs can be used
restrictions on competing uses

Large Language Model (LLM): A type of AI model designed to understand and generate human language responses. LLMs are trained on massive collections of text and can answer questions, summarise information, draft documents and hold conversations. Examples include ChatGPT, Claude and Gemini.

Non-exclusive rights: Rights that allow multiple licensing arrangements to exist at the same time. If members opt in on a non-exclusive basis, they may still choose to enter direct licensing arrangements separately.

Prompt: The instruction or question a person gives to an AI tool. Examples:

“summarise this article”
“Draft a media release”
“Explain this legal case”

Prompt input: The content that a person uploads, copies or enters into an AI tool as part of a prompt. This could include text, images, reports or articles.

Reference content: Content that an AI system can access or retrieve to improve its responses. This content may not become permanently part of the model itself. Instead, it is consulted when needed. Examples include:

journal articles
news archives
databases
professional/trade publications

Retrieval Augmented Generation (RAG): A method that allows an AI system to retrieve information from trusted sources before generating a response. Rather than relying only on what the model learned during training, the AI searches approved content sources in real time. For example:

an AI research assistant may retrieve licensed journal articles before answering a question
a workplace chatbot may search internal company documents to provide accurate information

RAG systems are important because they can improve accuracy and allow content owners to licence access to trusted content.

Training: The process of teaching an AI model by exposing it to large amounts of data and content. During training, the model learns patterns in language, images or other material.

Trusted content: Reliable, high-quality content from known and authoritative sources. AI companies and enterprises increasingly value trusted content because it can improve the quality and accuracy of AI outputs.

Share Tweet