How to draft an AI contract

September 26, 2024

Best practice for using AI (i.e. in this context large language models) is in a state of flux. Our clients tell us that they see a lot of abstract high-level articles about AI, but little concrete practical advice about how to draft an AI contract.

What should we consider when drafting and negotiating an AI contract?

You will probably either download AI models directly from the model creator or access AI models via a service provided over the internet

Large language models come in two forms:

(a) models provided as a service that you can access remotely, most famously the OpenAI models (such as ChatGPT); and

(b) models that you can download and use on your own computers. For example:

Google DeepMind (Gemma-2-9B);
Meta (Llama 3.1-8B);
Microsoft (Phi-3-medium-128K-Instruct); and
Mistral AI (Mistral-Nemo-Base-2407).

Your AI contract will probably be a software as a service agreement and not a model licence agreement

Unless your company is building its AI models from scratch (i.e. unless your company is OpenAI, Google, Meta, or a similar organisation), to use AI your company will probably either:

procure AI models directly from a model creator under a model licence agreement; or
access AI models via a service provided over the internet by a supplier under a software as a service agreement.

If customers download the models (i.e. option (b) above), then their AI contract will be a model licence agreement governing how the customer uses the models downloaded to their computers. These customers will probably end up using the standard terms of use for the models provided by the applicable model creator listed above.

Alternatively, if customers use models accessed remotely (i.e option (a) above - such as ChatGPT), or customers use other AI vendors to process customer data and access AI functionality via the option (a) or (b) models on behalf of the customer, then the customer AI contract will be a software as a service agreement with the AI vendor.

Your AI contract will probably be with an AI vendor that is not the underlying model creator or provider

While typing prompts into a ChatGPT interface may be fun, many AI agreements signed by business customers are with AI vendors that do not create the underlying AI models. Instead, these AI vendors build industry-specific services for customers using AI models provided by model creators.

These AI vendors build their own software that interacts with AI models via the option (a) or (b) models discussed above, and these AI vendors host the software and customer data on the AI vendor’s servers. These AI vendors often host the customer’s information and access the models using Retrieval-augmented generation (RAG) and store the customer data in vector databases used to interact with the AI models.

In other words, customers upload data and documents to these AI vendor’s servers, the software on the AI vendor’s servers interacts with the AI models, and then the software on the AI vendor’s servers sends responses in the form of data and documents to their customers.

Key considerations when drafting an AI software as a service contract

(a) Your AI software as a service agreement should contain all the terms and conditions contained in a non-AI software as a service contract

For example, your AI contract should include the following details (mostly paraphrasing APRA requirements here):

details regarding the scope of the arrangement, services to be supplied, and fees;
service levels and performance requirements;
details regarding the form in which data is to be kept and clear provisions identifying ownership and control of data;
reporting requirements, including content and frequency of reporting;
audit and monitoring procedures;
business continuity management and force majeure processes;
terms and conditions in relation to confidentiality, privacy and the security of information (including whether the vendor complies with ISO 27001 or a similar security standard);
data location and transfer restrictions;
default arrangements and termination provisions;
dispute resolution arrangements;
liability and indemnity clauses, and obligations for the supplier to comply with applicable laws;
sub-contracting obligations;
insurance obligations; and
to the extent applicable, offshoring arrangements (including through sub-contracting).

(b) Data de-identification and aggregation (model creators will not necessarily keep customers’ information confidential)

Customers should assume that the model creators will use customer request information to improve their products (i.e. model creators will not necessarily keep customers’ information confidential). To avoid this potential problem, customers can either not use personal or confidential information when using these AI services, or can de-identify the information before it is provided to the model creator.

Customers should ensure that their software as a service agreements (whether AI or non-AI) require the AI vendor to keep the data that customers upload secure, comply with privacy laws, and only use the confidential and personal information of the customer in order to provide services to the customer.

In other words, if the goal is to not provide identifiable information to the model creator, customers can either:

de-identify the information before it is provided to the AI vendor; or
provide the information to the AI vendor under an AI contract with appropriate security, privacy, and data use limitations. The AI contract can then require the AI vendor to de-identify the customer information before providing that information to the model creator.

(c) Vector databases are commonly used for AI services, and may contain customer personal information and confidential information

Depending on the AI model used, the AI vendor may store customer information in a vector database and interact with the AI models using Retrieval-augmented generation (RAG). A vector database is a database that stores vectors (fixed-length lists of numbers which are mathematical representations of data in a high-dimensional space) along with other data items which allows the user to search the database with a query vector to retrieve the closest matching database records.

Databases more commonly used by software as a service providers (such as relational databases and NoSQL databases) generally store customer records in the database as text. On the other hand, vector databases store customer data as fixed-length lists of numbers, and AI vendors then use the information contained in the vector database with the AI models to generate AI model responses.

While vector databases may not contain raw text, these vector databases may still contain retrievable customer personal information and confidential information. Even if customers create their own vector database and provide the database (or access to it) to the AI vendor, then the AI vendor may still have access to customer personal information and confidential information that will need to be appropriately protected and secured.

(d) Using AI in high-risk settings and changes in law

Global AI regulation is in a state of flux.

For example, in Australia, the Department of Industry, Science and Resources has released a Proposals Paper for consultation on Introducing mandatory guardrails for AI in high-risk settings, alongside a Voluntary AI Safety Standard. The proposal is that organisations developing or deploying high-risk AI systems will be required to do the following:

Establish, implement and publish an accountability process including governance, internal capability and a strategy for regulatory compliance.
Establish and implement a risk management process to identify and mitigate risks.
Protect AI systems, and implement data governance measures to manage data quality and provenance.
Test AI models and systems to evaluate model performance and monitor the system once deployed.
Enable human control or intervention in an AI system to achieve meaningful human oversight.
Inform end-users regarding AI-enabled decisions, interactions with AI and AI-generated content.
Establish processes for people impacted by AI systems to challenge use or outcomes.
Be transparent with other organisations across the AI supply chain about data, models and systems to help them effectively address risks.
Keep and maintain records to allow third parties to assess compliance with guardrails.
Undertake conformity assessments to demonstrate and certify compliance with the guardrails.

While most AI regulations that will affect most customers may be abstract or years away, preferably your AI contracts should:

require AI vendors to comply with the obligations such as the above obligations, track the use of this information and compliance with these obligations, and provide the above information to the AI vendor’s customers; and
allow for changes in law, and require AI vendors to comply with new and updated laws.

(e) Hallucination

Enough has already been said about AI hallucination. Customers should ensure that their AI use cases involve human checks and intervention where appropriate.

(f) An exit plan

For all software as a service arrangements (whether AI or non-AI):

(a) customers should have an immediate exit plan if laws change, allowing the customer to move to a new or different service (if required); and

(b) preferably the applicable AI contract should include terms that requires the AI vendor to delete all of the customer’s data if requested by the customer, noting that (as discussed above) the model creator may be able to continue using any customer data provided to the model creator even if the AI vendor intermediary deletes all its copies of the customer’s data.