The Three Data Models — And What Each Means for Your Firm

Model 1: Consumer AI tools (ChatGPT free, Bing Chat) Data processed and potentially used for training. Do not use for client work. Full stop.

Model 2: Enterprise AI tools (Microsoft Copilot for M365, ChatGPT Enterprise, Harvey) Data processed on the vendor's cloud infrastructure under a DPA. Vendor commits not to use data for model training. Data does leave your firm's direct control and is processed on third-party infrastructure. Appropriate for general productivity; concerning for highly sensitive client documents where confidentiality obligations are absolute.

Model 3: Custom AI on private deployment (Govistudio model) The entire AI system — vector database, embedding model, language model inference, query processing — runs inside your firm's own cloud environment. No data is transmitted to any external AI provider during operation. The firm retains full control of all data at all times. This is the appropriate architecture for client documents subject to attorney-client privilege.

Does OpenAI Train on Your Law Firm's Data?

This is the question every managing partner asks. The correct answer requires a distinction:

Consumer ChatGPT (chat.openai.com): OpenAI's privacy policy states that conversations may be used to improve and train models. Do not use this with client data.

OpenAI API (used by developers to build systems): When a Data Processing Agreement (DPA) is signed, OpenAI commits that API inputs and outputs are not used for training. However, data is still processed on OpenAI's shared infrastructure.

Azure OpenAI Service (Microsoft): Runs on Microsoft's cloud infrastructure — not OpenAI's. Microsoft's enterprise DPA similarly commits to no training use. Data stays within Microsoft's infrastructure, not OpenAI's.

Govistudio's custom AI (private deployment): Govistudio builds your system to run on your firm's own Azure tenant or AWS VPC. When your attorney makes a query, the query is processed by infrastructure your firm controls. OpenAI and Microsoft never see the query, the documents, or the answer.

UK GDPR and Attorney-Client Privilege Requirements

UK GDPR (for UK firms and firms handling UK client data):

AI vendor must sign a Data Processing Agreement (DPA) as a data processor under Article 28
Processing of personal data must have a lawful basis documented in your Record of Processing Activities (ROPA)
High-risk processing requires a Data Protection Impact Assessment (DPIA)
Data residency: if data must stay in the UK, deployment on Azure UK South or AWS eu-west-2 is required

Attorney-client privilege (UK: Legal Professional Privilege):

Privilege protects communications between solicitor and client
Communications disclosed to third parties without appropriate safeguard risk losing privilege
A properly constructed private AI deployment — where the firm's own cloud environment processes all data — maintains privilege by keeping data within the firm's control
Shared-cloud tools where a third-party vendor processes data introduce privilege risk that should be assessed by your data protection and professional responsibility counsel

Data Handling Comparison

	Consumer ChatGPT	Microsoft Copilot (Enterprise)	Custom AI (Private Deployment)
Data used for model training	Potentially	No (DPA)	No
Data leaves firm's environment	✓	✓	✗
Processed on third-party infrastructure	✓	✓	✗
Data residency control	✗	Partial	✓ Full
Privilege risk	High	Low–Medium	Minimal
Appropriate for client documents	✗	Consult counsel	✓

What to Require From Any AI Vendor Before Proceeding

Before signing any AI vendor agreement, obtain and review:

A signed Data Processing Agreement (DPA) designating the vendor as a data processor under applicable data protection law
Written confirmation of data residency — specifically which data centres process your data and in which jurisdiction
Written confirmation that your data is not used to train or improve the vendor's AI models
A description of the security architecture — specifically whether your data is processed on shared or dedicated infrastructure
Audit rights or SOC 2 Type II certification confirming security controls
A data retention and deletion policy — when you terminate, what happens to your data

Govistudio provides a standard DPA, security architecture documentation, and signed data handling commitments before any engagement begins.

FAQs

Q: Does OpenAI train on data submitted through the API? A: No — when a Data Processing Agreement is signed with OpenAI or Microsoft (for Azure OpenAI), your data is not used for model training. This is contractually committed. The key distinction is API access (enterprise) versus consumer ChatGPT (which may use data for training under its default privacy policy).

Q: What is a private AI deployment and why do law firms need it? A: A private deployment means all AI processing occurs within your firm's own cloud environment — your Azure subscription or AWS account. No queries, document contents, or answers are transmitted to any external AI provider. This is the architecture required when client confidentiality and privilege obligations are absolute.

Q: Do we need a DPIA for our AI system? A: Under UK GDPR, a DPIA is required when processing is "likely to result in a high risk" to individuals. AI systems processing client personal data (in legal matters, medical records, employment disputes) typically meet this threshold. Govistudio provides a DPIA template as part of the implementation process.

Q: What does data residency mean and why does it matter? A: Data residency determines the physical location of the servers that process and store your data. For UK firms under UK GDPR, data processed outside the UK or EEA may require additional safeguards (standard contractual clauses). Private deployment on Azure UK South or AWS eu-west-2 (Ireland) keeps data in UK/EEA jurisdiction without additional requirements.

Is Your Law Firm's Client Data Safe With AI? A Plain-English Guide for Managing Partners

Quick Take / Direct Answer