DATA SAFETY

How We Keep Regulated Data Safe in Every AI System We Build

If your data is regulated, the first question about any AI system is not what it can do. It is where your data goes, who can see it, and how you prove it. Here are the four layers that answer that question before a single pipeline ships.

9 min read

If your business runs on regulated data, you have probably already had the conversation that ends an AI project before it starts. Someone wants to paste a patient note, a client file, or a card transaction into a chatbot to see what it can do. Someone else, usually the person whose name is on the compliance filing, asks where that data goes once it leaves the building. Nobody has a clean answer, so the project quietly dies. That instinct is correct. The problem is not the question. The problem is that most AI tools were never built to answer it.

Watchtower is the AI system Ereos builds to sit underneath your operation, and we treat that question as the foundation, not an afterthought. Keeping regulated data safe is not one feature we add at the end. It is four layers we build in sequence, before any pipeline produces a single output. This is the framework we use on every engagement, whether you are bound by HIPAA, by SEC books-and-records rules, by fair-housing law, by PCI, or by all of them at once. Each layer is reviewable. None of them ask you to take AI on faith.

Layer one: network and identity

The first decision is where the system lives, and the answer is your environment, not ours. Watchtower runs inside your own Microsoft 365 tenant, your Azure subscription, or your equivalent. It authenticates through the identity provider your team already uses. It inherits the security controls, the conditional access policies, and the network boundaries your IT team has already built and already defends.

This matters more than it sounds. A separate AI vendor platform means a separate place your regulated data goes to live, a separate set of credentials to manage, and a new attack surface that did not exist last quarter. Every one of those is something your compliance team now has to account for and your security team now has to monitor. Watchtower does not create any of that. It runs where your data already is, under the controls that already govern it. There is no parallel system for your IT team to maintain and no new perimeter to defend.

  • Deployment inside your own tenant or subscription, not a third-party AI platform you do not control.
  • Authentication through your existing identity provider, so access follows the rules your team already set.
  • Security and network controls you already operate, with no new perimeter to harden.

Layer two: data scrubbing before any model call

Running inside your environment is necessary, but it is not enough on its own, because the AI models doing the actual reasoning often run outside it. That is the moment regulated data is most exposed: the instant content leaves your pipeline to reach a model. So we put a scrubbing layer in front of every model call. Before any content reaches an AI model, the pipeline strips credentials, regulated identifiers, protected health information, personally identifiable information, and the other sensitive content the model does not need in order to do its job.

The principle is that a model should see the shape of the problem, not the identity of the person attached to it. A pipeline scoring a front-desk call does not need the patient's medical record number to tell you the call went badly. A pipeline flagging a trade-surveillance exception does not need the client's social security number to spot the pattern. By scrubbing first, you shrink what is at stake on every single call to a model down to the minimum the work requires. If a model provider had a breach tomorrow, the content that passed through it would not be the content that lands you in a HIPAA notification or a PCI investigation.

Layer three: provider routing under signed agreements

Scrubbing reduces the exposure, but it does not eliminate the need to know who is on the other end of the call. So the third layer governs which AI providers your data is ever allowed to reach. Ereos only uses providers we hold signed data agreements with, including business associate agreements for healthcare work where a BAA is required. We route each pipeline's traffic to providers under the agreements that fit its regulatory regime, and we do not send your data to providers we have no contract with.

Two commitments sit inside those agreements that matter for regulated data. First, your data is not used to train public models. The content that flows through a pipeline is processed for that pipeline and nothing else. It does not become training data that could resurface somewhere you cannot account for. Second, the agreement establishes the legal accountability your own filings depend on. A BAA is not a formality. It is the document that lets a healthcare client say, truthfully, that every party touching protected health information is contractually bound to protect it. The same logic applies to the agreements behind financial, housing, and payment data. The provider routing is built so that what is true on paper is also true in the running system.

Layer four: audit logging and spend tracking

The first three layers control where data goes and what reaches a model. The fourth layer is how you prove it, and proof is the part regulators actually ask for. Every interaction Watchtower has is logged. For each pipeline, the data flow is a diagram, not a verbal assurance: it shows what gets read, what gets scrubbed, which provider sees what, and what comes back. Your compliance officer reviews that diagram and signs off on it before the pipeline ships. Nothing reaches production that your compliance team has not seen on paper first.

That logging is what turns a defensible design into a defensible record. When an SEC examiner asks an RIA to produce its books and records, the audit log is the answer. When a payer audits a practice, the log shows exactly which content moved and where. When a fair-housing complaint lands on a multifamily operator, the trail shows what the system saw and what a human decided. The same log carries spend tracking, so the cost of every pipeline is visible alongside its activity. You can read more about how that spend control works on its own in our piece on predictable AI spend, but the point here is simpler: a system you cannot audit is a system you cannot defend, and we build the audit in from the first day.

Why all four, every time

Any one of these layers on its own leaves a gap. Running inside your tenant does not help if the pipeline ships raw identifiers to a model under no agreement. A signed agreement does not help if you cannot prove which data actually moved. Scrubbing does not help if there is no record that it happened. The four layers are a single framework because regulated data demands all four at once: control the boundary, minimize the content, govern the providers, and keep the proof.

This is also why we sequence the work the way we do. The scrubber, the audit log, the per-pipeline cost tracking, and the user permissions all stand up before a single AI call hits production. We learned that order by running Watchtower on our own operation first, where our own regulated data was the thing at stake. If the way your team handles regulated data has kept you out of AI entirely, that is usually the right reason to start the conversation, not the reason to avoid it.

Common questions

Where does Watchtower run, and does my data leave my environment?
Watchtower runs inside your own Microsoft 365 tenant, Azure subscription, or equivalent, on your existing identity provider and security controls. Regulated content is scrubbed before any model call, and only minimized content ever reaches an AI provider, under a signed data agreement. There is no separate vendor platform holding your data.
Will my regulated data be used to train AI models?
No. Ereos only uses providers we hold signed data agreements with, and those agreements specify that your data is not used to train public models. Content that flows through a pipeline is processed for that pipeline and nothing else.
Can Watchtower meet HIPAA, SEC, fair-housing, and PCI requirements?
The four-layer framework is built to fit each regime: business associate agreements for HIPAA, an audit log built to the SEC books-and-records standard, language flagging for fair housing, and scope control for PCI. The data-flow diagram for each pipeline is reviewable and signable by your compliance officer before it ships.
How do I prove to an auditor what the AI system did?
Every interaction is logged, and each pipeline ships with a reviewable data-flow diagram showing what gets read, scrubbed, sent, and returned. When an examiner, payer, or regulator asks for records, that log is the answer, alongside per-pipeline spend tracking.

See what this would do inside your operation.

A discovery call is a conversation, not a commitment. We will walk through what a custom Watchtower would do against your specific systems and data.

Schedule a discovery call