Blog

LGPD and patient data in LLMs — what you cannot leave to improvisation

In healthcare, the right question is not only whether AI helps. It is also what happens to data when you paste a clinical history into an external platform. Without legal basis, governance, and minimization, the shortcut can become an ethical, institutional, and legal risk.

Published on

May 18, 2026

Reading time

5 min read

Author

Equipe Humaniza Health

The most common mistake begins with a banal gesture

The scenario is familiar. A professional wants to save time. They copy a clinical summary, paste it into a chatbot, and ask: "organize this better", "summarize it", "help me think through it." There is rarely bad intent. What exists is time pressure, curiosity, and the feeling that the text is "not that identifiable anyway."

But in healthcare, improvised data handling is never neutral.

Health-related data receives reinforced protection under Brazil's LGPD. That completely changes the standard of care expected whenever you use an external platform to process clinical information.

Warning

Pasting identifiable or easily re-identifiable patient information into an external LLM without clear institutional basis, minimization, and governance is risky behavior. In many settings, it is incompatible with the minimum compliance standard healthcare requires.

What matters legally and operationally

For real-world practice, it matters less to memorize statute numbers and more to understand four questions:

01
What data is being processed?
02
For what purpose is it being sent?
03
What legal basis and governance sustain that processing?
04
Is the chosen environment appropriate for that level of sensitivity?

Under LGPD, health data is sensitive personal data. That does not mean "never touch it." It means the required level of care rises sharply. Purpose, necessity, security, access control, retention, and accountability stop being administrative details and become part of the clinical and institutional decision.

If the organization has not clearly defined whether a given tool may or may not be used with clinical information, the prudent standard is not to invent a personal policy in the middle of a busy shift.

Not every "de-identified" text is actually safe

A recurring mistake is confusing the absence of a full name with real anonymization. Very often, the text remains re-identifiable because it still contains precise age, dates, rare diagnoses, location, service context, or a combination of clinical clues.

In other words: removing the patient's name does not solve the problem by itself.

Note

Useful de-identification depends on context. A rare case in a small city, inside a specific service, may remain identifiable even without an explicit name, ID number, or chart number.

That is why, when the goal is to use AI for writing support, teaching, or case discussion, the mature practice usually looks like one of three strategies:

fully fictional case
heavily abstracted and minimized case
institutionally approved environment for that use

What is usually inappropriate and what is usually safer

Situation	Main risk	Safer alternative
Pasting identifiable clinical notes into a public chatbot	improper exposure of sensitive data	rewrite as a fictional case or use an approved institutional environment
Asking the model to summarize a report with patient details	transfer of sensitive information without clear governance	extract only the clinical question without identifiers, or work inside an internal system
Using AI for teaching with a nearly literal rare case	re-identification	build a synthetic teaching case
Reviewing academic material that still contains sensitive research data	use outside the intended scope	actually anonymize and align with protocol and committee requirements when applicable

The core point is this: the usefulness of the tool never removes the obligation to design a use proportional to the risk.

What a physician, professor, or researcher should do in practice

If you work on the front line, a few simple questions already improve conduct a lot:

am I using real patient data?
is that data necessary for this task?
is there an institutional policy allowing this workflow?
can I reformulate the request with sufficient abstraction?
would I be comfortable justifying this use to leadership, legal, the DPO, or the patient?

If the answer starts to feel uncomfortable, the workflow is probably not mature enough.

In teaching and research, this matters twice as much. Excitement about AI does not replace consent, research ethics, data governance, or authorship responsibility.

Where RAG, controlled environments, and governance fit

Not every solution looks like "never use AI." The serious solution usually looks like architecture plus process.

When the environment is controlled, the corpus is defined, access is restricted, and the use has a legitimate and bounded purpose, the conversation changes level. RAG-based tools, enterprise environments, and models operating over internal bases can offer much more responsible paths than improvisation inside open platforms.

But that requires institutional design. It does not arise spontaneously because an individual user meant well.

Limits of this post

This text is editorial and operational guidance, not legal advice. It does not replace review by counsel, the data protection officer, an ethics committee, or institutional governance.

Warning

If your use involves real patients, sensitive data, or protected institutional material, the final question is not "can I do it?", but "am I authorized, covered, and governed to do it?"

One more distinction matters: being able to remove names technically does not mean you actually anonymized the case adequately. And teaching context does not suspend privacy duties.

What to carry forward

In healthcare, useful AI is not AI without rules. It is AI with better rules.

LGPD is not an irrational brake on technology use. It is a reminder that health data carries its own weight. Problems almost never begin with a huge planned scandal. They begin with a small shortcut, repeated without institutional design.

In healthcare, mature AI use does not begin with the prompt. It begins with data governance.

In the next post, we will unpack another common cognitive shortcut: the headline claiming that "AI passed the USMLE" and suggesting, from that alone, that it is ready for the real world.

To follow the full trail in its V0 form, use /en/blog?category=guia-ia-saude.

References

Brazil's General Data Protection Law (Law No. 13,709/2018). Planalto
Brazilian National Data Protection Authority (ANPD). Public materials and guidance. gov.br/anpd

Keep exploring

Blog

Back to blog

Projects

View projects