Blog
LGPD and patient data in LLMs — what you cannot leave to improvisation
In healthcare, the right question is not only whether AI helps. It is also what happens to data when you paste a clinical history into an external platform. Without legal basis, governance, and minimization, the shortcut can become an ethical, institutional, and legal risk.
Published on
May 18, 2026
Reading time
5 min read
Author
Equipe Humaniza Health
Categories
Share
On this page
In healthcare, the right question is not only whether AI helps. It is also what happens to data when you paste a clinical history into an external platform. Without legal basis, governance, and minimization, the shortcut can become an ethical, institutional, and legal risk.
The most common mistake begins with a banal gesture
The scenario is familiar. A professional wants to save time. They copy a clinical summary, paste it into a chatbot, and ask: "organize this better", "summarize it", "help me think through it." There is rarely bad intent. What exists is time pressure, curiosity, and the feeling that the text is "not that identifiable anyway."
But in healthcare, improvised data handling is never neutral.
Health-related data receives reinforced protection under Brazil's LGPD. That completely changes the standard of care expected whenever you use an external platform to process clinical information.
Pasting identifiable or easily re-identifiable patient information into an external LLM without clear institutional basis, minimization, and governance is risky behavior. In many settings, it is incompatible with the minimum compliance standard healthcare requires.
What matters legally and operationally
For real-world practice, it matters less to memorize statute numbers and more to understand four questions:
- 01
What data is being processed?
- 02
For what purpose is it being sent?
- 03
What legal basis and governance sustain that processing?
- 04
Is the chosen environment appropriate for that level of sensitivity?
Under LGPD, health data is sensitive personal data. That does not mean "never touch it." It means the required level of care rises sharply. Purpose, necessity, security, access control, retention, and accountability stop being administrative details and become part of the clinical and institutional decision.
If the organization has not clearly defined whether a given tool may or may not be used with clinical information, the prudent standard is not to invent a personal policy in the middle of a busy shift.
Not every "de-identified" text is actually safe
A recurring mistake is confusing the absence of a full name with real anonymization. Very often, the text remains re-identifiable because it still contains precise age, dates, rare diagnoses, location, service context, or a combination of clinical clues.
In other words: removing the patient's name does not solve the problem by itself.
Useful de-identification depends on context. A rare case in a small city, inside a specific service, may remain identifiable even without an explicit name, ID number, or chart number.
That is why, when the goal is to use AI for writing support, teaching, or case discussion, the mature practice usually looks like one of three strategies:
- fully fictional case
- heavily abstracted and minimized case
- institutionally approved environment for that use
What is usually inappropriate and what is usually safer
| Situation | Main risk | Safer alternative |
|---|---|---|
| Pasting identifiable clinical notes into a public chatbot | improper exposure of sensitive data | rewrite as a fictional case or use an approved institutional environment |
| Asking the model to summarize a report with patient details | transfer of sensitive information without clear governance | extract only the clinical question without identifiers, or work inside an internal system |
| Using AI for teaching with a nearly literal rare case | re-identification | build a synthetic teaching case |
| Reviewing academic material that still contains sensitive research data | use outside the intended scope | actually anonymize and align with protocol and committee requirements when applicable |
The core point is this: the usefulness of the tool never removes the obligation to design a use proportional to the risk.
What a physician, professor, or researcher should do in practice
If you work on the front line, a few simple questions already improve conduct a lot:
- am I using real patient data?
- is that data necessary for this task?
- is there an institutional policy allowing this workflow?
- can I reformulate the request with sufficient abstraction?
- would I be comfortable justifying this use to leadership, legal, the DPO, or the patient?
If the answer starts to feel uncomfortable, the workflow is probably not mature enough.
In teaching and research, this matters twice as much. Excitement about AI does not replace consent, research ethics, data governance, or authorship responsibility.
Where RAG, controlled environments, and governance fit
Not every solution looks like "never use AI." The serious solution usually looks like architecture plus process.
When the environment is controlled, the corpus is defined, access is restricted, and the use has a legitimate and bounded purpose, the conversation changes level. RAG-based tools, enterprise environments, and models operating over internal bases can offer much more responsible paths than improvisation inside open platforms.
But that requires institutional design. It does not arise spontaneously because an individual user meant well.
Limits of this post
This text is editorial and operational guidance, not legal advice. It does not replace review by counsel, the data protection officer, an ethics committee, or institutional governance.
If your use involves real patients, sensitive data, or protected institutional material, the final question is not "can I do it?", but "am I authorized, covered, and governed to do it?"
One more distinction matters: being able to remove names technically does not mean you actually anonymized the case adequately. And teaching context does not suspend privacy duties.
What to carry forward
In healthcare, useful AI is not AI without rules. It is AI with better rules.
LGPD is not an irrational brake on technology use. It is a reminder that health data carries its own weight. Problems almost never begin with a huge planned scandal. They begin with a small shortcut, repeated without institutional design.
In healthcare, mature AI use does not begin with the prompt. It begins with data governance.
In the next post, we will unpack another common cognitive shortcut: the headline claiming that "AI passed the USMLE" and suggesting, from that alone, that it is ready for the real world.
To follow the full trail in its V0 form, use /en/blog?category=guia-ia-saude.
References
- Brazil's General Data Protection Law (Law No. 13,709/2018). Planalto
- Brazilian National Data Protection Authority (ANPD). Public materials and guidance. gov.br/anpd