{"id":23071,"date":"2026-05-29T12:13:07","date_gmt":"2026-05-29T12:13:07","guid":{"rendered":"https:\/\/engineerbabu.com\/blog\/?p=23071"},"modified":"2026-05-29T12:13:07","modified_gmt":"2026-05-29T12:13:07","slug":"hipaa-compliant-ai-in-healthcare","status":"publish","type":"post","link":"https:\/\/engineerbabu.com\/blog\/hipaa-compliant-ai-in-healthcare\/","title":{"rendered":"HIPAA-Compliant AI in Healthcare: Building Clinical AI Scribes, Decision Support, and Patient-Facing AI"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">In October 2023, a clinical AI startup in New York, Series A, $16M raised, shipped an ambient documentation feature. The product listened to provider-patient conversations during outpatient visits, transcribed the audio, and used an LLM to generate a structured SOAP note that pre-populated in the provider&#8217;s EHR.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Providers loved it. Documentation time dropped from 22 minutes per encounter to 6 minutes. The NPS from the first 40 provider beta users was 74.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In the sixth week of beta, a family medicine physician in their Pittsburgh pilot site caught something. The AI-generated note for a patient visit about knee pain included a medication in the Assessment and Plan section, ibuprofen 600mg TID, that the provider had not prescribed during the visit. The patient was allergic to NSAIDs. The allergy was documented in their chart.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The medication had not been discussed in the visit. The LLM had hallucinated it, generating a plausible but entirely fabricated treatment recommendation from context patterns in its training data. The physician caught it in review before signing the note. The note was corrected. The patient was not harmed.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">But here is what the founder told me when he called two weeks later: his product had a one-click &#8220;Accept and Sign&#8221; button that a provider under time pressure, and every provider is under time pressure, could use to push the AI-generated note to the EHR without reading it in detail.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">One click. A hallucinated NSAID prescription for an NSAID-allergic patient. No harm occurred because the physician caught it. Not because the product was designed to catch it.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">They redesigned the note review workflow before the next beta cohort. The one-click sign was removed. A required field-by-field attestation was added. The time savings dropped from 16 minutes to 11 minutes per encounter. Three beta providers complained about the extra steps. The founder shipped anyway.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Eleven minutes is still eleven minutes. And no hallucinated NSAID prescription has reached a clinical record since.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is what clinical AI guardrail design looks like when you build it from first principles rather than from conversion rate optimization. It costs you some of the efficiency gain. It protects your users&#8217; patients. And it keeps your product from being the one that caused the harm that ended the company.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Build the guardrails first. Optimize the efficiency second.<\/span><\/p>\n<h2><b>Eight Things Clinical AI Founders Get Wrong Before They Build<\/b><\/h2>\n<ul>\n<li aria-level=\"1\">\n<h2><b>Wrong #1: &#8220;OpenAI\/Anthropic can sign a BAA, we&#8217;re covered.&#8221;<\/b><\/h2>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">OpenAI offers a BAA under an enterprise agreement. Anthropic offers a BAA through direct enterprise negotiation. But a BAA with the LLM provider covers only the API service under the BAA&#8217;s specific terms. It does not cover every service those companies offer, it does not cover your data pipeline from clinical source to LLM and back, and it does not cover the third-party services in your stack that also touch the clinical data. The BAA with the LLM provider is one piece of a multi-vendor compliance architecture, not a complete solution.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Wrong #2: &#8220;The LLM won&#8217;t hallucinate on clinical data.&#8221;<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">All LLMs hallucinate. The frequency and severity vary by model, by prompt design, by the specificity of the clinical domain, and by how well the model&#8217;s training data covered the relevant clinical area. Building a <\/span><a href=\"https:\/\/engineerbabu.com\/blog\/8-benefits-of-using-ai-for-clinical-diagnosis\/\"><span style=\"font-weight: 400;\">clinical AI product<\/span><\/a><span style=\"font-weight: 400;\"> on the assumption that the LLM will not hallucinate is not a product decision, it is a liability decision. Every clinical AI output that could influence a clinical decision requires architectural guardrails that catch, flag, or prevent hallucinated content from reaching clinical use.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Wrong #3: &#8220;We&#8217;ll add clinical review later, let&#8217;s ship with auto-accept first.&#8221;<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">There is no &#8220;add clinical review later&#8221; in clinical AI. The moment your product produces an output that a clinician could act on, a medication in a SOAP note, a diagnosis in a clinical summary, a recommendation in a patient communication, that output can cause harm if it is wrong and the clinician does not review it. Ship with the clinical review workflow from Day 1 or do not ship.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Wrong #4: &#8220;Fine-tuning on clinical data is just training on more data.&#8221;<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Fine-tuning a foundation LLM on clinical data, patient records, clinical notes, medical literature, involves ePHI that is subject to HIPAA. The fine-tuning process itself, the training infrastructure, the model weights that embed clinical data patterns, and the storage of fine-tuning datasets all have HIPAA implications. Fine-tuning on ePHI without a compliant data governance framework is a HIPAA violation regardless of whether the fine-tuning produces a better model.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Wrong #5: &#8220;Patient-facing AI just needs a disclaimer.&#8221;<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">A disclaimer that says &#8220;this is not medical advice&#8221; does not eliminate clinical liability if the AI provides information that a reasonable patient could interpret as clinical guidance and that information is incorrect. Patient-facing clinical AI requires clinical content governance, crisis escalation pathways, culturally appropriate health literacy design, and, for mental health contexts, specific safety protocols. A disclaimer is legal language. It is not a clinical safety architecture.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Wrong #6: &#8220;We don&#8217;t need IRB approval for clinical AI development.&#8221;<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">If your clinical AI development involves collecting data from patients or using patient data in a way that constitutes human subjects research, which includes using identified patient data to train or validate a clinical AI model, IRB approval may be required. The determination depends on whether the activity meets the regulatory definition of human subjects research under 45 CFR Part 46. Get a research compliance attorney&#8217;s opinion before collecting or using patient data for model development.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Wrong #7: &#8220;The FDA doesn&#8217;t regulate clinical AI unless it&#8217;s a diagnostic tool.&#8221;<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The FDA&#8217;s regulatory reach for AI in healthcare is broader than most founders realize. Software that provides patient-specific recommendations that influence treatment decisions, even if positioned as decision support rather than diagnosis, may meet the SaMD definition. The <\/span><a href=\"https:\/\/www.fda.gov\/regulatory-information\/search-fda-guidance-documents\/clinical-decision-support-software\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">FDA&#8217;s 2022 CDS guidance<\/span><\/a><span style=\"font-weight: 400;\"> and the 2023 AI action plan clarify the boundaries. Do not assume your clinical AI product is outside FDA jurisdiction without a regulatory attorney&#8217;s written opinion.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Wrong #8: &#8220;Model performance on the benchmark dataset means clinical performance.&#8221;<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Academic AI benchmark performance, USMLE pass rates, clinical NLP benchmarks, diagnostic accuracy on published datasets, does not translate directly to clinical performance in your specific use case on your specific patient population. Clinical validation in your intended clinical environment, with your intended user population, on your intended patient population, is required to characterize real-world performance. Benchmark performance is a starting point. It is not clinical evidence.<\/span><\/p>\n<h2><b>The Clinical AI Landscape in 2026, Four Categories, Four Different Builds<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Clinical AI in 2026 spans four meaningfully distinct product categories. Each has different regulatory requirements, different architectural patterns, different clinical risk profiles, and different go-to-market motions.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Category 1: Clinical AI Scribes (Ambient Documentation)<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><b>What it is:<\/b><span style=\"font-weight: 400;\"> AI that listens to provider-patient conversations, transcribes the audio, and generates structured clinical documentation, SOAP notes, visit summaries, after-visit summaries, that the provider reviews, edits, and signs.<\/span><\/p>\n<p><b>Clinical risk profile: <\/b><span style=\"font-weight: 400;\">Moderate. The primary risk is hallucination, content in the generated note that was not in the conversation, or content that was discussed but is incorrectly captured. The provider&#8217;s review step is the primary safety control.<\/span><\/p>\n<p><b>Regulatory profile: <\/b><span style=\"font-weight: 400;\">Generally outside FDA SaMD jurisdiction if the software does not independently diagnose or treat, it is generating documentation, not making clinical decisions. The FDA&#8217;s CDS guidance suggests ambient documentation software is not a medical device if it only documents what the provider said and does not add clinical interpretation. Confirm with a regulatory attorney.<\/span><\/p>\n<p><b>HIPAA profile: <\/b><span style=\"font-weight: 400;\">Significant. Session audio is ePHI. Transcripts are ePHI. Generated notes are ePHI. The entire pipeline, audio capture, transcription, LLM processing, note storage, requires HIPAA BAA coverage for every service involved.<\/span><\/p>\n<p><b>Market size and maturity: <\/b><span style=\"font-weight: 400;\">The most commercially mature category of clinical AI in 2026. Nuance DAX Copilot (Microsoft), Suki, Abridge, Ambience Healthcare, and DeepScribe are the established players. The market is competitive but not saturated, and health systems that have not yet deployed ambient documentation are actively evaluating solutions.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Category 2: Clinical Decision Support AI<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><b>What it is:<\/b><span style=\"font-weight: 400;\"> AI that analyzes patient clinical data, structured EHR data, lab results, imaging reports, clinical notes, and surfaces patient-specific recommendations, risk scores, care gaps, or clinical insights to providers or care managers.<\/span><\/p>\n<p><b>Clinical risk profile: <\/b><span style=\"font-weight: 400;\">High. The primary risks are: incorrect recommendations that lead to inappropriate treatment decisions, missed recommendations that lead to delayed treatment, and algorithmic bias that produces disparate recommendations across patient subgroups.<\/span><\/p>\n<p><b>Regulatory profile: <\/b><span style=\"font-weight: 400;\">Variable and FDA-sensitive. Clinical decision support that analyzes physiological signals, medical images, or in vitro diagnostic data is a medical device. CDS that surfaces evidence-based guideline recommendations based on structured clinical data may qualify for the CDS exemption, but the exemption analysis is complex and case-specific. Get a regulatory attorney&#8217;s written opinion.<\/span><\/p>\n<p><b>HIPAA profile: <\/b><span style=\"font-weight: 400;\">Significant. The patient clinical data analyzed by the AI is ePHI. Every service in the pipeline, EHR data extraction, AI inference, results storage, requires <\/span><a href=\"https:\/\/engineerbabu.com\/blog\/what-is-hipaa-baa-healthcare-apps-usa\/\"><span style=\"font-weight: 400;\">HIPAA BAA<\/span><\/a><span style=\"font-weight: 400;\"> coverage.<\/span><\/p>\n<p><b>Market opportunity: <\/b><span style=\"font-weight: 400;\">High. Clinical decision support AI is the highest-value category for health system enterprise sales, risk stratification, care gap identification, sepsis prediction, readmission prevention, medication safety. The market is large, the buyer is the health system CMO and CMIO, and the contract values are significant ($200K\u2013$2M\/year per health system).<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Category 3: Patient-Facing AI<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><b>What it is: <\/b><span style=\"font-weight: 400;\">AI that interacts directly with patients, symptom checkers, medication adherence chatbots, mental health support tools, care navigation assistants, chronic disease management companions.<\/span><\/p>\n<p><b>Clinical risk profile: <\/b><span style=\"font-weight: 400;\">Highest of the four categories. The patient is the direct user. The patient may be medically illiterate, may be in acute distress, may misinterpret AI outputs as clinical advice, and may take clinical action based on AI outputs without provider involvement. The consequences of incorrect AI outputs reach the patient directly, without a clinician as an intermediary safety layer.<\/span><\/p>\n<p><b>Regulatory profile: <\/b><span style=\"font-weight: 400;\">FDA-sensitive. Patient-facing AI that provides patient-specific health information that could be used to make clinical decisions, &#8220;your symptoms are consistent with X&#8221;, may be a medical device. The FDA has been actively developing its regulatory posture for direct-to-patient AI. Get a regulatory attorney&#8217;s opinion before building patient-facing clinical AI.<\/span><\/p>\n<p><b>HIPAA profile: <\/b><span style=\"font-weight: 400;\">Variable. Patient-facing AI in a consumer context (direct-to-consumer wellness app) may not be subject to HIPAA if the app has no Covered Entity relationship. Patient-facing AI embedded in a health plan&#8217;s member portal or a provider&#8217;s patient portal is subject to HIPAA. Understand your HIPAA applicability before building.<\/span><\/p>\n<p><b>Market opportunity: <\/b><span style=\"font-weight: 400;\">Large but structurally challenging. Patients are cost-sensitive (consumer willingness to pay is lower than B2B), the regulatory risk is higher than B2B clinical AI, and consumer health AI products face FTC scrutiny as well as potential FDA oversight. The highest-value patient-facing AI opportunity is embedded in payer or employer benefit offerings where the B2B buyer funds access.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Category 4: Administrative Healthcare AI<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><b>What it is: <\/b><span style=\"font-weight: 400;\">AI that automates <\/span><a href=\"https:\/\/engineerbabu.com\/blog\/how-ai-transforms-hospital-workflows\/\"><span style=\"font-weight: 400;\">healthcare administrative workflows<\/span><\/a><span style=\"font-weight: 400;\">, prior authorization, <\/span><a href=\"https:\/\/engineerbabu.com\/blog\/ai-revenue-cycle-management-software-usa\/\"><span style=\"font-weight: 400;\">revenue cycle management<\/span><\/a><span style=\"font-weight: 400;\">, claims processing, clinical documentation coding (ICD-10\/CPT), scheduling optimization, contract management.<\/span><\/p>\n<p><b>Clinical risk profile: <\/b><span style=\"font-weight: 400;\">Low. Administrative AI does not directly influence clinical decisions or patient treatment. The risk is financial and operational, incorrect coding, incorrect prior authorization, billing errors, rather than clinical.<\/span><\/p>\n<p><b>Regulatory profile: <\/b><span style=\"font-weight: 400;\">Generally outside FDA jurisdiction. Administrative AI that does not involve patient-specific clinical decision-making is typically not a medical device.<\/span><\/p>\n<p><b>HIPAA profile: <\/b><span style=\"font-weight: 400;\">Significant. Administrative healthcare AI processes claims data, billing data, and scheduling data, all of which may contain ePHI. HIPAA requirements apply fully.<\/span><\/p>\n<p><b>Market opportunity:<\/b><span style=\"font-weight: 400;\"> Underestimated and growing rapidly. Revenue cycle management AI, prior authorization automation, and clinical coding AI are among the fastest-growing segments of healthcare AI investment in 2026, because the ROI is measurable, the buyer is the CFO and revenue cycle director, and the regulatory complexity is lower than clinical AI.<\/span><\/p>\n<p><b>From a US founder call:<\/b><span style=\"font-weight: 400;\"> &#8220;I spent two years building a clinical decision support AI for sepsis prediction. Raised $12M. The health system sales cycle was 18 months. The IRB approval for the clinical validation study took 8 months. The EMR integration took 6 months per health system.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Two of my engineers left and started a company doing prior authorization automation AI. They raised $6M, had their first paying customer in four months, and are at $2M ARR while I am still closing my second health system. I am not saying clinical AI is wrong. I am saying administrative AI is a faster path to revenue for a founder who needs to show traction.&#8221;, Series A clinical AI founder, Atlanta.<\/span><\/p>\n<h2><b>The Regulatory Stack for Clinical AI, HIPAA, FDA SaMD, FTC, and State Laws<\/b><\/h2>\n<ul>\n<li aria-level=\"1\">\n<h3><b>HIPAA, The Baseline<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">All patient clinical data processed by clinical AI is ePHI. The Privacy Rule governs what clinical data can be used for AI purposes. The Security Rule governs how clinical data must be protected in the AI pipeline. The Breach Notification Rule applies when ePHI is improperly accessed.<\/span><\/p>\n<p><b>The minimum necessary principle (\u00a7164.502(b)) applies to clinical AI:<\/b><span style=\"font-weight: 400;\"> your AI system should access only the patient data elements necessary for the specific AI function being performed. An AI that needs the patient&#8217;s medication list to perform drug interaction checking should not have access to the patient&#8217;s mental health records, SUD history, or HIV status.<\/span><\/p>\n<p><b>The authorization requirement for using ePHI for AI development:<\/b><span style=\"font-weight: 400;\"> using identifiable patient ePHI to train or fine-tune an AI model is a use of ePHI. Under HIPAA, ePHI can be used for treatment, payment, and healthcare operations without patient authorization, but AI model training may not fall neatly within &#8220;healthcare operations&#8221; depending on the context. The safest approach for AI training datuse de-identified data (de-identified under the Safe Harbor or Expert Determination method per \u00a7164.514), or obtain patient authorization, or use a Limited Data Set under a Data Use Agreement.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>FDA SaMD Framework<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><b>For clinical AI specifically:<\/b><span style=\"font-weight: 400;\"> the CDS exemption is narrower than most founders assume. The PCCP is essential for AI\/ML SaMD that will be retrained post-clearance. The FDA&#8217;s 2023 AI action plan signals increasing regulatory attention to clinical AI across all four categories.<\/span><\/p>\n<p><b>The FDA&#8217;s current enforcement posture for clinical AI that has not sought clearance: <\/b><span style=\"font-weight: 400;\">the FDA has generally focused enforcement on the highest-risk categories first, AI diagnostic tools for cancer, cardiac conditions, and ophthalmology. But the trend is toward broader enforcement, not narrower. Build with the assumption that your product will eventually require regulatory engagement with the FDA.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>FTC, For Consumer-Facing Clinical AI<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The FTC&#8217;s enforcement authority under Section 5 of the FTC Act (unfair or deceptive acts) and the FTC Health Breach Notification Rule applies to consumer-facing health AI that is not subject to HIPAA. <\/span><a href=\"https:\/\/www.lexology.com\/library\/detail.aspx?g=2c9dc6b1-25a4-41bb-ae34-973a6c871d7a\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">The FTC&#8217;s 2023 policy statement on AI<\/span><\/a><span style=\"font-weight: 400;\"> makes clear that the FTC considers health-related AI claims, including claims about AI accuracy, clinical evidence for AI recommendations, and data privacy, within its enforcement purview.<\/span><\/p>\n<p><b>For patient-facing AI: <\/b><span style=\"font-weight: 400;\">avoid overstating the clinical evidence for AI recommendations, avoid making accuracy claims that cannot be substantiated, and ensure your privacy policy accurately describes how patient data is used for AI purposes.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>State Laws, The Emerging Patchwork<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Several states have enacted or are considering AI-specific health regulations in 2026:<\/span><\/p>\n<p><b>California SB 1120 (2024): <\/b><span style=\"font-weight: 400;\">Requires health plan algorithms used for clinical decisions to be disclosed to patients and to be auditable for bias. Applies to payer-side clinical AI.<\/span><\/p>\n<p><b>Colorado SB 169 (2024): <\/b><span style=\"font-weight: 400;\">Regulates algorithmic decision-making in insurance, including health insurance. Applies to AI used in coverage determinations.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Several states with comprehensive privacy laws (California, Colorado, Virginia) include provisions on automated decision-making that apply to health-related AI.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The state AI regulatory landscape for healthcare is evolving rapidly. A multi-state regulatory analysis from a healthcare attorney is worth the investment before launching clinical AI in a multi-state market.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-23111\" src=\"https:\/\/engineerbabu.com\/blog\/wp-content\/uploads\/2026\/05\/04_scribe_pipeline.png\" alt=\"\" width=\"1920\" height=\"1040\" title=\"\"><\/p>\n<h2><b>The 16-Question Clinical AI Readiness Audit<\/b><\/h2>\n<ul>\n<li aria-level=\"1\"><b>Have you determined whether your clinical AI product is a medical device?<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Regulatory attorney written opinion. Not a verbal discussion. A written opinion that documents the analysis.<\/span><\/p>\n<ul>\n<li aria-level=\"1\"><b>If your product is a medical device, what is your FDA regulatory pathway?<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">510(k), De Novo, or PMA. If you do not have a pathway, you do not have a product launch timeline.<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>If your product claims CDS exemption, is the exemption analysis documented in writing?<\/b><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">Document the four-part test analysis. Do not assert exemption without written documentation.<\/span><\/p>\n<ul>\n<li aria-level=\"1\"><b>Does your clinical AI process ePHI?<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Almost certainly yes. Identify every point in the pipeline where ePHI is processed: audio capture, transcription, EHR data ingestion, LLM inference, output storage, audit logging.<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Does every service in your ePHI processing pipeline have a HIPAA BAA?<\/b><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">List every service. Confirm BAA availability and execution for each.<\/span><\/p>\n<ul>\n<li aria-level=\"1\"><b>Which LLM are you using and does it have a BAA covering your use case?<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">AWS Bedrock (under AWS BAA), Azure OpenAI with HIPAA mode, OpenAI Enterprise BAA, Anthropic Enterprise BAA, each has different coverage, different scope, and different conditions. Know exactly which services are covered before clinical data flows through the API.<\/span><\/p>\n<ul>\n<li aria-level=\"1\"><b>Have you designed hallucination guardrails for every clinical AI output that could influence a clinical decision?<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Not &#8220;do you plan to add guardrails&#8221;, have you designed them. Source citation requirements, confidence framing, human review gates, clinical escalation pathways.<\/span><\/p>\n<ul>\n<li aria-level=\"1\"><b>Is there a required human review step before any AI-generated clinical content becomes a clinical record?<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Auto-accept is not appropriate for any AI output that will be stored as a clinical record. The provider must review and attest. Design this into the product from Day 1.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Have you designed the clinical escalation pathway for patient-facing AI?<\/b><\/li>\n<\/ul>\n<p><b>For <\/b><a href=\"https:\/\/engineerbabu.com\/blog\/mental-health-app-development\/\"><b>mental health AI<\/b><\/a><b>:<\/b><span style=\"font-weight: 400;\"> crisis detection and 988\/emergency services connection. For general patient-facing AI: escalation to a human care navigator or clinical staff when the AI cannot safely respond to a patient query.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>What is your training data governance policy?<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">What data was used to train or fine-tune your model? Was it de-identified? Was it obtained under appropriate authorization? Is the training data pipeline documented?<\/span><\/p>\n<ul>\n<li aria-level=\"1\"><b>What is your model versioning and audit trail?<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Every model version deployed to production must be logged. Every clinical output must be attributed to the model version that generated it. If a hallucination is discovered in a clinical record, you must be able to identify which model version produced it and which other outputs that version may have affected.<\/span><\/p>\n<ul>\n<li aria-level=\"1\"><b>What is your model performance monitoring architecture?<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">How do you detect performance degradation after deployment? What metrics do you track? What threshold triggers investigation or model update?<\/span><\/p>\n<ul>\n<li aria-level=\"1\"><b>Have you conducted a bias analysis on your AI model?<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Performance across demographic subgroups: age, sex, race\/ethnicity, primary language, insurance status. Performance across clinical subgroups: disease severity, comorbidity burden, clinical site.<\/span><\/p>\n<ul>\n<li aria-level=\"1\"><b>What is your patient data use policy for AI improvement?<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Can you use patient interaction data to improve your models? Under what authorization? With what patient notice? This must be documented in your privacy policy and implemented in your data governance framework.<\/span><\/p>\n<ul>\n<li aria-level=\"1\"><b>Have you designed for minimum necessary data access?<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Does your AI access only the patient data elements it needs for the specific function it performs? Or does it have broad access to the full patient record?<\/span><\/p>\n<ul>\n<li aria-level=\"1\"><b>What is your incident response plan for a clinical AI failure?<\/b><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">A hallucinated clinical record is discovered. A patient-facing AI provides incorrect medical information. What is your response? Who is notified? How is the clinical impact assessed? How is the affected output corrected in the clinical record?<\/span><\/p>\n<h2><b>LLM Selection Under BAA, The 2026 Decision Tree<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">This is the decision every clinical AI founder must make explicitly before clinical data enters the AI pipeline. Here is the complete 2026 picture.<\/span><\/p>\n<p><b>The decision framework:<\/b><\/p>\n<p><b>Question 1: <\/b><span style=\"font-weight: 400;\">Does your AI feature process ePHI, individually identifiable health information from a patient?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">If no, any LLM can be used. The BAA requirement does not apply to non-ePHI data.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">If yes, proceed to Question 2.<\/span><\/p>\n<p><b>Question 2: <\/b><span style=\"font-weight: 400;\">Is a BAA available from the LLM provider that covers your specific use case?<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>AWS Bedrock:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">BACovered under the standard AWS BAA for HIPAA-eligible services. AWS Bedrock is on the HIPAA-eligible services list as of 2026. Models available: Claude (Anthropic), Llama (Meta), Mistral, Titan (Amazon), and others through the Bedrock model catalog. Verdict: Our default recommendation for most clinical AI features processing ePHI.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The BAA situation is unambiguous, it is the standard AWS BAA you already have. No enterprise negotiation required. No separate legal review needed for the LLM specifically. Limitation: Model selection is limited to Bedrock&#8217;s catalog. Not every frontier model is available on Bedrock. For use cases where a specific model capability is required that Bedrock does not yet offer, evaluate alternatives.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Azure OpenAI with HIPAA mode:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">BAAvailable under an Azure enterprise agreement with HIPAA configuration enabled. Azure OpenAI is on Microsoft&#8217;s list of HIPAA-covered services when configured correctly. Models available: GPT-4o, GPT-4 Turbo, GPT-4, GPT-3.5 Turbo, text embedding models. Verdict: Strong option for clinical AI products already in the Azure ecosystem, or for use cases requiring GPT-4o class performance where Azure enterprise relationship is in place.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Requires Azure enterprise agreement and correct HIPAA configuration, confirm with your Microsoft account team that the specific services you use are in HIPAA mode. Limitation: Requires Azure enterprise agreement. Configuration complexity higher than AWS Bedrock. Microsoft&#8217;s covered services list must be validated specifically, not all Azure AI services are in the HIPAA-covered list.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>OpenAI API with Enterprise BAA<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">BAA available under OpenAI Enterprise agreement. The BAA covers the API (chat completions, embeddings, fine-tuning) under the enterprise agreement terms. Models available: GPT-4o, GPT-4 Turbo, GPT-4, o1, o3, text embedding models. Verdict: Valid option for clinical AI teams with an OpenAI enterprise agreement and legal counsel review of the BAA scope.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The BAA scope, which specific services are covered, what data processing limitations apply, what the data retention terms are, must be reviewed by your healthcare attorney before clinical ePHI flows through the API. Limitation: Requires enterprise agreement (not available on standard API plans). BAA scope review adds legal cost and time. OpenAI&#8217;s enterprise pricing is higher than standard API pricing.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Anthropic API with Enterprise BAA<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">BAA available through direct enterprise negotiation with Anthropic&#8217;s enterprise team. Not yet available as a self-service agreement. Models available: Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku, Claude 3.5 Haiku. Verdict: Available for clinical AI use cases where Claude&#8217;s specific capabilities (long context, instruction following, clinical reasoning) justify the enterprise negotiation.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Requires direct engagement with Anthropic&#8217;s enterprise team, not a self-service signup. Limitation: Enterprise negotiation timeline (4\u20138 weeks). The BAA terms must be reviewed by your healthcare attorney.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Google Cloud Vertex AI (Gemini models):<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">BAGoogle Cloud&#8217;s HIPAA BAA covers Vertex AI services when using the Healthcare Data Engine and when the specific Vertex AI services are on Google&#8217;s covered services list. Confirm current coverage with Google Cloud&#8217;s healthcare team. Models available: Gemini 1.5 Pro, Gemini 1.5 Flash, Gemini 1.0 Pro, Med-PaLM 2 (available to select partners).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Verdict: Strong option for clinical AI products in the Google Cloud ecosystem. Med-PaLM 2, Google&#8217;s clinical-domain-specific foundation model, is available to select clinical AI partners and shows strong performance on clinical reasoning benchmarks. For products requiring clinical domain performance specifically, the Med-PaLM 2 access pathway is worth investigating.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Limitation: Google&#8217;s HIPAA-covered services list must be validated specifically for each Vertex AI service used. Med-PaLM 2 access is not self-service.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Self-hosted open-source models (Llama, Mistral, Mixtral, clinical fine-tunes):<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">BANo BAA required, you control the infrastructure. The model runs on your HIPAA-compliant cloud infrastructure under your existing AWS\/GCP\/Azure BAA. Models available: Llama 3 (70B, 8B), Mistral 7B, Mixtral 8x7B, BioMistral (clinical fine-tune), ClinicalCamel (clinical fine-tune), and others.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Verdict: The appropriate choice when: (1) the clinical data sensitivity is high enough that your legal team is not comfortable with any cloud LLM processing ePHI, (2) the patient population includes individuals with particularly sensitive conditions (SUD, HIV\/AIDS, psychiatric history) where data sovereignty concerns are paramount, (3) enterprise health system customers require on-premises deployment without data leaving the health system&#8217;s infrastructure.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Limitation: Significant infrastructure overhead, GPU compute, model serving infrastructure, model management, inference optimization. Frontier model performance is generally below cloud-hosted GPT-4o or Claude 3.5 Sonnet for complex clinical reasoning tasks. Self-hosted deployment is the right architectural choice for a minority of clinical AI use cases.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>The BAA is not the only compliance consideration:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Having a BAA with the LLM provider covers the LLM API service. It does not cover:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The audio transcription service (if you are processing session audio), Amazon Transcribe Medical (covered under AWS BAA), Deepgram for Healthcare (BAA available), AssemblyAI (BAA available for healthcare). Confirm for your specific transcription service.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The vector database storing clinical embeddings (if you use RAG architecture), Pinecone (BAA available on enterprise plan), Weaviate (BAA available), Chroma (self-hosted avoids the BAA question).<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The prompt logging and observability platform, Langsmith, Helicone, and similar LLM observability platforms may capture prompt content including ePHI in logs. Confirm BAA availability before enabling prompt logging for clinical AI features.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The error monitoring service, Sentry, Datadog, if it captures API request\/response payloads including clinical content.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Map every service in the ePHI processing pipeline. Confirm BAA coverage for each. This is the BAA registry for your clinical AI product, maintain it from Day 1.<\/span><\/p>\n<p><b>EB Index 2026:<\/b><span style=\"font-weight: 400;\"> Across 28 clinical AI products we have supported since 2022, the most common BAA gap discovered during SOC 2 readiness assessments is the LLM observability\/prompt logging platform.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Founders who enable prompt logging for debugging (a reasonable engineering decision) inadvertently route clinical prompt content, which often includes patient symptoms, medications, and clinical history, through an observability platform that has no HIPAA BAA. Disable prompt logging in production or confirm BAA availability for your observability platform before going live.<\/span><\/p>\n<h2><b>Clinical AI Scribes, Ambient Documentation Done Right<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Ambient clinical documentation is the most commercially deployed category of clinical AI in 2026. It is also the category with the most well-understood failure modes, which means the guardrail architecture is better established than for more novel clinical AI applications.<\/span><\/p>\n<h3><b>The ambient documentation pipeline:<\/b><\/h3>\n<p><b><i>Audio capture:<\/i><\/b> <span style=\"font-weight: 400;\">Session audio captured via device microphone (<\/span><a href=\"https:\/\/engineerbabu.com\/services\/mobile-app-development\"><span style=\"font-weight: 400;\">mobile app<\/span><\/a><span style=\"font-weight: 400;\">, desktop app, or dedicated hardware). The audio is ePHI from the moment it is captured, patient voice is individually identifiable and the content is protected health information. Encrypted in transit from the point of capture. Not stored permanently on the capture device.<\/span><\/p>\n<p><b><i>Transcription:<\/i><\/b> <span style=\"font-weight: 400;\">Audio converted to text in real time or near-real time. Transcription service must have a HIPAA BAA. Amazon Transcribe Medical (under AWS BAA) is our standard recommendation, it is trained on medical terminology and handles clinical speech patterns better than general-purpose transcription services. Speaker diarization (identifying who is speaking, provider vs. patient) significantly improves note structure.<\/span><\/p>\n<p><b><i>Clinical entity extraction:<\/i><\/b> <span style=\"font-weight: 400;\">NLP processing of the transcript to identify clinical entities, symptoms, diagnoses, medications, procedures, vitals. This step can be performed by the LLM or by a separate NLP layer before the LLM generates the structured note.<\/span><\/p>\n<p><b><i>SOAP note generation:<\/i><\/b><span style=\"font-weight: 400;\"> The LLM receives the transcript (and optionally the extracted clinical entities and the patient&#8217;s relevant EHR context) and generates a structured SOAP note. The note structure, what goes in Subjective, Objective, Assessment, Plan, is defined by a system prompt that encodes clinical documentation standards.<\/span><\/p>\n<p><b><i>Provider review and attestation:<\/i><\/b> <span style=\"font-weight: 400;\">The generated note is presented to the provider for review. The provider edits, approves, and signs. The signed note is pushed to the EHR. No AI-generated content enters the clinical record without provider review and attestation.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>The SOAP note generation system prompt architecture:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The system prompt that governs AI scribe behavior is among the most clinically consequential engineering artifacts in the product. It defines:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">What clinical documentation standards the note must follow<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">What content should be in each SOAP section<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">What the AI should do when content is ambiguous or unclear in the transcript<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">What the AI must NOT do, fabricate information not in the transcript, add clinical interpretation beyond what was discussed, include medications not mentioned in the visit<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">A poorly designed system prompt produces notes that are plausible-sounding but clinically incorrect. A well-designed system prompt produces notes that accurately capture the clinical encounter with appropriate uncertainty markers where the transcript was unclear.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Example system prompt constraint (the constraint that prevents the medication hallucination from the opening story):<\/span><\/p>\n<p><span style=\"font-weight: 400;\">You are generating a clinical SOAP note from a provider-patient conversation transcript.<\/span><\/p>\n<p><strong>CRITICAL RULES:<\/strong><\/p>\n<p><span style=\"font-weight: 400;\">&#8211; Include ONLY medications explicitly mentioned in the transcript.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">&#8211; If a medication is NOT clearly mentioned in the transcript, DO NOT include it in the Plan section.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">&#8211; If a clinical finding is unclear or ambiguous in the transcript, use the phrase &#8220;Provider to clarify:&#8221; followed by the unclear element rather than inferring a specific value.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">&#8211; Do NOT add clinical interpretation, diagnoses, or recommendations that were not explicitly discussed in the conversation.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">&#8211; If the transcript is incomplete or inaudible for a portion of the encounter, note &#8220;[Inaudible section]&#8221; rather than inferring content.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">These constraints reduce the creative latitude of the LLM, which means the notes are less polished than an unconstrained LLM would produce. They also mean the notes are far less likely to contain hallucinated clinical content. Clinical safety beats polish. Ship the constrained version.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>The provider review UX, designed for safety, not speed:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The note review interface is where hallucinations are caught or missed. Design it for clinical accuracy, not for fastest possible signing.<\/span><\/p>\n<p><b>What works:<\/b><\/p>\n<p><b>Section-by-section review: <\/b><span style=\"font-weight: 400;\">The SOAP note is presented one section at a time (S \u2192 O \u2192 A \u2192 P), with the relevant transcript excerpt visible alongside each section. The provider can see exactly what the AI used to generate each part of the note and can verify accuracy against the source conversation.<\/span><\/p>\n<p><b>Confidence indicators: <\/b><span style=\"font-weight: 400;\">Fields where the AI&#8217;s confidence in the transcription or the clinical entity extraction is below a threshold are highlighted for required provider attention. Not all content needs equal scrutiny, the provider&#8217;s attention should be directed to the highest-uncertainty content first.<\/span><\/p>\n<p><b>Medication-specific attestation: <\/b><span style=\"font-weight: 400;\">A separate attestation step for any medication that appears in the Plan section. The provider explicitly confirms that each medication was discussed and prescribed during the visit before it is included in the signed note.<\/span><\/p>\n<p><b>What does not work:<\/b><\/p>\n<p><span style=\"font-weight: 400;\">One-click sign with full note visible: Providers under time pressure will sign without reading. Do not design for this pathway.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Auto-populated clinical fields without source citation: Any clinical field populated from the transcript should show the transcript excerpt that supports it. Dark-pattern auto-population that hides the source encourages trust without verification.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>The EHR integration for note delivery:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The reviewed and signed note must be delivered to the EHR in the correct location, the encounter note section for the specific visit date. EHR integration for note delivery requires:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">FHIR R4 Document reference resource creation (for FHIR-capable EHRs)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">HL7 v2 MDM message for legacy EHR note delivery<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Epic SMART on FHIR write-back for Epic customers (requires App Orchard write-back certification)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Athenahealth API note creation endpoint for Athena customers<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The note delivery integration is a separate engineering workstream from the AI scribe itself, scope it explicitly in discovery.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Audio retention policy:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Session audio is ePHI. Retaining session audio long-term creates ongoing HIPAA compliance obligations and creates a subpoena risk for the full audio of every clinical encounter ever recorded.<\/span><\/p>\n<p><b>Our recommendation: <\/b><span style=\"font-weight: 400;\">retain session audio for 72 hours after the note is signed, sufficient for any note correction that requires reference to the original recording. Delete automatically at 72 hours. The clinical record is the signed note, not the audio. The audio served its purpose in generating the note. Retain only what is clinically necessary.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is a data minimization decision that also reduces long-term liability and storage costs. Document the retention policy explicitly and implement automated deletion before going live.<\/span><\/p>\n<p><b>Compliance trap:<\/b><span style=\"font-weight: 400;\"> AI scribe products that use session audio to train or improve their transcription or note generation models must obtain appropriate authorization before using identified audio for this purpose.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">An opt-in consent for &#8220;audio may be used to improve our service&#8221; that is buried in the terms of service is insufficient for using clinical ePHI for model training. Design a clear, affirmative opt-in for model improvement data use, and build the technical mechanism to honor opt-out requests by excluding a provider&#8217;s session audio from training pipelines.<\/span><\/p>\n<h2><b>Clinical Decision Support AI, The Guardrails That Cannot Be Optional<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Clinical decision support AI analyzes patient clinical data and surfaces recommendations, risk scores, care gaps, or clinical insights to providers or care managers. The guardrail architecture for CDS AI is different from ambient documentation, the AI is not documenting what was said, it is generating new clinical insights from data analysis.<\/span><\/p>\n<h3><b>The source citation requirement:<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">Every CDS AI output that influences a clinical decision must be accompanied by the source data that generated it. A sepsis risk score of 78% is not clinically useful without knowing which clinical features drove that score, elevated lactate, tachycardia, hypotension, recent antibiotic administration. A care gap alert for a patient overdue for a mammogram is not clinically useful without knowing the source of the patient&#8217;s age and last mammogram date.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The source citation serves two clinical purposes: it allows the clinician to verify that the AI&#8217;s data inputs are correct (a common failure mode is AI reasoning from incorrect or outdated data in the EHR), and it allows the clinician to apply their clinical judgment to the recommendation rather than simply accepting the AI&#8217;s output.<\/span><\/p>\n<p><b>Technically: <\/b><span style=\"font-weight: 400;\">the source citation must trace from the AI output to the specific EHR data elements that generated it, with timestamps showing when that data was last updated. This requires that the inference pipeline capture data provenance, which data elements, from which records, at what version, alongside the inference output.<\/span><\/p>\n<h3><b>The uncertainty communication requirement:<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">CDS AI outputs must communicate uncertainty to the clinician. A risk score without a confidence interval, a recommendation without a supporting evidence level, a clinical insight without a stated limitation, these create false certainty in clinical environments where uncertainty is clinically meaningful.<\/span><\/p>\n<p><b>Design patterns for uncertainty communication:<\/b><\/p>\n<p><span style=\"font-weight: 400;\">Confidence ranges: &#8220;Sepsis risk: 72\u201384% (moderate confidence)&#8221; rather than &#8220;Sepsis risk: 78%.&#8221;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Evidence level attribution: &#8220;Based on SEPSIS-3 criteria applied to structured EHR data. Does not incorporate clinical gestalt or findings not documented in the EHR.&#8221;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Data recency warning: &#8220;This recommendation is based on lab values last updated 6 hours ago. Clinical status may have changed.&#8221;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Limitation disclosure: &#8220;This algorithm was validated on adult patients. Performance in pediatric patients has not been validated.&#8221;<\/span><\/p>\n<h3><b>The minimum necessary data access requirement:<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">CDS AI must access only the patient data elements necessary for the specific clinical question being answered. A drug interaction checker does not need the patient&#8217;s psychiatric history. A diabetic retinopathy screening reminder does not need the patient&#8217;s HIV status or substance use history.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is both a HIPAA requirement (minimum necessary, \u00a7164.502(b)) and a clinical ethics requirement. Unnecessary access to sensitive clinical data creates risk, both the risk of data exposure and the risk of AI systems incorporating sensitive data in ways that the patient did not authorize and the clinician did not intend.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Implement a data access manifest for every CDS AI feature: a documented list of the specific data elements the feature accesses, why each element is needed, and a technical control that prevents the feature from accessing data elements outside the manifest.<\/span><\/p>\n<h3><b>The algorithmic bias monitoring requirement:<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">CDS AI trained on historical clinical data inherits the biases in that data. Historical clinical data reflects historical disparities in healthcare, disparities in diagnostic rates, treatment rates, and outcomes across demographic groups. An AI trained to predict hospital readmission risk trained on data from a health system that historically underdiagnosed heart failure in Black women will likely underpredict readmission risk for Black women.<\/span><\/p>\n<h3><b>Bias monitoring for CDS AI requires:<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">Subgroup performance analysis at launch: sensitivity, specificity, PPV, NPV, and AUC-ROC by age group, sex, race\/ethnicity, insurance type, and any other clinically relevant demographic subgroup.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Ongoing subgroup performance monitoring post-launch: tracking whether real-world performance differs by subgroup and whether performance gaps emerge over time.<\/span><\/p>\n<p><b>Remediation plan: <\/b><span style=\"font-weight: 400;\">documented processes for investigating and addressing subgroup performance disparities, including model retraining on more representative data, feature engineering to reduce disparate impact, or, in cases where disparate performance cannot be adequately addressed, limiting the tool&#8217;s deployment contexts to those where performance has been validated.<\/span><\/p>\n<h3><b>The clinician alert fatigue problem:<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">CDS AI that generates too many alerts creates alert fatigue, clinicians learn to dismiss alerts reflexively because so many of them are false positives. A CDS system with a 90% false positive rate, even if the 10% true positives are clinically meaningful, will be ignored after the first week.<\/span><\/p>\n<h3><b>Alert design principles for CDS AI:<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">High specificity over high sensitivity for actionable alerts: a few highly specific alerts that the clinician can trust are more clinically valuable than many low-specificity alerts that the clinician ignores.<\/span><\/p>\n<p><b>Alert fatigue monitoring:<\/b><span style=\"font-weight: 400;\"> track the rate at which clinicians dismiss alerts without acting on them. Alert dismissal rates above 70% are a signal of alert fatigue that requires tuning.<\/span><\/p>\n<p><b>Tiered alert urgency: <\/b><span style=\"font-weight: 400;\">not all CDS insights need to be alerts. Surface time-sensitive, high-confidence, actionable insights as alerts. Surface lower-urgency insights as background information in the patient chart view that the clinician can review at their discretion.<\/span><\/p>\n<h2><b>Patient-Facing AI, The Highest UX Stakes in Healthcare<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Patient-facing clinical AI is the category where the consequences of design errors are most direct. The patient is the end user. There is no clinician as an intermediary. Here is how to build it safely.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>The clinical scope definition:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Before building patient-facing AI, define explicitly what the AI can and cannot do. The clinical scope definition is not just a product decision, it is a safety architecture decision.<\/span><\/p>\n<p><b>What is in scope: <\/b><span style=\"font-weight: 400;\">answering general health questions from a curated, clinically-reviewed knowledge base, reminding patients about scheduled appointments and medications, helping patients understand their clinical test results (with appropriate framing), connecting patients to appropriate care resources.<\/span><\/p>\n<p><b>What is never in scope: <\/b><span style=\"font-weight: 400;\">providing patient-specific diagnosis, recommending specific treatments or medications, advising patients to change or stop prescribed medications, providing emergency medical guidance in place of emergency services contact.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The scope definition must be implemented technically, not just stated in a disclaimer. The system prompt must explicitly prohibit out-of-scope responses. The AI must be configured to recognize when a patient question falls outside the defined scope and to respond with an appropriate referral to clinical care rather than attempting to answer.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>The crisis escalation architecture:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">For any patient-facing AI serving a general health population, crisis situations will occur. Patients will disclose suicidal ideation, express hopelessness, describe symptoms of acute medical emergencies, or indicate that they are in danger. The product must be designed for these moments.<\/span><\/p>\n<p><b>Crisis detection layer: <\/b><span style=\"font-weight: 400;\">keyword and semantic pattern matching for explicit and implied crisis language. Categories: suicidal ideation, self-harm, acute medical emergency (chest pain, stroke symptoms, severe allergic reaction), domestic violence or abuse, substance use crisis.<\/span><\/p>\n<p><b>Immediate crisis response: <\/b><span style=\"font-weight: 400;\">when crisis language is detected, by keyword match, semantic analysis, or both, the product immediately surfaces crisis resources to the patient. 988 Suicide and Crisis Lifeline for mental health crises. 911 for medical emergencies. National Domestic Violence Hotline for safety concerns. The crisis resources must be surfaced within the product, not as an external link that the patient may not follow, and must be accessible within two interactions from any screen.<\/span><\/p>\n<p><b>Escalation to human support:<\/b><span style=\"font-weight: 400;\"> for platforms with a human care navigator or clinical support function, a crisis detection event automatically creates an escalation notification to the clinical support team. The escalation is logged: what the patient said, when, what the AI responded, and when the clinical support team was notified.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Do not rely solely on AI to manage crisis situations. The AI&#8217;s role is crisis detection and immediate resource surfacing. A human clinical support pathway must exist for every patient-facing AI platform serving health-related queries.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Health literacy and accessibility design:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Patient health literacy in the US is lower than most clinical AI products assume. Approximately <\/span><a href=\"https:\/\/www.ohsu.edu\/center-for-ethics\/health-literacy-and-clear-communication-basics\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">36% of US adults<\/span><\/a><span style=\"font-weight: 400;\"> have basic or below-basic health literacy, meaning they have difficulty reading and understanding health information presented at a college reading level.<\/span><\/p>\n<p><b>Patient-facing AI must be designed for low health literacy:<\/b><span style=\"font-weight: 400;\"> use plain language (sixth-grade reading level or below for all patient-facing content), avoid medical jargon or define it immediately when used, use short sentences and simple vocabulary, confirm patient understanding at key points in the interaction.<\/span><\/p>\n<p><b>Language access: <\/b><span style=\"font-weight: 400;\">for patient populations that include non-English speakers, the AI must provide responses in the patient&#8217;s preferred language or provide clear pathways to human language interpretation services. An AI that responds only in English to a Spanish-speaking patient has created a health equity problem, not a health equity solution.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>The &#8220;I don&#8217;t know&#8221; pattern:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Patient-facing AI must know the limits of its knowledge and communicate those limits clearly. When a patient asks a question that is outside the AI&#8217;s knowledge base, or that requires clinical judgment the AI cannot provide, the AI must say so, directly and without hedging, and direct the patient to appropriate clinical care.<\/span><\/p>\n<p><b>Bad:<\/b><span style=\"font-weight: 400;\"> &#8220;That&#8217;s a great question! While I can&#8217;t give medical advice, it sounds like your symptoms might be related to [plausible but unvalidated clinical speculation].&#8221;<\/span><\/p>\n<p><b>Good: <\/b><span style=\"font-weight: 400;\">&#8220;I&#8217;m not able to answer that question safely, your symptoms need a clinician&#8217;s evaluation. Please contact your doctor, or if this feels urgent, go to an urgent care center or emergency department.&#8221;<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The second response is less engaging. It is also safer. Build the &#8220;I don&#8217;t know&#8221; response as a first-class feature, not as an edge case handler.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Content moderation for patient-generated inputs:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Patients will provide inputs that are outside the expected scope, abuse, harassment, personally distressing content, content that indicates acute crisis. Patient-facing AI must have content moderation that:<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Detects out-of-scope inputs and routes them appropriately (crisis \u2192 escalation, abuse \u2192 graceful response and topic change)<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Does not reinforce harmful patterns, AI responses to distressed patients must not validate harmful thoughts or behaviors, even implicitly<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Maintains appropriate clinical boundaries, the AI is not a therapist, it is not a friend, and its responses must not encourage the patient to treat it as a primary emotional support relationship.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-23110\" src=\"https:\/\/engineerbabu.com\/blog\/wp-content\/uploads\/2026\/05\/06_crisis_escalation.png\" alt=\"\" width=\"1920\" height=\"1120\" title=\"\"><\/p>\n<h2><b>AI for Administrative Healthcare Workflows, The Underrated Opportunity<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Administrative healthcare AI is the fastest path to clinical AI revenue for a founder who needs to show traction before a Series A. Here is why and what it takes to build it.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Prior authorization automation:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Prior authorization, the process by which a provider requests insurance approval before delivering specific services, is one of the most burdensome administrative processes in US healthcare. It consumes an estimated $35 billion per year in administrative costs across the US healthcare system. Providers spend an average of 13 hours per week on prior authorization. 94% of physicians report that prior authorization delays care, and 34% report it has led to a serious adverse event.<\/span><\/p>\n<p><b>AI for prior auth automation: <\/b><span style=\"font-weight: 400;\">an AI that reads the clinical criteria for a specific payer&#8217;s prior auth requirement, pulls the relevant clinical documentation from the patient&#8217;s EHR (diagnosis codes, clinical notes, lab results, imaging reports), and pre-populates the prior auth request with supporting documentation, reducing the provider&#8217;s time from 45 minutes to 5 minutes per request.<\/span><\/p>\n<p><b>Clinical risk profile:<\/b><span style=\"font-weight: 400;\"> low, the AI is assisting with documentation, not making clinical decisions. Regulatory profile: generally outside FDA jurisdiction. Revenue model: per-authorization fee ($3\u2013$10\/auth), or monthly subscription per provider. Time to first customer: 3\u20136 months.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Revenue cycle management AI:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Clinical documentation is the foundation of the revenue cycle, inaccurate or incomplete documentation leads to denied claims, delayed reimbursement, and lost revenue. AI that analyzes clinical documentation and suggests more specific or complete coding, identifying that a note describes &#8220;diabetes with peripheral vascular disease&#8221; and should be coded E11.51 rather than just E11, reduces claim denial rates and increases revenue capture.<\/span><\/p>\n<p><b>Clinical risk profile: <\/b><span style=\"font-weight: 400;\">low, the AI is suggesting coding, not making clinical decisions. Regulatory profile: generally outside FDA jurisdiction. Revenue model: percentage of additional revenue captured (2\u20135%), or monthly subscription per provider. Time to first customer: 4\u20138 months.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Clinical documentation coding (ICD-10\/CPT):<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Computer-assisted coding (CAC), AI that reads clinical notes and suggests appropriate ICD-10 diagnosis codes and CPT procedure codes, is an established market with significant AI-driven improvement opportunity. Traditional CAC systems use rule-based engines. LLM-based CAC can handle the full complexity of clinical language, including free-text notes, with significantly higher accuracy.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is an area where <\/span><a href=\"https:\/\/engineerbabu.com\/\"><span style=\"font-weight: 400;\">EngineerBabu&#8217;s<\/span><\/a><span style=\"font-weight: 400;\"> CMMI Level 5 credential and healthcare experience create a meaningful differentiation from pure-AI startups without healthcare process expertise.<\/span><\/p>\n<h2><b>The Clinical AI Data Architecture, Training, Fine-Tuning, and Inference Under HIPAA<\/b><\/h2>\n<ul>\n<li aria-level=\"1\">\n<h3><b>The data governance hierarchy for clinical AI:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Level 1, De-identified data (preferred for model development): Clinical data de-identified under the HIPAA Safe Harbor method (\u00a7164.514(b)), removing all 18 HIPAA identifiers, or the Expert Determination method (\u00a7164.514(b)(1)), statistical certification that re-identification risk is very small. De-identified data is not ePHI and is not subject to HIPAA restrictions on use.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is the preferred data governance level for model training and fine-tuning. De-identify training data before using it for model development whenever possible.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Level 2, Limited Data Set under Data Use Agreement: A Limited Data Set (LDS) retains some data elements that are not in the Safe Harbor de-identification standard, dates, geographic data below state level, but has the 16 most direct identifiers removed. LDS use requires a Data Use Agreement with the data source and limits use to research, public health, or healthcare operations purposes.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Level 3, Identified ePHI under authorization or treatment\/operations: Using identified ePHI for AI development requires either patient authorization or a determination that the use falls within treatment, payment, or healthcare operations under HIPAA. AI model training is not clearly within &#8220;healthcare operations&#8221; without careful legal analysis. Using identified ePHI for <\/span><a href=\"https:\/\/engineerbabu.com\/services\/ai-development\"><span style=\"font-weight: 400;\">AI development<\/span><\/a><span style=\"font-weight: 400;\"> without patient authorization and without a clear healthcare operations basis is a HIPAA risk.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>The RAG (Retrieval-Augmented Generation) architecture for clinical AI:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">For clinical AI features that need patient-specific clinical context, CDS that reasons about a specific patient&#8217;s medical history, ambient scribes that access the patient&#8217;s medication list to contextualize the visit conversation, RAG is the architectural pattern that provides real-time clinical context without fine-tuning the LLM on ePHI.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">RAG pipeline for clinical AI:<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Patient-specific clinical data is retrieved from the EHR via FHIR API at inference time<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Retrieved data is formatted as clinical context documents<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Context documents are included in the LLM prompt alongside the user query<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The LLM generates a response grounded in the retrieved patient-specific context<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The response is returned to the provider or patient<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Retrieved data and generated response are logged with ePHI-appropriate audit trail<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">The RAG approach provides patient-specific clinical context at inference time without requiring ePHI to be embedded in model weights through fine-tuning. This is a significantly simpler HIPAA compliance posture than fine-tuning on ePHI.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>The vector database for clinical RAG:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Clinical RAG architectures often use a vector database to store embeddings of clinical documents, medical literature, clinical guidelines, formulary data, institutional protocols, that the AI can retrieve as context alongside patient-specific EHR data.<\/span><\/p>\n<p><b>HIPAA consideration:<\/b><span style=\"font-weight: 400;\"> if the vector database stores embeddings of patient-specific ePHI documents (clinical notes, lab results, imaging reports), the vector database is handling ePHI and requires BAA coverage.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Vector databases with healthcare BAA availability:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Pinecone: BAA available on enterprise plan<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Weaviate: BAA available for enterprise deployments<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Chromself-hosted deployment avoids the BAA question<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">If the vector database stores only non-patient-specific content (medical literature, guidelines, formulary data), it is not handling ePHI and does not require a BAA.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Model versioning and audit trail for clinical AI:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Every model version that is deployed to production must be logged in your model registry with:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Model version identifier<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Model architecture and parameters<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Training data description (what data, what de-identification method, what date range)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Performance metrics on the validation dataset (overall and by subgroup)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Deployment date<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Deprecation date (when the model version was retired from production)<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Every clinical output must be attributable to the model version that generated it. If a hallucination is discovered in a clinical record six months post-deployment, you must be able to identify: which model version generated that output, when it was deployed, what its performance characteristics were, and whether other outputs from that model version may have the same failure mode.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is not just a regulatory requirement, it is a clinical liability management requirement.<\/span><\/p>\n<h2><b>Hallucination Guardrails, The Engineering Architecture That Protects Patients<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Hallucination is the defining reliability challenge of LLM-based clinical AI. Here is the engineering architecture that addresses it.<\/span><\/p>\n<h3><b>Layer 1: Prompt engineering constraints<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">The system prompt must explicitly prohibit hallucination-prone behaviors:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Prohibit adding clinical information not present in the source data or conversation<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Prohibit speculating about diagnoses or treatments not explicitly discussed<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Require uncertainty markers when the source is ambiguous or incomplete<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Require the &#8220;I don&#8217;t know&#8221; response when the question falls outside validated knowledge<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Prohibit fabricating references, statistics, or clinical guidelines<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Prompt constraints reduce hallucination frequency. They do not eliminate it. Every other guardrail layer is necessary.<\/span><\/p>\n<h3><b>Layer 2: Grounding validation<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">For clinical AI that generates outputs based on specific source data, ambient documentation based on a transcript, CDS based on EHR data, a grounding validation step verifies that each claim in the AI output is supported by specific source content.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Implementation: after the LLM generates the clinical output, a validation pass extracts specific clinical claims from the output and attempts to trace each claim to the source data. Claims that cannot be traced to source data are flagged as ungrounded and either removed from the output, highlighted for mandatory provider review, or cause the output to be regenerated with a more constrained prompt.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is the architectural layer that would have caught the hallucinated NSAID prescription from the opening story, the medication was not in the transcript, and a grounding validation would have identified it as an ungrounded claim.<\/span><\/p>\n<h3><b>Layer 3: Clinical entity validation<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">Clinical entities in the AI output, medications, diagnoses, procedures, lab values, are validated against clinical reference databases:<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Medications: validated against RxNorm (is this a real medication with the stated dosage and route?) Diagnoses: validated against ICD-10-CM (is this a real diagnosis code?) Drug interactions: flagged if the generated note includes a medication that interacts with a medication already in the patient&#8217;s EHR Allergies: flagged if the generated note includes a medication to which the patient has a documented allergy<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The allergy check alone would have prevented the harm scenario from the opening story. It must be automated, not dependent on the provider catching it in review.<\/span><\/p>\n<h3><b>Layer 4: Confidence scoring<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">The LLM is prompted to generate a confidence score or uncertainty level for each clinical claim in its output. Low-confidence claims are surfaced to the provider with visual highlighting and a mandatory review flag.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Implementation: structured output from the LLM (using JSON schema output mode) that includes both the clinical content and a confidence assessment for each content element. Claims below the confidence threshold are highlighted in the review interface.<\/span><\/p>\n<h3><b>Layer 5: Human review gate<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">Every clinical AI output that will enter a clinical record must pass through a provider review gate before being stored as an official record. The review gate design:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Presents the AI output with source citations visible<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Highlights ungrounded claims, low-confidence claims, and flagged clinical entities<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Requires explicit provider action on each flagged element before the note can be signed<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Records the provider&#8217;s review and attestation with timestamp and provider identity in the audit log<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The human review gate is the last line of defense against hallucinated clinical content. Every previous layer reduces the frequency of errors that reach the review gate. The review gate catches what gets through.<\/span><\/p>\n<h3><b>Layer 6: Post-deployment hallucination monitoring<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">After deployment, monitor for hallucination patterns in clinical outputs:<\/span><\/p>\n<p><b>Provider edit rate by field type: <\/b><span style=\"font-weight: 400;\">if providers consistently edit a specific field (e.g., the medication list in the Plan section), that field&#8217;s generation logic has a systematic error that needs investigation.<\/span><\/p>\n<p><b>Provider rejection rate: <\/b><span style=\"font-weight: 400;\">if providers frequently delete entire sections of AI-generated notes, the generation quality for those sections needs improvement.<\/span><\/p>\n<p><b>Clinical entity anomaly detection: <\/b><span style=\"font-weight: 400;\">flag AI outputs that include clinical entities (medications, diagnoses) that are statistically unusual for the visit type, a weight loss medication in a pediatric visit note, a cardiac medication in a well-child visit note.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Hallucination monitoring requires that you collect and analyze data on provider edits to AI-generated content. This data is clinically sensitive, it contains both the AI output and the provider&#8217;s corrections. Handle it under the same HIPAA framework as any clinical data.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-23113\" src=\"https:\/\/engineerbabu.com\/blog\/wp-content\/uploads\/2026\/05\/02_guardrail_layers.png\" alt=\"\" width=\"1920\" height=\"1240\" title=\"\"><\/p>\n<h2><b>The Real Cost Stack for Clinical AI Development in 2026<\/b><\/h2>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Engineering (what you pay us):<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Clinical AI Scribe MVP (ambient documentation, SOAP note generation, EHR integration for one EHR, provider review workflow): $140K\u2013$220K \/ 14\u201320 weeks<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Clinical Decision Support AI MVP (EHR data ingestion, risk stratification model, CDS alert delivery, bias monitoring infrastructure): $180K\u2013$290K \/ 18\u201326 weeks<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Patient-Facing AI MVP (symptom checker or care navigation, crisis escalation, health literacy design, one language): $110K\u2013$185K \/ 12\u201318 weeks<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Administrative AI MVP (prior auth automation or revenue cycle coding, EHR integration, provider workflow): $90K\u2013$155K \/ 10\u201316 weeks<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Dedicated clinical AI pod post-MVP: $28K\u2013$46K\/month<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>LLM infrastructure costs:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">AWS Bedrock, Claude 3.5 Sonnet: $3.00\/million input tokens, $15.00\/million output tokens (2026 pricing, confirm current rates)<\/span><\/p>\n<p><span style=\"font-weight: 400;\">At 500 ambient documentation sessions\/day \u00d7 4,000 tokens average per session:, Input: 500 \u00d7 4,000 = 2,000,000 tokens\/day = $6\/day = $2,190\/year, Output: 500 \u00d7 1,500 = 750,000 tokens\/day = $11.25\/day = $4,106\/year, Total LLM cost at 500 sessions\/day: approximately $6,300\/year, well within reasonable product economics<\/span><\/p>\n<p><span style=\"font-weight: 400;\">At 5,000 sessions\/day: approximately $63,000\/year, needs margin management but commercially viable<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Amazon Transcribe Medical: $0.0086\/second of audio At 500 sessions\/day \u00d7 15 minutes average: 500 \u00d7 900 = 450,000 seconds\/day = $3,870\/day This is expensive at scale, audio transcription is the dominant LLM infrastructure cost for ambient documentation at volume. Optimize session length and consider batched transcription for non-real-time use cases.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>HIPAA compliance infrastructure:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">BAA enterprise agreements (OpenAI, Anthropic, where applicable): $15K\u2013$40K\/year enterprise plan Vector database with healthcare BAA (Pinecone enterprise): $2K\u2013$8K\/month depending on index size and query volume Clinical NLP tooling (Amazon Comprehend Medical, Azure Text Analytics for Health): $0.01\u2013$0.05\/API call<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Clinical validation:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">IRB submission and approval: $3K\u2013$10K (legal and administrative) Clinical study conduct for CDS AI (site costs, data collection, monitoring): $80K\u2013$250K depending on study design Human factors validation study for patient-facing AI: $25K\u2013$60K<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Regulatory (if FDA SaMD pathway is required):<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">For ambient documentation and administrative AI that qualifies for CDS exemption or is outside FDA jurisdiction: regulatory costs are limited to the classification opinion ($5K\u2013$15K) and ongoing compliance monitoring.<\/span><\/p>\n<p><b>EB Index 2026:<\/b><span style=\"font-weight: 400;\"> The median total first-year cost for a clinical AI scribe product, engineering, LLM infrastructure, HIPAA compliance, clinical validation, and EHR integration for two health system customers, was $347,000. The median time from project start to first paying health system customer was 11 months. The largest timeline driver was health system clinical governance review at a median of 14 weeks per health system.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-23103\" src=\"https:\/\/engineerbabu.com\/blog\/wp-content\/uploads\/2026\/05\/05_build_cost.png\" alt=\"\" width=\"1920\" height=\"1160\" title=\"\"><\/p>\n<h2><b>The 14-Week Clinical AI MVP Sprint<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">This timeline covers a clinical AI scribe <\/span><a href=\"https:\/\/engineerbabu.com\/services\/mvp-development\"><span style=\"font-weight: 400;\">MVP development<\/span><\/a><span style=\"font-weight: 400;\">, the most common clinical AI category for a first build. Adjust for CDS AI (longer clinical validation workstream), patient-facing AI (longer health literacy and crisis UX workstream), or administrative AI (shorter regulatory workstream).<\/span><\/p>\n<h3><b>Week 1: Discovery, Regulatory Scoping, and Data Architecture Design<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">Intended use statement written. Regulatory attorney CDS exemption analysis initiated. BAA mapping: every service in the ePHI pipeline identified, BAA availability confirmed for each. LLM selection finalized (AWS Bedrock recommended). Transcription service selected (Amazon Transcribe Medical). Training data governance policy documented. Minimum necessary data access manifest designed for each AI feature.<\/span><\/p>\n<h3><b>Week 2: BAA Execution and Prompt Architecture<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">AWS enterprise agreement confirmed (if not already in place). LLM system prompt architecture designed, clinical constraints documented, hallucination prevention rules written, uncertainty marker requirements specified. Grounding validation approach selected. Clinical entity validation reference databases identified (RxNorm for medication validation, allergy check integration designed). Audio retention policy documented (72-hour default).<\/span><\/p>\n<h3><b>Week 3: Infrastructure Provisioning and Pipeline Foundation<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">HIPAA-eligible cloud infrastructure provisioned. Audio capture infrastructure built, encrypted in transit from capture device, not stored on device. Transcription pipeline built, Amazon Transcribe Medical integrated, speaker diarization configured. SBOM generation in CI\/CD pipeline. Audit trail service deployed, every ePHI access, every model inference, every provider action logged.<\/span><\/p>\n<h3><b>Week 4: LLM Integration and Prompt Implementation<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">AWS Bedrock integration. System prompt implemented with clinical constraints. Structured output schema designed, SOAP note JSON schema with confidence scores per field. Grounding validation layer implemented, claim extraction from output, source tracing against transcript. Low-confidence field flagging logic implemented.<\/span><\/p>\n<h3><b>Week 5: Clinical Entity Validation Layer<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">RxNorm medication validation integration. ICD-10 diagnosis code validation. Drug-drug interaction check against patient medication list (requires EHR integration or medication list input). Allergy check against patient allergy list (requires EHR integration or allergy list input). Clinical entity anomaly detection (statistically unusual entities flagged for review).<\/span><\/p>\n<h3><b>Week 6: Provider Review Interface<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">SOAP note review interface with source citations, transcript excerpt shown alongside each note section. Confidence indicators on low-confidence fields. Medication attestation workflow, explicit per-medication confirmation required. Mandatory review flags on ungrounded claims and clinical entity validation failures. Audit log of provider review actions, what was reviewed, what was edited, when the note was signed.<\/span><\/p>\n<h3><b>Week 7: EHR Integration for Note Delivery<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">FHIR R4 Document reference write-back (if target EHR supports FHIR write). HL7 v2 MDM message for legacy EHR note delivery. Epic write-back (if Epic is target EHR, requires App Orchard write-back certification initiated in Week 1). Athenahealth API note creation. Note delivery confirmation and error handling.<\/span><\/p>\n<h3><b>Week 8: Mobile Capture Application<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">Provider-facing mobile app (iOS, Android) for session audio capture. Session start\/stop controls. Pre-session setup (patient selection, visit type selection). Session audio encrypted in transit from app to transcription service. No audio stored on device after transmission.<\/span><\/p>\n<h3><b>Week 9: Hallucination Monitoring Infrastructure<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">Provider edit tracking, every field edit logged with original AI value and provider correction value. Edit rate analysis by field type and by clinical context. Alert for statistically elevated edit rates on specific fields. Post-deployment hallucination monitoring dashboard for engineering team.<\/span><\/p>\n<h3><b>Week 10: Bias Analysis and Clinical Validation Setup<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">Demographic subgroup analysis design for clinical validation. IRB submission prepared (if clinical study requires IRB approval). Clinical validation study protocol drafted. Performance benchmarking on de-identified test dataset, overall performance and by subgroup. Any subgroup performance gaps identified and documented.<\/span><\/p>\n<h3><b>Week 11: Internal QA and Clinical Advisor Review<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">Full test suite including hallucination edge cases, transcripts with absent medications, ambiguous diagnoses, inaudible sections, multiple providers speaking. Clinical advisor review of 50 AI-generated notes against reference notes from the same transcripts. Clinical advisor sign-off on prompt architecture and guardrail design. HIPAA compliance review, BAA registry, audit trail completeness, data flow diagram.<\/span><\/p>\n<h3><b>Week 12: Security Review and Penetration Testing<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">Third-party penetration test scoped and initiated. SAST findings reviewed and addressed. Prompt injection testing, attempts to override the system prompt through adversarial patient or provider inputs. Audio data security review, confirm no audio retained beyond 72-hour retention window. SBOM reviewed against NVD for known vulnerabilities.<\/span><\/p>\n<h3><b>Week 13: Pilot Deployment and Clinical Governance<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">First pilot health system: clinical governance submission prepared. Health system IT onboarding guide written. SMART on FHIR credentials issued for the pilot health system. Pilot deployment to 5\u201310 providers. Daily provider feedback collection during pilot. Any critical issues identified in pilot: fix within 48 hours, re-pilot before broader deployment.<\/span><\/p>\n<h3><b>Week 14: Pilot Review and Commercial Launch Preparation<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">Pilot results reviewed: provider NPS, documentation time reduction, edit rate by field, hallucination incident count, provider-reported issues. Hallucination monitoring findings reviewed: any systemic hallucination patterns requiring prompt or guardrail updates. Clinical governance approval from pilot health system. Commercial launch preparation: customer onboarding documentation, pricing, customer success model.<\/span><\/p>\n<h2><b>Human-in-the-Loop Design, Why Every Clinical AI Feature Needs It<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Human-in-the-loop (HITL) design is the architectural principle that ensures a human expert reviews and confirms AI outputs before those outputs influence consequential decisions. In clinical AI, HITL is not optional, it is the primary safety mechanism.<\/span><\/p>\n<h3><b>The three levels of HITL for clinical AI:<\/b><\/h3>\n<p><b><i>Level 1, Required review before action:<\/i><\/b> <span style=\"font-weight: 400;\">The AI generates an output. A human must review and explicitly approve the output before it is acted upon. No AI output can become a clinical record, a patient communication, a clinical recommendation, or a treatment decision without human review and approval.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is the minimum HITL requirement for any clinical AI that generates content that could influence patient care. The ambient documentation example, provider must review and sign before the note enters the EHR, is Level 1 HITL.<\/span><\/p>\n<p><b><i>Level 2, Confidence-gated automatic action:<\/i><\/b> <span style=\"font-weight: 400;\">The AI generates an output with a confidence score. High-confidence outputs (above a validated threshold) are acted upon automatically. Low-confidence outputs require human review before action.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This level is appropriate for administrative AI with lower clinical stakes, high-confidence prior auth determinations that meet a validated threshold, high-confidence coding suggestions that match a validated clinical pattern. It is not appropriate for clinical AI that directly influences patient care decisions.<\/span><\/p>\n<p><b><i>Level 3, Supervised automation:<\/i><\/b> <span style=\"font-weight: 400;\">The AI acts autonomously within a constrained domain. A human supervisor reviews AI actions in aggregate and can intervene when patterns suggest systematic errors. The AI does not stop and wait for human review of individual actions.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This level is appropriate for very low-stakes administrative tasks, appointment reminder scheduling, routine refill requests for stable chronic condition patients. It is not appropriate for any clinical AI that generates clinical content or clinical recommendations.<\/span><\/p>\n<h3><b>HITL and workflow design:<\/b><\/h3>\n<p><span style=\"font-weight: 400;\">HITL is not just a safety requirement, it is a product design challenge. A HITL design that adds too much friction will be circumvented by clinicians under time pressure. A HITL design that adds too little friction will not catch clinical AI errors.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The design goal is the minimum friction required to ensure meaningful human review. Meaningful review means the clinician is actually reading and evaluating the AI output, not clicking through it reflexively.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Design strategies for meaningful review without excessive friction:<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Focused attention direction: the review interface highlights the content that most needs review, ungrounded claims, low-confidence fields, clinical entity flags, so the clinician&#8217;s attention is directed to the highest-risk content first.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Contextual source display: showing the source (transcript excerpt, EHR data element) alongside the AI output makes review faster and more accurate than requiring the clinician to recall or look up the source independently.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Progressive attestation: break the review into stages that match clinical reasoning, chief complaint, then history, then assessment, then plan, rather than presenting the full note at once.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Time feedback: show the clinician how long the review has taken and how it compares to their previous reviews. Clinicians who are reviewing very quickly relative to their typical pace may be clicking through without reading.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-23108\" src=\"https:\/\/engineerbabu.com\/blog\/wp-content\/uploads\/2026\/05\/10_hitl_levels.png\" alt=\"\" width=\"1920\" height=\"1080\" title=\"\"><\/p>\n<h2><b>Post-Launch: Model Monitoring, Drift Detection, and Retraining Under HIPAA<\/b><\/h2>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Model drift, the clinical AI failure mode that happens slowly:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">A clinical AI model that performs well at launch may perform worse six months later, not because anything changed in the product, but because the clinical environment changed. New ICD-10 codes, new medications, new clinical protocols, seasonal illness patterns, changes in patient documentation behavior, all of these shift the distribution of clinical data the model encounters in ways that may degrade performance.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Model drift detection requires:<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Continuous performance monitoring: metrics collected in real time from production, provider edit rate for ambient documentation, alert accuracy for CDS AI, patient satisfaction and escalation rate for patient-facing AI.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Drift detection algorithms: statistical tests that compare the current distribution of model inputs and outputs to the baseline distribution from validation. Significant distribution shift triggers investigation and potential retraining.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Canary deployments: when a retrained model is deployed, it is first deployed to a small percentage of production traffic (5\u201310%). Performance is monitored in the canary cohort before full deployment. If performance is worse than the baseline model in the canary cohort, the deployment is rolled back before full production exposure.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Retraining under HIPAA:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">If clinical AI retraining uses production clinical data, provider edits to AI-generated notes, patient interactions with patient-facing AI, EHR data from clinical encounters, that data may be ePHI. Retraining on ePHI requires the same data governance framework as initial training: de-identification where possible, appropriate authorization or DUA where de-identification is not possible, documented data lineage.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Provider edits to AI-generated notes are particularly valuable for retraining ambient documentation models. A provider&#8217;s correction of a hallucinated medication is a labeled training example, the AI output was wrong, the provider correction was right. This is high-quality signal for model improvement.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To use provider edits for retraining: obtain appropriate authorization (provider consent to use their edits for model improvement, which should be in the provider agreement), de-identify the associated clinical content where possible, document the data use in your privacy policy.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>The PCCP for clinical AI that requires FDA clearance:<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">For clinical AI products that require FDA clearance, the Predetermined Change Control Plan is the mechanism for pre-approving algorithm updates. Include in the PCCP:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Retraining on expanded clinical datasets with the same clinical indication<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Retraining on datasets from new clinical sites or geographic regions<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Updating preprocessing pipelines that do not change the model architecture<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Adjusting decision thresholds within validated performance ranges<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The PCCP makes algorithm improvement operationally feasible without requiring a new FDA submission for every update. It must be negotiated with the FDA in the original submission, include everything you might want to change. The cost of adding a PCCP element after clearance is a new submission. The cost of including it in the original PCCP is the regulatory attorney&#8217;s time to write it.<\/span><\/p>\n<h2><b>When an Indian Engineering Partner Is Wrong for Your Clinical AI Build<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">An Indian engineering partner is the wrong call for your clinical AI product if: your clinical AI development involves daily collaboration with clinical advisors who are embedded in the engineering process and available only during US clinical hours, if the clinical review cadence is synchronous and spontaneous in a way that the overlap window cannot accommodate.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">If your health system customers require that all AI model training and inference occur on-premises within the health system&#8217;s own infrastructure, some academic medical centers and federal health systems have this requirement, and it means the engineering team must be able to work within the health system&#8217;s network, which may restrict offshore access.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">If your clinical AI product is in an FDA SaMD category where your regulatory attorney has advised that all development personnel have formal FDA-regulated environment training and documentation, an uncommon requirement, but one that some high-risk device developers impose.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">If your clinical AI requires real-time collaboration between engineers and clinical staff responding to patient interactions, for example, a clinical AI product with a human escalation function where the engineering team must be responsive to real-time clinical escalations.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For the vast majority of clinical AI founders building ambient documentation tools, CDS AI, patient-facing health navigation, or administrative healthcare AI: the structured collaboration model is viable.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Clinical advisors embedded on the US side, engineering team in Indore with defined US-overlap hours, model training on de-identified data in HIPAA-compliant cloud infrastructure. We have built clinical AI products from Indore that are deployed in US health systems today.<\/span><\/p>\n<h2><b>The Clinical AI Product Scorecard\u2122<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Score each row 0 (absent), 1 (partial), or 2 (fully present). Maximum score: 70.<\/span><\/p>\n<table>\n<tbody>\n<tr>\n<td><b>#<\/b><\/td>\n<td><b>Criterion<\/b><\/td>\n<td><b>Weight<\/b><\/td>\n<td><b>Your Score<\/b><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">1<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Regulatory attorney written opinion on FDA SaMD classification (or CDS exemption documentation)<\/span><\/td>\n<td><span style=\"font-weight: 400;\">2\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/4<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">2<\/span><\/td>\n<td><span style=\"font-weight: 400;\">BAA confirmed for every service in the ePHI processing pipeline<\/span><\/td>\n<td><span style=\"font-weight: 400;\">2\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/4<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">3<\/span><\/td>\n<td><span style=\"font-weight: 400;\">LLM provider BAA executed and scope reviewed by healthcare attorney<\/span><\/td>\n<td><span style=\"font-weight: 400;\">2\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/4<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">4<\/span><\/td>\n<td><span style=\"font-weight: 400;\">System prompt with explicit hallucination prevention constraints<\/span><\/td>\n<td><span style=\"font-weight: 400;\">2\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/4<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">5<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Grounding validation layer (AI output claims traced to source data)<\/span><\/td>\n<td><span style=\"font-weight: 400;\">2\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/4<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">6<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Clinical entity validation (medication RxNorm check, allergy check, ICD-10 check)<\/span><\/td>\n<td><span style=\"font-weight: 400;\">2\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/4<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Required human review gate before any AI output enters clinical record<\/span><\/td>\n<td><span style=\"font-weight: 400;\">2\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/4<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">8<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Source citations displayed alongside AI output in review interface<\/span><\/td>\n<td><span style=\"font-weight: 400;\">2\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/4<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">9<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Confidence indicators on low-confidence AI output fields<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/2<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">10<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Crisis escalation pathway for patient-facing AI<\/span><\/td>\n<td><span style=\"font-weight: 400;\">2\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/4<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">11<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Bias analysis by demographic subgroup before clinical deployment<\/span><\/td>\n<td><span style=\"font-weight: 400;\">2\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/4<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">12<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Model version registry with performance metrics and deployment dates<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/2<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">13<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Every clinical AI output attributable to specific model version<\/span><\/td>\n<td><span style=\"font-weight: 400;\">2\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/4<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">14<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Provider edit rate monitoring post-deployment<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/2<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">15<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Audio retention policy documented and automated deletion implemented (if audio captured)<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/2<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">16<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Prompt logging disabled in production or prompt logging service has HIPAA BAA<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/2<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">17<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Training data governance policy (de-identification method, authorization basis)<\/span><\/td>\n<td><span style=\"font-weight: 400;\">2\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/4<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">18<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Minimum necessary data access manifest per AI feature<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/2<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">19<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Clinical advisor review of AI feature outputs before production deployment<\/span><\/td>\n<td><span style=\"font-weight: 400;\">2\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/4<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">20<\/span><\/td>\n<td><span style=\"font-weight: 400;\">SBOM generated in CI\/CD pipeline<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/2<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">21<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Incident response plan for clinical AI failure (hallucination in clinical record)<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/2<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">22<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Retraining data governance documented (if using production data for retraining)<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/2<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">23<\/span><\/td>\n<td><span style=\"font-weight: 400;\">PCCP included in FDA submission (if AI\/ML SaMD requiring clearance)<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/2<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">24<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Real-world performance monitoring infrastructure operational at launch<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/2<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400;\">25<\/span><\/td>\n<td><span style=\"font-weight: 400;\">Health literacy design for patient-facing AI (sixth-grade reading level or below)<\/span><\/td>\n<td><span style=\"font-weight: 400;\">1\u00d7<\/span><\/td>\n<td><span style=\"font-weight: 400;\">\/2<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><b>Score interpretation:<\/b><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">55\u201370: Strong clinical AI safety and compliance posture, ready for health system deployment and enterprise sales<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">40\u201354: Proceed with identified gaps remediated, patient safety 2\u00d7 items are non-negotiable<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Under 40: Significant patient safety and regulatory exposure, do not deploy clinically until gaps are closed<\/span><\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-23109\" src=\"https:\/\/engineerbabu.com\/blog\/wp-content\/uploads\/2026\/05\/09_scorecard-1.png\" alt=\"\" width=\"1920\" height=\"1120\" title=\"\"><\/p>\n<h2><b>Conclusion<\/b><\/h2>\n<p><span style=\"font-weight: 400;\">Clinical AI is the highest-stakes software category in <\/span><a href=\"https:\/\/engineerbabu.com\/industries\/healthcare-software-development\"><span style=\"font-weight: 400;\">health tech<\/span><\/a><span style=\"font-weight: 400;\">. The ambient documentation product that gets it right saves a physician fourteen minutes per patient and hundreds of hours per year.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The product that gets it wrong, the hallucinated NSAID prescription that reaches a patient allergic to NSAIDs, the CDS alert that fires on incorrect data, the patient-facing AI that fails to escalate a suicidal patient to crisis resources, does not get a second chance with the patient who was harmed.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The founders who build clinical AI correctly understand that the guardrail architecture is the product. Not a safety feature added after the core product is built. The foundation the core product is built on.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The LLM is a powerful, unreliable collaborator. Your job is to build the architecture that makes its unreliability clinically acceptable, source citation, grounding validation, clinical entity validation, confidence scoring, mandatory human review, so that the fourteen minutes saved per encounter are genuinely saved, and no patient receives a hallucinated clinical decision as a result.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">I have been on 2,000+ calls with US healthcare founders since 2014. The clinical AI founders who succeed are the ones who treat patient safety as the product constraint, the thing that shapes every architecture decision, not as the compliance requirement they address before launch.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">If you want 30 minutes to talk through your clinical AI product, which LLM, which HIPAA architecture, what guardrails, what regulatory exposure, book a call with me or Aditi. No slides. No pitch. Just the product conversation.<\/span><\/p>\n<h2><b>FAQ<\/b><\/h2>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Can I use ChatGPT or GPT-4 API for a HIPAA-compliant clinical AI product?<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Yes, under specific conditions. OpenAI offers a Business Associate Agreement under an enterprise agreement. The BAA covers the API (chat completions, embeddings) under the enterprise agreement terms. The BAA scope must be reviewed by your healthcare attorney to confirm it covers your specific use case. Standard API plans (pay-as-you-go) do not include a BAA. If you are processing ePHI through the OpenAI API without an enterprise BAA, you are in violation of HIPAA.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>What is the difference between AWS Bedrock and using the Anthropic API directly for clinical AI?<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">AWS Bedrock provides access to Anthropic&#8217;s Claude models (and other models) through AWS&#8217;s infrastructure, covered under the standard AWS HIPAA BAA without requiring a separate enterprise agreement with Anthropic. The Anthropic API directly requires a separate enterprise BAA negotiation with Anthropic. For most clinical AI products, AWS Bedrock is faster to compliance and operationally simpler, one BAA (AWS) covers both the cloud infrastructure and the LLM. The trade-off is that Bedrock may not offer the latest Claude model versions as quickly as Anthropic&#8217;s direct API.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>How do I prevent LLM hallucinations in clinical AI outputs?<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">No single mechanism eliminates LLM hallucinations. A layered guardrail architecture reduces them to a clinically acceptable frequency: (1) prompt engineering constraints prohibiting fabrication and requiring uncertainty markers, (2) grounding validation that traces AI output claims to source data, (3) clinical entity validation checking medications against RxNorm, allergies against patient records, and diagnoses against ICD-10, (4) confidence scoring that flags low-confidence outputs for mandatory provider review, and (5) a required human review gate before any AI output enters a clinical record. Operate all five layers simultaneously.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>Does an AI clinical scribe need FDA clearance?<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">An AI clinical scribe that generates clinical documentation from provider-patient conversations is generally not a medical device under FDA definitions, it is generating documentation of what the provider said, not making independent clinical decisions. However, the line between documentation and clinical decision support is not always clear. Get a regulatory attorney&#8217;s written CDS exemption analysis before asserting that your product does not require FDA engagement. The analysis is particularly important if your scribe adds clinical interpretation, suggests diagnoses, or recommends treatments beyond what was explicitly discussed in the conversation.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>What is the minimum necessary principle and how does it apply to clinical AI?<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The HIPAA minimum necessary principle (45 CFR \u00a7164.502(b)) requires that ePHI access be limited to the minimum necessary to accomplish the intended purpose. For clinical AI, this means: your AI system should access only the patient data elements required for the specific AI function being performed. An ambient documentation AI that needs the patient&#8217;s medication list to contextualize visit documentation does not need access to the patient&#8217;s psychiatric history, HIV status, or SUD records. Implement a data access manifest for each AI feature listing the specific data elements accessed and why each is necessary.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>How do I handle clinical AI model bias?<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Clinical AI bias, disparate performance across demographic subgroups, requires a multi-step approach. Before deployment: analyze model performance by age group, sex, race\/ethnicity, primary language, insurance type, and any other clinically relevant demographic. After deployment: monitor real-world performance by subgroup continuously. For identified performance gaps: investigate the root cause (training data underrepresentation, feature disparities, labeling bias), retrain on more representative data, and validate the retrained model&#8217;s subgroup performance before deployment. Document all bias analysis in your model cards and clinical validation reports.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>What data can I use to train or fine-tune a clinical AI model?<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The preferred training data source is de-identified clinical data, data from which all 18 HIPAA identifiers have been removed using the Safe Harbor or Expert Determination method. De-identified data is not ePHI and can be used for model training without HIPAA restrictions. Using identified ePHI for model training requires either patient authorization or a determination that the use falls within healthcare operations, which is not always clear for model training purposes. Get a healthcare attorney&#8217;s opinion before using identified ePHI for training data.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>How should a patient-facing AI handle suicidal ideation or mental health crises?<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The crisis escalation architecture must be built before the first patient interaction. When crisis language is detected, by keyword matching, semantic analysis, or both, the product must: immediately surface 988 Suicide and Crisis Lifeline contact information, Crisis Text Line, and local emergency services within the product; if the platform has clinical staff, create an escalation notification to the clinical support team; log the crisis event with timestamp, patient content, AI response, and escalation status; and not attempt to provide clinical crisis management through the AI. The AI detects and escalates. A human clinical pathway manages the crisis.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>What is prompt injection and how does it affect clinical AI security?<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Prompt injection is an attack where malicious content in the AI&#8217;s input, a patient message, a clinical note, an EHR data field, contains instructions that override the AI&#8217;s system prompt. In a clinical AI context, a prompt injection attack could cause the AI to: generate fabricated clinical content, bypass hallucination guardrails, disclose ePHI from the current context window, or perform unintended actions. Defenses: test your system with adversarial prompt injection inputs as part of security testing, implement input sanitization for external data sources, use structured input formats that reduce injection risk, and monitor for unusual AI output patterns that may indicate a successful injection.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>How do I handle clinical AI outputs in multiple languages for diverse patient populations?<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Patient-facing AI for diverse populations must support the patient&#8217;s preferred language. The implementation options: (1) use a multilingual LLM (Claude, GPT-4, Gemini support many languages natively) with a language detection step that switches the response language to match the patient&#8217;s input, (2) maintain separate system prompts and clinical content in each supported language with clinical review of each language version, (3) for languages where LLM performance is less validated clinically, route to human language interpretation services rather than relying on AI-generated health content. Clinical content in any language must be reviewed by a bilingual clinical advisor for accuracy and health literacy appropriateness.<\/span><\/p>\n<ul>\n<li aria-level=\"1\">\n<h3><b>What is the difference between fine-tuning and RAG for clinical AI?<\/b><\/h3>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Fine-tuning trains the LLM on a clinical dataset, embedding clinical knowledge into the model weights. RAG (Retrieval-Augmented Generation) retrieves relevant clinical documents at inference time and includes them in the prompt as context, without changing the model weights. For most clinical AI use cases, RAG is preferable to fine-tuning for three reasons: (1) RAG does not require ePHI in training data, you can use a general-purpose LLM with de-identified clinical knowledge in the retrieval database, (2) RAG knowledge can be updated without retraining, you update the retrieval database, not the model, (3) RAG is more transparent, the retrieved documents are visible in the prompt, making it easier to trace the source of AI outputs. Fine-tuning is appropriate when the clinical domain requires specialized language patterns or reasoning that the base LLM does not handle well, and when clean de-identified training data is available.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In October 2023, a clinical AI startup in New York, Series A, $16M raised, shipped an ambient documentation feature. The product listened to provider-patient conversations during outpatient visits, transcribed the audio, and used an LLM to generate a structured SOAP note that pre-populated in the provider&#8217;s EHR. Providers loved it. Documentation time dropped from 22 [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":23102,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1246],"tags":[],"class_list":["post-23071","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-healthtech"],"_links":{"self":[{"href":"https:\/\/engineerbabu.com\/blog\/wp-json\/wp\/v2\/posts\/23071","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/engineerbabu.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/engineerbabu.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/engineerbabu.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/engineerbabu.com\/blog\/wp-json\/wp\/v2\/comments?post=23071"}],"version-history":[{"count":3,"href":"https:\/\/engineerbabu.com\/blog\/wp-json\/wp\/v2\/posts\/23071\/revisions"}],"predecessor-version":[{"id":23114,"href":"https:\/\/engineerbabu.com\/blog\/wp-json\/wp\/v2\/posts\/23071\/revisions\/23114"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/engineerbabu.com\/blog\/wp-json\/wp\/v2\/media\/23102"}],"wp:attachment":[{"href":"https:\/\/engineerbabu.com\/blog\/wp-json\/wp\/v2\/media?parent=23071"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/engineerbabu.com\/blog\/wp-json\/wp\/v2\/categories?post=23071"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/engineerbabu.com\/blog\/wp-json\/wp\/v2\/tags?post=23071"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}