Architecture Summary

The Condensa pipeline transforms raw document images into standardized healthcare formats (like FHIR) via a staged processing flow.

High-level Pipeline

Document → Vision Encoder → Parser → Mapper → Validator → Valid FHIR Output

Layer Breakdown

LayerWhat it doesTechnology / Notes
Vision Encoder Converts raw documents (PDFs, scans, images) to compact visual tokens to reduce file size and complexity. Condensa Vision Core — efficient visual compression & encoding.
Parser Reads visual tokens and extracts structured information: entities, values, units, sections. Python scripts + domain-specific ontology rules for entity recognition.
Mapper Transforms parsed output into FHIR or ERP JSON using mapping templates and rules. LLM-assisted rule templates ensure fields conform to FHIR structures.
Validator Validates mapped data against expected schema and FHIR compliance checks. Pydantic for schema validation; FHIR validator for official compliance tests.

Example Flow

  1. User uploads a lab report PDF
  2. Vision Encoder compresses and tokenizes images
  3. Parser extracts "Blood Glucose", "Patient Name", "Date"
  4. Mapper converts extracted fields to a FHIR Observation bundle
  5. Validator checks FHIR schema — output returned to user

Example FHIR Output (snippet)

{
  "resourceType": "Observation",
  "id": "obs-12345",
  "status": "final",
  "code": { "text": "Blood Glucose" },
  "subject": { "reference": "Patient/pat-6789" },
  "effectiveDateTime": "2025-08-01T09:00:00Z",
  "valueQuantity": { "value": 110, "unit": "mg/dL" }
}