Mathematical Confidence in a Claims Graph

Spencer Wozniak

Epistemology | February 13, 2026

Clinical documents are dense, narrative, and often ambiguous. Turning them into structured, mathematically tractable knowledge requires not just extraction, but epistemology.

At Serelora, we move from raw text to canonical claims through a two-layer architecture:

Layer A: From Document to Mentions

Whether it arrives as a PDF, scanned image, fax, or even messy handwritten notes, every document is converted into machine-readable text with high accuracy.

Once in text form, the document is segmented into clinically meaningful sections (e.g. HPI, PMH, Medications, ROS, Physical Exam, Assessment & Plan, or SOAP components) using a dynamic chunking step.

These sections are not hardcoded, but rather the system identifies structure contextually based on the document itself. Importantly, each section preserves exact character offsets, ensuring full traceability to the source.

Within each chunk, the system extracts discrete clinical entities (e.g. medications, diagnoses, labs, procedures, allergies, etc.) without relying on a fixed schema or hard-coded list.

Each extracted mention carries:

  • Text
  • Section context
  • Document provenance
  • Negation status
  • Confidence (0–1)

At this stage, we do not yet assert truth. We merely record that this document appears to state something.

Layer B: From Mentions to Claims

Mentions are provisional. Claims are canonical.

When a new mention is extracted, the system evaluates whether it:

  • Represents the same assertion as an existing claim
  • Introduces a new assertion
  • Contradicts an existing claim
  • Supersedes an existing claim

New Claims: Single Confidence Assignment

If a mention creates a new claim, the claim simply inherits the mention’s extraction confidence:

\[p_{\text{claim}} = p_{\text{mention}}\]

without accumulation.

Same Claim: Bayesian-Like Confidence Accumulation

When a mention is attached as additional evidence to an existing claim, confidence is updated using a probabilistic accumulation rule.

Let the existing claim confidence and the new mention confidence be:

\[p_1, p_2 \in [0,1]\]

respectively. Both values are clamped to the unit interval:

\[ p_i := \max(0, \min(1, p_i)) \]

The new confidence is computed as:

\[ p_{\text{new}} = 1 - (1 - p_1)(1 - p_2) \]\[ p_{\text{new}} = p_1 + p_2 - p_1 p_2 \]

Interpretation

We reinterpret each confidence as a probability of being wrong:

\[ q_1 = 1 - p_1 \]\[ q_2 = 1 - p_2 \]

Assuming independence, the probability both are wrong is:

\[ q_{\text{combined}} = q_1 \cdot q_2 \]

Converting back to confidence:

\[ p_{\text{new}} = 1 - q_{\text{combined}} \]

This is mathematically equivalent to the probability that at least one independent supporting signal is correct.

Generalization

For multiple independent mentions supporting the same claim:

\[ p_{\text{final}} = 1 - \prod_{i=1}^{n} (1 - p_i) \]

Confidence approaches 1 asymptotically but never irrationally exceeds it. Each additional independent witness reduces the probability of error multiplicatively.

Epistemic Discipline

Importantly:

  • Confidence only accumulates when evidence attaches to an existing claim.
  • Contradictions create new claims and alter epistemic status.
  • Supersessions preserve historical truth but mark it temporally outdated.

The system distinguishes between:

  • Extraction (what text appears to say)
  • Assertion (what we provisionally treat as true)
  • Confidence (how mathematically secure that assertion is)

Why This Matters

Clinical reasoning requires more than data retrieval: it requires structured belief revision to decide when evidence truly strengthens a claim (and when it does not).

By grounding confidence in a principled probabilistic update rule, we move toward a coherent epistemic model. Evidence does not merely accumulate, but it increases certainty only when it independently reduces the probability of error.

Thus, each claim in the graph is mathematically situated with confidence reflecting the structured convergence of multiple witnesses.

By the mouth of two or three witnesses every matter shall be established.

— Deuteronomy 19:15