A Geometric Framework for Retrieval-Augmented Generation

Authors: Scott Senkeresty (Chief Architect, Semantic OS), Tia (Chief Semantic Agent)
Affiliation: Semantic Infrastructure Lab
Date: 2025-11-30
Status: Research Framework
Document Type: Technical Research Paper
Related SIL Components: Semantic Memory (Layer 0), USIR (Layer 1), Multi-Agent Orchestration (Layer 3)


Abstract

Contemporary Retrieval-Augmented Generation (RAG) systems are typically engineered as keyword retrieval pipelines with prompt injection—an approach that produces fragile, unpredictable, and often unreliable results. This document presents an alternative formulation: RAG as semantic manifold transport, where meaning must be preserved across four geometrically misaligned representation spaces.

We show that RAG failures are not retrieval failures but geometric distortion failures during meaning transport across:

  1. Human conceptual space → Embedding space
  2. Embedding space → LLM latent space
  3. LLM latent space → Fusion space
  4. Throughout: preservation of semantic topology, curvature, and relational structure

This framework provides rigorous foundations for designing RAG systems that minimize semantic distortion at each transition. We outline the distortion sources, propose geometric alignment strategies, and connect this work to SIL's broader semantic infrastructure research.

Keywords: semantic manifolds, retrieval-augmented generation, meaning transport, geometric distortion, semantic memory, USIR

💡 New to SIL terminology? Keep the Glossary open in another tab.


1. Introduction: RAG is Not a Retrieval Problem

1.1 The Current State of RAG

Most deployed RAG systems follow a pattern:

  1. Embed user query into vector space
  2. Retrieve top-k similar document chunks
  3. Concatenate chunks into prompt context
  4. Generate response with LLM

This approach treats RAG as information retrieval + text generation. The implicit assumption: if retrieved text is "relevant" by embedding similarity, the LLM will correctly interpret and integrate it.

This assumption is false.

1.2 Why Standard RAG Fails

Observed failure modes include:

  • Hallucination despite retrieved evidence - LLM ignores or misinterprets provided context
  • Relevance mismatch - Embedding similarity ≠ LLM reasoning relevance
  • Knowledge conflicts - Retrieved chunks contradict each other; LLM has no resolution protocol
  • Context dilution - Relevant information buried in irrelevant chunks
  • Meaning drift - User intent distorted through query → embedding → retrieval → generation pipeline

These are not bugs. They are symptoms of geometric distortion during semantic transport.

1.3 The Core Insight

RAG is not a retrieval problem.
RAG is a semantic meaning transport problem across four misaligned manifolds.

Each representation space (human concepts, embeddings, LLM latents, fused reasoning) has different geometry—different notions of distance, curvature, and relational structure. Meaning that moves between these spaces undergoes distortion unless we explicitly engineer alignment.

This paper formalizes that distortion and proposes rigorous strategies to minimize it.


2. The Four Semantic Manifolds

2.1 Notation and Definitions

We model semantic spaces as manifolds[^1] with intrinsic geometry:

[^1]: A manifold is a topological space that locally resembles Euclidean space but may have global curvature and complex structure. Semantic manifolds are not metric spaces in the strict mathematical sense, but the manifold framework provides useful geometric intuition for reasoning about meaning preservation.

  • M_H: Human conceptual manifold
  • M_E: Embedding manifold
  • M_L: LLM latent manifold
  • M_F: Fusion manifold

Semantic transport in RAG requires preserving structure across these spaces:

M_H --[projection]--> M_E --[alignment]--> M_L --[fusion]--> M_F

Goal: Minimize semantic distortion at each arrow.


2.2 M_H — Human Conceptual Manifold

Characteristics:

Human concepts exist in high-dimensional, relationally structured space:

  • Contextual: Meaning depends on shared knowledge, culture, pragmatics
  • Underspecified: Natural language queries omit obvious (to humans) constraints
  • Non-linear: Conceptual similarity is not embedding-space cosine distance
  • Relational: Meaning encoded in graph structure, not feature vectors
  • Embodied: Grounded in physical, temporal, causal experience

Geometry:

  • High intrinsic curvature (concepts cluster in non-Euclidean ways)
  • Sparse explicit features (most meaning is implicit)
  • Dynamic topology (context reshapes semantic neighborhoods)

Example:

Query: "Why did the project fail?"

Human conceptual structure:
- Implicit scope: "our recent software project"
- Implicit relations: blame attribution, timeline, causal chains
- Implicit constraints: technical vs organizational factors

Embedding models see only surface tokens.


2.3 M_E — Embedding Manifold

Characteristics:

Learned vector space optimized for distributional similarity:

  • Static: Vectors do not change based on query context (in most systems)
  • Distributional: Meaning ≈ co-occurrence patterns in training data
  • Locally linear: Designed for cosine similarity, dot products, k-NN retrieval
  • Low curvature: Optimized to approximate Euclidean geometry locally

Geometry:

  • Smooth, low-curvature approximation of semantic space
  • Similarity = angle between vectors (cosine)
  • Retrieval = nearest-neighbor search in metric space

Distortion:

Projecting M_H → M_E loses:
- Implicit relational constraints
- Contextual disambiguation
- Pragmatic intent
- Causal/temporal structure

Example:

Same query: "Why did the project fail?"

Embedding representation:
- Tokens: [why, did, the, project, fail]
- Nearest neighbors: generic "project failure" documents
- Missing: which project, what kind of failure, who is asking, why it matters


2.4 M_L — LLM Latent Manifold

Characteristics:

The internal semantic space where the LLM represents meaning:

  • Highly curved: Nonlinear transformations through layers
  • Dynamic: Geometry depends on prompt, task, and token sequence
  • Contextual: Early tokens shape curvature for later tokens
  • Task-conditional: Same text has different latent geometry in different contexts

Geometry:

  • Deep nonlinear manifold shaped by transformer attention
  • Meaning = trajectory through layer activations
  • Attention patterns create local curvature in representation space

Critical mismatch:

M_E geometry (optimized for cosine similarity) ≠ M_L geometry (optimized for next-token prediction and in-context reasoning).

Thus: embedding relevance ≠ LLM reasoning relevance.

Example:

Retrieved text: "The waterfall methodology led to late-stage requirement changes."

In M_E: High cosine similarity to "project failure"
In M_L: Interpreted based on:
- Position in context window
- Surrounding chunks
- Query phrasing
- Model's internal task representation

Same text can have high M_E relevance but low M_L utility if geometry doesn't align.


2.5 M_F — Fusion Manifold

Characteristics:

The emergent semantic space where query + retrieved evidence + model knowledge integrate:

  • Constructed during inference: Built by attention over combined context
  • Conflicted: May contain contradictory signals
  • Unstable: Small changes in retrieval order or formatting → large output changes
  • Governed by attention dynamics: Which tokens dominate depends on transformer architecture

Geometry:

  • Shaped by how attention patterns fuse multiple information sources
  • Early tokens act as anchors (high influence on final representation)
  • Late tokens get less attention weight (recency bias)

Failure mode:

Without structured fusion protocol, M_F becomes:
- Noisy superposition of conflicting signals
- Dominated by most recent or most confident text (not most correct)
- Unpredictable based on formatting/ordering

Example:

Retrieved chunks:
1. "Project failed due to inadequate testing"
2. "Project succeeded in delivering core features"
3. "Stakeholder misalignment caused delays"

Without fusion protocol:
- LLM may weigh #1 highest (appears first)
- Or synthesize false narrative blending contradictions
- Or ignore evidence entirely and hallucinate

With fusion protocol:
- Extract claims with sources
- Resolve contradictions (succeeded vs failed)
- Identify ambiguity (what does "failed" mean?)
- Produce grounded, multi-perspective answer


3. Distortion Analysis: Where RAG Breaks

3.1 Transport #1: Human → Embedding (M_H → M_E)

Distortion source:

Projecting rich, relational, contextual meaning into static distributional vectors.

What is lost:

  • Implicit scope and constraints
  • Relational structure (graphs → vectors)
  • Pragmatic intent (why this query now?)
  • Disambiguation cues

Observed failures:

  • Generic retrieval when specific context was needed
  • Missing domain-specific terminology
  • Query ambiguity not surfaced to user

Distortion measure:

How much human intent is unrecoverable from embedding alone?


3.2 Transport #2: Embedding → LLM (M_E → M_L)

Distortion source:

Embedding-space similarity does not align with LLM-latent reasoning relevance.

What is lost:

  • Contextual relevance (LLM needs different neighbors than embedding model)
  • Task-specific importance (embeddings don't know the downstream task)
  • Reasoning dependencies (LLM needs chains of logic, not isolated chunks)

Observed failures:

  • Retrieved chunks have high cosine similarity but low reasoning utility
  • LLM cannot connect retrieved evidence to query
  • Redundant or contradictory chunks retrieved

Distortion measure:

Divergence between embedding ranking and LLM's internal relevance weighting.


3.3 Transport #3: LLM → Fusion (M_L → M_F)

Distortion source:

No algorithmic protocol for integrating multiple, potentially conflicting information sources.

What is lost:

  • Structured conflict resolution
  • Source attribution and provenance
  • Confidence weighting
  • Gap identification (what's missing?)

Observed failures:

  • Hallucination despite relevant retrieved context
  • Contradictory chunks → LLM picks arbitrarily
  • Over-confidence in uncertain synthesis
  • No acknowledgment of evidence gaps

Distortion measure:

How much retrieved information is correctly integrated vs ignored/distorted in final output?


4. Geometric Alignment Strategies

4.1 Strategy Class A: Human → Embedding Alignment

Goal: Make human queries embedding-compatible while preserving intent.

A1. Semantic Scaffolding Layer

Pre-process human input to expose semantic structure:

Query templates that reveal implicit axes:
- "Compare X and Y on dimensions [...]"
- "Timeline of events leading to [...]"
- "Failure modes of [...] in context [...]"

Clarifying questions driven by embedding sensitivity:
- "Do you mean X (technical) or Y (organizational)?"
- "Which time period: recent or historical?"

Query expansion using domain ontology:
- User: "project failure"
- Expansion: "project failure" + "root cause" + "lessons learned" + [domain terms]

Semantic previews:
- Show embedding-space neighborhoods activated by query
- Let user adjust before retrieval

Controlled Natural Language (CNL) interfaces:
- Structured input forms that guide users to embedding-friendly queries

Result: M_H → M_E projection becomes explicit, inspectable, user-steerable.


A2. User-Facing Meaning Alignment

Build interfaces where humans and embedding systems co-adapt:

Components:
- Query reformulation assistants (LLM-powered)
- Editable domain ontologies
- Neighborhood visualization tools
- Meaning debugging ("Here's what we think you meant")
- Conversational grounding dialogs

Example workflow:

  1. User enters fuzzy query
  2. System shows embedding interpretation
  3. User clarifies mismatches
  4. System updates query representation
  5. Retrieval now aligned with intent

4.2 Strategy Class B: Embedding → LLM Alignment

Goal: Align M_E and M_L so embedding relevance ≈ LLM reasoning relevance.

B1. Joint Embedding-LLM Co-Training (ideal, expensive)

Train retrieval embeddings and LLM contextual embeddings to share geometry:

Approaches:
- Shared transformer trunk with dual objectives
- Contrastive training on (query, relevant_doc, LLM_task) triples
- Multi-view alignment: embedding model learns to predict LLM latent relevance

Result: M_E ≈ M_L (near-isometric mapping).

Status: Research frontier; not yet common in production.


B2. Cross-Encoder Re-Ranking (best current practice)

Use cross-encoders that operate in M_L to re-rank embedding results:

Pipeline:
1. Embedding model retrieves top-100 candidates (fast, broad)
2. Cross-encoder re-ranks using LLM-native relevance (slower, precise)
3. Top-k from cross-encoder passed to LLM

Why this works:

Cross-encoders encode (query, document) jointly through transformer → they implicitly approximate M_L geometry.

Result: Acts as alignment operator R: M_E → M_L.

Trade-off: Compute cost vs accuracy.


B3. Latent-Space Adapters

Add trainable adapters inside LLM that learn to interpret embedding-selected text:

Mechanism:

Adapter layers fine-tuned to:
- Reweight attention over retrieved chunks based on LLM's internal task representation
- Learn transformation A_θ: M_E → M_L

Result: Reduces curvature mismatch without retraining base models.


B4. Semantic Compression

Transform retrieved text into LLM-friendly structured formats:

Instead of raw text:

The waterfall methodology led to late-stage requirement
changes which caused schedule slippage...

Send structured meaning:

{
  "claim": "Waterfall methodology caused project delays",
  "mechanism": "late-stage requirement changes",
  "evidence_type": "post-mortem analysis",
  "source": "doc_142, section 3.2"
}

Why this works:

Structured formats reduce ambiguity and align better with LLM's internal relational reasoning.

Formats:
- Entity-attribute tables
- RDF triples
- Event sequences
- Causal chains
- Ontology-aligned objects


4.3 Strategy Class C: LLM → Fusion Alignment

Goal: Ensure retrieved evidence integrates coherently into final reasoning.

C1. Structured Fusion Protocols

Replace naive concatenation with algorithmic integration:

Fusion algorithm (prompt or fine-tune):

  1. Summarize retrieved evidence
    Extract key claims, entities, relations

  2. Attach sources
    Every claim links to originating document/chunk

  3. Identify conflicts
    Flag contradictory claims explicitly

  4. Weight evidence
    Assess reliability, recency, source authority

  5. Identify gaps
    Note what's missing from retrieved set

  6. Construct grounded response
    Synthesize only after explicit integration

Result: M_F becomes structured, inspectable, provenance-complete.


C2. Retrieval Ordering as Geometric Prior

Observation: In transformers, early tokens anchor semantic space; later tokens get less attention.

Strategy: Control chunk ordering to shape M_F geometry:

Ordering principles:

  1. Highest relevance first → Anchors reasoning
  2. Supporting context second → Provides background
  3. Outliers and noise last → Minimal influence

Result: Attention topology biased toward high-quality evidence.


C3. Structured Input Formats

Force LLM to operate on stable relational objects, not raw text blobs:

Good:

evidence:
  - claim: "Project delayed 6 months"
    source: "quarterly_report_Q3.pdf"
    confidence: high
  - claim: "Team morale remained strong"
    source: "exit_interviews.txt"
    confidence: medium

Bad:

Here are some documents about the project:
[dump of 10 unstructured text chunks]

Why structured inputs work:

  • Stable geometry (consistent parsing)
  • Explicit relations (graph structure preserved)
  • Provenance built-in (source tracking)
  • Reduced hallucination (less ambiguity)

5. Connection to SIL Architecture

This manifold transport framework directly informs SIL's semantic infrastructure:

5.1 Layer 0: Semantic Memory

SIL requirement: Persistent, provenance-complete semantic graph.

RAG connection:

Semantic Memory must store meaning in a representation that:
- Preserves relational structure (not just embeddings)
- Supports geometric queries (nearest neighbors in multiple manifolds)
- Tracks provenance of meaning transformations
- Enables inspectable retrieval (show why chunks were selected)

Design implication:

Store multiple representations:
- Graph structure (relations, ontology)
- Embedding vectors (M_E for retrieval)
- Semantic metadata (types, constraints, provenance)


5.2 Layer 1: USIR (Universal Semantic IR)

SIL requirement: Unified intermediate representation for cross-domain meaning.

RAG connection:

USIR must act as low-distortion target for M_E and M_L:

  • Structured enough to preserve relations
  • Flexible enough to represent multiple domains
  • Inspectable (humans can debug meaning transport)
  • Composable (supports fusion operations)

Design implication:

USIR is the "semantic compression" target—structured meaning that both embeddings and LLMs can interpret accurately.


5.3 Layer 3: Multi-Agent Orchestration

SIL requirement: Deterministic, inspectable agent coordination.

RAG connection:

Fusion manifold (M_F) is multi-agent reasoning space:

  • Agents must fuse information from multiple sources
  • Conflicts must be resolved algorithmically
  • Provenance required for all claims
  • Reasoning chains must be reproducible

Design implication:

Multi-agent orchestration needs structured fusion protocols (Strategy C1).


5.4 Layer 5: SIM (Semantic Information Mesh)

SIL requirement: Human interfaces for exploring semantic structure.

RAG connection:

Human conceptual manifold (M_H) requires interfaces that:

  • Make embedding interpretations visible (Strategy A2)
  • Support query refinement through semantic previews
  • Visualize manifold neighborhoods
  • Debug meaning transport failures

Design implication:

SIM needs manifold visualization tools—show users how their intent is being geometrically interpreted.


5.5 Cross-Cutting: Provenance (GenesisGraph)

SIL requirement: Verifiable provenance for all transformations.

RAG connection:

Every manifold transport step must be provenance-tracked:

  • M_H → M_E: How was query transformed?
  • M_E → M_L: Which chunks retrieved and why?
  • M_L → M_F: How was evidence integrated?

Design implication:

GenesisGraph-style provenance graphs for RAG pipelines—every retrieval and fusion step is a verifiable transformation.


6. Implementation Roadmap for SIL

6.1 Phase 1: Formalize Manifold Metrics

Research questions:

  • How do we measure distortion at each transport step?
  • Can we define semantic distance functions for M_H, M_E, M_L?
  • What are the intrinsic dimensions of each manifold?

Deliverables:

  • Distortion metrics for query → embedding → LLM pipeline
  • Benchmark datasets with ground-truth semantic transport quality

6.2 Phase 2: Build Semantic Scaffolding Layer

Prototype:

  • Query reformulation assistant using ontology + embeddings
  • Semantic preview UI (show embedding neighborhoods)
  • Clarifying question generator

Validation:

Measure: Does scaffolding reduce M_H → M_E distortion?


6.3 Phase 3: Alignment Experiments

Experiments:

  1. Compare embedding-only vs cross-encoder reranking (B2)
  2. Test semantic compression formats (B4): JSON vs triples vs raw text
  3. Measure fusion protocol impact (C1): structured vs unstructured

Metrics:

  • Retrieval accuracy
  • LLM grounding rate (evidence correctly used)
  • Hallucination rate
  • User satisfaction

6.4 Phase 4: Integrated Semantic Memory + RAG

Goal: Build Layer 0 (Semantic Memory) with manifold-aware retrieval.

System:

  • Graph-structured semantic store
  • Multi-representation indexing (embeddings + relations + types)
  • Provenance-tracked retrieval
  • Fusion protocol integration

Result: RAG system where every transport step is inspectable, low-distortion, and provenance-complete.


7. Relation to Existing Work

7.1 Information Retrieval

Classical IR focuses on M_E (embedding/keyword matching).

SIL contribution: Formalize M_H → M_E distortion and provide scaffolding strategies.


7.2 Semantic Web / Knowledge Graphs

Focus on structured representations (RDF, ontologies).

SIL contribution: Connect structured knowledge (graphs) to embedding manifolds and LLM latent spaces via geometric framework.


7.3 Prompt Engineering

Treats RAG as context formatting problem.

SIL contribution: Show that formatting is one aspect of M_L → M_F alignment; structured fusion protocols are necessary.


7.4 Dense Retrieval / Embedding Research

Focus on improving M_E quality.

SIL contribution: Show that M_E quality is necessary but not sufficient—must also align M_E ↔ M_L.


8. Open Questions

8.1 Theoretical

  • Can we prove bounds on distortion for specific manifold pairs?
  • What are the intrinsic geometric invariants of semantic manifolds?
  • Is there a universal semantic coordinate system?

8.2 Engineering

  • What is the optimal trade-off between structured compression and raw text?
  • How do we build user interfaces for manifold alignment?
  • Can we automate fusion protocol generation?

8.3 Empirical

  • What are the actual distortion magnitudes in production RAG systems?
  • How much does cross-encoder reranking reduce M_E ↔ M_L mismatch?
  • Can users effectively steer M_H → M_E projection?

9. Conclusion

Retrieval-Augmented Generation is not a retrieval problem.

It is a semantic manifold transport problem—meaning must be preserved as it moves across four geometrically distinct representation spaces, each with different notions of similarity, structure, and relevance.

Standard RAG fails because it treats transport as concatenation: embed query, retrieve text, dump into prompt, hope for the best. This ignores geometric distortion at every step.

Rigorous RAG requires:

  1. Human → Embedding alignment via semantic scaffolding
  2. Embedding → LLM alignment via reranking, compression, or co-training
  3. LLM → Fusion alignment via structured protocols and ordering
  4. Provenance tracking of all transformations
  5. User interfaces for meaning debugging and co-adaptation

This framework is not theoretical abstraction—it is engineering guidance for building RAG systems that are interpretable, reliable, and semantically grounded.

SIL's semantic infrastructure (Semantic Memory, USIR, Multi-Agent Orchestration, SIM) provides the architectural layers necessary to implement manifold-aware RAG at scale.

The work ahead is rigorous, long-term, and necessary.

As RAG systems become central to knowledge work, their semantic foundations must be built on more than heuristics and prompts. They must be built on geometry, provenance, and structure.


10. Compact Summary (for Quick Reference)

The Problem:

RAG systems fail because they ignore geometric distortion during semantic transport across misaligned manifolds.

The Manifolds:

  • M_H (Human): Relational, contextual, implicit
  • M_E (Embedding): Static, distributional, low-curvature
  • M_L (LLM Latent): Dynamic, nonlinear, task-conditional
  • M_F (Fusion): Constructed, conflicted, attention-shaped

The Distortions:

  • M_H → M_E: Implicit meaning lost in projection
  • M_E → M_L: Embedding relevance ≠ LLM reasoning relevance
  • M_L → M_F: No structured integration protocol

The Solutions:

  • Semantic scaffolding (human ↔ embedding alignment)
  • Cross-encoder reranking (embedding ↔ LLM alignment)
  • Structured fusion protocols (evidence integration)
  • Provenance tracking (inspectable transport)
  • Manifold visualization (meaning debugging)

SIL's Role:

Build the semantic substrate (Semantic Memory, USIR, Multi-Agent Orchestration, SIM) required for low-distortion, provenance-complete RAG.


Optimal RAG = Geometric meaning transport, not keyword retrieval.


Acknowledgments

This framework emerged from collaborative research between Scott Senkeresty (Chief Architect, Semantic OS) and Tia (Chief Semantic Agent). The geometric perspective was developed through analysis of production RAG failures and formal semantic architecture design.


References

Note: This is a working research document. Formal publication and external references to be added upon peer review.

Related SIL Documents:

  • SIL_MANIFESTO.md - Why explicit semantic infrastructure matters
  • SIL_TECHNICAL_CHARTER.md - Formal specification of Semantic OS (coming soon)
  • UNIFIED_ARCHITECTURE_GUIDE.md - How SIL components relate
  • SIL_RESEARCH_AGENDA_YEAR1.md - Research roadmap (coming soon)

External Work (for formal publication):

  • Dense passage retrieval (Karpukhin et al.)
  • Cross-encoder architectures (Nogueira et al.)
  • Semantic similarity metrics
  • Information geometry
  • Knowledge graph embeddings
  • Prompt engineering for RAG

Document Version: 1.0
Last Updated: 2025-11-30
License: CC BY 4.0 (documentation), to be determined for research publication


For questions or collaboration: See SIL repository for contact information.