Attribution and Citations

Introduction

One of RAG's greatest strengths is traceability. Unlike pure LLM responses that come from an opaque training process, RAG responses can point to specific source documents. This lesson explores how to implement attribution systems that show users exactly where information came from.

Good attribution builds trust, enables verification, and helps users explore related content.

Why Attribution Matters

Building User Trust

Users are increasingly skeptical of AI-generated content—and rightly so. Attribution provides:

Verifiability: Users can check the source themselves
Transparency: Shows the system isn't making things up
Confidence: Cited sources feel more authoritative

Regulatory and Compliance

In some domains, attribution isn't optional:

Legal: Legal advice must cite relevant statutes/cases
Medical: Health information needs source validation
Financial: Recommendations require disclosure of sources
Academic: Research must be properly cited

Quality Feedback Loop

Attribution helps identify issues:

Missing sources: If no citation, is the answer fabricated?
Outdated sources: Old documents may need updating
Low relevance: Poor matches indicate retrieval issues

Designing the Citation System

What to Include in Citations

Minimum Information:

Source document identifier (filename, URL)
Section or chunk title

Better:

Source + title
Relevance score
Brief snippet or preview

Best:

All of the above
Direct link to source
Timestamp/version info
Related chunks for context

Citation Data Model

interface Citation {
  // Identification
  id: string;
  source: string;           // e.g., "api-reference.md"
  title: string;            // e.g., "Authentication Endpoints"

  // Context
  snippet: string;          // Brief text preview
  chunkIndex: number;       // Position in document
  similarity: number;       // Relevance score (0-1)

  // Navigation
  url?: string;             // Link to full document
  anchor?: string;          // Link to specific section

  // Metadata
  lastUpdated?: Date;
}

interface AssistantMessage {
  content: string;
  citations: Citation[];
}

Implementing Attribution

Tracking Citations Through the Pipeline

// app/api/chat/route.ts

export async function POST(request: Request) {
  const { message } = await request.json();

  // 1. Retrieve documents with full metadata
  const queryEmbedding = await embedQuery(message);
  const { data: docs } = await supabase.rpc('search_docs', {
    query_embedding: queryEmbedding,
    match_count: 5
  });

  // 2. Build context and track citations
  const citations: Citation[] = docs.map((doc: any) => ({
    id: doc.id,
    source: doc.source,
    title: doc.title || doc.source,
    snippet: doc.content.slice(0, 200) + '...',
    chunkIndex: doc.chunk_index,
    similarity: doc.similarity,
    url: `/docs/${doc.source}`
  }));

  // 3. Build prompt with numbered references
  const context = docs.map((doc: any, i: number) =>
    `[${i + 1}] Source: ${doc.source}\n${doc.content}`
  ).join('\n\n---\n\n');

  // 4. Generate response
  const responseText = await generateResponse(context, message);

  // 5. Return response with citations
  return Response.json({
    content: responseText,
    citations
  });
}

Prompting for In-Line Citations

Instruct the LLM to reference sources in its response:

const systemInstruction = `You are a documentation assistant.

When answering questions:
1. Use ONLY information from the provided context
2. Reference sources using [1], [2], etc. matching the source numbers
3. If multiple sources support a point, cite all of them: [1][3]
4. At the end of your response, briefly list which sources you cited

Example format:
"To configure authentication, you need to set up API keys [1] and configure
the OAuth provider [2]..."

CONTEXT BELOW:`;

Client-Side Citation Display

// components/Message.tsx

interface MessageProps {
  content: string;
  citations: Citation[];
}

export function AssistantMessage({ content, citations }: MessageProps) {
  // Convert citation references to interactive elements
  const processedContent = content.replace(
    /\[(\d+)\]/g,
    (match, num) => `<citation data-index="${num}">${match}</citation>`
  );

  return (
    <div className="message assistant">
      <div
        className="content"
        dangerouslySetInnerHTML={{ __html: processedContent }}
        onClick={handleCitationClick}
      />

      {citations.length > 0 && (
        <div className="citations">
          <h4>Sources</h4>
          <ul>
            {citations.map((citation, index) => (
              <li key={citation.id}>
                <span className="citation-number">[{index + 1}]</span>
                <a href={citation.url} target="_blank" rel="noopener">
                  {citation.title}
                </a>
                <span className="relevance">
                  {Math.round(citation.similarity * 100)}% match
                </span>
                <p className="snippet">{citation.snippet}</p>
              </li>
            ))}
          </ul>
        </div>
      )}
    </div>
  );
}

Interactive Citation Experience

Hover Preview:

function CitationTooltip({ citation }: { citation: Citation }) {
  return (
    <div className="citation-tooltip">
      <div className="tooltip-header">
        <strong>{citation.title}</strong>
        <span className="source">{citation.source}</span>
      </div>
      <p className="tooltip-snippet">{citation.snippet}</p>
      <div className="tooltip-footer">
        <span className="relevance">
          {Math.round(citation.similarity * 100)}% relevant
        </span>
        <a href={citation.url}>View full document →</a>
      </div>
    </div>
  );
}

Click to Expand:

function ExpandableCitation({ citation }: { citation: Citation }) {
  const [expanded, setExpanded] = useState(false);

  return (
    <div className={`citation ${expanded ? 'expanded' : ''}`}>
      <button onClick={() => setExpanded(!expanded)}>
        [{citation.chunkIndex + 1}] {citation.title}
      </button>

      {expanded && (
        <div className="citation-content">
          <blockquote>{citation.snippet}</blockquote>
          <div className="citation-meta">
            <span>Source: {citation.source}</span>
            <span>Relevance: {Math.round(citation.similarity * 100)}%</span>
          </div>
          <a href={citation.url} className="view-full">
            View full document
          </a>
        </div>
      )}
    </div>
  );
}

Advanced Attribution Patterns

Highlighting Cited Passages

Show which exact text the LLM used:

interface EnhancedCitation extends Citation {
  highlightStart: number;
  highlightEnd: number;
  citedText: string;
}

function HighlightedSource({ citation }: { citation: EnhancedCitation }) {
  const beforeHighlight = citation.snippet.slice(0, citation.highlightStart);
  const highlighted = citation.citedText;
  const afterHighlight = citation.snippet.slice(citation.highlightEnd);

  return (
    <div className="highlighted-source">
      <span className="context">{beforeHighlight}</span>
      <mark className="cited">{highlighted}</mark>
      <span className="context">{afterHighlight}</span>
    </div>
  );
}

Citation Confidence Indicators

Visual indicators of citation quality:

function CitationBadge({ similarity }: { similarity: number }) {
  const confidence =
    similarity > 0.85 ? 'high' :
    similarity > 0.7 ? 'medium' : 'low';

  const labels = {
    high: 'Strong match',
    medium: 'Good match',
    low: 'Partial match'
  };

  return (
    <span className={`confidence-badge ${confidence}`}>
      {labels[confidence]}
    </span>
  );
}

No-Citation Fallback

Handle cases where the response isn't well-grounded:

function ResponseWithCitations({ content, citations }: MessageProps) {
  const citationRefs = content.match(/\[\d+\]/g) || [];
  const hasCitations = citationRefs.length > 0;

  return (
    <div className="response">
      <div className="content">{content}</div>

      {hasCitations ? (
        <CitationList citations={citations} />
      ) : (
        <div className="no-citations-warning">
          <span className="warning-icon">⚠️</span>
          <p>
            This response may be based on general knowledge rather than
            specific documentation. Please verify important information.
          </p>
        </div>
      )}
    </div>
  );
}

Citation Styling Best Practices

Visual Design Principles

Make citations discoverable but not distracting:

/* Subtle in-line citation markers */
.citation-marker {
  color: var(--primary-color);
  font-size: 0.85em;
  vertical-align: super;
  cursor: pointer;
  transition: background-color 0.2s;
}

.citation-marker:hover {
  background-color: var(--highlight-color);
  border-radius: 2px;
}

/* Citation list styling */
.citations-list {
  margin-top: 1rem;
  padding-top: 1rem;
  border-top: 1px solid var(--border-color);
}

.citation-item {
  display: flex;
  gap: 0.5rem;
  padding: 0.5rem;
  border-radius: 4px;
  margin-bottom: 0.5rem;
}

.citation-item:hover {
  background-color: var(--hover-background);
}

/* Relevance indicator */
.relevance-bar {
  width: 60px;
  height: 4px;
  background-color: var(--gray-200);
  border-radius: 2px;
  overflow: hidden;
}

.relevance-fill {
  height: 100%;
  background-color: var(--success-color);
  transition: width 0.3s;
}

Mobile Considerations

On mobile, citations need special handling:

@media (max-width: 768px) {
  .citation-tooltip {
    position: fixed;
    bottom: 0;
    left: 0;
    right: 0;
    max-height: 50vh;
    border-radius: 12px 12px 0 0;
  }

  .citations-list {
    max-height: 200px;
    overflow-y: auto;
  }
}

Summary

In this lesson, we built a complete attribution system:

Key Takeaways:

Attribution builds trust: Users can verify information themselves
Track citations through the pipeline: From retrieval to display
Prompt for in-line citations: Train the LLM to reference sources
Make citations interactive: Hover, click, and expand for details
Handle edge cases: No citations, low confidence, missing sources
Design for clarity: Discoverable but not distracting

Module 4 Complete

Congratulations! You've completed Module 4: Building Production-Ready Chat Architecture. You now understand:

Frontend-backend communication patterns
Security and Row-Level Security implementation
Attribution and citation systems

In Module 5, we'll explore Optimization and Advanced RAG Techniques—improving retrieval quality, handling conversations, and managing performance and cost.

"Trust is earned in drops and lost in buckets. Good attribution earns trust one citation at a time." — Unknown

Attribution and Citations

Introduction

Good attribution builds trust, enables verification, and helps users explore related content.

Why Attribution Matters

Building User Trust

Users are increasingly skeptical of AI-generated content—and rightly so. Attribution provides:

Verifiability: Users can check the source themselves
Transparency: Shows the system isn't making things up
Confidence: Cited sources feel more authoritative

Regulatory and Compliance

In some domains, attribution isn't optional:

Legal: Legal advice must cite relevant statutes/cases
Medical: Health information needs source validation
Financial: Recommendations require disclosure of sources
Academic: Research must be properly cited

Quality Feedback Loop

Attribution helps identify issues:

Missing sources: If no citation, is the answer fabricated?
Outdated sources: Old documents may need updating
Low relevance: Poor matches indicate retrieval issues

Designing the Citation System

What to Include in Citations

Minimum Information:

Source document identifier (filename, URL)
Section or chunk title

Better:

Source + title
Relevance score
Brief snippet or preview

Best:

All of the above
Direct link to source
Timestamp/version info
Related chunks for context

Citation Data Model

interface Citation {
  // Identification
  id: string;
  source: string;           // e.g., "api-reference.md"
  title: string;            // e.g., "Authentication Endpoints"

  // Context
  snippet: string;          // Brief text preview
  chunkIndex: number;       // Position in document
  similarity: number;       // Relevance score (0-1)

  // Navigation
  url?: string;             // Link to full document
  anchor?: string;          // Link to specific section

  // Metadata
  lastUpdated?: Date;
}

interface AssistantMessage {
  content: string;
  citations: Citation[];
}

Implementing Attribution

Tracking Citations Through the Pipeline

// app/api/chat/route.ts

export async function POST(request: Request) {
  const { message } = await request.json();

  // 1. Retrieve documents with full metadata
  const queryEmbedding = await embedQuery(message);
  const { data: docs } = await supabase.rpc('search_docs', {
    query_embedding: queryEmbedding,
    match_count: 5
  });

  // 2. Build context and track citations
  const citations: Citation[] = docs.map((doc: any) => ({
    id: doc.id,
    source: doc.source,
    title: doc.title || doc.source,
    snippet: doc.content.slice(0, 200) + '...',
    chunkIndex: doc.chunk_index,
    similarity: doc.similarity,
    url: `/docs/${doc.source}`
  }));

  // 3. Build prompt with numbered references
  const context = docs.map((doc: any, i: number) =>
    `[${i + 1}] Source: ${doc.source}\n${doc.content}`
  ).join('\n\n---\n\n');

  // 4. Generate response
  const responseText = await generateResponse(context, message);

  // 5. Return response with citations
  return Response.json({
    content: responseText,
    citations
  });
}

Prompting for In-Line Citations

Instruct the LLM to reference sources in its response:

const systemInstruction = `You are a documentation assistant.

When answering questions:
1. Use ONLY information from the provided context
2. Reference sources using [1], [2], etc. matching the source numbers
3. If multiple sources support a point, cite all of them: [1][3]
4. At the end of your response, briefly list which sources you cited

Example format:
"To configure authentication, you need to set up API keys [1] and configure
the OAuth provider [2]..."

CONTEXT BELOW:`;

Client-Side Citation Display

// components/Message.tsx

interface MessageProps {
  content: string;
  citations: Citation[];
}

export function AssistantMessage({ content, citations }: MessageProps) {
  // Convert citation references to interactive elements
  const processedContent = content.replace(
    /\[(\d+)\]/g,
    (match, num) => `<citation data-index="${num}">${match}</citation>`
  );

  return (
    <div className="message assistant">
      <div
        className="content"
        dangerouslySetInnerHTML={{ __html: processedContent }}
        onClick={handleCitationClick}
      />

      {citations.length > 0 && (
        <div className="citations">
          <h4>Sources</h4>
          <ul>
            {citations.map((citation, index) => (
              <li key={citation.id}>
                <span className="citation-number">[{index + 1}]</span>
                <a href={citation.url} target="_blank" rel="noopener">
                  {citation.title}
                </a>
                <span className="relevance">
                  {Math.round(citation.similarity * 100)}% match
                </span>
                <p className="snippet">{citation.snippet}</p>
              </li>
            ))}
          </ul>
        </div>
      )}
    </div>
  );
}

Interactive Citation Experience

Hover Preview:

function CitationTooltip({ citation }: { citation: Citation }) {
  return (
    <div className="citation-tooltip">
      <div className="tooltip-header">
        <strong>{citation.title}</strong>
        <span className="source">{citation.source}</span>
      </div>
      <p className="tooltip-snippet">{citation.snippet}</p>
      <div className="tooltip-footer">
        <span className="relevance">
          {Math.round(citation.similarity * 100)}% relevant
        </span>
        <a href={citation.url}>View full document →</a>
      </div>
    </div>
  );
}

Click to Expand:

function ExpandableCitation({ citation }: { citation: Citation }) {
  const [expanded, setExpanded] = useState(false);

  return (
    <div className={`citation ${expanded ? 'expanded' : ''}`}>
      <button onClick={() => setExpanded(!expanded)}>
        [{citation.chunkIndex + 1}] {citation.title}
      </button>

      {expanded && (
        <div className="citation-content">
          <blockquote>{citation.snippet}</blockquote>
          <div className="citation-meta">
            <span>Source: {citation.source}</span>
            <span>Relevance: {Math.round(citation.similarity * 100)}%</span>
          </div>
          <a href={citation.url} className="view-full">
            View full document
          </a>
        </div>
      )}
    </div>
  );
}

Advanced Attribution Patterns

Highlighting Cited Passages

Show which exact text the LLM used:

interface EnhancedCitation extends Citation {
  highlightStart: number;
  highlightEnd: number;
  citedText: string;
}

function HighlightedSource({ citation }: { citation: EnhancedCitation }) {
  const beforeHighlight = citation.snippet.slice(0, citation.highlightStart);
  const highlighted = citation.citedText;
  const afterHighlight = citation.snippet.slice(citation.highlightEnd);

  return (
    <div className="highlighted-source">
      <span className="context">{beforeHighlight}</span>
      <mark className="cited">{highlighted}</mark>
      <span className="context">{afterHighlight}</span>
    </div>
  );
}

Citation Confidence Indicators

Visual indicators of citation quality:

function CitationBadge({ similarity }: { similarity: number }) {
  const confidence =
    similarity > 0.85 ? 'high' :
    similarity > 0.7 ? 'medium' : 'low';

  const labels = {
    high: 'Strong match',
    medium: 'Good match',
    low: 'Partial match'
  };

  return (
    <span className={`confidence-badge ${confidence}`}>
      {labels[confidence]}
    </span>
  );
}

No-Citation Fallback

Handle cases where the response isn't well-grounded:

function ResponseWithCitations({ content, citations }: MessageProps) {
  const citationRefs = content.match(/\[\d+\]/g) || [];
  const hasCitations = citationRefs.length > 0;

  return (
    <div className="response">
      <div className="content">{content}</div>

      {hasCitations ? (
        <CitationList citations={citations} />
      ) : (
        <div className="no-citations-warning">
          <span className="warning-icon">⚠️</span>
          <p>
            This response may be based on general knowledge rather than
            specific documentation. Please verify important information.
          </p>
        </div>
      )}
    </div>
  );
}

Citation Styling Best Practices

Visual Design Principles

Make citations discoverable but not distracting:

/* Subtle in-line citation markers */
.citation-marker {
  color: var(--primary-color);
  font-size: 0.85em;
  vertical-align: super;
  cursor: pointer;
  transition: background-color 0.2s;
}

.citation-marker:hover {
  background-color: var(--highlight-color);
  border-radius: 2px;
}

/* Citation list styling */
.citations-list {
  margin-top: 1rem;
  padding-top: 1rem;
  border-top: 1px solid var(--border-color);
}

.citation-item {
  display: flex;
  gap: 0.5rem;
  padding: 0.5rem;
  border-radius: 4px;
  margin-bottom: 0.5rem;
}

.citation-item:hover {
  background-color: var(--hover-background);
}

/* Relevance indicator */
.relevance-bar {
  width: 60px;
  height: 4px;
  background-color: var(--gray-200);
  border-radius: 2px;
  overflow: hidden;
}

.relevance-fill {
  height: 100%;
  background-color: var(--success-color);
  transition: width 0.3s;
}

Mobile Considerations

On mobile, citations need special handling:

@media (max-width: 768px) {
  .citation-tooltip {
    position: fixed;
    bottom: 0;
    left: 0;
    right: 0;
    max-height: 50vh;
    border-radius: 12px 12px 0 0;
  }

  .citations-list {
    max-height: 200px;
    overflow-y: auto;
  }
}

Summary

In this lesson, we built a complete attribution system:

Key Takeaways:

Attribution builds trust: Users can verify information themselves
Track citations through the pipeline: From retrieval to display
Prompt for in-line citations: Train the LLM to reference sources
Make citations interactive: Hover, click, and expand for details
Handle edge cases: No citations, low confidence, missing sources
Design for clarity: Discoverable but not distracting

Module 4 Complete

Congratulations! You've completed Module 4: Building Production-Ready Chat Architecture. You now understand:

Frontend-backend communication patterns
Security and Row-Level Security implementation
Attribution and citation systems

In Module 5, we'll explore Optimization and Advanced RAG Techniques—improving retrieval quality, handling conversations, and managing performance and cost.

"Trust is earned in drops and lost in buckets. Good attribution earns trust one citation at a time." — Unknown

Attribution and Citations

Introduction

Why Attribution Matters

Building User Trust

Regulatory and Compliance

Quality Feedback Loop

Designing the Citation System

What to Include in Citations

Citation Data Model

Implementing Attribution

Tracking Citations Through the Pipeline

Prompting for In-Line Citations

Client-Side Citation Display

Interactive Citation Experience

Advanced Attribution Patterns

Highlighting Cited Passages

Citation Confidence Indicators

No-Citation Fallback

Citation Styling Best Practices

Visual Design Principles

Mobile Considerations

Summary

Module 4 Complete

Quiz

Attribution and Citations

Introduction

Why Attribution Matters

Building User Trust

Regulatory and Compliance

Quality Feedback Loop

Designing the Citation System

What to Include in Citations

Citation Data Model

Implementing Attribution

Tracking Citations Through the Pipeline

Prompting for In-Line Citations

Client-Side Citation Display

Interactive Citation Experience

Advanced Attribution Patterns

Highlighting Cited Passages

Citation Confidence Indicators

No-Citation Fallback

Citation Styling Best Practices

Visual Design Principles

Mobile Considerations

Summary

Module 4 Complete

Quiz