Concepts Sources Graph Archive Round Robin →

Citation discipline

Concept·5 cited sources·updated 2026-04-27

The structural property that every claim on a wiki page is wiki-linked to its source page, and every query answer carries citations because the substrate makes it the path of least resistance.

What it means

In a Karpathy LLM Wiki, citation is not a behavioral target the LLM is asked to hit — it is a structural property of the data layer. Every claim on an entity page wiki-links to a source page. Every query reads index, drills into pages, and returns an answer that quotes wiki-linked sources by construction. The LLM is not "told to cite"; citing is the cheapest way to assemble the answer.

This is the property that makes a wiki's outputs defensible to institutional buyers. A claim without a source link is visible as unsupported. A query answer without citations is a structural anomaly, not a behavioral lapse.

How it shows up in sources

  • LLM Wiki — > "When you ask questions against the wiki… The LLM searches for relevant pages, reads them, and synthesizes an answer with citations." The pattern treats citation as a default, not an option.
  • 51 Terminal — Product Overview (April 2026) — Slide 4 lists "non-existent or poor-quality data" and "no licenses for external data use" as canonical status-quo failures of generic intelligence platforms. Slide 11 places "client deliverable licensing" as a capability axis. Together these establish that for an institutional-buyer audience, citation discipline and licensing-clean output are the same property — claims must be sourced and the sources must be redistributable.
  • Circle USDC Transparency & Reserves (public page, fetched April 2026) — Circle's two-tier disclosure regime (weekly self-published holdings + monthly third-party Deloitte attestation under AICPA standards) is a real-world institutional implementation of structural citation discipline at the issuer layer. The reserve-vehicle grounding (the SEC-registered 2a-7 Circle Reserve Fund managed by BlackRock with daily public portfolio reporting) adds a third structural-citation layer on top of the issuer's own.
  • Sky.money landing page (Sky Protocol public interface, fetched April 2026)counter-example illustrating that citation discipline at the issuer layer is not universal in DeFi-native protocols. Sky Protocol's public landing page replaces issuer-side disclosure with structural disclaimers ("Sky.money does not control, set, or guarantee the rate"; "Skybase does not participate in, and has no ability to control or guarantee outcomes of, decentralized governance processes"). The DeFi-native analogue of citation discipline may be on-chain provenance — every governance change has a public, deterministic on-chain trail (see DAO governance) — but this is a different kind of citation discipline operating at the protocol-mechanics layer, not at the reserve-disclosure layer.
  • Chainalysis public homepage (fetched April 2026) — citation discipline at the compliance-vendor product layer. Chainalysis's "court admissible" data positioning is a structural commitment that compliance analytics output must be defensible in regulator and court contexts. This extends the citation-discipline concept across three layers in the wiki: the substrate layer (Karpathy's wiki pattern enforces wiki-link-to-source on every claim), the issuer layer (Circle's monthly Big Four attestations under AICPA standards), and now the compliance-vendor layer (Chainalysis's court-admissible analytics output methodology).

Mechanism / how it works

Citation becomes structural through three reinforcing rules:

  1. Wiki-link first, body second. The schema's "every page starts with frontmatter, then prose" convention combines with the rule that wiki-links are mandatory for first-mentions of named entities. This forces the LLM to ground each claim before writing the surrounding paragraph.
  2. Source pages are first-class citizens. Every ingested source gets its own page with verbatim Key Claims and Notable Quotes. Other pages link to the source, not to a URL or a free-text reference.
  3. Query operation reads index → pages → answer. The query path cannot return text without traversing source pages, so citations are the natural output format.

In a domain like institutional research, this property is the difference between "the system claims X" and "the system claims X, with the verbatim quote sourced from LLM Wiki in the wiki's source layer." The latter is auditable; the former is a hallucination risk.

The 51 deck adds a fourth layer specific to institutional buyers: licensing-clean output. A consulting firm assembling a client deliverable cannot redistribute analysis derived from data it has no license to redistribute. The wiki-pattern substrate addresses this by being explicit about the licensing posture of every source page (this vault is Path A — public sources only). A future fork of this template, ingesting contracted proprietary sources for internal use, would tag those source pages differently and gate redistribution at the schema level.

Related concepts

Related vendors / sectors

(pre-domain — but worth noting that this is the property that maps directly to the audit context's Layer 4 finding from /Users/gstijakovic/Dev/Projects/51/51-llm-wiki-plan.md: institutional buyers need citable answers; substrate-level citation discipline is the architecture proof for that requirement.)

Open questions

  • How rigid should the no-bare-URL rule be? Karpathy's gist allows external references in source pages; this vault's schema enforces wiki-links in body prose but allows raw URLs in dedicated ## External references sections.
  • For domains with ephemeral sources (Slack threads, podcast clips), what's the equivalent of a stable wiki-link? Frozen quotes plus a fetched-on date? A locally archived transcript file?

Sources cited