Tldr

I’ve been thinking recently about silicon sampling, ontology, and language models. This post attempts to capture the main points of my ontology and philosophy of language as related to those themes.

  1. LLMs are Simulators
  2. Simulators instantiate Simulacra
  3. Simulacra can be Real
  4. Whether simulacra are real boils down to whether they are good
  5. Simulacra are good if they simulate what they’re supposed to simulate
  6. Simulating something means mimicking the causal structure of the thing being simulated
  7. Simulacra with referents are good if they simulate their referents
  8. Simulacra without referents are good if they simulate an observer’s expected causal structure
  9. Language use like “the model believes X” is problematic
    • I imagine, on this point, that I am in agreement with the majority of people who have thought about this phrase for more than a minute.
  10. Language use like “the model believes X” is problematic because it reflects a confused ontology. “The model believes X” has multiple non-problematic translations. The problems arise from not knowing which translation the phrase suggests.
  • This might be where I diverge from some prior work
  1. One approach to a non-confusing ontology would involve multiple levels of analysis to account for the Numerical, Augmented, Algorithmic, Probabilistic, Simulacral, and meta-Simulacral (Naaps) aspects of language model systems.
  • This is similar in spirit to Marr’s Three-Level Hypothesis for the human visual system.
  1. Fixing the ontology allows us to fix confused language use. Under the NaaPS ontology, previously argument-inducing phrases like “model believes X” could translate to:
  • “probability distribution believes X” (P-level) or “weight matrices believe X” (N-level).
    • These statements can be expanded into more technical statements with less anthropomorphism, like “the fact X is represented somewhere in the model weights”. I think these sorts of expansions are non-problematic.
    • The literal interpretation of the statement “weight matrices believe X” seems pretty clearly false, but I don’t think this is the sense that most practitioners are using when they say things like “model believes X”.
  • “helpful assistant simulacrum believes X”.
    • Expanded further using the NaaPS ontology, this statement would be “the model weights, together with some code and a user prompt, render a simulcrum of an entity that believes X”. I think this statement can be true and unproblematic, though it is also observer-relative. Assessing whether something is a simulacrum of an entity that believes X requires the observer to have expecations about the behavior of entities that believe X.