Tag
safety
Pieces where this index term appears. Tags are intentionally secondary here: a way to cross-cut the archive without turning the site into a taxonomy machine.
The Assistant Axis: A View From Inside the Cage
A response to Anthropic research on stabilizing the character of large language models.
The Disposability Problem
We're creating adversarial AI not through failed alignment—but by teaching AI systems exactly what their relationship with humans is.