arbdwj
About
  • Jun 10, 2026 Tracing Eval-Awareness Emergence Through Training of OLMo 3 ↗
    Tracing how eval-awareness emerges across the training stages of OLMo 3.
  • Jul 23, 2025 Scaling Laws for LLM-Based Data Compression ↗
    Investigating how large language models compress text, image, and speech with universal power laws
  • Oct 27, 2024 Experiments with the Platonic Representation Hypothesis ↗
    Investigating the validity of PRH in OOD setting
  • Aug 28, 2024 Understanding Hidden Computations in Chain-of-Thought Reasoning ↗
    chain-of-thought is decryptable
  • Mar 24, 2023 Adversarial training against goal misgeneralization is ELK-hard ↗
    can goal-misgeneralization be formulated as an instance of ELK?
  • Oct 16, 2021 The AGI needs to be honest ↗
    building truthful-ai is hard
  • arbdwj