Oliver Sieberling

prof_pic.jpg

osieberl@mit.edu

I am a first-year PhD student at MIT, advised by Yoon Kim. My research focuses on the pretraining of large neural networks. I am particularly interested in efficient sequence modeling, scalable architectures, and hardware-algorithm co-design.

I received my BSc in Computer Science from ETH Zurich in 2025. I also spent a semester abroad at Princeton University. For my bachelor’s thesis I worked with Johannes Lengler on discrete stochastic processes. During my undergrad I also worked with Dan Alistarh on LLM efficiency and with Johannes von Oswald on linear-time sequence modeling.

publications

  1. ICLR
    MesaNet: Sequence Modeling by Locally Optimal Test-Time Training
    Johannes Oswald*, Nino Scherrer*, Seijin Kobayashi, and 13 more authors
    In International Conference on Learning Representations (ICLR), 2026
  2. NeurIPS
    Quartet: Native FP4 Training Can Be Optimal for Large Language Models
    Roberto L. Castro*, Andrei Panferov*, Soroush Tabesh, Oliver Sieberling, Jiale Chen, Mahdi Nikdan, Saleh Ashkboos, and Dan Alistarh
    In Advances in Neural Information Processing Systems (NeurIPS), 2025
  3. ICML
    EvoPress: Accurate Dynamic Model Compression via Evolutionary Search
    Oliver Sieberling, Denis Kuznedelev, Eldar Kurtic, and Dan Alistarh
    In International Conference on Machine Learning (ICML), 2025
  4. Preprint
    DarwinLM: Evolutionary Structured Pruning of Large Language Models
    Shengkun Tang, Oliver Sieberling, Eldar Kurtic, Zhiqiang Shen, and Dan Alistarh
    Preprint, 2025
  5. Algorithmica
    Plus Strategies Are Exponentially Slower for Planted Optima of Random Height
    Johannes Lenglerαβ, Leon Schillerαβ, and Oliver Sieberlingαβ
    Algorithmica, 2026. Conference version at GECCO 2024
  6. SN Comp. Sci.
    Hardest Monotone Functions for Evolutionary Algorithms
    Marc Kaufmannαβ, Maxime Larcherαβ, Johannes Lenglerαβ, and Oliver Sieberlingαβ
    SN Computer Science, 2025. Conference version at EvoCOP 2024 (Best Paper Award Nomination)

* equal contribution  ·  αβ alphabetical ordering