
Oğuz Kaan Yüksel
I am a fourth-year Ph.D candidate at the Theory of Machine Learning Lab, EPFL, advised by Nicolas Flammarion. I focus on understanding AI from first principles — using mathematical modeling and rigorous theory to explain how and why modern learning systems work. My work spans generalization, optimization, and the structure of data, with a particular interest in language models and autoregressive processes.
Background. M.Sc. in Data Science (minor in Mathematics) and Ph.D. at EPFL. B.Sc. in Computer Engineering & Mathematics at Boğaziçi University.
Selected Publications
News
Two papers [Incremental Learning of Sparse Attention Patterns in Transformers](/publications/yuksel2026incremental) and [Induction Heads Interpolate N-Grams](/publications/dangelo2026induction) are accepted to ICML, 2026.
Presented a talk on [Incremental Learning of Sparse Attention Patterns in Transformers](/slides/prigm-2025/) at PriGM, EurIPS 2025.
Presented two posters [Incremental Learning of Sparse Attention Patterns in Transformers](/publications/yuksel2025incremental) and [Generalization Bounds for Autoregressive Processes and In-Context Learning](/publications/yuksel2025generalization) at PriGM, EurIPS 2025.
Presented [On the Sample Complexity of Next-Token Prediction](/publications/yuksel2025sample) at AISTATS 2025.
Presented [Long-Context Linear System Identification](/publications/yuksel2025long) at ICLR 2025.
Presented [First-order ANIL provably learns representations despite overparametrization](/publications/yuksel2024first) at ICLR 2024.
Presented [First-order ANIL provably learns representations despite overparametrization](/publications/yuksel2024first) at NeurIPS 2023.