OpenAI
San Francisco, CA, USA
About the Team The scaling team team builds foundational components that power OpenAI's ML training infrastructure. We focus on developing scalable, robust, and high-performance systems that maximize the productivity of our researchers and hardware. Our mission is to accelerate progress toward AGI by enabling the fastest iteration cycles and highest throughput for model development at scale. About the Role As a software engineer on the Scaling team, you'll help build and optimize the low-level stack that orchestrates computation and data movement across OpenAI's supercomputing clusters. Your work will involve designing high-performance runtimes, building custom kernels, contributing to compiler infrastructure, and developing scalable simulation systems to validate and optimize distributed training workloads. You will work at the intersection of systems programming, ML infrastructure, and high-performance computing, helping to create both ergonomic developer APIs and highly...