
Senior Software Engineer - Kubernetes AI Scheduler
NvidiaSummary
Senior Software Engineer for KAI-Scheduler, an open-source CNCF project specializing in AI workload scheduling on Kubernetes. You will contribute to the core development of KAI, focusing on scalability, AI workload optimization, and Kubernetes internals. This role involves designing and implementing efficient scheduling algorithms, collaborating with the community, and translating user feedback into product improvements.
Required Skills
Details
- Experience Required
- 4+ years
- Posted
- ~Jun 25, 2026
Description
KAI-Scheduler is an open-source CNCF project focused on delivering the best scheduling experience for AI workloads on Kubernetes. Adopted by AI frontier labs, leading enterprises, and some of the largest AI infrastructure deployments in the world, KAI helps organizations efficiently run AI at scale.
KAI is designed to support any AI infrastructure—from the latest GPU and networking technologies to future hardware generations—while maximizing performance, utilization, and scalability. As a Senior Software Engineer for KAI, you will help build the future of AI scheduling in the Kubernetes ecosystem, working on challenging problems spanning workload scheduling, Kubernetes internals, and large-scale AI infrastructure.
What you’ll be doing:
- Develop clean, maintainable, and well-tested software in Go.
- Design and implement scalability improvements for KAI, helping it operates efficiently in massive-scale deployments (thousands of nodes) while addressing Kubernetes scaling constraints and bottlenecks.
- Apply strong algorithmic thinking to solve complex AI workload scheduling and placement challenges, balancing performance, fairness, cluster utilization, topology constraints, and scalability.
- Conduct code and design reviews to uphold high-quality standards and mentor team members.
- Work closely with contributors, users, and customers, helping translate feedback from production deployments into product and engineering improvements.
- Collaborate with related upstream projects (schedulers, AI frameworks, cluster autoscalers, Kubernetes SIGs/WGs, etc.) and contribute to community and ecosystem discussions.
What we need to see:
- B.Sc. or M.Sc. in Computer Science or a related field or equivalent experience
- 8+ years of experience in backend software development, including system design and architecture
- 4+ years of advanced Kubernetes development experience, including designing and implementing CRDs and controllers, with deep expertise in Kubernetes internals, networking, storage, and cluster architecture.
- Strong algorithmic skills with experience tackling complex optimization and distributed systems challenges.
- Strong technical skills and a proven ability to collaborate with and mentor other engineers.
We are an equal-opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.
