A
Full Stack Software Engineer - ML Compute Capacity
AppleSanta Clara, California, United States$181,100 - $318,4005+ years
Apply Scaling machine learning workloads across thousands of accelerators creates challenges that few engineers ever encounter. In Apple’s Machine Learning Platform Technologies organization, we build the infrastructure that powers large-scale ML training and inference workloads, bringing together expertise in distributed systems, machine learning infrastructure, and high-performance computing.
As a senior engineer on the ML Compute Capacity team, you will design, build, and operate the production systems that ensure compute resources are optimally distributed throughout the company. You'll work across the stack — from data pipelines and backend services to APIs and interactive frontends — developing telemetry systems, optimization algorithms, policies, and intuitive tools for managing demand and improving efficiency across Apple's largest accelerator fleet. Our small, nimble team works in a high-autonomy, fast-paced environment, and we're passionate about digging into data patterns, laying out the performance characteristics of an entire distributed system, and knowledge sharing. If the opportunity to own and operate services that scale, stay highly available, and "just work" excites you, then please reach out to us!
- Build and operate demand and capacity planning systems
- Build data pipelines and telemetry systems that ingest, normalize, and serve fleet-wide utilization and cost data across multi-tenant and heterogeneous fleets
- Develop observability infrastructure — monitoring, alerting, and dashboards — that surfaces real-time fleet health and efficiency signals
- Drive innovation in forecasting, optimization, and supply chain management tooling that works at scale
- Build end-to-end tooling — from data models and APIs to interactive dashboards — that distills complex data into actionable insights for leadership
- Build self-service platforms with well-defined schema contracts and APIs, enabling ML teams, infrastructure engineers, and finance to balance usability, utilization, and costs
- Engage cross-functionally with finance analysts, supply chain managers, data center operations, compute infrastructure engineers, and more
- Support the team through code reviews and knowledge sharing
- 5+ years of experience in relevant areas
- Proficiency in Python for production backend and data engineering work
- Experience building data pipelines and crafting robust queries over large-scale, multi-source data (e.g., Trino, PostgreSQL, Elasticsearch)
- Experience designing and building RESTful APIs and working with cloud storage technologies
- Experience with modern web frameworks like React
- Experience with observability tools (e.g., Prometheus, Grafana) or equivalent monitoring systems
- Excellent problem-framing and problem-solving skills
- Strong CS fundamentals
- Bachelor's degree or higher in Engineering, Mathematics, Economics, or a related quantitative field
- Experience operating Kubernetes at production scale — including scheduling, resource management, and cluster debugging
- Familiarity with accelerator utilization patterns across ML training and inference
- Strong interest with capacity planning, cost attribution, or FinOps systems
Similar Jobs
Pre-Silicon EngineerCupertino, California, United States
$147,400 - $272,100Speech Scientist / Engineer (Interspeech 2022)Cupertino, California, United States
RAN1/RAN4 Standards EngineerBeijing, Beijing, China
Display Metrology EngineerShenzhen, Guangdong, China
Molding Process EngineerShanghai, Shanghai, China
Molding Process EngineerShenzhen, Guangdong, China