A
Software Engineer, Agentic Evaluation
AppleCupertino, California, United States$147,400 - $272,1003+ years
Apply We're a team at Apple building software that helps shape the next generation of Siri and AI-powered experiences. The work spans frameworks, tooling, and infrastructure — including a strong focus on how we evaluate and measure the quality of what we ship. We can't say much about specifics, but the problems are new, the surface area is large, and the reach is enormous. We're a collaborative, humble, and curious group that learns from each other and builds together.
You'll work alongside engineers, designers, and researchers to design and build software end-to-end — from early prototypes to production systems running on real devices. You'll have meaningful autonomy in how you get there, and the opportunity to shape both what we build and how we know it's working. The work is hard enough to stretch you, and the team is generous enough to support you while you grow.
- Designing and building software that powers new Siri and AI experiences
- Prototyping quickly to explore what works — then making it real
- Partnering closely with ML, Design, and other engineering teams to shape features, not just implement specs
- Contributing to architecture decisions and helping set the technical bar for the team
- Investing in the quality, reliability, and evaluability of what you ship — through testing, tooling, and infrastructure the team can trust
- 3+ years of software engineering experience with strong CS fundamentals
- Proficiency in Swift, Objective-C, Python, or another modern language — strong engineers in adjacent stacks will pick up the rest
- You've shipped software that people used, and you're ready to own bigger pieces end-to-end
- Expert in using generative AI models for coding — you've integrated tools like Claude, Cursor, or Codex deeply into how you work, and have a point of view on where they help and where they don't
- An interest in software evaluation and quality — you care about whether what you build actually works, and want to be on a team that takes measurement seriously
- Comfortable with ambiguity; when you're stuck, you dig in
- Strong communication and a track record of working well across teams
- BS in Computer Science or equivalent experience
- Experience in one or more iOS/macOS domains: system services, UI frameworks, concurrent application architecture, or performance
- Background building developer tools, test infrastructure, evaluation systems, or data pipelines
- Familiarity with how AI systems are evaluated — offline eval, human eval, A/B, or model-graded approaches
- Proficiency with one or more scripting languages (Python, Ruby, Bash)
- You seek out feedback and learn fast from those around you
- Close to the frontier — curious about new models and techniques, and have a point of view on where human-AI interaction is headed
Similar Jobs
Pre-Silicon EngineerCupertino, California, United States
$147,400 - $272,100Speech Scientist / Engineer (Interspeech 2022)Cupertino, California, United States
RAN1/RAN4 Standards EngineerBeijing, Beijing, China
Display Metrology EngineerShenzhen, Guangdong, China
Molding Process EngineerShanghai, Shanghai, China
Molding Process EngineerShenzhen, Guangdong, China