At IBM Software, we transform client challenges into solutions, building the world's leading AI-powered, cloud-native products that shape the future of business and society. We are building the next generation of watsonx.data—a GPU-accelerated, open data lakehouse engineered to deliver category-leading price-performance for analytics and AI workloads. Working in Software means joining a team fueled by curiosity and collaboration, where you'll design query planning pipelines, cost-based optimization logic, and scheduling strategies that drive the intelligence behind how queries are planned, optimized, and executed across the platform. With a culture that values innovation, growth, and continuous learning, IBM Software places you at the heart of IBM's product and technology landscape. Here, you'll have the tools and opportunities to advance your career while creating software that changes the world.
As a Software Engineer focused on query engine planning, optimization, and scheduling, you will design, develop, test, and deliver the planning pipelines, cost-based optimization logic, and scheduling strategies that power watsonx.data's analytical engine.
You will work in an Agile, collaborative environment to understand stakeholder requirements and directly influence query latency, resource efficiency, and high-concurrency performance at petabyte scale.
Your primary responsibilities will include
- Build Planning & Logical Optimization: Design logical plan generation and cost-based optimization—cardinality estimation, predicate pushdown, join reordering, partition and projection pruning, and subquery decorrelation—for complex query patterns. Design Physical Execution & Scheduling: Translate logical plans into physical plans accounting for data locality, storage format, and CPU/GPU capabilities; implement adaptive execution and scheduling that balances throughput, latency, and fairness. Contribute to CI/CD & Tooling: Contribute to the automated CI/CD pipeline and build tooling for plan visualization, cost-model inspection, and optimizer decision tracing. Debug & Tune: Profile execution plans, unit-test fixes for planning regressions in production and CI, and optimize memory management (spill-to-disk, buffer pools, operator budgeting). Collaborate in Agile Environment: Partner with storage, GPU acceleration, catalog, and AI/ML teams, contribute to design reviews and RFCs with measurable acceptance criteria, and document optimization rules and cost-model assumptions.
- Query Engine Experience: 6+ years of professional software engineering, including at least 2 years in query engine development, database internals, or distributed query processing.
- Systems Programming Proficiency: Strong skills in Java, Scala, C++, or a comparable systems language, comfortable with performance-critical, production-grade code.
- Optimization Depth: Hands-on experience with plan generation, logical optimization (predicate pushdown, join reordering, partition pruning), and cost-based optimization, plus the optimizer internals of a modern engine (Presto/Trino, Spark, DuckDB, ClickHouse). Execution Models & AQE: Solid grasp of relational algebra, query rewriting, execution-model trade-offs (volcano, vectorized, compiled), adaptive query execution, and profiling tools (async-profiler, perf, flamegraphs). Communication & Education: Clear written communication—able to document optimizer design decisions and explain trade-offs to engineers and leadership; Bachelor's degree in Computer Science, Engineering, or equivalent practical experience. Optimizer & GPU Planning: Experience maintaining a production query optimizer, GPU-aware planning or hardware-accelerated execution (RAPIDS/cuDF, Velox), and statistics infrastructure (histograms, NDV estimation) integrated into cost models. Learned Optimization & OSS: Exposure to ML-guided optimization or learned cardinality estimation, contributions to open source (Trino, Spark, Calcite, Arrow, Velox, DuckDB), and familiarity with open table formats and their impact on partition pruning and scan optimization. United States Software Engineering Hybrid Professional Multiple Cities