Extensive experience architecting, deploying, and supporting Linux platforms used for complex engineering applications
10+ years of experience administering large-scale Linux environments
Strong expertise with RHEL, CentOS, or other Linux variants
Deep knowledge of:
NFS
LDAP and Active Directory integration
DNS and DHCP
SSH, PAM, and security hardening
Filesystem permissions, ACLs, groups, and identity management
Deep expertise in enterprise storage infrastructure, including:
NAS and SAN topologies
Management of tens or hundreds of petabytes of storage
Data backup, disaster recovery, and business continuity strategies for large volumes of engineering data
Experience supporting:
5,000+ servers or equivalent compute scale
Multi-site or global engineering environments
High-availability infrastructure
Extensive experience operating and scaling distributed compute environments that support semiconductor design workloads
Strong working knowledge of IBM Spectrum LSF, including:
Job scheduling and prioritization
Queue architecture
Resource allocation policies
Fair-share scheduling
Compute farm optimization
Regression infrastructure
EDA workload tuning
Distributed job execution
License-aware scheduling
Experience with:
Millions of jobs per day preferred
CPU and memory optimization
Farm utilization analytics
Capacity forecasting
Strong understanding of semiconductor engineering workflows and their infrastructure dependencies
Familiarity with:
RTL-to-GDSII flows
Verification regressions
Simulation farms
Synthesis and place-and-route infrastructure
EDA license management
Cadence, Synopsys, and Mentor environments
Tapeout-critical infrastructure reliability
Clear understanding of:
The impact of downtime on tapeout schedules
The importance of deterministic engineering environments
Reproducibility requirements
Common performance bottlenecks in EDA workflows
Deep operational knowledge of Perforce administration at enterprise scale, including:
Perforce database management and optimization
Backup and recovery strategies
Replication and edge servers
Large binary repository performance
Access controls and permissions
Additional experience preferred with:
Git, GitLab, Gerrit, and GitHub
CI/CD for hardware development flows
Artifact management systems
Application and AI tooling strategy and direction
Strong background in infrastructure automation, platform engineering, and observability
Experience with technologies such as:
Puppet
Python
Shell scripting
Ansible
Terraform
Kubernetes preferred
Experience with monitoring and observability platforms such as:
Prometheus
Grafana
Splunk
ELK
Strong understanding of:
Infrastructure as Code
Automated provisioning
Configuration management
Self-service engineering platforms
12+ years of experience leading infrastructure or platform engineering teams
Experience managing senior technical teams, including:
Senior Linux administrators
HPC engineers
Storage engineers
DevOps or platform teams
Proven ability to build and scale organizations that support 1,000+ engineers in demanding technical environments
Demonstrated success leading:
24x7 production operations
Incident management and escalation processes
Root cause analysis and corrective action planning
Change management programs
SLA and SLO ownership
Disaster recovery planning
Security and compliance initiatives
Strong ability to work effectively across:
Silicon engineering
CAD and EDA teams
IT and security
Program management
Executive leadership
Proven ability to translate:
Engineering pain points into infrastructure strategy
Business priorities into operational execution
A leadership style that balances strategic vision with operational rigor
Strong communication skills and the ability to influence across technical and executive audiences
A hands-on mindset with the judgment to prioritize reliability, scale, and engineer productivity in a fast-paced semiconductor environment