Senior Technical Program Manager, Infrastructure Reliability and Quality (IRQ)

Amazon

Amazon

Other Engineering, IT, Operations, Quality Assurance

Herndon, VA, USA

Posted on Apr 22, 2026

Description

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help.

You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.

The Data Center Infrastructure Reliability and Quality (IRQ) Team owns the Quality and Reliability of critical infrastructure equipment for the lifecycle of equipment. This includes leading Design for Reliability (DFR) and Design for quality (DFQ) effort for AWS Infra New Product Development (NPD), supporting/sustaining the existing AWS critical infra equipment fleet by identifying systemic equipment issues, driving root cause analysis (RCA) corrective action to mitigate the risk in the AWS fleet.

As a Senior Technical Program Manager for Quality and Reliability for the Mechanical and Power Generation Products, you should be an exceptionally strong communicator, both written and verbally. You will lead multi-discipline and highly technical program teams. You should have experience of driving quality and reliability initiatives in a complex engineering environment and will have worked as a technical project manager on increasingly complex projects. Your experience includes data center infrastructure technologies including HVAC, power distribution systems, security devices and controls. You will generate and maintain enhanced reporting, meaningful KPIs, and process/automation improvements to ensure team efficiency and visibility of the portfolio efforts to critical stakeholders and leadership.

You will partner with other engineering teams, purchasing team and project execution/delivery teams regularly. You will make strategic decisions regarding specific projects and overall program direction. You will be capable of connecting long-term strategies with rapid growth patterns of AWS, as well as guide specific operational needs of the business.

You must be adept at identifying and communicating upcoming risks, issues, and bottlenecks as well as be instrumental in resolving those issues, often cross multiple departmental boundaries to achieve your goal. You must possess a strong sense of organization, communication and team building skill-sets that foster robust working relationships for both internal and external stakeholders.



Key job responsibilities
This TPM will lead end-to-end quality and reliability programs for mission-critical mechanical and power generation products at AWS data centers — orchestrating complex product strategies, owning program-level metrics and executive reporting, and building compelling cases for internal and external leadership where both speed and quality are non-negotiable.

- Own and execute end-to-end quality and reliability qualification and sustaining programs for highly complex, mission-critical mechanical and power generation products
- Learn and understand the AWS data center lifecycle specific to Data Center Engineering, globally
- Project manage the IRQ Engineering NPD portfolio across scope, schedule, budget, resources, quality, risk, and reporting
- Interface with Quality and Reliability Engineering teams to promote and standardize work-product inputs/outputs globally
- Sync with TPMs and Product Managers to align reliability and quality deliverables to major NPD program milestones
- Develop technical reliability and quality evaluation plans for mechanical and power generation product portfolios with engineering teams
- Work with external partners (reliability labs, OEMs, CMFs) to facilitate testing, qualification, and analysis
- Own product reliability and quality readiness at each NPD gate
- Own supplier readiness for NPDs
- Estimate and manage budget requirements per project; work with NPD teams to allocate resources
- Manage project timelines and report progress bi-weekly to executive leadership and key partners
- Build and present program-level metrics and status to executives on a regular cadence
- Develop business cases to align internal and external leadership on the most effective path for quality and reliability program management
- Drive internal team members and remove roadblocks for product qualification
- Build strong end-to-end team processes and mechanisms

About the team
Why AWS
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.

Diverse Experiences
Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.

Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.

Inclusive Team Culture
Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness.

Mentorship and Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.