Find your next role

Discover amazing opportunities across our network of companies committed to gender equality in the workplace.

Sr. SDE (L6), ML Ops

Amazon

Amazon

Software Engineering, Operations, Data Science
Seattle, WA, USA
Posted on Oct 24, 2024

DESCRIPTION

The AWS Infrastructure Services (AIS) team is the backbone of AWS, managing the design, planning, delivery, and operation of our global infrastructure. Essentially, we’re the ones who keep the cloud running. Within AIS, the Science team takes on the exciting challenge of using big data and machine learning to optimize power and cooling, the most critical resources in our data centers. In short, we ensure maximum efficiency while preventing overheating and power outages. Our work helps shape future data center designs and drives exceptional cost savings to AWS customers.

As a Software Engineer on the AIS Science team, you will collaborate with scientists, program managers, and data engineers to build, operationalize, and scale machine learning workflows and platform services. Your work will directly impact how server demand is placed by modeling power and cooling load across AWS's global data centers.

You will play a critical role in building infrastructure meant to support all phases of ML models, from R&D to production, including model retraining and iteration. Our team tackles complex challenges in data processing, model hosting, and metric monitoring. As our responsibilities grow and the number of models we manage increases, we’re seeking an innovative senior engineer with a passion for data, machine learning, and MLOps to join our mission-driven team!

If you're passionate about machine learning and model operations, enjoy working in a collaborative and dynamic team that values work-life balance, and want to make a lasting impact on AWS infrastructure worldwide, this is your opportunity. Come join us on this exciting journey!


Key job responsibilities
In this role you will leverage your engineering background and expertise in ML to lead developing platforms for deploying, productionalizing, and scaling machine learning models, with a focus on variant retraining and ongoing model monitoring.

A day in the life
- Lead the design and implementation of a stable and efficient training and inference infrastructure that scales to support a variety of different machine learning models.
- Collaborate with tenured applied scientists and data engineers to develop improved training and inference infrastructure that accelerates innovation and promotes best practice model scoring and model monitoring.
- Quickly learn the ins and outs of AWS infrastructure’s rack planning and forecasting distributed workflows, and engineer solutions to make these systems more robust, fault-tolerant, and efficient across input and output orgs.

BASIC QUALIFICATIONS

- 5+ years of non-internship professional software development experience
- 5+ years of programming with at least one software programming language experience
- 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- Experience as a mentor, tech lead or leading an engineering team

PREFERRED QUALIFICATIONS

- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Master's degree in machine learning or equivalent
- Experience with developing state-of-the-art, best practice MLOps tooling and frameworks

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.

Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $151,300/year in our lowest geographic market up to $261,500/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. This position will remain posted until filled. Applicants should apply via our internal or external career site.