Find your next role

Discover amazing opportunities across our network of companies committed to gender equality in the workplace.

Senior Applied Scientist - Systems for ML Inference and Training Optimization, Deep Science for Systems and Services

Amazon

Amazon

Software Engineering, Data Science
Baden-Württemberg, Germany
Posted on Nov 13, 2025

Description

We are seeking an exceptional Senior Applied Scientist specializing in ML Systems, training, and inference optimization to join DS3. This role requires deep expertise in performance engineering, kernel development, distributed systems optimization, and AI workload optimization across heterogeneous compute platforms. You will invent and implement novel optimization techniques that directly impact the performance and cost-efficiency of ML training and inference for AWS customers worldwide.
As a Senior Applied Scientist in DS3, you will work at the lowest levels of the software stack—writing custom CUDA kernels, optimizing PTX assembly, developing high-performance operators for GPUs and AWS Neuron, designing efficient communication patterns for multi-GPU and multi-node training, and inventing new algorithmic approaches to accelerate transformer models and emerging architectures. Your work will span from single-node inference optimization to large-scale distributed training systems, influencing the design of AWS training and inference services and setting new standards for ML systems performance across the industry.


Deep Science for Systems and Services (DS3) is a part of AWS Utility Computing (UC) which provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services.


Key job responsibilities
Systems-Level Scientific Innovation: Design and implement novel kernel-level optimizations for ML inference and training workloads, including custom CUDA kernels, PTX-level optimizations, and cross-platform acceleration for CUDA and AWS Neuron SDK.
Performance Engineering Leadership: Drive 2-10× performance improvements in latency, throughput, and memory efficiency for production ML inference & training systems through systematic profiling, analysis, and optimization.
Cross-Platform Optimization: Develop and port high-performance ML operators across GPUs, AWS Inferentia/Trainium, and emerging AI accelerators, ensuring optimal performance on each platform.
Product-Level Impact: Lead the design, implementation, and delivery of scientifically-complex optimization solutions that directly improve customer experience and reduce AWS operational costs at scale.
Scientific Rigor: Produce technical documentation and internal research reports demonstrating the correctness, efficiency, and scalability of your optimizations. Contribute to external publications when aligned with business needs.
Technical Leadership: Influence your team's technical direction and scientific roadmap. Build consensus across engineering and science teams on optimization strategies and architectural decisions.
Mentorship & Knowledge Sharing: Actively mentor junior scientists and engineers on performance engineering best practices, kernel development, and systems-level optimization techniques.

About the team
Deep Science for Systems and Services (DS3) is a science organization within AWS Compute & ML Services focused on advancing AI/ML technologies at the systems level. Our team works at the intersection of machine learning and high-performance computing, developing optimizations for large model inference across diverse hardware platforms. We push the boundaries of what's possible in ML inference performance, working directly with CUDA, AWS Neuron, and other low-level compute abstractions to deliver industry-leading latency, throughput, and cost-performance for AWS customers deploying AI at scale.
About AWS

Diverse Experiences
AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.

Why AWS?
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.

Inclusive Team Culture
AWS values curiosity and connection. Our employee-led and company-sponsored affinity groups promote inclusion and empower our people to take pride in what makes us unique. Our inclusion events foster stronger, more collaborative teams. Our continual innovation is fueled by the bold ideas, fresh perspectives, and passionate voices our teams bring to everything we do.

Mentorship & Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.

Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.