Find your next role
Discover amazing opportunities across our network of companies committed to gender equality in the workplace.
Amazon
AGI Data Services strives to be best in class at acquiring, creating and ground-truth data, with the highest standards of privacy and trust, to power the best AI models on Earth.
We are seeking a Senior Software Development Engineer (Sr. SDE) who is passionate about Generative AI and has strong engineering fundamentals to own and accelerate the next generation of GenAI-powered tooling within AGI Data Services. The Sr. SDE will design, build, and maintain LLM-as-a-Judge evaluation pipelines that leverage large language models to assess data quality at scale — including judge architectures, evaluation rubrics, scoring models, and calibration mechanisms that align with the standards set by core scientist teams developing Amazon Nova models. The Sr. SDE will also design and build GenAI-powered workflow tools — such as conversational diagnostic agents, automated quality assessment systems, and guided remediation workflows — that streamline data collection and quality assurance processes, enabling cross-functional teams to rapidly identify issues, reduce resolution time, and continuously improve data throughput.
The Sr. SDE's work will directly improve Amazon Nova models. Our team has built a strong foundation of GenAI-powered engineering practices — this senior role will accelerate and scale that momentum. This role offers direct visibility to VP and SVP leadership.
Key job responsibilities
The Sr. SDE will own the LLM-as-a-Judge evaluation pipeline — designing, building, and scaling automated evaluation systems that leverage large language models to assess data quality. The Sr. SDE will architect judge pipelines, develop evaluation rubrics and scoring frameworks, build calibration and agreement mechanisms, and ensure judge outputs align with quality standards defined by core scientist teams.
The Sr. SDE will design and build GenAI-powered diagnostic and workflow tools — including conversational troubleshooting agents, automated quality assessment tools, guided remediation systems, and workflow copilots. The Sr. SDE will leverage and extend agent orchestration frameworks such as LangChain, LangGraph, Amazon Bedrock Agents, or design custom orchestration layers tailored to AGI Data Services workflows.
The Sr. SDE will build upon the team's existing GenAI-forward practices — introducing advanced patterns for prompt engineering, RAG, agent orchestration, and LLM evaluation into production systems. The Sr. SDE will design and implement robust backend services, APIs, and data pipelines on AWS leveraging Amazon Bedrock, SageMaker, Lambda, ECS/EKS, Step Functions, DynamoDB, OpenSearch, and S3.
The Sr. SDE will collaborate with Applied Scientists, Technical Program Managers, domain experts, and vendor teams — bridging technology, process, and operations.
A day in the life
The Sr. SDE will review LLM-as-a-Judge pipeline metrics — monitoring judge accuracy, calibration drift, and agreement rates — and collaborate with Applied Scientists to refine evaluation rubrics. The Sr. SDE will design new judge architectures, build and iterate on conversational troubleshooting agents, fine-tune prompt chains, and expand RAG knowledge bases. The Sr. SDE will dive deep into data quality anecdotes to find patterns and root causes, propose tooling solutions that automate manual processes, and share new GenAI integration patterns that build on existing team practices. The Sr. SDE will communicate impact and roadmaps to cross-functional partners and VP leadership.