We are seeking a talented, self-directed Data Engineer to design, build, and operate large-scale, high-performance data infrastructure that powers analytics, AI/ML workloads, and intelligent automation across Device Operations. You will implement data structures using best practices in data modeling and ETL/ELT processes, build real-time and batch pipelines, and enable AI-ready data foundations that support both traditional BI and emerging agentic systems. You will gather business and functional requirements and translate them into robust, scalable solutions that work within the broader data architecture. You will analyze source systems, drive best practices with partner teams, and participate in the full development lifecycle — from design and implementation to documentation, delivery, and operational support.
The ideal candidate relishes working with large volumes of data, enjoys the challenge of highly complex technical contexts, and is passionate about enabling data-driven decisions at scale. They are an expert in data modeling, ETL design, and data warehousing — and are energized by the intersection of data engineering and AI/ML, where well-structured data infrastructure creates an outsized impact on intelligent systems. They are a self-starter, comfortable with ambiguity, able to think big while paying careful attention to detail, and thrive in a fast-paced, collaborative environment.
Key job responsibilities
Design, implement, and operate scalable data pipelines (batch and real-time) that serve analytics, reporting, and AI/ML workloads
Build and maintain data infrastructure that supports AI-ready datasets — structured for consumption by machine learning models, agents, and natural language interfaces
Interface with technology teams to extract, transform, and load data from diverse sources using SQL, Python, and distributed computing frameworks
Implement data models and ETL/ELT processes using best practices in dimensional modeling, data vault, or hybrid approaches on MPP data warehouses
Build robust data integration pipelines using SQL, Python, and Spark across batch and streaming paradigms
Design and deliver high-quality datasets that support business analysis, customer reporting, and AI/ML feature engineering
Partner with scientists and application engineers to ensure data infrastructure meets the needs of ML training, inference, and agentic automation systems
Interface with business customers, gather requirements, and deliver complete, well-documented data solutions
Evaluate and make decisions around dataset designs, pipeline architectures, and tooling proposed by peer engineers
Produce comprehensive dataset documentation, metadata, and data lineage artifacts
Mentor junior data engineers on best practices in data engineering, code quality, and operational excellence
A day in the life
You will work across the full spectrum of data engineering — building pipelines that ingest from operational systems, designing warehouse schemas that serve thousands of daily queries, optimizing infrastructure for cost and performance, and enabling new AI/ML capabilities by making data accessible, reliable, and well-governed. You will collaborate with scientists, BI engineers, and application developers to solve problems that span traditional analytics and emerging intelligent systems. Some days you will be deep in SQL optimization; other days you will be designing real-time CDC pipelines or enabling a new agent to query data programmatically.
About the team
The Data, Analytics, and Science Hub (DASH) team builds scalable data platforms, analytical frameworks, AI-powered solutions, and reporting infrastructure to support Device Operations & Supply Chain. DASH serves multiple organizations across DeviceOps — delivering data engineering, business intelligence, and science capabilities that power operational decision-making. The team operates at the intersection of data infrastructure and AI innovation, building systems that serve both human analysts and intelligent agents supporting Device Ops users.
- Bachelor's degree in a quantitative/technical field such as computer science, engineering, statistics
- 4+ years of data engineering, database engineering, business intelligence or business analytics experience
- Experience in writing complex, highly-optimized SQL queries across large datasets
- 4+ years of development/programming/scripting language (Python/Java/Bash/Perl) experience
- Experience in data warehouse technical architectures, data modeling, infrastructure components, ETL/ ELT and reporting/analytic tools and environments, data structures and hands-on SQL coding
- Experience in Redshift, or experience in Hive/Spark/Hbase/Yarn and experience in Kafka
- Experience with AWS services including S3, Redshift, Sagemaker, EMR, Kinesis, Lambda, and EC2
- Knowledge of distributed systems as it pertains to data storage and computing
- Master's degree in engineering, statistics, computer science, mathematics, or a related quantitative field
- Experience with data infrastructures: relational analytic DBMS, Elastic-Search, and Big Data EMR/EC2/Glue/Lambda, or experience with training and deploying machine learning systems to solve large-scale optimizations
- Experience with infrastructure as code, ops automation, and configuration management tools such as Chef, Puppet, or Ansible
- Experience communicating with users, other technical teams, and management to collect requirements, describe data modeling decisions and data engineering strategy
- Experience as a mentor, tech lead or leading an engineering team, or experience debugging, profiling, and implementing best software engineering practices in large-scale systems
- Knowledge of software engineering best practices across the development life cycle, including agile methodologies, coding standards, code reviews, source management, build processes, testing, and operations
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.