Sr. / Staff ML Engineer, FM Training Integration - ML Compute
at Apple
Location
Santa Clara, United States of America
Compensation
$181k–$318k USD
Type
full time
Posted
Yesterday
Tailor your résumé to this role in 30 seconds.
Free account · ATS keyword check · per-job bullet rewrite by Claude.
Job description
We are looking for a ML Engineer to join our ML Compute team to help improve the efficiency, scalability, and reliability of model training and inference workloads in the cloud. In this role, you will lead the integration of large-scale ML workloads with cloud infrastructure, working cross-functionally with ML engineers, infrastructure engineers, and researchers to optimize performance, improve system efficiency, and drive high utilization of accelerator resources.
Own the integration of large-scale model training workloads with accelerator-based cloud infrastructure, ensuring scalable and reliable execution.
Drive performance optimization across the ML stack, including data pipelines, model execution, and distributed systems, to improve throughput, latency, and hardware utilization.
Design and run benchmarks to evaluate model performance and infrastructure configurations, using results to guide optimization efforts.
Build and improve tooling for observability, profiling, and debugging to increase visibility and reliability of ML workloads.
Collaborate cross-functionally with ML engineers, infrastructure engineers, and researchers to improve system efficiency and scalability.
Establish and promote best practices for performance tuning and resource utilization.
Drive high-quality design and code reviews, share best practices, and elevate engineering standards across the team.
5+ years of experience in software engineering, ML infrastructure, or related domains.
Hands-on experience with machine learning workflows, including training, evaluation, and inference at scale.
Proficiency in Python and experience with at least one major ML framework (e.g., PyTorch or JAX).
Experience with cloud-based infrastructure and distributed systems (e.g., containers, orchestration, storage, and networking).
Bachelor’s degree in Computer Science, Engineering, or a related field.
Experience working with accelerator-based systems (e.g., GPUs/TPUs), including performance tuning and debugging of ML workloads.
Hands-on experience with distributed training or inference at scale (e.g., data, model, or pipeline parallelism).
Experience optimizing large-scale ML systems, including bottleneck analysis across compute, memory, and networking.
Familiarity with profiling, tracing, and benchmarking tools for ML workloads (e.g., PyTorch Profiler, NVIDIA Nsight).
Experience building or operating ML infrastructure using containerization and orchestration frameworks (e.g., Docker, Kubernetes).
Advanced degree in Computer Science, Engineering, or a related field.
We are a group of engineers to support training foundation models at Apple! We build infrastructure to support training foundation models with general capabilities such as understanding and generation of text, images, speech, videos, and other modalities and apply these models to Apple products. We are looking for engineers who are passionate about building systems that push the frontier of deep learning in terms of scaling, efficiency, and flexibility and delight millions of users in Apple products.
Apple employees also have the opportunity to become an Apple shareholder through participation in Apple’s discretionary employee stock programs. Apple employees are eligible for discretionary restricted stock unit awards, and can purchase Apple stock at a discount if voluntarily participating in Apple’s Employee Stock Purchase Plan. You’ll also receive benefits including: Comprehensive medical and dental coverage, retirement benefits, a range of discounted products and free services, and for formal education related to advancing your career at Apple, reimbursement for certain educational expenses — including tuition. Additionally, this role might be eligible for discretionary bonuses or commission payments as well as relocation. Learn more about Apple Benefits
Note: Apple benefit, compensation and employee stock programs are subject to eligibility requirements and other terms of the applicable plan or program.
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant
At Apple, we believe accessibility is a fundamental human right. You’ll find that idea reflected in everything here — in our culture, our benefits and our digital tools. By welcoming as many perspectives as possible, we help you build a career where you feel like you belong.
Learn about accessibility in Apple’s workplace
Learn about reasonable accommodations for job applicants
Apple accepts applications to this posting on an ongoing basis.