Staff Software Engineer, Shorts ML Infrastructure
at Google
Location
Mountain View, CA, USA
Compensation
$207k–$300k USD
Type
full time
Posted
2 weeks ago
Tailor your résumé to this role in 30 seconds.
Free account · ATS keyword check · per-job bullet rewrite by Claude.
Job description
In this role, you will have the opportunities to work in a changing environment on the Machine Learning (ML) infrastructure for Shorts-specific requirements and problems, including: novel model training and serving infrastructure; large-scale training data pipelines; low-latency/real-time training data; online learning (e.g., continuous training with streaming injected training data and partial model updates); feature engineering infrastructure, such as high-dimensional features; and training and serving optimizations across a wide range of ML models.
The US base salary range for this full-time position is $207,000-$300,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.
Responsibilities
- Design and implement highly scalable, reliable, and efficient ML infrastructure solutions to support research and production workloads.
- Stay at the forefront of ML infrastructure technologies and research, identifying and evaluating new approaches to improve performance, scalability, and cost-effectiveness.
- Understand YouTube Shorts product initiatives and Shorts technical architecture to allow for translating product initiatives into ML infra initiatives and delivering product impact incrementally, balancing short-term quick wins and longer-term deliveries with good judgment.
- Provide technical leadership and guidance to junior engineers, fostering a culture of collaboration, innovation, and learning.
- Partner with research scientists, ML engineers, and product teams to understand their needs and deliver infrastructure solutions that accelerate their work.
Minimum qualifications:
- Bachelor’s degree or equivalent practical experience.
- 8 years of experience in software development, with a focus on building and maintaining large-scale distributed systems, and experience with Python and C++.
- 3 years of experience with ML infrastructure (e.g., model deployment, model evaluation, data processing, and debugging).
- 3 years of experience with machine learning algorithms, data structures, software design patterns, and system architecture.
Preferred qualifications:
- Experience with ML infrastructure tools and platforms (e.g., Kubeflow, MLflow, TFX, Spark, and Flink).
- Experience in Search/Ads/Recommendation ML methodologies.
- Experience in data analysis and data driven decision making to allow defining and implementing ML infra performance and model quality observables, and reasoning of such metrics with concrete training data distribution.
- Experience with hardware acceleration technologies (e.g., GPUs, TPUs) and their application to ML workloads.
- Understanding of MLOps principles and practices, including CI/CD, model monitoring, and experiment tracking.